1. What is a dummy/binary/indicator variable?
2. What are the 2 types of data?
3. What
when δ is small. Create a table with three columns:
i. In one column, list values of x from 1.00, 1.01, 1.02, ...1.20.
ii. In the second column, evaluate ln(x) using a calculator.
In the third column, show the difference between the actual value of ln(x) and the approximation that ln(1+δ ) = δ.
1. Which fascist allegedly came up with associations between variables?
2. What is the purpose of regression models?
3. What is y determined by?
4. What is the "true model/relationship"?
5. What equation describes the trendline?
6. i. What is the covariance?
ii. What can you interpret from it?
7. What is correlation coefficient?
a. What can you interpret from it?
b. How do you know if there is a real relationship between 2 variables?
8. What is the p-value? What is the range of it? What is considered low/high value?
9. What does statistically significant mean? What p value/range labels it as such?
10. What are the rules for means/variances/covariances?
1. Tell me about regression model! yay...
2. What do we use regressions for?
3. What are the requirements of the regression model?
STATA:
1. What are the Qualifiers/if conditions on stata?
2. How do you generate a dummy variable for people taller than 6 feet?
3. How do you generate a graph?
4. How do you find correlation?
What is a do file?
Problem Set 2:
Part 1) Suppose a dataset has N=5 individuals, with values of Xi and Yi:
Xi: 12; 8; 10; 3; 12
Yi: 12; 3.6; 9.6; 3.7; 6.5
1) Calculate bo and b1
2) Calculate y for each individual
3) Calculate u for each individual
4) Calculate the R^2 value
5) Suppose that, instead of using variable X, we used V = 100X as the explanatory variable. How does it change the estimates of bo and b1? (Answer this question using properties of regression coefficients, not by recalculating the b)
6) Suppose that, instead of using the variable X, we used W=X+10 as the explanatory variable. How does it change the estimates of bo and b1? (Again, answer using properties of regression coefficients)
Problem Set 2:
Part 2) Suppose a dataset had N=4 individuals, with values of Xi and Yi-
Xi: 10; 12; 14; 16
Yi: 18; 15; 10; 1
7) Estimate the model ln(Yi) = β0 + β1*Xi + ui.
8) Describe the relationship between a change in X and the change in Y (The answer involves percentages.)
--> Suppose we estimate b1 = -0.45. Which statement would accurately describe the relationship between changes in x and y?
A. When x increases by 1, y decreases by 45%.
B. When x increases by 100%, y decreases by 0.45%.
C. When x increases by 100%, y decreases by 45.
D. When x increases by 1, y decreases by 0.45.
E. When x increases by 1, y decreases by 0.45%.
9) Estimate the model ln(Yi) = b0 + b1*ln(Xi) + ui.
10) Describe the relationship between a change in X and the change in Y. (Again, the answer involves percentages.)
--> Suppose that we estimated ß1 = 1.2. Which statement would accurately describe the relationship between changes in x and y?
A. When x increases by 100%, y increases by 1.2.
B. When x increases by 1, y increases by 1.2%.
C. When x increases by 100%, y increases by 120%.
D. When x increases by 1%, y increases by 12%.
E. When x increases by 1, y increases by 1.2.
F. When x increases by 1, y increases by 120%.
11) Can the R2 values of the two regressions (one with x as an explanatory variable, the other with ln(x)) be compared, to determine which model fits the data better?
A. No, since one uses x as the explanatory variable and the other uses ln(x).
B. Yes, since the outcome variable is the same, ln(y), and the sample is the same.
C. No, since the models are not linear.
D. No, since the sample has fewer than 30 observations.
E. Yes, since the sample is the same and the number of observations is the same.
Problem Set 2:
Part 3) Suppose that we want to study the demand function Q = a*P^b, where Q and P are prices, and a and b are unknown parameters.
12) Which of the following models is equivalent to Q = a * Pb and can be estimated by linear regression?
A. ln(Q) = ln(a) + b * ln(P)
B. ln(Q) = b * ln(a*Q)
C. ln(Q) = b * (ln(a) + ln(P))
D. Q = ln(a) + b * P
13) What is the economic interpretation of b?
A. Marginal utility
B. Opportunity cost
C. Discount factor
D. Marginal cost
E. Elasticity
14) According to standard demand theory, what can we predict about the sign or magnitude of b?
A. Negative, but nothing more
B. Positive and greater than 1
C. Positive and less than 1
D. Equal to -1
E. Between -1 and 0
*Quiz 1:* 1. In statistics, the word "skewed" means:
A. Extreme
B. Unbalanced
C. Biased
D. Obscured
2. A dummy variable takes only values of -1, 0, and +1; indicating whether a condition is false, neutral, or true. (T/F)
2. In the past three years, the annual interest rate on my savings account has been R1, R2, R3. (These values are all expressed as decimals.) Which formula can be used to calculate the average interest rate, R?
A. R = ((1+R1)(1+R2)(1+R3))⅓ - 1
B. R = 1 + ((1+R1) + (1+R2) + (1+R3))⅓
C. R = 1 + ((1+R1)(1+R2)(1+R3))½
D. R = √((1+R1)(1+R2)(1+R3))
E. R = 1 - √((1+R1) + (1+R2) + (1+R3)) / 3
3. Your sample contains 3 observations: 10, 12, 14. Which value is closest to the variance in the sample?
A. 1.00
B. 1.15
C. 1.33
D. 1.41
E. 2.00
F. 2.33
G. 4.00
4. If a variable is "skewed to the right", then:
A. The mean is greater than the median.
B. The mean is the same as the median.
C. The mean is less than the median.
D. The mean is a hyperreal number, and the median is a surreal number.
5. According to the empirical rule, the fraction of observations within one, two, and three standard deviations of the mean are often approximately:
A. 0, ⅔, 95%
B. ½, ⅔, 99%
C. 0, ¾, 88.9%
D. ½, ⅔, 95%
E. 33.3%, 66.6%, 99.9%
F. ½, 95%, 99%
G. ½, ⅔, ¾
H. ⅔, 95%, 99.9%
6. In all samples, at least what fraction of observations are within 1, 2, and 3 standard deviations of the mean?
A. 0, 1/3, 2/3
B. 1/2, 3/4, 7/8
C. 1/3, 2/3, 3/3
D. 0, 3/4, 8/9
7. In a sample, the variable X has a mean of 18 and a standard deviation of 5.9. What is the "standardized value" associated with 11.7?
8. If a statistician knows the "standardized value" or "z-score" of a particular observation, the statistician knows roughly how common or uncommon the observation is — without knowing the particular details of the distribution, like the mean or variance. (T/F)
9. What are two problems with using the range as a measure of dispersion? (multiple choice)
A. The range cannot be calculated in a sample with an odd number of observations.
B. The range is always less than the variance.
C. The range cannot be calculated when the sample is skewed.
D. The range depends on the variance in observations.
E. The range is determined by unusual observations.
F. The range is expected to change with the sample size.
Suppose that we estimate the regression line y = 3 + 4(x)
a) What is the change Δy when Δx = 2 ?
b) Suppose we have two observations in the dataset, and the difference in the x values is exactly Δx = 2, while the difference in their y values is Δy = 9.
--> Why is this finding different from the result in part (a)?
What is the relationship between x and y in the following circumstances?
1) Y=B0 + B1*ln(X) + u
2) ln(Y)=B0 + B1*ln(X) + u
3) ln(Y)=B0 + B1*X + u
4) Y=B0 + B1*(X) + u
5) y=A*B^X
1. Does correlation imply causality?
2. Does causality imply correlation?
3. What does the causal effect of x on y imply?
What are the factors contributing to observed correlation in observational data (aside from the causal effect), otw known as endogeneity?
What is the causal effect?
1. Another term for regression?
2. How do you measure how effective the regression model is at predicting outcome?
3. If examining MPG in regression model, what are potential explanatory(X) variables of it?
1) What is marginal effect AKA SLOPE?
2) Find marginal effect aka slope for the following:
i. y = a+bx
ii. y = a+bx+c(x^2)
iii. ln(y) = a+bx
4. How do you calculate % change?
Why do we have multiple variables in experiments?
What are the 2 types of variables you can have in regression?
EXAM 2 (NOT ON MIDTERM 1):
Probability Theory!
1. What is S or space?
2. What are A, B?
3. What are the 2 types of events?
4. What is the complement of event?
5. What is the intersection of A and B?
6. What is n? What is u?
7. What is mutually exclusive?
Quiz 2
1) Y = B0+B1*X
Mean of x = 5; SD of x = 2
Mean of y = -4; SD of y = 3
Cov(x,y) = -0.5
Calculate B0 and B1.
2)
a) In Stata, what do we use to set a variable equal to some value (often with commands "generate" or "replace")?
b) In Stata, what do we use to describe the condition when 2 values happen to already be the same (often an "if" statement or when defining a dummy variable)?
3) Work this question without a calculator: ln(2) = 0.69; ln(3) = 1.10...ln(6) = ?; ln(32) = ?
4) A "hat" on a variable means:
a. optimal
b. going to a party
c. standardized
d. derivative
e. estimate
5) In a regression, what is a residual?
a. difference between observed and predicted values of y
b. minimum value of objective function
c. unobserved characteristics that contribute to outcome
d. error in the estimates of the Bs
6) Looking at the following image(https://imgur.com/gallery/kNXqVUk), find the:
i. B0
ii. B1
iii. Fraction of differences in y that can be explained by x
iv. p-value to test whether there is a statistically significant relationship between x and y
7) Correlation implies causality (T/F)
8) Causality implies correlation (T/F)
9) What is unobserved heterogeneity?
a. chalk
b. linear dependence with an omitted variable
c. some other characteristic, associated with x, causing an effect on y
d. reverse causality
e. variable with no direct effect on y
f. difference in variability of the unobservables-->
is econometrics?
4. What are the 2 branches of statistics?
1. What is a population?
2. What is a parameter?
3. What is a sample?
4. What is a statistic?
1. What are the different organizations of data?
2. What are the common measures of central tendency? What do they express?
3. i.What is left skew?
ii. What is right skew?
iii. What is symmetric distribution?
What is
[Show Less]