Suppose a researcher is considering developing an IV regression model with one regressor, Xi, and one instrument, Zi. If she has a sample of n = 113, what range must the correlation coefficient be between Xi and Zi in order for Zi to be considered a strong instrument?
> Consider three random variables, X, Y, and Z. Suppose that Y takes on k values y1, ……., yk; that X takes on l values x1, ……., xl; and that Z takes on m values z1, â
> Use the probability distribution given in Table 2.2 to compute (a) E(Y) and E(X); (b) 2X and 2Y; and (c) XY and corr (X, Y). Data from Table 2.2:
> Macroeconomists have also noticed that interest rates change following oil price jumps. Let Rt denote the interest rate on three-month Treasury bills (in percentage points at an annual rate). The distributed lag regression relating the change in Rt(&acir
> Suppose Yt = 0 + ut, where ut follows a stationary stationary AR (1) ut = 1ut - 1 + u∼t with u∼t i.i.d. with mean 0 and variance 2u and |1
> Suppose that a(L) = (1 - L), with |1| < 1, and b(L) = 1 + L +2L2 +3L3 ……. a. Show that the product b(L)a(L) = 1, so that b(L) = a(L)-1. b. Why is the restriction |1| < 1 important?
> Consider the ADL model Yt = 5.3 + 0.2Yt - 1 + 1.5Xt - 0.1Xt - 1 + u∼t, where Xt is strictly exogenous. a. Derive the impact effect of X on Y. b. Derive the first five dynamic multipliers. c. Derive the first five cumulative multipliers. d. Derive the lon
> Increases in oil prices have been blamed for several recessions in developed countries. To quantify the effect of oil prices on real economic activity, researchers have run regressions like those discussed in this chapter. Let GDPt denote the value of qu
> The moving average model of order q has the form Yt = 0 + et + 1et - 1 + 2et - 2 + ……+ qet - q, where et is a serially uncorrelated random var
> Suppose Yt is the monthly value of the number of new home construction projects started in the United States. Because of the weather, Yt has a pronounced seasonal pattern; for example, housing starts are low in January and high in June. Let Jan denote t
> Suppose Yt follows the stationary AR (1) model Yt = 2.5 + 0.7Yt - 1 + ut, where ut is i.i.d. with E (ut) = 0 and var (ut) = 9. a. Compute the mean and variance of Yt. b. Compute the first two autocovariances of Yt. c. Compute the first two autocorrelatio
> In this exercise, you will conduct a Monte Carlo experiment to study the phenomenon of spurious regression discussed. In a Monte Carlo study, artificial data are generated using a computer, and then those artificial data are used to calculate the statist
> Prove the following results about conditional means, forecasts, and forecast errors: a. Let W be a random variable with mean W and variance 2w, and let c be a constant. Show that E [(W – c)2] = 2w + (W – c)2. b. Consider the problem of forecasting Yt
> Consider two random variables, X and Y. Suppose that Y takes on k values y1, ……., yk and that X takes on l values x1, ……. ,xl. a. Show that; b. Use your answer to (a) to veri
> The forecaster in Exercise 15.2 augments her AR (4) model for IP growth to include four lagged values of ∆Rt, where Rt is the interest rate on three-month U.S. Treasury bills (measured in percentage points at an annual rate). a. The F-s
> Using the same data as in Exercise 15.2, a researcher tests for a stochastic trend in ln (IPt), using the following regression: where the standard errors shown in parentheses are computed using the homoskedasticity-only formula and the regressor t is a l
> The Index of Industrial Production (IPt) is a monthly time series that measures the quantity of industrial commodities produced in a given month. This problem uses data on this index for the United States. All regressions are estimated over the sample pe
> Suppose Yt follows a random walk, Yt = Yt−1 + ut, for t = 1, ……, T, where Y0 = 0 and ut is i.i.d. with mean 0 and variance 2u. a. Compute the mean and variance of Yt. b. Compute the covariance between Yt and Yt−k. c. Use the results in (a) and (b) to sh
> Consider the stationary AR (1) model Yt = b0 + b1Yt−1 + ut, where ut is i.i.d. with mean 0 and variance 2u. The model is estimated using data from time periods t = 1 through t = T, yielding the OLS estimators b^0 and b^1. You are interested in forecasti
> Suppose ∆Yt follows the AR (1) model ∆Yt = 0 +∆Yt - 1 + ut. a. Show that Yt follows an AR (2) model. b. Derive the AR (2) coefficients for Yt as a function of 0 and 1.
> A researcher carries out a QLR test using 30% trimming, and there are q = 5 restrictions. Answer the following questions, using the values in Table 15.5 (“Critical Values of the QLR Statistic with 15% Trimming”) and Ap
> Consider the AR (1) model Yt = 0 + 1Yt - 1 + ut. Suppose the process is stationary. a. Show that E (Yt) = E (Yt – 1). b. Show that E (Yt) = 0 / (1 - 1).
> You have a sample of size n = 1 with data y1 = 2 and x1 = 1. You are interested in the value of in the regression Y = X + u. a. Plot the sum of squared residuals (y1 - bx1)2 as function of b. b. Show that the least squares estimate of b is b^OLS = 2.
> Let X and Y be two random variables. Denote the mean of Y given X = x by (x) and the variance of Y by 2(x). a. Show that the best (minimum MSPE) prediction of Y given X = x is (x) and the resulting MSPE is 2(x). b. Suppose X is chosen at random. Use
> In any year, the weather can inflict storm damage to a home. From year to year, the damage is random. Let Y denote the dollar value of damage in any given year. Suppose that in 95% of the years Y = $0, but in 5% of the years Y = $30,000. a. What are the
> In Exercise 14.5(b), suppose you predict Y using Y - 1 instead of Y. a. Compute the bias of the prediction. b. Compute the mean of the prediction error. c. Compute the variance of the prediction error. d. Compute the MSPE of the prediction. e. Does Y - 1
> In Exercise 14.5(b), suppose you predict Y using Y/2 instead of Y. a. Compute the bias of the prediction. b. Compute the mean of the prediction error. c. Compute the variance of the prediction error. d. Compute the MSPE of the prediction e. Does Y/2 prod
> Y is a random variable with mean = 2 and variance 2 = 25. a. Suppose you know the value of i. What is the best (lowest MSPE) prediction of the value of Y? That is, what is the oracle prediction of Y? ii. What is the MSPE of this prediction? b. Supp
> Describe the relationship, if any, between the standard error of a regression and the square root of the MSPE of the regression’s out-of-sample predictions.
> Consider the fixed-effects panel data model Yjt = j + ujt for j = 1, ……, k and t = 1, ……, T. Assume that ujt is i.i.d. across entities j and over time t wi
> Let X1 and X2 be two positively correlated random variables, both with variance 1. a. (Requires calculus) The first principal component, PC1, is the linear combination of X1 and X2 that maximizes var (w1X1 + w2X2), where Show that b. The second principal
> You have a sample of size n = 1 with data y1 = 2 and x1 = 1. You are interested in the value of in the regression Y = X + u. (Note there is no intercept.) a. Plot the sum of squared residuals (y1 - bxl)2 as function of b. b. Show that the least square
> A researcher is interested in predicting average test scores for elementary schools in Arizona. She collects data on three variables from 200 randomly chosen Arizona elementary schools: average test scores (TestScore) on a standardized test, the fraction
> Derive the final equality in Equation (13.10). (Use the definition of the covariance, and remember that, because the actual treatment Xi is random, b1i and Xi are independently distributed.) Data from Equation 13.10:
> Suppose you have the same data as in Exercise 13.7 (panel data with two periods, n observations), but ignore the W regressor. Consider the alternative regression model where Gi = 1 if the individual is in the treatment group and Gi = 0 if the individual
> Yi, i = 1, …, n, are i.i.d. Bernoulli random variables with p = 0.6. Let Y denote the sample mean. a. Use the central limit theorem to compute approximations for i. Pr (Y >= 0.64) when n = 50. ii. Pr (Y Y > 0.55) = 0.95? (Use the central limit theorem
> Suppose you have panel data from an experiment with T = 2 periods (so t = 1, 2). Consider the panel data regression model with fixed individual and time effects and individual characteristics Wi that do not change over time. Let the treatment be binary,
> Suppose there are panel data for T = 2 time periods for a randomized controlled experiment, where the first observation (t = 1) is taken before the experiment and the second observation (t = 2) is for the post treatment period. Suppose the treatment is b
> Consider a study to evaluate the effect on college student grades of dorm room Internet connections. In a large dorm, half the rooms are randomly wired for high-speed Internet connections (the treatment group), and final course grades are collected for a
> A new law will increase minimum wages in City A next year but not in City B, a city much like City A. You collect employment data from a random selected sample of restaurants in cities A and B this year, and you plan to return and collect data at restaur
> Suppose that, in a randomized controlled experiment of the effect of an SAT preparatory course on SAT scores, the following results are reported: a. Estimate the average treatment effect on test scores. b. Is there evidence of non-random assignment? Expl
> For the following calculations, use the results in column (3) of Table 13.2. Consider two classrooms, A and B, which have identical values of the regressors in column (3) of Table 13.2, except that: a. Classroom A is a small class, and classroom B is a r
> Consider the potential outcomes framework. Suppose Xi is a binary treatment that is independent of the potential outcomes Yi (1) and Yi (0). Let TEi = Yi (1) – Yi (0) denotes the treatment effect for individual i. a. Can you consistently estimate E [Yi (
> Results of a study by McClelan, McNeill, and Newhouse are reported. They estimate the effect of cardiac catheterization on patient survival times. They instrument the use of cardiac catheterization by the distance between a patient’s home and a hospital
> Consider the regression model with heterogeneous regression coefficients Yi = 0 + 1iXi + vi, where (vi, Xi, 1i) are i.i.d. random variables with 1 = E (1i). a. Show that the model can be written as Yi = 0 + 1Xi + ui, where ui = (1i - 1) Xi + vi.
> How would you calculate the small class treatment effect from the results in Table 13.1? Can you distinguish this treatment effect from the aide treatment effect? How would you have to change the program to correctly estimate both effects? Data from Tab
> Y is distributed N (10, 100) and you want to calculate Pr (Y
> A researcher is interested in the effect of more secure property rights on income across countries. He collects recent data from 60 countries and runs the OLS regression Yi = 0 + 1Xi + ui, where Yi is a country’s GDP per capita and Xi is an index takin
> Consider a product market with a supply function Qsi = 0 + 1Pi + usi, a demand function Qdi = 0 + udi, and a market equilibrium condition Qsi = Qdi, where usi and udi are mutually independent i.i.d. random variables, both with a mean of 0. a. Show tha
> A classmate has developed an IV regression model with one regressor, Xi, and two instruments, Z1i and Z2i. She has a strong theoretical basis as to why corr (Z1i, ui) = 0, namely that Z1i is the result of a random lottery. Preliminary work, however, show
> Consider the IV regression model Yi = 0 + 1Xi + 2Wi + ui, where Xi is correlated with ui and Zi is an instrument. Suppose that the first three assumptions in key concept are satisfied. Which IV assumption is not satisfied when a. Zi is independent of
> Consider TSLS estimation of the effect of a single included endogenous variable, Xi, on Yi using one binary instrument, Zi, which takes values of either 0 or 1. Noting that show that the Wald estimator can be derived from the TSLS estimator in this circu
> A classmate is interested in estimating the variance of the error term in Equation (12.1). a. Suppose she uses the estimator from the second-stage regression of where X^i is the fitted value from the first-stage regression. Is this estimator consistent?
> Consider the regression model with a single regressor: Yi = 0 + 1Xi + ui. Suppose the least squares assumptions in Key Concept 4.3 are satisfied. a. Show that Xi is a valid instrument. That is, show that Zi = Xi. b. Show that the IV regression assumpti
> Two classmates are comparing their answers to an assignment. One classmate has specified an instrumental variable regression model Yi = 0 + 1Xi + 2Wi + ui, using Zi as an instrument. The other student has specified the same model, but has omitted Wi.
> This question refers to the panel data IV regressions summarized in Table. a. Suppose the federal government is considering a new tax on cigarettes that is estimated to increase the retail price by $0.25 per pack. If the current price per pack is $6.75,
> Suppose Yi, I = 1, 2, …, n is i.i.d. random variables, each distributed N (20, 4). a. Compute Pr (19.6
> Use the estimated linear probability model shown in column (1) of Table 11.2 to answer the following: a. Two applicants, one self-employed and one in salaried employment, apply for a mortgage. They have the same values for all the regressors other than e
> Consider the linear probability model Yi = 0 + 1Xi + ui, and assume that E (ui | Xi) = 0. a. Show that Pr (Yi = 1 | Xi) = 0 + 1Xi. b. Show that var (ui | Xi) = (0 + 1Xi) [1 – (0 + 1Xi)]. c. Is ui heteroskedastic? Explain. d. Derive the likelih
> Repeat Exercise 11.6 using the logit model in Equation (11.10). Are the logit and probit results similar? Explain. Data from Equation 11.10: Data from Exercise 11.6: Use the estimated probit model in Equation (11.8) to answer the following questions: a
> Use the estimated probit model in Equation (11.8) to answer the following questions: a. A black mortgage applicant has a P/I ratio of 0.35. What is the probability that his application will be denied? b. Suppose the applicant reduced this ratio to 0.30.
> Seven hundred income-earning individuals from a district were randomly selected and asked whether they are government employees (Govi = 1) or not (Govi = 0); data were also collected on their gender (Malei = 1 if male and = 0 if female) and their years o
> Seven hundred income-earning individuals from a district were randomly selected and asked whether they are government employees (Govi = 1) or not (Govi = 0); data were also collected on their gender (Malei = 1 if male and = 0 if female) and their years o
> Seven hundred income-earning individuals from a district were randomly selected and asked whether they are government employees (Govi = 1) or not (Govi = 0); data were also collected on their gender (Malei = 1 if male and = 0 if female) and their years o
> Seven hundred income-earning individuals from a district were randomly selected and asked whether they are government employees (Govi = 1) or not (Govi = 0); data were also collected on their gender (Malei = 1 if male and = 0 if female) and their years o
> State which model you would use for: a. A study explaining the number of hours a person spends working in a factory during one week. b. A study explaining the level of satisfaction (0 through 5) a person gains from their job. c. A study of consumers’ cho
> Suppose a random variable Y has the following probability distribution: Pr (Y = 1) = p, Pr (Y = 2) = q, and Pr (Y = 3) = 1 - p - q. A random sample of size n is drawn from this distribution, and the random variables are denoted Y1, Y2, ……, Yn. a. Derive
> In a population, Y = 50 and 2Y = 21. Use the central limit theorem to answer the following questions: a. In a random sample of size n = 50, find Pr (Y 49). c. In a random sample of size n = 45, find Pr (50.5
> Seven hundred income-earning individuals from a district were randomly selected and asked whether they are government employees (Govi = 1) or not (Govi = 0); data were also collected on their gender (Malei = 1 if male and = 0 if female) and their years o
> a. In the fixed effects regression model, are the fixed entity effects, ai, consistently estimated as with T fixed? b. If n is large (say, n = 2000) but T is small (say, T = 4), do you think that the estimated values of i are approxima
> Consider observations (Yit, Xit) from the linear panel data model Yit = Xit1 + i + it + uit, where t = 1, ……, T; i = 1, ……, n; and i + it is an unobserved entity-specific time trend. How would you estimate 1?
> Suppose a researcher believes that the occurrence of natural disasters such as earthquakes leads to increased activity in the construction industry. He decides to collect province-level data on employment in the construction industry of an earthquake-pro
> Do the fixed effects regression assumptions imply that cov (v∼it, v∼is) = 0 for t ≠s in Equation (10.28)? Explain. Data from Equation 10.28:
> Consider the model with a single regressor. This model also can be written as Yit = 0 + 1X1, it + 2B2t + + TBTt + 2D2i ++ nDni + uit, where B2t = 1 if t = 2 and 0 otherwise, D2i = 1 if i = 2 and 0 otherwise, and so forth. How are the coefficien
> Using the regression in Equation (10.11), what are the slope and intercept for a. Entity 1 in time period 1? b. Entity 1 in time period 3? c. Entity 3 in time period 1? d. Entity 3 in time period 3? Data from Equation 1011:
> Section 9.2 gave a list of five potential threats to the internal validity of a regression study. Apply that list to the empirical analysis and thereby draw conclusions about its internal validity. Data from Section 9.2: 1. Omitted Variable Bias 2. Meas
> Consider the binary variable version of the fixed effects model in Equation except with an additional regressor, D1i; that is, let a. Suppose that n = 3. Show that the binary regressors and the “constant” regressor are
> Let ^1DM denote the entity-demeaned estimator given in Equation (10.22), and let ^BA1 denote the “before and after” estimator without an intercept, so that Show that, if Data from E
> X is a Bernoulli random variable with Pr (X = 1) = 0.90; Y is distributed N (0, 4); W is distributed N (0, 16); and X, Y, and W are independent. Let S = XY + (1 – X) W. (That is, S = Y when X = 1, and S = W when X = 0.) a. Show that E(Y2) = 4 and E(W2) =
> A researcher wants to estimate the determinants of annual earnings—age, gender, schooling, union status, occupation, and sector of employment. He has been told that if he collects panel data on a large number of randomly chosen individuals over time, he
> This exercise refers to the drunk driving panel data regression summarized in Table 10.1. a. New Jersey has a population of 8.85 million people. Suppose New Jersey increased the tax on a case of beer by $2 (in 1988 dollars). Use the results in column (5)
> Consider the linear regression of TestScore on Income shown in Figure 8.2 and the nonlinear regression in Equation (8.18). Would either of these regressions provide a reliable estimate of the causal effect of income on test scores? Would either of these
> Would the regression in Equation (4.9) in chapter 4 be useful for predicting test scores in a school district in Massachusetts? Why or why not? Data from Equation 4.9:
> Are the following statements true or false? Explain your answer. a. “An ordinary least squares regression of Y onto X will not be internally valid if Y is correlated with the error term.” b. “If the error term exhibits heteroskedasticity, then the estima
> Suppose that n = 50 i.i.d. observations for (Yi, Xi) yield the following regression results: Y^ = 49.2 + 73.9X, SER = 13.4, R2 = 0.78. Another researcher is interested in the same regression, but he makes an error when he enters the data into his regress
> The demand for a commodity is given by Q = 0 + 1P + u, where Q denotes quantity, P denotes price, and u denotes factors other than price that determine demand. Supply for the commodity is given by Q = 0 + 1P + v, where v denotes factors other than pr
> Using the regressions shown in columns (2) of Table 9.3, and column (2) of Table 9.2, construct a table like Table 9.3 and compare the estimated effects of a 10 percentage point increase in the students eligible for free lunch on test scores in Californi
> Labor economists studying the determinants of women’s earnings discovered a puzzling empirical result. Using randomly selected employed women, they regressed earnings on the women’s number of children and a set of control variables (age, education, occup
> Consider the one-variable regression model Yi = 0 + 1Xi + ui, and suppose it satisfies the least squares assumptions. Suppose Yi is measured with error, so the data are Y∼i = Yi + wi, where wi is the measurement error, which is i.i.d. and independent o
> Compute the following probabilities: a. If Y is distributed t12, find Pr (Y
> Assume that the regression model Yi = 0 + 1Xi + ui satisfies the least squares assumptions. You and a friend collect a random sample of 300 observations on Y and X. a. Your friend reports that he inadvertently scrambled the X observations for 20% of th
> Consider the one-variable regression model Yi = 0 + 1Xi + ui, and suppose it satisfies the least squares assumptions in Key Concept 4.3. The regressor Xi is missing, but data on a related variable, Zi, are available, and the value of Xi is estimated us
> Read the box “The Demand for Economics Journals” in Section 8.3. Discuss the internal and external validity of the estimated effect of price per citation on subscriptions. Data from Section 8.3: Professional economist
> Read the box “The Effect of Ageing on Healthcare Expenditures: A Red Herring?” in Section 8.3. Discuss the internal and external validity as a causal effect of the relationship between age and healthcare expenditures,
> Suppose that you have just read a careful statistical study of the effect of improved health of children on their test scores at school. Using data from a project in a West African district in 2000, the study concluded that students who received multivit
> Explain how you would use approach 2 to calculate the confidence interval discussed below Equation (8.8). Data from Equation 8.8:
> X is a continuous variable that takes on values between 5 and 100. Z is a binary variable. Sketch the following regression functions (with values of X between 5 and 100 on the horizontal axis and values of Y^ on the vertical axis): a. Y^ = 2.0 + 3.0 * ln
> This problem is inspired by a study of the gender gap in earnings in top corporate jobs (Bertrand and Hallock, 2001). The study compares total compensation among top executives in a large set of U.S. public corporations in the 1990s. (Each year these pub
> Refer to Table 8.3. a. A researcher suspects that the effect of %Eligible for subsidized lunch has a nonlinear effect on test scores? In particular, he conjectures that increases in this variable from 10% to 20% have little effect on test scores but that
> Read the box “The Demand for Economics Journals” in Section 8.3. a. The box reaches three conclusions. Looking at the results in the table, what is the basis for each of these conclusions? b. Using the results in regre
> Compute the following probabilities: a. If Y is distributed X23, find Pr (Y