Use CONSUMP for this exercise. One version of the permanent income hypothesis (PIH) of consumption is that the growth in consumption is unpredictable. [Another version is that the change in consumption itself is unpredictable; see Mankiw (1994, Chapter 15) for discussion of the PIH.] Let gct = log(ct) – log(ct-1) be the growth in real per capita consumption (of nondurables and services). Then the PIH implies that E(gct|It-1) = E(gct), where It-1 denotes information known at time (t – 1); in this case, t denotes a year. (i) Test the PIH by estimating gct = 0 + 1gct-1 + ut. Clearly state the null and alternative hypotheses. What do you conclude? (ii) To the regression in part (i) add the variables gyt-1, i3t-1, and inft-1. Are these new variables individually or jointly significant at the 5% level? (Be sure to report the appropriate p-values.) (iii) In the regression from part (ii), what happens to the p-value for the t statistic on gct-1? Does this mean the PIH hypothesis is now supported by the data? (iv) In the regression from part (ii), what is the F statistic and its associated p-value for joint significance of the four explanatory variables? Does your conclusion about the PIH now agree with what you found in part (i)?
> Use the data in SCHOOL93_98 to answer the following questions. Use the command xtset schid year to set the cross section and time dimensions. (i) How many schools are there. Does each school have a record for each of the six years? Verify that lavgrexpp
> Use the data in COUNTYMURDERS to answer this question. The data set covers murders and executions (capital punishment) for 2,197 counties in the United States. (i) Consider the model murdrateit = (t + 0execsit + 1execsi,t-1 + 2execsi,t-2 + 3execsi,t
> This question assumes that you have access to a statistical package that computes standard errors robust to arbitrary serial correlation and heteroskedasticity for panel data methods. (i) For the pooled OLS estimates, obtain the standard errors that allo
> Use the data in RENTAL for this exercise. The data on rental prices and other variables for college towns are for the years 1980 and 1990. The idea is to see whether a stronger presence of students affects rental rates. The unobserved effects model is wh
> The purpose of this exercise is to compare the estimates and standard errors obtained by correctly using 2SLS with those obtained using inappropriate procedures. Use the data file WAGE2. (i) Use a 2SLS routine to estimate the equation Log(wage) = 0 + 1
> Use the data in PHILLIPS for this exercise. (i) We estimated an expectation augmented Phillips curve of the form (inft = 0 + 1unemt + et, where (inft = inft – inft-1. In estimating this equation by OLS, we assumed that the supply shock, et, was uncorre
> Use the data in CARD for this exercise. (i) In Table 15.1, the difference between the IV and OLS estimates of the return to education is economically important. Obtain the reduced form residuals, v^2, from the reduced form regression educ on nearc4, expe
> Consider the analysis in Computer Exercise C11 in Chapter 4 using the data in HTV, where educ is the dependent variable in a regression. (i) How many different values are taken on by educ in the sample? Does educ have a continuous distribution? (ii) Plot
> Use the data in LABSUP to answer the following questions. These are data on almost 32,000 black or Hispanic women. Every woman in the sample is married. It is a subset of the data used in Angrist and Evans (1998). Our interest here is in determining how
> Use the data in WAGE2 for this exercise. (i) If sibs is used as an instrument for educ, the IV estimate of the return to education is .122. To convince yourself that using sibs as an IV for educ is not the same as just plugging sibs in for educ and runni
> For this exercise, use the data in AIRFARE, but only for the year 1997. (i) A simple demand function for airline seats on routes in the United States is Log(passen) = 10 + 1log(fare) + 11log(dist) + 12[3log(dist)]2 + u1 where; passen = average passen
> (i) Suppose that, after differencing to remove the unobserved effect, you think (log(polpc) is simultaneously determined with (log(crmrte); in particular, increases in crime are associated with increases in police officers. How does this help to explain
> Use the Economic Report of the President (2005 or later) to update the data in CONSUMP, at least through 2003. Reestimate equation. Do any important conclusions change?
> (i) Because log(pcinc) is insignificant in both (16.22) and the reduced form for open, drop it from the analysis. Estimate by OLS and IV without log(pcinc). Do any important conclusions change? (ii) Still leaving log(pcinc) out of the analysis, is land o
> A common method for estimating Engel curves is to model expenditure shares as a function of total expenditure, and possibly demographic variables. A common specification has the form; sgood = 0 + 1ltotexpend + demographics + u, where sgood is the fract
> (i) A model to estimate the effects of smoking on annual income (perhaps through lost work days due to illness, or productivity effects) is log(income) = 0 + 1cigs + 2educ + 3age + 4age2 + u1, where cigs is number of cigarettes smoked per day, on av
> Use the data in APPLE for this exercise. These are telephone survey data attempting to elicit the demand for a (fictional) “ecologically friendly” apple. Each family was (randomly) presented with a set of prices for regular apples and the ecolabeled appl
> (i) Using the 428 women who were in the workforce, estimate the return to education by OLS including exper, exper2, nwifeinc, age, kidslt6, and kidsge6 as explanatory variables. Report your estimate on educ and its standard error. (ii) Now, estimate the
> Use the data in MLB1 for this exercise. (i) Use the model estimated in equation (4.31) and drop the variable rbisyr. What happens to the statistical significance of hrunsyr? What about the size of the coefficient on hrunsyr? (ii) Add the variables runsyr
> There, we used the data in FERTIL1 to estimate a linear model for kids, the number of children ever born to a woman. (i) Estimate a Poisson regression model for kids, Interpret the coefficient on y82. (ii) What is the estimated percentage difference in
> (i) For what percentage of the workers in the sample is pension equal to zero? What is the range of pension for workers with nonzero pension benefits? Why is a Tobit model appropriate for modeling pension? (ii) Estimate a Tobit model explaining pension i
> Use the data in JTRAIN98 to answer the following questions. Here you will use a Tobit model because the outcome, earn98, sometimes is zero. (i) How many observations (men) in the sample have earn98 = 0? Is it a large percentage of the sample? (ii) Estima
> Use the data set in ALCOHOL, obtained from Terza (2002), to answer this question. The data, on 9,822 men, includes labor market information, whether the man abuses alcohol, and demographic and background variables. In this question you will study the eff
> (i) Using OLS on the full sample, estimate a model for log(wage) using explanatory variables educ, abil, exper, nc, west, south, and urban. Report the estimated return to education and its standard error. (ii) Now estimate the equation from part (i) usin
> Use the data in CPS91 for this exercise. These data are for married women, where we also have information on each husband’s income and demographics. (i) What fraction of the women report being in the labor force? (ii) Using only the data for working wo
> (i) The variable favwin is a binary variable if the team favored by the Las Vegas point spread wins. A linear probability model to estimate the probability that the favored team wins is P(favwin = 1|spread) = 0 + 1spread. Explain why, if the spread inc
> (i) Let yt be real per capita disposable income. Use the data through 1989 to estimate the model yt = + t + yt-1 + ut and report the results in the usual form. (ii) Use the estimated equation from part (i) to forecast y in 1990. What is the forecast
> (i) Graph gfr against time. Does it contain a clear upward or downward trend over the entire sample period? (ii) Using the data through 1979, estimate a cubic time trend model for gfr (that is, regress gfr on t, t2, and t3, along with an intercept). Comm
> (i) Estimate the linear trend model chnimpt = + t + ut, using the first 119 observations (this excludes the last 12 months of observations for 1988). What is the standard error of the regression? (ii) Now, estimate an AR(1) model for chnimp, again usi
> (i) It may be that the expected value of the return at time t, given past returns, is a quadratic function of returnt-1. To check this possibility, use the data in NYSE to estimate; returnt = 0 + 1returnt-1 + 2returnt-1 + ut; report the results in sta
> Use the data in PHILLIPS for this exercise. (i) Estimate the models represented in equations using the data through 2015. (ii) Use the new equations to forecast unem2016; round to two decimal places. Which equation produces a better forecast? (iii) Use t
> (i) In Example 18.7, we estimated an error correction model for the holding yield on six-month T-bills, where one lag of the holding yield on three-month T-bills is the explanatory variable. We assumed that the cointegration parameter was one in the equa
> In testing for cointegration between gfr and pe, add t2 to equation to obtain the OLS residuals. Include one lag in the augmented DF test. The 5% critical value for the test is -4.15.
> (i) Estimate an AR(3) model for pcip. Now, add a fourth lag and verify that it is very insignificant. (ii) To the AR(3) model from part (i), add three lags of pcsp to test whether pcsp Granger causes pcip. Carefully, state your conclusion. (iii) To the m
> (i) Using all of the years—through 2017—run the regression (inft on inft21 (and an intercept) and test the null hypothesis that {inft} is I(1) against the alternative that it is I(0). At what significance level do you reject the null hypothesis? (ii) Wha
> This question asks you to study the so-called Beveridge Curve from the perspective of cointegration analysis. The U.S. monthly data from December 2000 through February 2012 are in BEVERIDGE. (i) Test for a unit root in urate using the usual Dickey-Fuller
> Use the data in MINWAGE.DTA for sector 232 to answer the following questions. (i) Confirm that lwage232t and lemp232t are best characterized as I(1) processes. Use the augmented DF test with one lag of gwage232 and gemp232, respectively, and a linear tim
> Use the data in TRAFFIC2 for this exercise. These monthly data, on traffic accidents in California over the years 1981 to 1989, were used in Computer Exercise C11 in Chapter 10. (i) Using the standard Dickey-Fuller regression, test whether ltotacct has a
> This exercise also uses the data from VOLAT. Here, you will study the question of Granger causality using the percentage changes. (i) Estimate an AR(3) model for pcipt, the percentage change in industrial production (reported at an annualized rate). Show
> Use the data in VOLAT for this exercise. (i) Confirm that lsp500 = log(sp500) and lip = log(ip) appear to contain unit roots. Use Dickey Fuller tests with four lagged changes and do the tests with and without a linear time trend. (ii) Run a simple regres
> In equation (4.42) of Chapter 4, using the data set BWGHT, compute the LM statistic for testing whether motheduc and fatheduc are jointly significant. In obtaining the residuals for the restricted model, be sure that the restricted model is estimated usi
> (i) Using the data from all but the last four years (16 quarters), estimate an AR(1) model for (r6t. (We use the difference because it appears that r6t has a unit root.) Find the RMSE of the one-step-ahead forecasts for (r6, using the last 16 quarters. (
> Use the data in WAGEPRC for this exercise. Problem 5 in Chapter 11 gave estimates of a finite distributed lag model of gprice on gwage, where 12 lags of gwage are used. (i) Estimate a simple geometric DL model of gprice on gwage. In particular, estimate
> Use the data in RENTAL for this exercise. The data for the years 1980 and 1990 include rental prices and other variables for college towns. The idea is to see whether a stronger presence of students affects rental rates. The unobserved effects model is L
> Use the data in KIELMC for this exercise. (i) The variable dist is the distance from each home to the incinerator site, in feet. Consider the model Log(price) = 0 + 0y81 + 1log(dist) + d1y81.log(dist) + u. If building the incinerator reduces the value
> Consider the version of Fair’s model in Example 10.6. Now, rather than predicting the proportion of the two-party vote received by the Democrat, estimate a linear probability model for whether or not the Democrat wins. (i) Use the binary variable demwins
> (i) In part (i) of Computer Exercise C6 in Chapter 11, you were asked to estimate the accelerator model for inventory investment. Test this equation for AR(1) serial correlation. (ii) If you find evidence of serial correlation, re-estimate the equation b
> Use the data in OKUN to answer this question; (i) Estimate the equation pcrgdpt = 0 + 1cunemt + ut and test the errors for AR(1) serial correlation, without assuming {cunemt: t = 1, 2, . . .} is strictly exogenous. What do you conclude? (ii) Regress th
> Use the data in NYSE to answer these questions. (i) Estimate the model in equation (12.47) and obtain the squared OLS residuals. Find the average, minimum, and maximum values of u^2t over the sample. (ii) Use the squared OLS residuals to estimate the fol
> Okun’s Law—see, for example, Mankiw (1994, Chapter 2)—implies the following relationship between the annual percentage change in real GDP, pcrgdp, and the change in the annual unemployment rate, cunem: pcrgdp = 3 - 2 * cunem. If the unemployment rate is
> (i) Test for a unit root in log(invpc), including a linear time trend and two lags of (log(invpct). Use a 5% significance level. (ii) Use the approach from part (i) to test for a unit root in log(price). (iii) Given the outcomes in parts (i) and (ii), do
> In this exercise, you are to compare OLS and LAD estimates of the effects of 401(k) plan eligibility on net financial assets. The model is nettfa = 0 + 1inc + 2inc2 + b3age + b4age2 + b5male + b6e401k + u. (i) Use the data in 401KSUBS to estimate the
> Use the data in MURDER only for the year 1993 for this question, although you will need to first obtain the lagged murder rate, say mrdrte-1. (i) Run the regression of mrdrte on exec, unem. What are the coefficient and t statistic on exec? Does this regr
> Use the data in JTRAIN98 to answer this question. The variable unem98 is a binary variable indicating whether a worker was unemployed in 1998. It can be used to measure the effectiveness of the job training program in reducing the probability of being un
> We computed the OLS and a set of WLS estimates in a cigarette demand equation. (i) Obtain the OLS estimates in equation (8.35). (ii) Obtain the h^i used in the WLS estimation of equation (8.36) and reproduce equation (8.36). From this equation, obtain th
> (i) Estimate the model children = 0 + 1age + 2age2 + 3educ + 4electric + 5urban + u and report the usual and heteroskedasticity-robust standard errors. Are the robust standard errors always bigger than the nonrobust ones? (ii) Add the three religio
> Suppose that the return from holding a particular firm’s stock goes from 15% in one year to 18% in the following year. The majority shareholder claims that “the stock return only increased by 3%,” while the chief executive officer claims that “the return
> Much is made of the fact that certain mutual funds outperform the market year after year (that is, the return from holding shares in the mutual fund is higher than the return from holding a portfolio such as the S&P 500). For concreteness, consider a 10-
> In Example, quantity of compact discs was related to price and income by quantity = 120 - 9.8 price 1 .03 income. What is the demand for CDs if price = 15 and income = 200? What does this suggest about using linear functions to describe demand curves?
> Use the data set GPA1 to answer this question. It was used in Computer Exercise C13 in Chapter 3 to estimate the effect of PC ownership on college GPA. (i) Run the regression colGPA on PC, hsGPA, and ACT and obtain a 95% confidence interval for PC. Is t
> Suppose that a high school student is preparing to take the SAT exam. Explain why his or her eventual SAT score is properly viewed as a random variable.
> The following table contains monthly housing expenditures for 10 families. (i) Find the average monthly housing expenditure. (ii) Find the median monthly housing expenditure. (iii) If monthly housing expenditures were measured in hundreds of dollars, rat
> we estimated an equation to test for a tradeoff between minutes per week spent sleeping (sleep) and minutes per week spent working (totwrk) for a random sample of individuals. We also included education and age in the equation. Because sleep and totwrk a
> Write a two-equation system in “supply and demand form,” that is, with the same variable yt (typically, “quantity”) appearing on the left-hand side: y1 = 1y2 + 1z1 + u1 y1 = 2y2 + 2z2 + u2. (i) If 1 = 0 or 2 = 0, explain why a reduced form exists f
> Suppose you are hired by a university to study the factors that determine whether students admitted to the university actually come to the university. You are given a large random sample of students who were admitted the previous year. You have informati
> Let patents be the number of patents applied for by a firm during a given year. Assume that the conditional expectation of patents given sales and RD is; E(patents|sales,RD) = exp[0 + 1log(sales) + 2RD + 3RD2], where sales is annual firm sales and RD
> (i) Suppose in the Tobit model that x1 = log(z1), and this is the only place z1 appears in x. Show that where 1 is the coefficient on log(z1). (ii) If x1 = z1, and x2 = z21, show that where 1 is the coefficient on z1 a
> (i) For a binary response y, let y be the proportion of ones in the sample (which is equal to the sample average of the yj). Let q^0 be the percent correctly predicted for the outcome y = 0 and let q^1 be the percent correctly predicted for the outcome y
> Let {yt} be an I(1) sequence. Suppose that ^n is the one-step-ahead forecast of (yn+1 and let f^n = ^n + yn be the one-step-ahead forecast of yn+1. Explain why the forecast errors for forecasting (yn+1 and yn+1 are identical
> Suppose that yt follows the model yt = + 1zt-1 + ut ut = ut-1 + et E(et|It-1) = 0, where It-1 contains y and z dated at t - 1 and earlier. (i) Show that E(yt11|It) = (1 = ) + yt + 1zt - 1zt-1. (ii) Suppose that you use n observations to estimat
> Use the data in GPA1 to answer this question. We can compare multiple regression estimates, where we control for student achievement and background variables, and compare our findings with the difference-in-means estimate in Computer Exercise C11 in Chap
> Let gMt be the annual growth in the money supply and let unemt be the unemployment rate. Assuming that unemt follows a stable AR(1) process, explain in detail how you would test whether gM Granger causes unem.
> Using the monthly data in VOLAT, the following model was estimated: where pcip is the percentage change in monthly industrial production, at an annualized rate, and pcsp is the percentage change in the Standard & Poor’s 500 Index, a
> Suppose the process {(xt, yt): t = 0, 1, 2, . . .} satisfies the equations yt = xt + ut and (xt = (xt-1 + vt, where E(ut|It-1) = E(vt|It-1) = 0, It-1 contains information on x and y dated at time t - 1 and earlier, - 0, and || < 1 [so that xt, and t
> Consider the error correction model in equation (18.37). Show that if you add another lag of the error correction term, yt-2 – xt-2, the equation suffers from perfect collinearity. Data from Equation 18.37:
> Suppose that {yt} and {zt} are I(1) series, but yt - zt is I(0) for some - 0. Show that for any - , yt - zt must be I(1).
> An interesting economic model that leads to an econometric model with a lagged dependent variable relates yt to the expected value of xt, say, x*t , where the expectation is based on all observed information at time t - 1: yt = 0 + 1x*t + ut. A natural
> Consider the geometric distributed model in equation, written in estimating equation form as in equation: yt = 0 + zt + yt-1 + vt, where vt = ut - ut-1. (i) Suppose that you are only willing to assume the sequential exogeneity assumption in (18.6). W
> Consider equation (18.15) with k = 2. Using the IV approach to estimating the h and , what would you use as instruments for yt-1? Data from Equation 18.15:
> Why can we not use first differences when we have independent cross sections in two years (as opposed to panel data)?
> (i) In the enterprise zone event study in Computer Exercise C5 in Chapter 10, a regression of the OLS residuals on the lagged residuals produces ^ = .841 and se(^) = .053. What implications does this have for OLS? (ii) If you want to use OLS but also w
> Use the data in HTV to answer this question. (i) Estimate the regression model educ = 0 + 1motheduc + 2fatheduc + 3abil + 4abil2 + u by OLS and report the results in the usual form. Test the null hypothesis that educ is linearly related to abil agai
> In Example 10.6, we used the data in FAIR to estimate a variant on Fair’s model for predicting presidential election outcomes in the United States. (i) What argument can be made for the error term in this equation being serially uncorrelated? (ii) When
> A partial adjustment model is y*t = 0 + 1xt + et yt – yt-1 = (y*t – yt-1) + at, where y*t is the desired or optimal level of y and yt is the actual (observed) level. For example, y*t is the desired growth in firm inventories, and xt is growth in firm
> Let {xt: t = 1, 2, . . .} be a covariance stationary process and define h = Cov(xt, xt+h) for h >= 0. Show that Corr(xt, xt+h) = h/0.
> In Example 10.4, we wrote the model that explicitly contains the long-run propensity, (0, as gfrt = 0 + (0pet + 1 (pet-1 = pet) + 2 (pet-2 - pet) + u, where we omit the other explanatory variables for simplicity. As always with multiple regression ana
> Consider the simple regression model with classical measurement error, y = 0 + 1x* + u, where we have m measures on xp. Write these as zh = x* + eh, h = 1, . . . , m. Assume that xp is uncorrelated with u, e1, . . . ,
> (i) In column (3) of Table 9.2, the coefficient on educ is .018 and it is statistically insignificant, and that on IQ is actually negative, 2.0009, and also statistically insignificant. Explain what is happening. (ii) What regression might you run that s
> Consider the potential outcomes framework, where w is a binary treatment indicator and the potential outcomes are y(0) and y(1). Assume that w is randomly assigned, so that w is independent of [y(0),y(1)]. Let 0 = E[y(0)], 1 = E[y(1)], 20 = Var[y(0)],
> Consider a model at the employee level, yi,e = 0 + 1xi,e,1 + 2xi,e,2 + . . . + kxi,e,k + fi + vi,e, where the unobserved variable fi is a “firm effect” to each employee at a given firm i. The error term vi,e is specific to employee e at firm i. The c
> (i) In the context of potential outcomes with a sample of size n, let [yi(0), yi(1)] denote the pair of potential outcomes for unit i. Define the averages and define the sample average treatment effect (SATE) as SATE = y(1) – y(0). Can
> Using the data in SLEEP75 (see also Problem 3 in Chapter 3), we obtain the estimated equation The variable sleep is total minutes per week spent sleeping at night, totwrk is total weekly minutes spent working, educ and age are measured in years, and male
> Use the data in GPA1 to answer these questions. It is a sample of Michigan State University undergraduates from the mid-1990s, and includes current college GPA, colGPA, and a binary variable indicating whether the student owned a personal computer (PC).
> If we start with (6.38) under the CLM assumptions, assume large n, and ignore the estimation error in the ^j, a 95% prediction interval for y0 is [exp(-1.96^) exp(logy0) , exp(1.96^) exp(logy0)]. The point prediction for y0 is y^0 = exp(^2/2)exp(logy
> Consider the equation y = b0 + b1x + b2x2 + u E(u|x) = 0, where the explanatory variable x has a standard normal distribution in the population. In particular, E(x) = 0, E(x2) = Var(x) = 1, and E(x3) = 0. This last condition holds because the standard no
> In the simple regression model under MLR.1 through MLR.4, we argued that the slope estimator, ^1, is consistent for 1. Using ^0 = y - ^1x1, show that plim ^0 = 0. [You need to use the consistency of ^1 and the law of large numbers, along with the
> In Problem 3 in Chapter 3, we estimated the equation where we now report standard errors along with the estimates. (i) Is either educ or age individually significant at the 5% level against a two-sided alternative? Show your work. (ii) Dropping educ and
> We used data on nonunionized manufacturing firms to estimate the relationship between the scrap rate and other firm characteristics. We now look at this example more closely and use all available firms. (i) The population model estimated in Example 4.7 c
> The data in MEAPSINGLE were used to estimate the following equations relating school-level performance on a fourth-grade math test to socioeconomic characteristics of students attending school. The variable free, measured at the school level, is the perc
> The following table was created using the data in CEOSAL2, where standard errors are in parentheses below the coefficients: The variable mktval is market value of the firm, profmarg is profit as a percentage of sales, ceoten is years as CEO with the curr