4.99 See Answer

Question: You need to use two data sets


You need to use two data sets for this exercise, JTRAIN2 and JTRAIN3. The former is the outcome of a job training experiment. The file JTRAIN3 contains observational data, where individuals themselves largely determine whether they participate in job training. The data sets cover the same time period.
(i) In the data set JTRAIN2, what fraction of the men received job training? What is the fraction in JTRAIN3? Why do you think there is such a big difference?
(ii) Using JTRAIN2, run a simple regression of re78 on train. What is the estimated effect of participating in job training on real earnings?
(iii) Now add as controls to the regression in part (ii) the variables re74, re75, educ, age, black, and hisp. Does the estimated effect of job training on re78 change much? How come? (Hint: Remember that these are experimental data.)
(iv) Do the regressions in parts (ii) and (iii) using the data in JTRAIN3, reporting only the estimated coefficients on train, along with their t statistics. What is the effect now of controlling for the extra factors, and why?
(v) Define avgre = (re74 + re75)/2. Find the sample averages, standard deviations, and minimum and maximum values in the two data sets. Are these data sets representative of the same populations in 1978?
(vi) Almost 96% of men in the data set JTRAIN2 have avgre less than $10,000. Using only these men, run the regression
re78 on train, re74, re75, educ, age, black, hisp
and report the training estimate and its t statistic. Run the same regression for JTRAIN3, using only men with avgre ≤ 10. For the subsample of low-income men, how do the estimated training effects compare across the experimental and nonexperimental data sets?
(vii) Now use each data set to run the simple regression re78 on train, but only for men who were unemployed in 1974 and 1975. How do the training estimates compare now?
(viii) Using your findings from the previous regressions, discuss the potential importance of having comparable populations underlying comparisons of experimental and nonexperimental estimates.


> Use the data in HPRICE1 to obtain the heteroskedasticity-robust standard errors for equation (8.17). Discuss any important differences with the usual standard errors. (ii) Repeat part (i) for equation (8.18). (iii) What does this example suggest about he

> Consider the following model to explain sleeping behavior: (i) Write down a model that allows the variance of u to differ between men and women. The variance should not depend on other factors. (ii) Use the data in SLEEP75 to estimate the parameters of t

> If we start with (6.38) under the CLM assumptions, assume large n, and ignore the estimation error in the

> Suppose we want to estimate the effects of alcohol consumption (alcohol) on college grade point average (colGPA). In addition to collecting information on grade point averages and alcohol usage, we also obtain attendance information (say, percentage of

> The following three equations were estimated using the 1,534 observations in 401K: Which of these three models do you prefer? Why? 80.29 + 5.44 mrate + .269 age – .00013 totemp (.78) (.52) prate (.045) (.00004) R? = .100, R = .098. 97.32 + 5.02 mrat

> When atndrte2 and ACT # atndrte are added to the equation estimated in (6.19), the R-squared becomes .232. Are these additional terms jointly significant at the 10% level? Would you include them in the model? stndfnl = 2.05 - .0067 atndrte 1.63 pri

> In Example 4.2, where the percentage of students receiving a passing score on a tenth-grade math exam (math10) is the dependent variable, does it make sense to include sci11—the percentage of eleventh graders passing a science exam—as an ad

> The following model allows the return to education to depend upon the total amount of both parents’ education, called pareduc: Show that, in decimal form, the return to another year of education in this model is What sign do you expect

> Use the data in 401KSUBS for this question, restricting the sample to fsize 5 1. (i) To the model estimated in Table 8.1, add the interaction term, e401k · inc. Estimate the equation by OLS and obtain the usual and robust standard errors. W

> Using the data in RDCHEM, the following equation was obtained by OLS: (i) At what point does the marginal effect of sales on rdintens become negative? (ii) Would you keep the quadratic term in the model? Explain. (iii) Define salesbil as sales measured i

> Let 

> The following two equations were estimated using the data in MEAPSINGLE. The key explanatory variable is lexppp, the log of expenditures per student at the school level. (i) If you are a policy maker trying to estimate the causal effect of per-student s

> The following equation was estimated using the data in CEOSAL1: This equation allows roe to have a diminishing effect on log(salary). Is this generality necessary? Explain why or why not. .00008 roe? log(salary) = 4.322 + .276 log(sales) + .0215 roe

> Let d be a dummy (binary) variable and let z be a quantitative variable. Consider the model this is a general version of a model with an interaction between a dummy variable and a quantitative variable. (i) Since it changes nothing important, set the err

> Suppose you collect data from a survey on wages, education, experience, and gender. In addition, you ask for information about marijuana usage. The original question is: “On how many separate occasions last month did you smoke marijuana?” (i) Write an e

> In equation (7.29), suppose that we define outlf to be one if the woman is out of the labor force, and zero otherwise. (i) If we regress outlf on all of the independent variables in equation (7.29), what will happen to the intercept and slope estimates?

> To test the effectiveness of a job training program on the subsequent wages of workers, we specify the model where train is a binary variable equal to unity if a worker participated in the program. Think of the error term u as containing unobserved worke

> Let noPC be a dummy variable equal to one if the student does not own a PC, and zero otherwise. (i) If noPC is used in place of PC in equation (7.6), what happens to the intercept in the estimated equation? What will be the coefficient on noPC? (Hint: Wr

> An equation explaining chief executive officer salary is The data used are in CEOSAL1, where finance, consprod, and utility are binary variables indicating the financial, consumer products, and utilities industries. The omitted industry is transportatio

> Use the data set 401KSUBS for this exercise. (i) Using OLS, estimate a linear probability model for e401k, using as explanatory variables inc, inc2, age, age2, and male. Obtain both the usual OLS standard errors and the heteroscedasticity-robust versions

> Using the data in GPA2, the following equation was estimated: The variable sat is the combined SAT score; hsize is size of the student’s high school graduating class, in hundreds; female is a gender dummy variable; and black is a race

> The following equations were estimated using the data in BWGHT: And The variables are defined as in Example 4.9, but we have added a dummy variable for whether the child is male and a dummy variable indicating whether the child is classified as white. (

> The following equations were estimated using the data in ECONMATH, with standard errors reported under coefficients. The average class score, measured as a percentage, is about 72.2; exactly 50% of the students are male; and the average of colgpa (grade

> For a child i living in a particular school district, let voucheri be a dummy variable equal to one if a child is selected to participate in a school voucher program, and let scorei be that child’s score on a subsequent standardized exam. Suppose that th

> The estimated equation The variable sleep is total minutes per week spent sleeping at night, totwrk is total weekly minutes spent working, educ and age are measured in years, and male is a gender dummy. (i) All other factors being equal, is there evidenc

> The following equations were estimated using the data in ECONMATH. The first equation is for men and the second is for women. The third and fourth equations combine men and women. (i) Compute the usual Chow statistic for testing the null hypothesis that

> Consider a model at the employee level, where the unobserved variable fi is a “firm effect” to each employee at a given firm i. The error term vi,e is specific to employee e at firm i. The composite error is ui,e = fi

> There are different ways to combine features of the Breusch-Pagan and White tests for heteroskedasticity. One possibility not covered in the text is to run the regression where the µ I are the OLS residuals and the on X;1, X2, ... Xik

> The variable smokes is a binary variable equal to one if a person smokes, and zero otherwise. Using the data in SMOKE, we estimate a linear probability model for smokes: The variable white equals one if the respondent is white, and zero otherwise. Both t

> Using the data in GPA3, the following equation was estimated for the fall and second semester students: Here, trmgpa is term GPA, crsgpa is a weighted average of overall GPA in courses taken, cumgpa is GPA prior to the current semester, tothrs is total

> (i) Obtain the OLS estimates in equation (8.35). (ii) Obtain the /used in the WLS estimation of equation (8.36) and reproduce equation (8.36). From this equation, obtain the unweighted residuals and fitted values; call these

> True or False: WLS is preferred to OLS when an important variable has been omitted from the model.

> Consider a linear model to explain monthly beer consumption: Write the transformed equation that has a homoskedastic error term. beer = Bo + Binc + Bzprice + B3zeduc + Bafemale + u E(ulinc, price, educ, female) = 0 Var(ulinc, price, educ, female) = o

> Which of the following are consequences of heteroskedasticity? (i) The OLS estimators, β j, are inconsistent. (ii) The usual F statistic no longer has an F distribution. (iii) The OLS estimators are no longer BLUE.

> This exercise shows that in a simple regression model, adding a dummy variable for missing data on the explanatory variable produces a consistent estimator of the slope coefficient if the “missingness” is unrelated to

> Suppose that log1y2 follows a linear model with a linear form of heteroskedasticity. We write this as so that, conditional on x, u has a normal distribution with mean (and median) zero but with variance h(x) that depends on x. Because Med(u|x) = 0, equat

> The point of this exercise is to show that tests for functional form cannot be relied on as a general test for omitted variables. Suppose that, conditional on the explanatory variables

> Consider the simple regression model with classical measurement error, y =

> In the model (9.17), show that OLS consistently estimates a and b if

> In Example 4.4, we estimated a model relating number of campus crimes to student enrollment for a sample of colleges. The sample we used was not a random sample of colleges in the United States, because many schools in 1992 did not report campus crimes.

> The following equation explains weekly hours of television viewing by a child in terms of the child’s age, mother’s education, father’s education, and number of siblings: We are worried that tvhoursp

> Use the data set GPA1 for this exercise. (i) Use OLS to estimate a model relating colGPA to hsGPA, ACT, skipped, and PC. Obtain the OLS residuals. (ii) Compute the special case of the White test for heteroskedasticity. In the regression of

> Let math10 denote the percentage of students at a Michigan high school receiving a passing score on a standardized math test (see also Example 4.2). We are interested in estimating the effect of per-student spending on math performance. A simple model is

> Let us modify Computer Exercise C4 in Chapter 8 by using voting outcomes in 1990 for incumbents who were elected in 1988. Candidate A was elected in 1988 and was seeking reelection in 1990; voteA90 is Candidate A’s share of the two-part

> The R-squared from estimating the model using the data in CEOSAL2, was R2 5=.353 (n = 177) . When ceoten2 and comten2 are added, R2 = .375. Is there evidence of functional form misspecification in this model? log(salary) = Bo + Bilog(sales) + Bzlog(m

> Use the data in ECONMATH to answer this question. The population model is (i) For how many students is the ACT score missing? What is the fraction of the sample? Define a new variable, actmiss, which equals one if act is missing, and zero otherwise. (ii)

> Use the data in CEOSAL2 to answer this question. (i) Estimate the model by OLS using all of the observations, where lsalary, lsales, and lmktvale are all natural logarithms. Report the results in the usual form with the usual OLS standard errors. (You ma

> (i) Using all of the data, run the regression lavgsal on bs, lenrol, lstaff, and lunch. Report the coefficient on bs along with its usual and heteroskedasticity-robust standard errors. What do you conclude about the economic and statistical significance

> Use the data in MURDER only for the year 1993 for this question, although you will need to first obtain the lagged murder rate, say mrdrte21. (i) Run the regression of mrdrte on exec, unem. What are the coefficient and t statistic on exec? Does this regr

> You are to compare OLS and LAD estimates of the effects of 401(k) plan eligibility on net financial assets. The model is (i) Use the data in 401KSUBS to estimate the equation by OLS and report the results in the usual form. Interpret the coefficient on e

> Use the data in TWOYEAR for this exercise. (i) The variable stotal is a standardized test variable, which can act as a proxy variable for unobserved ability. Find the sample mean and standard deviation of stotal. (ii) Run simple regressions of jc and uni

> Use the data in LOANAPP for this exercise. (i) Estimate the equation in part (iii) of Computer Exercise C8 in Chapter 7, computing the heteroskedasticity- robust standard errors. Compare the 95% confidence interval on Bwhite with the nonrobust confidence

> Use the data in LOANAPP for this exercise. (i) How many observations have obrat > 40, that is, other debt obligations more than 40% of total income? (ii) Reestimate the model in part (iii) of Computer Exercise C8, excluding observations with obrat > 40.

> In example given below by dropping schools where teacher benefits are less than 1% of salary. (i) How many observations are lost? (ii) Does dropping these observations have any important effects on the estimated tradeoff? Example: Let totcomp denote aver

> Use the data in RDCHEM to further examine the effects of outliers on OLS estimates and to see how LAD is less sensitive to outliers. The model is where you should first change sales to be in billions of dollars to make the estimates easier to interpret.

> Use the data for the year 1990 in INFMRT for this exercise. (i) Reestimate equation (9.43), but now include a dummy variable for the observation on the District of Columbia (called DC). Interpret the coefficient on DC and comment on its size and signific

> Use the data from JTRAIN for this exercise. (i) Consider the simple regression model where scrap is the firm scrap rate and grant is a dummy variable indicating whether a firm received a job training grant. Can you think of some reasons why the unobserve

> Use the data set WAGE2 for this exercise. (i) Use the variable KWW (the “knowledge of the world of work” test score) as a proxy for ability. What is the estimated return to education in this case? (ii) Now, use IQ and KWW together as proxy variables. Wha

> (i) Apply RESET from equation (9.3) to the model estimated.. Is there evidence of functional form misspecification in the equation? (ii) Compute a heteroskedasticity-robust form of RESET. Does your conclusion from part (i) change? y β + βχ + + Brik +

> The data set NBASAL contains salary information and career statistics for 269 players in the National Basketball Association (NBA). Estimate a model relating points-per-game (points) to years in the league (exper), age, and years played in college (coll)

> Use the data in HPRICE1 for this exercise. (i) Estimate the model and report the results in the usual form, including the standard error of the regression. Obtain predicted price, when we plug in lotsize 5 10,000, sqrft 5 2,300, and bdrms 5 4; round this

> Use the data in ATTEND for this exercise. In the model of Example 6.3, argue that Use equation (6.19) to estimate the partial effect when priGPA 5 2.59 and atndrte 5 82. Interpret your estimate. Show that the equation can be written as where Î&cedi

> (i) Using the data in CRIME1, estimate this model by OLS and verify that all fitted values are strictly between zero and one. What are the smallest and largest fitted values? (ii) Estimate the equation by weighted least squares, as discussed in Section 8

> Use the data in VOTE1 for this exercise. Consider a model with an interaction between expenditures: What is the partial effect of expendB on voteA, holding prtystrA and expendA fixed? What is the partial effect of expendA on voteA? Is the expected sign f

> Use the housing price data in HPRICE1 for this exercise. Estimate the model and report the results in the usual OLS format. (ii) Find the predicted value of log(price), when lotsize = 20,000, sqrft = 2,500, and bdrms = 4. Find the predicted value of pric

> Use the data in GPA2 for this exercise. Estimate the model where hsize is the size of the graduating class (in hundreds), and write the results in the usual form. Is the quadratic term statistically significant? (ii) Using the estimated equation from par

> Consider a model where the return to education depends upon the amount of work experience (and vice versa): (i) Show that the return to another year of education (in decimal form), holding exper fixed, is β1 + β3exper. (ii) State

> Use the data in WAGE1 for this exercise. Use OLS to estimate the equation and report the results using the usual format. Is exper2 statistically significant at the 1% level? Using the approximation find the approximate return to the fifth year of exper

> Use the data in BENEFITS to answer this question. It is a school-level data set at the K–5 level on average teacher salary and benefits. See Example 4.10 for background. (i) Regress lavgsal on bs and report the results in the usual form

> Use the data in MEAP00 to answer this question. (i) Estimate the model / by OLS, and report the results in the usual form. Is each explanatory variable statistically significant at the 5% level? (ii) Obtain the fitted values from the regression in part

> Use the subset of 401KSUBS with fsize 5 1; this restricts the analysis to single-person households; see also Computer Exercise C8 in Chapter 4. (i) What is the youngest age of people in this sample? How many people are at that age? (ii) In the model wha

> (i) Run the regression ecolbs on ecoprc, regprc and report the results in the usual form, including the R-squared and adjusted R-squared. Interpret the coefficients on the price variables and comment on their signs and magnitudes. (ii) Are the price va

> Use the data in BWGHT2 for this exercise. (i) Estimate the equation by OLS, and report the results in the usual way. Is the quadratic term significant? (ii) Show that, based on the equation from part (i), the number of prenatal visits that maximizes log(

> Use the data in PNTSPRD for this exercise. (i) The variable sprdcvr is a binary variable equal to one if the Las Vegas point spread for a college basketball game was covered. The expected value of sprdcvr, say m, is the probability that the spread is cov

> Use the data in KIELMC, only for the year 1981, to answer the following questions. The data are for houses that sold during 1981 in North Andover, Massachusetts; 1981 was the year construction began on a local garbage incinerator. (i) To study the effect

> There has been much interest in whether the presence of 401(k) pension plans, available to many U.S. workers, increases net savings. The data set 401KSUBS contains information on net financial assets (nettfa), family income (inc), a binary variable for

> Use the data in LOANAPP for this exercise. The binary variable to be explained is approve, which is equal to one if a mortgage loan to an individual was approved. The key explanatory variable is white, a dummy variable equal to one if the applicant was w

> Use the data in WAGE1 for this exercise. (i) Use equation (7.18) to estimate the gender differential when educ = 12.5. Compare this with the estimated differential when educ = 0. (ii) Run the regression used to obtain (7.18), but with female (educ - 12.5

> Use the data in SLEEP75 for this exercise. The equation of interest is (i) Estimate this equation separately for men and women and report the results in the usual form. Are there notable differences in the two estimated equations? (ii) Compute the Chow t

> Define a dummy variable, rosneg, which is equal to one if ros Discuss the interpretation and statistical significance of log(salary) = Bo + Bilog(sales) + Brroe + B3rosneg + u.

> Use the data in GPA2 for this exercise. (i) Consider the equation where colgpa is cumulative college grade point average; hsize is size of high school graduating class, in hundreds; hsperc is academic percentile in graduating class; sat is combined SAT

> A model that allows major league baseball player salary to differ by position is where outfield is the base group. (i) State the null hypothesis that, controlling for other factors, catchers and outfielders earn, on average, the same amount. Test this hy

> Use the data in WAGE2 for this exercise. (i) Estimate the model and report the results in the usual form. Holding other factors fixed, what is the approximate difference in monthly salary between blacks and nonblacks? Is this difference statistically s

> Use the data in CHARITY to answer this question. The variable respond is a dummy variable equal to one if a person responded with a contribution on the most recent mailing sent by a charitable organization. The variable resplast is a dummy variable equal

> Use VOTE1 for this exercise. (i) Estimate a model with voteA as the dependent variable and prtystrA, democA, log(expendA), and log(expendB) as independent variables. Obtain the OLS residuals,

> Use the data in APPLE to answer this question. (i) Define a binary variable as ecobuy = 1 if ecolbs > 0 and ecobuy = 0 if ecolbs = 0. In other words, ecobuy indicates whether, at the prices given, a family would buy any ecologically friendly apples. W

> Use the data set in BEAUTY, which contains a subset of the variables (but more usable observations than in the regressions) reported by Hamermesh and Biddle (1994). (i) Find the separate fractions of men and women that are classified as having above aver

> Use the data in 401KSUBS for this exercise. (i) Compute the average, standard deviation, minimum, and maximum values of nettfa in the sample. (ii) Test the hypothesis that average nettfa does not differ by 401(k) eligibility status; use a twosided altern

> Use the data in NBASAL for this exercise. (i) Estimate a linear regression model relating points per game to experience in the league and position (guard, forward, or center). Include experience in quadratic form and use centers as the base group. Report

> Use the data in GPA1 for this exercise. Add the variables mothcoll and fathcoll to the equation estimated in (7.6) and report the results in the usual form. What happens to the estimated effect of PC ownership? Is PC still statistically significant? T

> Use the data in CATHOLIC to answer this question. (i) In the entire sample, what percentage of the students attend a Catholic high school? What is the average of math12 in the entire sample? (ii) Run a simple regression of math12 on cathhs and report the

> Use the data in FERTIL2 to answer this question. (i) Find the smallest and largest values of children in the sample. What is the average of children? Does any woman have exactly the average number of children? (ii) What percentage of women have electrici

> Use the data in BEAUTY for this question. (i) Using the data pooled for men and women, estimate the equation and report the results using heteroskedasticity-robust standard errors below coefficients. Are any of the coefficients surprising in either their

> Use the data in FERTIL2 to answer this question. (i) Estimate the model and report the usual and heteroskedasticity-robust standard errors. Are the robust standard errors always bigger than the nonrobust ones? (ii) Add the three religious dummy variable

> Use the data in MEAP00 to answer this question. (i) Estimate the model by OLS and obtain the usual standard errors and the fully robust standard errors. How do they generally compare? (ii) Apply the special case of the White test for heteroskedasticity.

> Apply the full White test for heteroskedasticity to equation (8.18). Using the chisquare form of the statistic, obtain the p-value. What do you conclude? log(price) –1.30 + .168 log(lotsize) + .700 log(sqrft) + 0.37 bdrms (.093) = (.65) (.038) (.028)

> Jordan Mendelson is interested in purchasing a franchise in a meal-preparation business. Customers will come to the business to assemble gourmet dinners and then take the prepared meals to their homes for cooking. The franchisor requires each store to us

> Two African American plaintiffs sued the producers of the reality television series The Bachelor and The Bachelorette for racial discrimination. The plaintiffs claimed that the shows had never featured a person of color in the lead role. Plaintiffs also

> Go to Appendix G at the end of this text and examine the excerpt of Case No. 4, Dees v. United Rentals North America, Inc. Review and then brief the case, making sure that your brief answers the following questions. 1. issue: What conduct on the part of

> Why has the federal government limited the application of the statutes discussed in this chapter to firms with a specified number of employees, such as fifteen or twenty? Should these laws apply to all employers, regardless of size? Why or why not?

4.99

See Answer