How does the price of a house depend on its size? Data from Saratoga, New York, on 1063 randomly selected houses that had been sold include data on price ($1000s) and size (1000 ft2), producing the following graphs and computer output:
Dependent variable is Price
1. Explain in context what the regression says.
2. The intercept is negative. What does this mean? (Hint: Notice the P-value.)
3. The output reports s=53.79. Explain what that means in this context.
4. What the value of the standard error of the slope of the regression line?
5. Explain what that means in this context.
> Here is a scatterplot matrix of the variables as re-expressed in Exercise 13 using a version that places Normal probability plots on the diagonal. 1. Comment on their suitability for a regression model to predict Life expectancy. The points are colored a
> Union rated frozen pizzas. Their report includes the number of Calories, Fat content, and Type (cheese or pepperoni, represented here as an indicator variable that is 1 for cheese and 0 for pepperoni). Here a regression model to predict the Score awarded
> The United States Central Intelligence Agency maintains a public site called the World Factbook at www.cia.gov/library/publications/the-worldfactbook/. There you find a wealth of variables about all the countries of the world. Let’s exa
> In Chapter 9 , Exercises 14 , 18, 29, and 30, we considered data on hill races in Scotland. These are overland races that climb and descend hills sometimes several hills in the course of one race. Here is a regression analysis to predict the Women Record
> In Exercise 25 of Chapter 9 , we considered a multiple regression model for predicting calories in breakfast cereals. The regression looked like this: Dependent variable is: Calories R-squared =38.4% R-squared (adjusted)=35.9% s=15.60 with 774=73 degrees
> In previous chapters we have looked at data from the 50 states. Here an analysis of data from a few years earlier. The Murder rate is per 100,000, HS Graduation rate is in %, Income is per capita income in dollars, Illiteracy rate is per 1000, and Life E
> The following software output provides information about the Size (in square feet) of 18 homes in Ithaca, New York, and the city assessed Value of those homes. Dependent variable is Value 1. Explain why inference for linear regression is appropriate with
> The Pew Research survey cited in Exercise 27 also asked what employment sector the respondents worked in and whether their job gave them a sense of identity or whether it was just what they do for a living. This table summarizes their responses: 1. Is th
> For each of the following, list the sample space and tell whether you think the events are equally likely: 1. Toss 2 coins; record the order of heads and tails. 2. A family has 3 children; record the number of boys. 3. Flip a coin until you get a head or
> The following software output is based on the mortality rate (deaths per 100,000 people) and the education level (average number of years in school) for 58 U.S. cities. Dependent variable is Mortality 1. Comment on the assumptions for inference. 2. Is th
> A sample of 84 model- 2011 cars from an online information service was examined to see how fuel efficiency (as highway mpg) relates to the cost (Manufacturer Suggested Retail Price in dollars) of cars. Here are displays and computer output: Dependent var
> Remember the Little League instructional video discussed in Chapter 21, Exercise 35? Ads claimed it would improve the performances of Little League pitchers. To test this claim, 20 Little Leaguers threw 50 pitches each, and we recorded the number of stri
> The professor teaching the introductory statistics class discussed in Exercise 57 wonders whether performance on homework can accurately predict midterm scores. 1. To investigate it, she fits a regression of the sum of the two midterms scores on homework
> The dataset below shows midterm and homework scores from an introductory statistics course. 1. Fit a model predicting the second midterm score from the first. 2. Comment on the model you found, including a discussion of the assumptions and conditions for
> Researchers at the University of Denver Infant Study Center wondered whether temperature might influence the age at which babies learn to crawl. Perhaps the extra clothing that babies wear in cold weather would restrict movement and delay the age at whic
> Tablet computers 2014 Cnet.com tests tablet computers and continuously updates its list. As of January 2014, the list included the battery life (in hours) and luminous intensity (i.e., screen brightness, in cd/m2). We want to know if Battery life is rela
> Consider again the relationship between the sales and profits of Fortune 500 companies that you analyzed in Exercise 52. 1. Find a 95% confidence interval for the slope of the regression line. Interpret your interval in context. 2. Last year, the drug ma
> Consider again the relationship between the population and ozone level of U.S. cities that you analyzed in Exercise 51. 1. Give a 90% confidence interval for the slope of the relationship between ozone level and population. 2. For the cities studied, the
> A business analyst was interested in the relationship between a company sales and its profits. She collected data (in millions of dollars) from a random sample of Fortune 500 companies and created the regression analysis and summary statistics shown. The
> Pew Research surveyed 5006 U.S. adults to ask their opinions about the state of jobs in the United States in 2016. (www.pewsocialtrends.org/2016/10/06/the-state-of-american-jobs/) Respondents were asked how satisfied they are with their current job and
> The Environmental Protection Agency is examining the relationship between the ozone level (in parts per million) and the population (in millions) of U.S. cities. Part of the regression analysis is shown. Dependent variable is Ozone Dependent variable is
> A skeptic suggests that reduced sea ice isn’t due to global climate change at all. He offers the following model, including Year since 1979 as another predictor as an alternative to the model in Exercise 23 (Data in Sea ice): Response variable is: Extent
> The output shows an attempt to model the association between average January Temperature (in degrees Fahrenheit) and Latitude (in degrees north of the equator) for 59 U.S. cities. Which of the assumptions for inference do you think are violated? Explain.
> Further analysis of the data for the breakfast cereals in Exercise 46 looked for an association between Fiber content and Calories by attempting to construct a linear model. Here are three graphs. Which of the assumptions for inference are violated? Expl
> Is your IQ related to the size of your brain? A group of female college students took a test that measured their verbal IQs and also underwent an MRI scan to measure the size of their brains (in 1000s of pixels). The scatterplot and regression analysis a
> A healthy cereal should be low in both calories and sodium. Data for 77 cereals were examined and judged acceptable for inference. The 77 cereals had between 50 and 160 calories per serving and between 0 and 320 mg of sodium per serving. HereÃ&cen
> Consider once again the CO2 and global temperature data of Exercise 41. The mean CO2 level for these data is 352.566 ppm. 1. Find a 90% confidence interval for the mean global temperature anomaly if the CO2 level reaches 450 ppm. 2. Find a 90% prediction
> Consider again the data in Exercise 40 about the gas mileage and weights of cars. 1. Create a 95% confidence interval for the average fuel efficiency among cars weighing 2500 pounds, and explain what your interval means. 2. Create a 95% prediction interv
> Consider the CO2 and global temperature data of Exercise 41. 1. Find a 90% confidence interval for the slope of the true line describing the association between Temp and CO2. 2. Explain in this context what your confidence interval means.
> Consider again the data in Exercise 40 about the gas mileage and weights of cars. 1. Create a 95% confidence interval for the slope of the regression line. 2. Explain in this context what your confidence interval means.
> In an effort to reduce the number of gun-related homicides, some cities have run buyback programs in which the police offer cash (often $50) to anyone who turns in an operating handgun. Chance magazine looked at results from a four-year period in Milwauk
> Data collected from around the globe (including the sea ice data of Exercise 23) show that the earth is getting warmer. The generally accepted explanation relates climate change to an increase in atmospheric levels of carbon dioxide (CO2) because CO2 is
> A consumer organization has reported test data for 50 car models. We will examine the association between the weight of the car (in thousands of pounds) and the fuel efficiency (in miles per gallon). Here are the scatterplot, summary statistics, and regr
> The price of a car depends on its age as well as on its mileage. Here is a regression in which the age of the cars (in years) is included in the regression model from Exercise 34: Response variable is: Price 1. What is the interpretation of the coefficie
> Biologists studying the effects of acid rain on wildlife collected data from 172 streams in the Adirondack Mountains. They recorded the pH (acidity) of the water and the BCI, a measure of biological diversity. Here a scatterplot of BCI against pH for the
> Based on the analysis of marriage ages given in Exercise 33, find a 95% confidence interval for the rate at which the age gap is closing. Explain what your confidence interval means
> Based on the analysis of marriage ages given in Exercise 33, find a 95% confidence interval for the rate at which the age gap is closing. Explain what your confidence interval means
> On January 22, 2017, www.autotrader.com listed 55 used Honda Civics for sale by owner. Here a scatterplot of the asking price vs. the number of miles on the odometer (in thousands): 1. Do you think a linear model is appropriate? Explain. Here is the regr
> Chapter 8, Exercises 42, 44, and 49, looked at the how the age at first marriage has changed over time for men and women. One trend was that people have been waiting until they are older to get married. Generally, men are older at their first marriage th
> Based on the regression output seen in Exercise 28, create a 95% confidence interval for the slope of the regression line and interpret it in context.
> Here is a mosaic plot of the data on Diet and Politics from Exercise 5 combined with data on Gender. 1. Are there more men or women in the survey? Explain briefly. 2. Does there appear to be an association between Politics and Gender? Explain briefly. 3.
> Based on the regression output seen in Exercise 27, create a 95% confidence interval for the slope of the regression line and interpret your interval in context.
> Look again at Exercise 28 regression output for age and cholesterol level. (Data in Framingham) 1. The output reports s = 46.16. Explain what that means in this context. 2. What the value of the standard error of the slope of the regression line? 3. Expl
> Look again at Exercise 27 regression output for the calorie and sodium content of hot dogs. 1. The output reports s=59.66. Explain what that means in this context. 2. What the value of the standard error of the slope of the regression line? 3. Explain wh
> Does a person cholesterol level tend to change with age? Data collected from 1406 adults aged 45 to 62 as part of the Framingham study produced the regression analysis shown. Assuming that the data satisfy the conditions for inference, examine the associ
> Healthy eating probably doesn’t include hot dogs, but if you are going to have one, you’d probably hope it low in both calories and sodium. Recently, Consumer Reports listed the number of calories and sodium content (i
> Exercise 24 shows computer output examining the association between the sizes of houses and their sale prices. 1. Check the assumptions and conditions for inference. 2. Find a 95% confidence interval for the slope and interpret it in context.
> Exercise 23 shows computer output examining the association between Arctic sea ice extent and global mean temperature. Find a 95% confidence interval for the slope and interpret it in context.
> Climate scientists have been observing the extent of sea ice in the northern Arctic using satellite observations. Many have expressed concern because in recent decades the extent of sea ice has declined precipitously possibly due to global climate change
> The 2013 World Drug Report investigated the prevalence of drug use as a percentage of the population aged 15 to 64. Data from 32 European countries are shown in the following scatterplot and regression analysis. (World Drug Report, 2013. www.unodc.org/un
> The dataset Student survey contains 299 responses to a student survey from a statistics project. The questions asked included: How would you rate yourself politically? (1=Far left, 9 = Far right) What is your gender? Do you believe in God? Pick a random
> In Chapter 6, we looked at data from the National Oceanic and Atmospheric Administration about their success in predicting hurricane tracks. Here is a scatterplot of the error (in nautical miles) for predicting hurricane locations 24 hours in the future
> The coach from Exercise 2 called a team meeting to summarize the results from his study. Would it be a good strategy to tell the players that all they need to do is to shoot more and the goals will follow?
> An SAT preparation course wants to advertise based on the analyses we’ve seen that raising your SAT scores will increase your eventual earnings. Is that conclusion supported by these analyses?
> Use the survey results in the table to investigate differences in education level attained among different age groups in the United States.
> Most pregnancies are full term, but some are preterm (less than 37 weeks). Of those that are preterm, the Centers for Disease Control and Prevention classifies them as early (less than 34 weeks) and late (34 to 36 weeks). A December 2010 National Vital S
> Titanic Newspaper headlines at the time, and traditional wisdom in the succeeding decades, have held that women and children escaped the Titanic in greater proportions than men. Here a table with the relevant data. Do you think that survival was independ
> A subtle form of racial discrimination in housing is racial steering. Racial steering occurs when real estate agents show prospective buyers only homes in neighborhoods already dominated by that family race. This violates the Fair Housing Act of 1968. Ac
> In Exercise 44, you found that the expected cell counts failed to satisfy the conditions for inference. 1. Find a sensible way to combine some cells that will make the expected counts acceptable. 2. Test a hypothesis about the full moon and state your co
> In some situations where the expected cell counts are too small, as in the case of the grades given by Professors Alpha and Beta in Exercise 43, we can complete an analysis anyway. We can often proceed after combining cells in some way that makes sense a
> Some people believe that a full moon elicits unusual behavior in people. The table shows the number of arrests made in a small town during weeks of six full moons and six other randomly selected weeks in the same year. We wonder if there is evidence of a
> The following data show the percentage change in population for the 50 states and the District of Columbia from the 2000 census to the 2010 census. Using appropriate graphical displays and summary statistics, write a report on the percentage change in po
> Two different professors teach an introductory statistics course. The table shows the distribution of final grades they reported. We wonder whether one of these professors is an easier grader. 1. Will you test goodness-of-fit, homogeneity, or independenc
> In April 2009, Gallup published results from data collected from a large sample of adults in the 27 European Union member states. One of the questions asked was, Which is the most practicable and realistic option for child care, taking into account the n
> The poll described in Exercise 39 also investigated the respondents party affiliations based on what area of the state they lived in. Test an appropriate hypothesis about this table and state your conclusions. (Data in Montana revisited)
> Medical researchers followed 6272 Swedish men for 30 years to see if there was any association between the amount of fish in their diet and prostate cancer. (Fatty Fish Consumption and Risk of Prostate Cancer, Lancet, June 2001) 1. Is this a survey, a re
> A poll conducted by the University of Montana classified respondents by whether they were male or female and political party, as shown in the table. We wonder if there is evidence of an association between being male or female and party affiliation. 1. I
> A random survey of autos parked in the student lot and the staff lot at a large university classified the brands by country of origin, as seen in the table. Are there differences in the national origins of cars driven by students and staff? 1. Is this a
> It common folk wisdom that drinking cranberry juice can help prevent urinary tract infections in women. In 2001, the British Medical Journal reported the results of a Finnish study in which three groups of 50 women were monitored for these infections ove
> Examine and comment on this table of the standardized residuals for the chi-square test you looked at in Exercise 34.
> Examine and comment on this table of the standardized residuals for the chi-square test you looked at in Exercise 33.
> The table below shows the rank attained by male and female officers in the New York City Police Department (NYPD). Do these data indicate that men and women are equitably represented at all levels of the department? 1. What the probability that a person
> In 2015, the website NewGeography.com listed its ranking of the best cities for job growth in the United States. The magazine top 20 large cities, along with their weighted job rating indices, are given in the table. The full dataset contains 70 cities.
> Here is a table we first saw in Chapter 2 showing who survived the sinking of the Titanic based on whether they were crew members, or passengers booked in first-, second-, or third-class staterooms: 1. If we draw an individual at random, what the probabi
> In Exercises 24, 26, 28, and 30, we considered data on articles in the NEJM. The original study listed 23 different statistics methods. (The list read: t-tests, contingency tables, linear regression, . . . .) Why would it not be appropriate to use a chi-
> In Exercises 23, 25, 27, and 29, we’ve looked at a study examining epidurals as one factor that might inhibit successful breastfeeding of newborn babies. Suppose a broader study included several additional issues, including whether the mother drank alcoh
> In Exercises 24, 26, and 28, we’ve tested a hypothesis about whether the use of statistics in NEJM medical articles has changed over time. The table shows the test residuals. 1. Show how the residual for the 1989/No cell was calculated.
> In Exercises 23, 25, and 27, we’ve tested a hypothesis about the impact of epidurals on successful breastfeeding. The following table shows the test residuals. 1. Show how the residual for the epidural/no breastfeeding cell was calculat
> In Exercises 24 and 26, we’ve begun to examine whether the use of statistics in NEJM medical articles has changed over time. 1. Calculate the component of chi-square for the 1989/No cell. 2. For this test, χ2=25.28. What the P-value? 3. State your concl
> In Exercises 23 and 25, we’ve begun to examine the possible impact of epidurals on successful breastfeeding. 1. Calculate the component of chi-square for the epidural/no breastfeeding cell. 2. For this test, χ2=14.87. What the P-value? 3. State your con
> The table in Exercise 24 shows whether NEJM medical articles during various time periods included statistics or not. Were planning to do a chi-square test. 1. How many degrees of freedom are there? 2. The smallest expected count will be in the 1989/No ce
> In Exercise 23, the table shows results of a study investigating whether aftereffects of epidurals administered during childbirth might interfere with successful breastfeeding. Were planning to do a chi-square test. 1. How many degrees of freedom are the
> A survey6 of articles from the New England Journal of Medicine (NEJM) classified them according to the principal statistics methods used. The articles recorded were all non-editorial articles appearing during the indicated years. Let just look at whether
> The National Center for Education Statistics reports average mathematics achievement scores for eighth graders in all 50 states (nces.ed.gov/nationsreportcard/): 1. Using technology and the provided data file, find the median, IQR, mean, and standard dev
> There is some concern that if a woman has an epidural to reduce pain during childbirth, the drug can get into the baby bloodstream, making the baby sleepier and less willing to breastfeed. The International Breastfeeding Journal published results of a st
> The fairness of the South African lottery was recently challenged by one of the country political parties. The lottery publishes historical statistics at its website (www.nationallottery.co.za). Here is a table of the number of times each number appeared
> The National Hurricane Center provides data that list the numbers of large (category 3, 4, or 5) hurricanes that have struck the United States, by decade since 1851 (www.nhc.noaa.gov/dcmi.shtml). The data are given below. Recently, there been some concer
> Many people know the mathematical constant is approximately 3.14. But that not exact. To be more precise, here are 20 decimal places: 3.14159265358979323846. Still not exact, though. In fact, the actual value is irrational, a decimal that goes on forever
> Offspring of certain fruit flies may have yellow or ebony bodies and normal wings or short wings. Genetic theory predicts that these traits will appear in the ratio 9:3:3:1 (9 yellow, normal: 3 yellow, short: 3 ebony, normal: 1 ebony, short). A researche
> In its study When Men Murder Women: An Analysis of 2009 Homicide Data, 2011, the Violence Policy Center (www.vpc.org) reported that 1818 women were murdered by men in 2009. Of these victims, a weapon could be identified for 1654 of them. Of those for who
> Census data for New York City indicate that 29.2% of the under-18 population is white, 28.2% black, 31.5% Latino, 9.1% Asian, and 2% other ethnicities. The New York Civil Liberties Union points out that, of 26,181 police officers, 64.8% are white, 14.5%
> A salesman who is on the road visiting clients thinks that, on average, he drives the same distance each day of the week. He keeps track of his mileage for several weeks and discovers that he averages 122 miles on Mondays, 203 miles on Tuesdays, 176 mile
> A company says its premium mixture of nuts contains 10% Brazil nuts, 20% cashews, 20% almonds, and 10% hazelnuts, and the rest are peanuts. You buy a large can and separate the various kinds of nuts. On weighing them, you find there are 112 grams of Braz
> As noted in an earlier chapter, Mars Inc. says that until very recently yellow candies made up 20% of its milk chocolate M&M red another 20%, and orange, blue, and green 10% each. The rest are brown. On his way home from work the day he was writing these
> Here are some summary statistics to go with the histogram of the ZIP codes of 500 customers from the Holes-R-Us Internet Jewelry Salon that we saw in Exercise 81: What can these statistics tell you about the company sales?
> After getting trounced by your little brother in a children game, you suspect the die he gave you to roll may be unfair. To check, you roll it 60 times, recording the number of times each face appears. Do these results cast doubt on the die fairness? 1.
> For each of the following situations, state whether you’d use a chi-square goodness-of-fit test, a chi-square test of homogeneity, a chi-square test of independence, or some other statistical test: 1. Is the quality of a car affected by what day it was b
> For each of the following situations, state whether you’d use a chi-square goodness-of-fit test, a chi-square test of homogeneity, a chi-square test of independence, or some other statistical test: 1. A brokerage firm wants to see whether the type of acc
> Can a food additive increase egg production? Agricultural researchers want to design an experiment to find out. They have 100 hens available. They have two kinds of feed: the regular feed and the new feed with the additive. They plan to run their experim
> In the experiment about hormone injections in cows described in Exercise 39 , a group of 52 Jersey cows increased average milk production from 43 pounds to 52 pounds per day, with a standard deviation of 4.8 pounds. Is this evidence that the hormone may