The regression of Duration of a roller coaster ride on the height of its initial Drop, described in Exercise 30, had R2=29.4%.
1. What is the correlation between Drop and Duration?
2. What would you predict about the Duration of the ride on a coaster whose initial Drop was 1 standard deviation below the mean Drop?
3. What would you predict about the Duration of the ride on a coaster whose initial Drop was 3 standard deviations above the mean Drop?
> The United Nations Development Programme (UNDP) uses the Human Development Index (HDI) in an attempt to summarize in one number the progress in health, education, and economics of a country (hdr.undp.org/en/data#). In 2015, the HDI was as high as 0.94 fo
> The Centers for Disease Control and Prevention tracks cigarette smoking in the United States (www.cdc.gov/nchs). How has the percentage of people who smoke changed since the danger became clear during the last half of the 20th century? The scatterplot sh
> Is there evidence that the age at which women get married has changed over the past 100 years? The scatterplot shows the trend in age at first marriage for American women (www.census.gov). 1. Is there a clear pattern? Describe the trend. 2. Is the associ
> The data file Receivers 2015 holds information about the 488 NFL players who caught at least one pass during the 2015 football season. A typical 53-man roster has about 13 players who would be expected to catch passes (primarily wide receivers, tight end
> Exercise 41 Chapter 6 looked at a sample of 35 vehicles to examine the relationship between gas mileage and engine displacement. The full dataset holds data on 1211 cars. How well did our sample of 35 represent the underlying relationship between displac
> Consider the four points (200,1950), (400,1650), (600,1800), and (800,1600). The least squares line is y^=1975+0.45x. Explain what least squares means, using these data as a specific example.
> A study of traffic delays in 68 U.S. cities found the following relationship between Total Delay (in total hours lost) and Mean Highway Speed: Is it appropriate to summarize the strength of association with a correlation? Explain.
> Consider the four points (10,10), (20,50), (40,20), and (50,80). The least squares line is y=7.0+1.1x. Explain what least squares means, using these data as a specific example.
> Wildlife researchers monitor many wildlife populations by taking aerial photographs. Can they estimate the weights of alligators accurately from the air? Here is a regression analysis of the Weight of alligators (in pounds) and their Length (in inches) b
> In an investigation of environmental causes of disease, data were collected on the annual mortality rate (deaths per 100,000) for males in 61 large towns in England and Wales. In addition, the water hardness was recorded as the calcium concentration (par
> We saw the data for the women 2016 Olympic heptathlon in Exercise 73. Are the two jumping events associated? Perform a regression of the long-jump results on the high-jump results. 1. What is the regression equation? What does the slope mean? 2. What per
> We discussed the women 2016 Olympic heptathlon in Chapter 5. Here are the results from the high jump, 800-meter run, and long jump for the 27 women who successfully completed all three events of the heptathlon in the 2016 Olympics: Let examine the associ
> Would a model that uses the person Waist size be able to predict the %Body Fat more accurately than one that uses Weight? Using the data in Exercise 71, create and analyze that model.
> It is difficult to determine a person body fat percentage accurately without immersing him or her in water. Researchers hoping to find ways to make a good estimate immersed 20 male subjects, then measured their waists and recorded their weights shown in
> In Exercise 69, we saw the relationship between CO2 measured at Mauna Loa and average global temperature anomaly from 1959 to 2016. Here is a plot of average global temperatures plotted against the yearly final value of the Dow Jones Industrial Average f
> The earth climate is getting warmer. The most common theory attributes the increase to an increase in atmospheric levels of carbon dioxide (CO2), a greenhouse gas. Here is a scatterplot showing the mean annual temperature anomaly (the difference between
> The table shows the number of live births per 1000 population in the United States, starting in 1965. (National Center for Health Statistics, www.cdc.gov/nchs/) 1. Make a scatterplot and describe the general trend in Birthrates. (Enter Year as years sinc
> In a study of streams in the Adirondack Mountains, the following relationship was found between the water pH and its hardness (measured in grains): Is it appropriate to summarize the strength of association with a correlation? Explain. (Data in Streams)
> We saw in this chapter that in Tompkins County, New York, older bridges were in worse condition than newer ones. Tompkins is a rural area. Is this relationship true in New York City as well? Here are data on the Condition (as measured by the state Depart
> Numbeo.com lists the cost of living (COL) for 576 cities around the world. It reports the typical cost of a number of staples. Here are a scatterplot and regression relating the cost of a cappuccino to the cost of a third of a liter of water: 1. Using th
> In Exercise 63, you created a model that can estimate the number of Calories in a burger when the Fat content is known. 1. Explain why you cannot use that model to estimate the fat content of a burger with 600 calories. 2. Using an appropriate model, est
> Chicken sandwiches are often advertised as a healthier alternative to beef because many are lower in fat. Tests on 11 brands of fast-food chicken sandwiches produced the following summary statistics and scatterplot from a graphing calculator: 1. Do you t
> In Chapter 6, you examined We can examine the association between the amounts of Fat and Calories in fast-food hamburgers. Here are the data: 1. Create a scatterplot of Calories vs. Fat. 2. Interpret the value of R2 in this context. 3. Write the equation
> Burger King introduced a meat-free burger in 2002. The nutrition label for the 2014 BK Veggie burger (no mayo) is shown here: (Data in Burger King items) 1. Use the regression model created in this chapter, Fat=8.4+0.91 Protein to predict the fat content
> Use the advertised prices for Toyota Corollas given in Exercise 59 to create a linear model for the relationship between a car Age and its Price. 1. Find the equation of the regression line. 2. Explain the meaning of the slope of the line. 3. Explain the
> Chapter 6, Exercise 42 examines results of a survey A survey was conducted in the United States and 10 countries of Western Europe to determine the percentage of teenagers who had used marijuana and other drugs. Below is the scatterplot. Summary statisti
> Carmax.com lists numerous Toyota Corollas for sale within a 250 mile radius of Redlands, CA. The table lists the ages of the cars and the advertised prices. 1. Make a scatterplot for these data. 2. Describe the association between Age and Price of a used
> We saw in Exercise 57 that the number of fires was nearly constant. But has the damage they cause remained constant as well? Here a regression that examines the trend in Acres per Fire (in hundreds of thousands of acres) together with some supporting plo
> A study compared the effectiveness of several antidepressants by examining the experiments in which they had passed the FDA requirements. Each of those experiments compared the active drug with a placebo, an inert pill given to some of the subjects. In e
> The National Interagency Fire Center (www.nifc.gov) reports statistics about wildfires. Here an analysis of the number of wildfires between 1985 and 2015. 1. Is a linear model appropriate for these data? Explain. 2. Interpret the slope in this context. 3
> Based on the statistics for college freshmen given in Exercise 54, what SAT score would you predict for a freshmen who attained a first-semester GPA of 3.0?
> Suppose we wanted to use SAT math scores to estimate verbal scores based on the information in Exercise 53. 1. What is the correlation? 2. Write the equation of the line of regression predicting verbal scores from math scores. 3. In general, what would a
> Colleges use SAT scores in the admissions process because they believe these scores provide some insight into how a high school student will perform at the college level. Suppose the entering freshmen at a certain college have mean combined SAT Scores of
> The SAT is a test often used as part of an application to college. SAT scores are between 200 and 800, but have no units. Tests are given in both Math and Verbal areas. SAT-Math problems require the ability to read and understand the questions, but can a
> For the online clothing retailer discussed in the previous problem, the scatterplot of Total Yearly Purchases by Income looks like this: The correlation between Total Yearly Purchases and Income is 0.722. Summary statistics for the two variables are: 1.
> An online clothing retailer keeps track of its customers purchases. For those customers who signed up for the company credit card, the company also has information on the customer Age and Income. A random sample of 500 of these customers shows the follow
> In Chapter 6, Exercise 40, we saw Below is a plot of mortgages in the United States (in trillions of 2013 dollars) vs. the interest rate at various times over the past 25 years. The correlation is r=0.845. The mean mortgage amount is $8.207 T and the mea
> In Chapter 6, Exercise 39, We learned that the Office of Federal Housing Enterprise Oversight (OFHEO) collects data on various aspects of housing costs around the United States. Here a scatterplot (by state) of the Housing Cost Index (HCI) vs. the Median
> Refer again to the regression analysis for home average attendance and games won by baseball teams, seen in Exercise 44. 1. Write the equation of the regression line. 2. Estimate the Home Average Attendance for a team with 750 Runs. 3. Interpret the mean
> Most roller coasters get their speed by dropping down a steep initial incline, so it makes sense that the height of that drop might be related to the speed of the coaster. Here a scatterplot of top Speed and largest Drop for 118 roller coasters around th
> Take another look at the regression analysis of tar and nicotine content of the cigarettes in Exercise 43. 1. Write the equation of the regression line. 2. Estimate the Nicotine content of cigarettes with 4 milligrams of Tar. 3. Interpret the meaning of
> Consider again the regression of Home Average Attendance on Runs for the baseball teams examined in Exercise 44. 1. What is the correlation between Runs and Home Average Attendance? 2. What would you predict about the Home Average Attendance for a team t
> Consider again the regression of Nicotine content on Tar (both in milligrams) for the cigarettes examined in Exercise 43. 1. What is the correlation between Tar and Nicotine? 2. What would you predict about the average Nicotine content of cigarettes that
> In Chapter 6, Exercise 45 looked We can look at the relationship between the number of runs scored by American League baseball teams and the average attendance at their home games for the 2016 season. Here are the scatterplot, the residuals plot, and par
> Is the nicotine content of a cigarette related to the tar? A collection of data (in milligrams) on 816 cigarettes produced the scatterplot, residuals plot, and regression analysis shown: 1. Do you think a linear model is appropriate here? Explain. 2. Exp
> Consider the roller coasters (with the outlier removed) described in Exercise 30 again. The regression analysis gives the model Duration=87.22+0.389 Drop. 1. Explain what the slope of the line says about how long a roller coaster ride may last and the he
> Consider the Albuquerque home sales from Exercise 29 again. The regression analysis gives the model Price=47.82+0.061Â Size. 1. Explain what the slope of the line says about housing prices and house size. 2. What price would you predict for a 3000-square
> Players in any sport who are having great seasons, turning in performances that are much better than anyone might have anticipated, often are pictured on the cover of Sports Illustrated. Frequently, their performances then falter somewhat, leading some a
> People who claim to have extrasensory perception (ESP) participate in a screening test in which they have to guess which of several images someone is thinking of. You and a friend both took the test. You scored 2 standard deviations above the mean, and y
> The National Insurance Crime Bureau reports that Honda Accords, Honda Civics, and Toyota Camrys are the cars most frequently reported stolen, while Ford Tauruses, Pontiac Vibes, and Buick LeSabres are stolen least often. Is it reasonable to say that ther
> The regression of Price on Size of homes in Albuquerque had R2=71.4% as described in Exercise 29. 1. What is the correlation between Size and Price? 2. What would you predict about the Price of a home 1 SD above average in Size? 3. What would you predict
> A sociology student investigated the association between a country Literacy Rate and Life Expectancy, and then drew the conclusions listed below. Explain why each statement is incorrect. (Assume that all the calculations were done properly.) 1. The R2 of
> A biology student who created a regression model to use a bird Height when perched for predicting its Wingspan made these two statements. Assuming the calculations were done correctly, explain what is wrong with each interpretation. 1. My R2 of 93% shows
> Exercise 30 examined the association between the Duration of a roller coaster ride and the height of its initial Drop, reporting that R2=29.4%. Write a sentence (in context, of course) summarizing what the R2 says about this regression.
> The regression of Price on Size of homes in Albuquerque had R2=71.4%, as described in Exercise 29. Write a sentence (in context, of course) summarizing what the R2 says about this regression.
> If you create a regression model for estimating the Height of a pine tree (in feet) based on the Circumference of its trunk (in inches), is the slope most likely to be 0.1, 1, 10, or 100? Explain.
> If you create a regression model for predicting the Weight of a car (in pounds) from its Length (in feet), is the slope most likely to be 3, 30, 300, or 3000? Explain.
> The dataset on roller coasters A lists the Duration of the ride in seconds in addition to the Drop height in feet for some of the coasters. One coaster (the Tower of Terror) is unusual for having a large drop but a short ride. After setting it aside, a r
> A random sample of records of home sales from Feb. 15 to Apr. 30, 1993, from the files maintained by the Albuquerque Board of Realtors gives the Price and Size (in square feet) of 117 homes. A regression to predict Price (in thousands of dollars) from Si
> Tell what each of the residual plots below indicates about the appropriateness of the linear model that was fit to the data.
> A candidate for office claims that there is a correlation between television watching and crime. Criticize this statement on statistical grounds.
> Tell what each of the residual plots below indicates about the appropriateness of the linear model that was fit to the data.
> Fill in the missing information in the following table.
> Fill in the missing information in the following table.
> For Exercise 16 regression model predicting fuel economy (in mpg) from the car engine size, se=3.522. Explain in this context what that means.
> For Exercise 15 regression model predicting potassium content (in milligrams) from the amount of fiber (in grams) in breakfast cereals, se=30.77. Explain in this context what that means.
> The correlation between a car engine size and its fuel economy (in mpg) is r=0.774. What fraction of the variability in fuel economy is accounted for by the engine size?
> The correlation between a cereal fiber and potassium contents is r=0.903. What fraction of the variability in potassium is accounted for by the amount of fiber that servings contain?
> In Exercise 16, the regression model Combined MPG=33.46 3.23 Displacement relates cars engine size to their fuel economy (Combined mpg). Explain what the slope means.
> In Exercise 15, the regression model Potassium^=38+27 Fiber relates fiber (in grams) and potassium content (in milligrams) in servings of breakfast cereals. Explain what the slope means.
> Exercise 16 describes a regression model that uses a car engine displacement to estimate its fuel economy. In this context, what does it mean to say that a certain car has a positive residual?
> Here are several scatterplots. The calculated correlations are 0.977,0.021,0.736, and 0.951. Which is which?
> The data set Igf13 contains the data from Igf for children under 13 years old. Most of the data was collected from physical examinations in schools. 1. Fit a linear regression to igÆ’ using age as the predictor variable. Comment on the appropriateness of
> Exercise 15 describes a regression model that estimates a cereal potassium content from the amount of fiber it contains. In this context, what does it mean to say that a cereal has a negative residual?
> In Chapter 6, Exercise 41 we examined We can examine the relationship between the fuel economy (Combined MPG) and Displacement (in liters) for 1211 models of cars. (Data in Fuel economy 2016) Further analysis produces the regression model Combined MPG^=3
> For many people, breakfast cereal is an important source of fiber in their diets. Cereals also contain potassium, a mineral shown to be associated with maintaining a healthy blood pressure. An analysis of the amount of fiber (in grams) and the potassium
> Agricultural scientists are working on developing an improved variety of Roma tomatoes. Marketing research indicates that customers are likely to bypass Romas that weigh less than 70 grams. The current variety of Roma plants produces fruit that averages
> Hens usually begin laying eggs when they are about 6 months old. Young hens tend to lay smaller eggs, often weighing less than the desired minimum weight of 54 grams. 1. The average weight of the eggs produced by the young hens is 50.9 grams, and only 28
> Most people think that the normal adult body temperature is 98.6F. That figure, based on a 19th-century study, has recently been challenged. In a 1992 article in the Journal of the American Medical Association, researchers reported that a more accurate f
> Companies that design furniture for elementary school classrooms produce a variety of sizes for kids of different ages. Suppose the heights of kindergarten children can be described by a Normal model with a mean of 38.2 inches and standard deviation of 1
> A tire manufacturer believes that the tread life of its snow tires can be described by a Normal model with a mean of 32,000 miles and standard deviation of 2500 miles. 1. If you buy one of these tires, would it be reasonable for you to hope it will last
> Assume the cholesterol levels of adult American women can be described by a Normal model with a mean of 188 mg/dL and a standard deviation of 24. 1. Draw and label the Normal model. 2. What percent of adult women do you expect to have cholesterol levels
> Consider the IQ model N(100,15) one last time. 1. What IQ represents the 15th percentile? 2. What IQ represents the 98th percentile? 3. What the IQR of the IQs?
> Here are several scatterplots. The calculated correlations are 0.923,0.487,0.006, and 0.777. Which is which?
> Consider the Angus weights model N(1152,84) one last time. 1. What weight represents the 40th percentile? 2. What weight represents the 99th percentile? 3. What the IQR of the weights of these Angus steers?
> In the Normal model N(100,15) from Exercise 10 , what cutoff value bounds 1. the highest 5% of all IQs? 2. the lowest 30% of the IQs? 3. the middle 80% of the IQs?
> Based on the model N(1152,84) describing Angus steer weights from Exercise 29 , what are the cutoff values for 1. the highest 10% of the weights? 2. the lowest 20% of the weights? 3. the middle 40% of the weights?
> Suppose we take logarithms of the CEO compensations in Exercise 47 . The histogram of log Compensation looks like this: with a mean of 1.07 and a standard deviation of 0.26. 1. According to the Normal model, what percent of CEOs would you expect to earn
> The Glassdoor Economic Research Blog published the compensation (in millions of dollars) for the CEOs of large companies. The distribution looks like this: The mean CEO compensation is $14.1M and the standard deviation is $11.32M. 1. According to the Nor
> A large philanthropic organization keeps records on the people who have contributed. In addition to keeping records of past giving, the organization buys demographic data on neighborhoods from the U.S. Census Bureau. Eighteen of these variables concern e
> NFL data from the 2015 football season reported the number of yards gained by each of the league 488 receivers: The mean is 274.73 yards, with a standard deviation of 327.32 yards. 1. According to the Normal model, what percent of receivers would you exp
> The mean of the 100 car speeds in Exercise 30 was 23.84 mph, with a standard deviation of 3.56 mph. 1. Using a Normal model, what values should border the middle 95% of all car speeds? 2. Here are some summary statistics. From your answer in part a, how
> Fifty-three men completed the men alpine downhill at the 2018 winter Olympics in PyeongChang. The gold medal winner finished in 100.25 seconds. Here are the times (in seconds) for all competitors. 1. The mean time was 103.883 seconds, with a standard dev
> For the car speed data in Exercise 30 , here are the histogram, boxplot, and Normal probability plot of the 100 readings. Do you think it is appropriate to apply a Normal model here? Explain.
> Owners of a new coffee shop tracked sales for the first 20 days and displayed the data in a scatterplot (by day). 1. Make a histogram of the daily sales since the shop has been in business. 2. State one fact that is obvious from the scatterplot, but not
> Later on, the forester in Exercise 39 shows you a histogram of the tree diameters he used in analyzing the woods that was for sale. Do you think he was justified in using a Normal model? Explain, citing some specific concerns.
> A company that manufactures rivets believes the shear strength (in pounds) is modeled by N(800,50). 1. Draw and label the Normal model. 2. Would it be safe to use these rivets in a situation requiring a shear strength of 750 pounds? Explain. 3. About wha
> A forester measured 27 of the trees in a large woods that is up for sale. He found a mean diameter of 10.4 inches and a standard deviation of 4.7 inches. Suppose that these trees provide an accurate description of the whole forest and that a Normal model
> Exercise 10 proposes modeling IQ scores with N(100,15). What IQ would you consider to be unusually high? Explain.
> In Exercise 29 , we suggested the model N(1152,84) for weights in pounds of yearling Angus steers. What weight would you consider to be unusually low for such an animal? Explain.