In Exercise 46, we saw in the Swim the Lake 2016 data that Vicki Keith round-trip swim of Lake Ontario was an obvious outlier among the other one-way times. Here is the new regression after this unusual point is removed:
Dependent variable is Time
R-Squared = 2.8% s = 302.1
1. In this new model, the value of e is smaller. Explain what that means in this context.
2. Are you more convinced (compared to the previous regression) that Ontario swimmers are getting faster (or slower)?
> Sammy Salsa, a small local company, produces 20 cases of salsa a day. Each case contains 12 jars and is imprinted with a code indicating the date and batch number. To help maintain consistency, at the end of each day, Sammy selects three jars of salsa, w
> A manufacturing company employs 14 project managers, 48 supervisors, and 377 laborers. In an effort to keep informed about any possible sources of employee discontent, management wants to conduct job satisfaction interviews with a sample of employees eve
> Between quarterly audits, a company likes to check on its accounting procedures to address any problems before they become serious. The accounting staff processes payments on about 120 orders each day. The next day, the supervisor rechecks 10 of the tran
> Occasionally, when I fill my car with gas, I figure out how many miles per gallon my car got. I wrote down those results after six fill-ups in the past few months. Overall, it appears my car gets 28.8 miles per gallon. 1. What statistic have I calculated
> How long is your arm compared with your hand size? Put your right thumb at your left shoulder bone, stretch your hand open wide, and extend your hand down your arm. Put your thumb at the place where your little finger is, and extend down the arm again. R
> A researcher studies children in elementary school and finds a strong positive linear association between height and reading scores. 1. Does this mean that taller children are generally better readers? 2. What might explain the strong correlation?
> What about drawing a random sample only from cell phone exchanges? Discuss the advantages and disadvantages of such a sampling method compared with surveying randomly generated telephone numbers from non-cell phone exchanges. Do you think these advantage
> Any time we conduct a survey, we must take care to avoid under coverage. Suppose we plan to select 500 names from the city phone book, call their homes between noon and 4 pm, and interview whoever answers, anticipating contacts with at least 200 people.
> Examine each of the following questions for possible bias. If you think the question is biased, indicate how and propose a better question. 1. Do you think high school students should be required to wear uniforms? 2. Given humanity great tradition of exp
> Examine each of the following questions for possible bias. If you think the question is biased, indicate how and propose a better question. 1. Should companies that pollute the environment be compelled to pay the costs of cleanup? 2. Given that 18-year-o
> An online poll on a website asked: A nationwide ban of the diet supplement ephedra went into effect recently. The herbal stimulant has been linked to 155 deaths and many more heart attacks and strokes. Ephedra manufacturer NVE Pharmaceuticals, claiming t
> Two members of the PTA committee in Exercises 26 and 27 have proposed different questions to ask in seeking parents opinions. Question 1: Should elementary school age children have to pass high-stakes tests in order to remain with their classmates? Quest
> The survey described in Exercise 29 asked: Many people believe this playground is too small and in need of repair. Do you think the playground should be repaired and expanded even if that means raising the entrance fee to the park? Describe two ways this
> An amusement park has opened a new roller coaster. It is so popular that people are waiting for up to 3 hours for a 2-minute ride. Concerned about how patrons (who paid a large amount to enter the park and ride on the rides) feel about this, they survey
> Some people have been complaining that the children playground at a municipal park is too small and is in need of repair. Managers of the park decide to survey city residents to see if they believe the playground should be rebuilt. They hand out question
> For your political science class, you’d like to take a survey from a sample of all the Catholic church members in your city. A list of churches shows 17 Catholic churches within the city limits. Rather than try to obtain a list of all members of all thes
> Students in the economics class discussed in Exercise 31 also wrote these conclusions. Explain the mistakes they made. 1. There was a very strong correlation of 1.22 between Life Expectancy and GDP. 2. The correlation between Literacy Rate and GDP was 0.
> Let revisit the school system described in Exercise 26. Four new sampling strategies have been proposed to help the PTA determine whether parents favor requiring elementary students to pass a test in order to be promoted to the next grade. For each, indi
> In a large city school system with 20 elementary schools, the school board is considering the adoption of a new policy that would require elementary students to pass a test in order to be promoted to the next grade. The PTA wants to find out whether pare
> Prior to the mayoral election discussed in Exercise 24, the newspaper also conducted a poll. The paper surveyed a random sample of registered voters stratified by political party, age, sex, and area of residence. This poll predicted that Amabo would win
> A local TV station conducted a PulsePoll about the upcoming mayoral election. Evening news viewers were invited to text in their votes, with the results to be announced on the late-night news. Based on the texts, the station predicted that Amabo would wi
> Dairy inspectors visit farms unannounced and take samples of the milk to test for contamination. If the milk is found to contain dirt, antibiotics, or other foreign matter, the milk will be destroyed and the farm re-inspected until purity is restored.
> A company packaging snack foods maintains quality control by randomly selecting 10 cases from each day production and weighing the bags. Then they open one bag from each case and inspect the contents.
> State police set up a roadblock to estimate the percentage of cars with up-to-date registration, insurance, and safety inspection stickers. They usually find problems with about 10% of the cars they stop.
> The Environmental Protection Agency took soil samples at 16 locations near a former industrial waste dump and checked each for evidence of toxic chemicals. They found no elevated levels of any harmful substances.
> Hoping to learn what issues may resonate with voters in the coming election, the campaign director for a mayoral candidate selects one block from each of the city election districts. Staff members go there and interview all the adult residents they can f
> A question posted on the gamefaqs.com website asked visitors to the site, Do you have an active social life outside the Internet? 22% of the 55,581 respondents said No or Not really, most of my personal contact is online.
> Your economics instructor assigns your class to investigate factors associated with the gross domestic product (GDP) of nations. Each student examines a different factor (such as Life Expectancy, Literacy Rate, etc.) for a few countries and reports to th
> Consumers Union asked all subscribers whether they had used alternative medical treatments and, if so, whether they had benefited from them. For almost all of the treatments, approximately 20% of those responding reported cures or substantial improvement
> At its website (www.gallup.com), the Gallup Poll publishes results of a new survey each day. Scroll down to the end, and you find a statement that includes words such as these: Results are based on telephone interviews with 1016 national adults, aged 18
> Major League Baseball tests players to see whether they are using performance-enhancing drugs. Officials select a team at random, and a drug-testing crew shows up unannounced to test all 40 players on the team. Each testing day can be considered a study
> For their class project, a group of statistics students decide to survey the student body to assess opinions about the proposed new student center. Their sample of 200 contained 50 first-year students, 50 sophomores, 50 juniors, and 50 seniors. 1. Do you
> Through their Roper Reports Worldwide, GfK Roper conducts a global consumer survey to help multinational companies understand different consumer attitudes throughout the world. Within 30 countries, the researchers interview 1000 people aged 1365. Their s
> Consider again the post-1950 trend in U.S. GDP we examined in Exercise 61. Here are regression output and a residual plot when we use the log of GDP in the model. Is this a better model for GDP? Explain. Would you want to consider a different re-expressi
> The scatterplot shows the gross domestic product (GDP) of the United States in trillions of 2009 dollars plotted against years since 1950 A linear model fit to the relationship looks like this: Dependent variable is: GDP($T) R-squared = 96.9% s = 0.8137
> In Exercise 58 we looked at United Nations data about a country GDP and the average number of people per room (Crowdedness) in housing there. For a re-expression, a student tried the reciprocal 10000|sol|GDP, representing the number of people per $10,000
> Let try the re-expressed variable Fuel Consumption (gal/100 mi) to examine the fuel efficiency of the 11 cars in Exercise 57. Here are the revised regression analysis and residuals plot: Dependent variable is: Fuel Consumption R-squared = 89.2% 1. Explai
> In a Chance magazine article (Summer 2005), Danielle Vasilescu and Howard Wainer used data from the United Nations Center for Human Settlements to investigate aspects of living conditions for several countries. Among the variables they looked at were the
> Hurricane Katrina hurricane force winds extended 120 miles from its center. Katrina was a big storm, and that affects how we think about the prediction errors. Suppose we add 120 miles to each error to get an idea of how far from the predicted track we m
> Now consider the variables igÆ’ and weight in the data set Igf13. 1. Fit a regression model. If the data violate any assumptions, find a suitable re-expression of igÆ’. 2. Add sex to the model in part a as a predictor. Interpret the coefficient of sex. 3
> As the example in the chapter indicates, one of the important factors determining a car Fuel Efficiency is its Weight. Let examine this relationship again, for 11 cars. 1. Describe the association between these variables shown in the scatterplot at the t
> In Chapter 4, we examined Consider the wind speeds in the Hopkins Forest over the course of a year. Here the scatterplot we saw then: (Data in Hopkins Forest) 1. Describe the pattern you see here. 2. Should we try re-expressing either variable to make th
> In Exercise 29, we considered whether a linear model would be appropriate to describe the trend in the number of passengers departing from the Oakland (CA) airport each month since the start of 1997. If we fit a regression model, we obtain this residual
> Look once more at the data from Tour de France 2016. In Exercise 52, we looked at the whole history of the race, but now let consider just the modern era from 1967 on. 1. Make a scatterplot and find the regression of Avg Speed by Year only for years from
> The Consumer Price Index (CPI) tracks the prices of consumer goods in the United States, as shown in the following table. The CPI is reported monthly, but we can look at selected values. The table shows the January CPI at five-year intervals. It indicate
> We met the Tour de France dataset in Chapter 1 (in Just Checking). Look at the Tour de France dataset. One hundred years ago, the fastest rider finished the course at an average speed of about 25.3 kph (around 15.8 mph). By the 21st century, winning ride
> The World Bank reports many demographic statistics about countries of the world. The data file holds the Fertility rate (births per woman) and the female Life Expectancy at birth (in years) for 200 countries of the world. Response variable is: Life expec
> In Chapter 7, we There is found a relationship between the age of a bridge in Tompkins County, New York, and its condition as found by inspection. (Data in Tompkins County Bridges 2016) But we considered only bridges built or replaced since 1900. Tompkin
> Look again at the graph of the age at first marriage for women in Exercise 42. Here is a regression model for the data on women, along with a residuals plot: Response variable is: Women R-squared = 61.1% s = 1.474 1. Based on this model, what would you p
> The errors in predicting hurricane tracks (examined in this chapter) were given in nautical miles. A statutory mile is 0.86898 nautical mile. Most people living on the Gulf Coast of the United States would prefer to know the prediction errors in statutor
> We removed humans from the scatterplot of the Gestation data in Exercise 45 because our species was an outlier in life expectancy. The resulting scatterplot (below) shows two points that now may be of concern. The point in the upper right corner of this
> People swam across Lake Ontario from Niagara-on-the-Lake to Toronto (52 km, or about 32.3 mi) 62 times between 1954 and 2016. We might be interested in whether the swimmers are getting any faster or slower. Here are the regression of the crossing Times (
> For humans, pregnancy lasts about 280 days. In other species of animals, the length of time from conception to birth varies. Is there any evidence that the gestation period is related to the animal life span? The first scatterplot shows Gestation Period
> Has the trend of decreasing difference in age at first marriage seen in Exercise 42 gotten stronger recently? The scatterplot and residual plot for the data from 1980 through 2015, along with a regression for just those years, are below. 1. Is this linea
> In Exercise 41, you investigated the federal rate on 3-month Treasury bills between 1950 and 1980. The scatterplot below shows that the trend changed dramatically after 1980, so we computed a new regression model for the years 1981 to 2015. Here the mode
> The graph shows the ages of both men and women at first marriage (www.census.gov). Clearly, the patterns for men and women are similar. But are the two lines getting closer together? Here are a timeplot showing the difference in average age (menÃ&
> Here are a plot and regression output showing the federal rate on 3-month Treasury bills from 1950 to 1980, and a regression model fit to the relationship between the Rate (in %) and Years Since 1950 (www.gpoaccess.gov/eop/). 1. What is the correlation b
> How does the speed at which you drive affect your fuel economy? To find out, researchers drove a compact car for 200 miles at speeds ranging from 35 to 75 miles per hour. From their data, they created the model Fuel Efficiency=320.1 Speed and created thi
> After keeping track of his heating expenses for several winters, a homeowner believes he can estimate the monthly cost from the average daily Fahrenheit temperature by using the model Cost=133 2.13 Temp. Here is the residuals plot for his data: 1. Interp
> A college admissions officer, defending the college use of SAT scores in the admissions process, produced the following graph. It shows the mean GPAs for last year freshmen, grouped by SAT scores. How strong is the evidence that SAT Score is a good predi
> A researcher investigating the association between two variables collected some data and was surprised when he calculated the correlation. He had expected to find a fairly strong association, yet the correlation was near 0. Discouraged, he didn’t bother
> To measure progress in reading ability, students at an elementary school take a reading comprehension test every year. Scores are measured in grade-level units; that is, a score of 4.2 means that a student is reading at slightly above the expected level
> A researcher studying violent behavior in elementary school children asks the children parents how much time each child spends playing computer games and has their teachers rate each child on the level of aggressiveness they display while playing with ot
> Suppose a researcher studying health issues measures blood pressure and the percentage of body fat for several adult males and finds a strong positive association. Describe three different possible cause-and-effect relationships that might be present.
> The original five points in Exercise 33 produce a regression line with slope 0. Match each of the red points (ae) with the slope of the line after that one point is added: 1. 0.45 2. 0.30 3. 0.00 4. 0.05 5. 0.85
> The scatterplot shows five blue data points at the left. Not surprisingly, the correlation for these points is r=0. Suppose one additional data point is added at one of the five positions suggested below in red. Match each point (ae) with the correct new
> Each of the following scatterplots shows a cluster of points and one stray point. For each, answer these questions: 1. In what way is the point unusual? Does it have high leverage, a large residual, or both? 2. Do you think that point is an influential p
> Each of these four scatterplots shows a cluster of points and one stray point. For each, answer these questions: 1. In what way is the point unusual? Does it have high leverage, a large residual, or both? 2. Do you think that point is an influential poin
> In Chapter 6, we saw data on the errors (in nautical miles) made by the National Hurricane Center in predicting the path of hurricanes. The scatterplot below shows the trend in the 24-hour tracking errors since 1970 (www.nhc.noaa.gov). 1. Interpret the s
> The scatterplot below shows the number of passengers at Oakland (CA) airport month by month since 1997 (oaklandairport.com/news/statistics/passenger-history/). 1. Describe the patterns in passengers at Oakland airport that you see in this time plot. 2. U
> In Exercise 22, we examined the percentage of men aged 1824 who smoked from 1965 to 2014 according to the Centers for Disease Control and Prevention. How about women? Here a scatterplot showing the corresponding percentages for both men and women along w
> Is there an association between time of year and the nighttime temperature in North Dakota? A researcher assigned the numbers 1365 to the days January 1 December 31 and recorded the temperature at 2:00 A.M. for each. What might you expect the correlation
> Here a scatterplot of the production budgets (in millions of dollars) vs. the running time (in minutes) for major release movies in 2005. Dramas are plotted as red x and all other genres are plotted as blue dots. (The re-make of King Kong is plotted as a
> A student who has created a linear model is disappointed to find that her R2 value is a very low 13%. 1. Does this mean that a linear model is not appropriate? Explain. 2. Does this model allow the student to make accurate predictions? Explain.
> In justifying his choice of a model, a student wrote, know this is the correct model because R2=99.4%. 1. Is this reasoning correct? Explain. 2. Does this model allow the student to make accurate predictions? Explain.
> As explained in Exercise 23, the Human Development Index (HDI) is a measure that attempts to summarize in one number the progress in health, education, and economics of a country. The percentage of older people (65 and older) in a country is positively a
> The United Nations Development Programme (UNDP) uses the Human Development Index (HDI) in an attempt to summarize in one number the progress in health, education, and economics of a country (hdr.undp.org/en/data#). In 2015, the HDI was as high as 0.94 fo
> The Centers for Disease Control and Prevention tracks cigarette smoking in the United States (www.cdc.gov/nchs). How has the percentage of people who smoke changed since the danger became clear during the last half of the 20th century? The scatterplot sh
> Is there evidence that the age at which women get married has changed over the past 100 years? The scatterplot shows the trend in age at first marriage for American women (www.census.gov). 1. Is there a clear pattern? Describe the trend. 2. Is the associ
> The data file Receivers 2015 holds information about the 488 NFL players who caught at least one pass during the 2015 football season. A typical 53-man roster has about 13 players who would be expected to catch passes (primarily wide receivers, tight end
> Exercise 41 Chapter 6 looked at a sample of 35 vehicles to examine the relationship between gas mileage and engine displacement. The full dataset holds data on 1211 cars. How well did our sample of 35 represent the underlying relationship between displac
> Consider the four points (200,1950), (400,1650), (600,1800), and (800,1600). The least squares line is y^=1975+0.45x. Explain what least squares means, using these data as a specific example.
> A study of traffic delays in 68 U.S. cities found the following relationship between Total Delay (in total hours lost) and Mean Highway Speed: Is it appropriate to summarize the strength of association with a correlation? Explain.
> Consider the four points (10,10), (20,50), (40,20), and (50,80). The least squares line is y=7.0+1.1x. Explain what least squares means, using these data as a specific example.
> Wildlife researchers monitor many wildlife populations by taking aerial photographs. Can they estimate the weights of alligators accurately from the air? Here is a regression analysis of the Weight of alligators (in pounds) and their Length (in inches) b
> In an investigation of environmental causes of disease, data were collected on the annual mortality rate (deaths per 100,000) for males in 61 large towns in England and Wales. In addition, the water hardness was recorded as the calcium concentration (par
> We saw the data for the women 2016 Olympic heptathlon in Exercise 73. Are the two jumping events associated? Perform a regression of the long-jump results on the high-jump results. 1. What is the regression equation? What does the slope mean? 2. What per
> We discussed the women 2016 Olympic heptathlon in Chapter 5. Here are the results from the high jump, 800-meter run, and long jump for the 27 women who successfully completed all three events of the heptathlon in the 2016 Olympics: Let examine the associ
> Would a model that uses the person Waist size be able to predict the %Body Fat more accurately than one that uses Weight? Using the data in Exercise 71, create and analyze that model.
> It is difficult to determine a person body fat percentage accurately without immersing him or her in water. Researchers hoping to find ways to make a good estimate immersed 20 male subjects, then measured their waists and recorded their weights shown in
> In Exercise 69, we saw the relationship between CO2 measured at Mauna Loa and average global temperature anomaly from 1959 to 2016. Here is a plot of average global temperatures plotted against the yearly final value of the Dow Jones Industrial Average f
> The earth climate is getting warmer. The most common theory attributes the increase to an increase in atmospheric levels of carbon dioxide (CO2), a greenhouse gas. Here is a scatterplot showing the mean annual temperature anomaly (the difference between
> The table shows the number of live births per 1000 population in the United States, starting in 1965. (National Center for Health Statistics, www.cdc.gov/nchs/) 1. Make a scatterplot and describe the general trend in Birthrates. (Enter Year as years sinc
> In a study of streams in the Adirondack Mountains, the following relationship was found between the water pH and its hardness (measured in grains): Is it appropriate to summarize the strength of association with a correlation? Explain. (Data in Streams)
> We saw in this chapter that in Tompkins County, New York, older bridges were in worse condition than newer ones. Tompkins is a rural area. Is this relationship true in New York City as well? Here are data on the Condition (as measured by the state Depart
> Numbeo.com lists the cost of living (COL) for 576 cities around the world. It reports the typical cost of a number of staples. Here are a scatterplot and regression relating the cost of a cappuccino to the cost of a third of a liter of water: 1. Using th
> In Exercise 63, you created a model that can estimate the number of Calories in a burger when the Fat content is known. 1. Explain why you cannot use that model to estimate the fat content of a burger with 600 calories. 2. Using an appropriate model, est
> Chicken sandwiches are often advertised as a healthier alternative to beef because many are lower in fat. Tests on 11 brands of fast-food chicken sandwiches produced the following summary statistics and scatterplot from a graphing calculator: 1. Do you t
> In Chapter 6, you examined We can examine the association between the amounts of Fat and Calories in fast-food hamburgers. Here are the data: 1. Create a scatterplot of Calories vs. Fat. 2. Interpret the value of R2 in this context. 3. Write the equation
> Burger King introduced a meat-free burger in 2002. The nutrition label for the 2014 BK Veggie burger (no mayo) is shown here: (Data in Burger King items) 1. Use the regression model created in this chapter, Fat=8.4+0.91 Protein to predict the fat content