The website www.nobelprize.org allows you to look up all the Nobel prizes awarded in any year. The data are not listed in a table. Rather you drag a slider to the year and see a list of the awardees for that year. Describe the Who in this scenario.
> Engineers at a computer production plant tested two methods for accuracy in drilling holes into a PC board. They tested how fast they could set the drilling machine by running 10 boards at each of two different speeds. To assess the results, they measure
> A start-up company has developed an improved electronic chip for use in laboratory equipment. The company needs to project the manufacturing cost, so it develops a spreadsheet model that takes into account the purchase of production equipment, overhead,
> Most water tanks have a drain plug so that the tank may be emptied when it to be moved or repaired. How long it takes a certain size of tank to drain depends on the size of the plug, as shown in the table. Create a model.
> The dataset Movies 06-15 introduced in the Chapter 3 exercises includes the distributor, number of tickets sold, and gross revenue in addition to the MPAA rating and the genre for each of the 10 years 2006 to 2015. Investigate the associations among the
> The Student survey dataset introduced in the Chapter 3 exercises includes responses to 13 questions. Investigate the associations among the variables that you find interesting. Write a short report on what you discover. Be sure to include summary statist
> The Titanic dataset includes more variables than just those discussed in Chapter 2. Others include such variables as the crew job and where each person boarded the ship. Stories, biographies, and pictures can be found on this site: www.encyclopedia-titan
> The Hopkins Forest dataset includes all 24 weather variables reported by the researchers. Many of the variables (e.g., temperature, relative humidity, solar radiation, wind) are reported as daily averages, minima and maxima. Using any of these variables,
> Is the mean amount of salt higher in menu items that contain meat? 1. Compare the sodium content of the meat and non-meat items with displays and summary statistics. 2. By shuffling the variable Meat 1000 times, investigate whether the mean sodium conten
> Consumer groups are concerned that cereals with a high sugar content (usually designed for children) are placed just where kids are most likely to see them in the middle shelf of the supermarket. The variable Middle indicates whether the cereal is locate
> Here is a stem-and-leaf display showing profits (in $M) for 30 of the 500 largest global corporations (as measured by revenue). The stems are split; each stem represents a span of 5000 ($M), from a profit of 43,000 ($M) to a loss of 7000 ($M). Use the st
> A company that markets build-it-yourself furniture sells a computer desk that is advertised with the claim less than an hour to assemble. However, through postpurchase surveys the company has learned that only 25% of its customers succeeded in building t
> In an experiment to determine whether seeding clouds with silver iodide increases rainfall, 52 clouds were randomly assigned to be seeded or not. The amount of rain they generated was then measured (in acre-feet). Here are the summary statistics: 1. Whic
> The Bicycle Helmet Safety Institute website includes a report on the number of bicycle fatalities per year in the United States. The table below shows the counts for the years 1994 2015. 1. What are the W for these data? 2. Display the data in a stem-and
> Consider again the Pew Research Center results on age and political party in Exercise R1.33 . 1. What is the marginal distribution of party affiliation? 2. Create segmented bar graphs displaying the conditional distribution of party affiliation for each
> According to the Bureau of Labor Statistics, the mean hourly wage for Chief Executives in 2009 was $80.43 and the median hourly wage was $77.27. By contrast, for General and Operations Managers, the mean hourly wage was $53.15 and the median was $44.55.
> The Pew Research Center conducts surveys regularly asking respondents which political party they identify with or lean toward. Among their results is the following table relating preferred political party and age. 1. What percent of people surveyed were
> Horsepower is another measure commonly used to describe auto engines. Here are the summary statistics and histogram displaying horse powers of the same group of 38 cars discussed in Exercise R1.31 1. Describe the shape, center, and spread of this distrib
> One measure of the size of an automobile engine is its displacement, the total volume (in liters or cubic inches) of its cylinders. Summary statistics for several models of new cars are shown. These displacements were measured in cubic inches. 1. How man
> Consider again the data on birth order and college majors in Exercise R1.28 1. What is the marginal distribution of majors? 2. What is the conditional distribution of majors for the oldest children? 3. What is the conditional distribution of majors for
> Researchers for the Herbal Medicine Council collected information on people experiences with a new herbal remedy for colds. They went to a store that sold natural health products. There they asked 100 customers whether they had taken the cold remedy and,
> Is your birth order related to your choice of major? A statistics professor at a large university polled his students to find out what their majors were and what position they held in the family birth order. The results are summarized in the table. 1. Wh
> Here are the number of pieces of mail received at a school office for 36 days. 1. Plot these data. 2. Find appropriate summary statistics. 3. Write a brief description of the school mail deliveries. 4. What percent of the days actually lie within one sta
> A class of fourth graders takes a diagnostic reading test, and the scores are reported by reading grade level. The 5-number summaries for the 14 boys and 11 girls are shown: 1. Which group had the highest score? 2. Which group had the greater range? 3. W
> Is it a good idea to listen to music when studying for a big test? In a study conducted by some statistics students, 62 people were randomly assigned to listen to rap music, Mozart, or no music while attempting to memorize objects pictured on a page. The
> Avoiding an accident when driving can depend on reaction time. That time, measured from the moment the driver first sees the danger until he or she steps on the brake pedal, is thought to follow a Normal model with a mean of 1.5 seconds and a standard de
> Babe Ruth was the first great slugger in baseball. His record of 60 home runs in one season held for 34 years until Roger Maris hit 61 in 1961. Mark McGwire (with the aid of steroids) set a new standard of 70 in 1998. Listed below are the home run totals
> A study in South Africa focusing on the impact of health insurance identified 1590 children at birth and then sought to conduct follow-up health studies 5 years later. Only 416 of the original group participated in the 5-year follow-up study. This made r
> The times of skaters in the qualifying heats for the women short track race at the 2018 Olympics in PyeongChang are given in the table below. 1. The mean finishing time was 45.075 seconds, with a standard deviation of 4.50 seconds. If the Normal model is
> Is the Statue of Liberty nose too long? Her nose measures 4²6³, but she is a large statue, after all. Her arm is 42 feet long. That means her arm is 42/4.5=9.3 times as long as her nose. Is that a reasonable ratio? Shown in the ta
> The National Highway Traffic Safety Administration reported that there were 3206 fatal accidents involving drivers between the ages of 15 and 19 years old the previous year, of which 65.5% involved male drivers. Of the male drivers, 18.4% involved drinki
> Does the duration of an eruption have an effect on the length of time that elapses before the next eruption? 1. The histogram below shows the duration (in minutes) of those 222 eruptions. Describe this distribution. 2. Explain why it is not appropriate t
> It is a common belief that Yellowstone most famous geyser erupts once an hour at very predictable intervals. The histogram below shows the time gaps (in minutes) between 222 successive eruptions. Describe this distribution.
> Average daily temperatures in January and July for 60 large U.S. cities are graphed in the histograms below. (Data in City climate) 1. What aspect of these histograms makes it difficult to compare the distributions? 2. What differences do you see between
> The Framingham Heart Study recorded the cholesterol levels of more than 1400 participants. (Data in Framingham) Here is an ogive of the distribution of these cholesterol measures. (An ogive shows the percentage of cases at or below a certain value.) Cons
> Which of these scatterplots show 1. little or no association? 2. a negative association? 3. a linear association? 4. a moderately strong association? 5. a very strong association?
> The dataset from England and Wales also notes for each town whether it was south or north of Derby. Here are some summary statistics and a comparative boxplot for the two regions. 1. What is the overall mean mortality rate for the two regions? 2. Do you
> In an investigation of environmental causes of disease, data were collected on the annual mortality rate (deaths per 100,000) for males in 61 large towns in England and Wales. In addition, the water hardness was recorded as the calcium concentration (par
> Progressive Insurance asked customers who had been involved in auto accidents how far they were from home when the accident happened. The data are summarized in the table. 1. Create an appropriate graph of these data. 2. Do these data indicate that drivi
> You pick a card from a standard deck and record its denomination (7, say) and its suit (maybe spades). 1. Is the variable suit categorical or quantitative? 2. Name a game you might be playing for which you would consider the variable denomination to be c
> A study by the Pew Internet & American Life Project found that 78% of U.S. residents over 16 years old read a book in the past 12 months. They also found that 21% had read an e-book using a reader or computer during that period. A newspaper reporting on
> One Thursday, researchers gave students enrolled in a section of basic Spanish a set of 50 new vocabulary words to memorize. On Friday, the students took a vocabulary test. When they returned to class the following Monday, they were reteste without advan
> As part of the course work, a class at an upstate NY college collects data on streams each year. Students record a number of biological, chemical, and physical variables, including the stream name, the substrate of the stream (limestone (L), shale (S), o
> A credit card bank is investigating the incidence of fraudulent card use. The bank suspects that the type of product bought may provide clues to the fraud. To examine this situation, the bank looks at the North American Industry Classification System (NA
> Based on long-term investigation, researchers have suggested that the acidity (pH) of rainfall in the Shenandoah Mountains can be described by the Normal model N(4.9,0.6). 1. Draw and carefully label the model. 2. What percent of storms produce rainfall
> Public relations staff members at State U phoned 850 local residents. After identifying themselves, the callers asked the survey participants their ages, whether they had attended college, and whether they had a favorable opinion of the university. The o
> How fast do horses run? Kentucky Derby winners run well over 30 miles per hour, as shown in this graph. The graph shows the percentage of Derby winners that have run slower than each given speed. Note that few have won running less than 33 miles per hour
> Clarksburg Bakery is trying to predict how many loaves to bake. In the past 100 days, they have sold between 95 and 140 loaves per day. Here is a histogram of the number of loaves they sold for the past 100 days. 1. Describe the distribution. 2. Which sh
> Facebook uploads more than 350 million photos every day onto its servers. For this collection, describe the Who and the What.
> The National Center for Health Statistics (NCHS) conducts an extensive survey consisting of an interview and medical examination with a representative sample of about 5000 people a year. The interview includes demographic, socioeconomic, dietary, and oth
> Sports announcers love to quote statistics. During the Super Bowl, they particularly love to announce when a record has been broken. They might have a list of all Super Bowl games, along with the scores of each team, total scores for the two teams, margi
> Satellites send back nearly continuous data on the earth land masses, oceans, and atmosphere from space. How might researchers use this information in both the short and long terms to help study changes in the earth climate?
> Sensors in parking lots are able to detect and communicate when spaces are filled in a large covered parking garage next to an urban shopping mall. How might the owners of the parking garage use this information both to attract customers and to help the
> Online retailers such as Amazon.com keep data on products that customers buy, and even products they look at. What does Amazon hope to gain from such information?
> Many grocery store chains offer customers a card they can scan when they check out and offer discounts to people who do so. To get the card, customers must give information, including a mailing address and e-mail address. The actual purpose is not to rew
> Here is the ANOVA table for the cookie experiment of Exercise 2 along with an interaction plot. What does the interaction term say about the cookie recipes?
> Here are the summary statistics for Verbal SAT scores for a high school graduating class: 1. Create side-by-side boxplots comparing the scores of boys and girls as best you can from the information given. 2. Write a brief report on these results. Be sure
> Here is an ANOVA table with an interaction term and the corresponding interaction plot for the TV watching data of Exercise 1 . What does the interaction term mean here?
> The student performing the chocolate chip cookie experiment of Exercise 2 planned to analyze his results with an Analysis of Variance on two factors. Here are some displays. Do you think the assumptions for ANOVA are satisfied?
> The TV watching study of Exercise 1 was collected as a survey of students at a small college. Do the assumptions of ANOVA appear to be met? Here are some displays to help in your decision:
> A student performed an experiment to compare chocolate chip cookie recipes. He baked batches of cookies with different amounts of Sugar: 0.5, 0.375, and 0.25 cups, and with different kinds of Chips: milk, semisweet, and dark chocolate. Cookie quality was
> In the previous chapter we considered TV watching by male and female student athletes. In that example, we categorized the students into four groups, but now we have seen that these data could be analyzed with two factors, Sex and Athlete. Write the ANOV
> A bank is studying the time that it takes 6 of its tellers to serve an average customer. Customers line up in the queue and then go to the next available teller. Here is a boxplot of the last 200 customers and the times it took each teller: 1. What are t
> Here are case prices (in dollars) of wines produced by wineries along three of the Finger Lakes. 1. What null and alternative hypotheses would you test for these data? Talk about prices and location, not symbols. 2. Do the conditions for an ANOVA seem to
> Here are boxplots that show the relationship between the number of cylinders a car engine has and the car fuel economy for a sample of cars. 1. State the null and alternative hypotheses that you might consider for these data. 2. Do the conditions for an
> A student performed an experiment with three different grips to see what effect it might have on the distance of a backhanded Frisbee throw. She tried it with her normal grip, with one finger out, and with the Frisbee inverted. She measured in paces how
> To shorten the time it takes him to make his favorite pizza, a student designed an experiment to test the effect of sugar and milk on the activation times for baking yeast. Specifically, he tested four different recipes and measured how many seconds it t
> A student study of the effects of caffeine asked volunteers to take a memory test 2 hours after drinking soda. Some drank caffeine-free cola, some drank regular cola (with caffeine), and others drank a mixture of the two (getting a half-dose of caffeine)
> A student interested in improving her dart-throwing technique designs an experiment to test 4 different stances to see whether they affect her accuracy. After warming up for several minutes, she randomizes the order of the 4 stances, throws a dart at a t
> A student runs an experiment to study the effect of three different mufflers on gas mileage. He devises a system so that his Jeep Wagoneer uses gasoline from a one-liter container. He tests each muffler 8 times, carefully recording the number of miles he
> A figure skater tried various approaches to her Salchow jump in a designed experiment using 5 different places for her focus (arms, free leg, midsection, takeoff leg, and free). She tried each jump 6 times in random order, using two of her skating partne
> An intern from the marketing department at the Holes R Us online piercing salon has recently finished a study of the company 500 customers. He wanted to know whether the mean ZIP code of customers purchasing different products varied according to the las
> A survey of 1021 school-age children was conducted by randomly selecting children from several large urban elementary schools. Two of the questions concerned eye and hair color. In the survey, the following codes were used: The students analyzing the dat
> A researcher investigated four different word lists for use in hearing assessment. She wanted to know whether the lists were equally difficult to understand in the presence of a noisy background. To find out, she tested 96 subjects with normal hearing ra
> A student runs an experiment to test four different popcorn brands, recording the number of kernels left un-popped. She pops measured batches of each brand 4 times, using the same popcorn popper and randomizing the order of the brands. After collecting h
> Joe wants to impress his boss. He builds a regression model to predict sales that has 20 predictors and an R2 of 80%. Sally builds a competing model with only 5 predictors, but an R2 of only 78%. Which model is likely to be most useful for understanding
> In a regression to predict compensation of employees in a large firm, the predictors in the regression were Years with the Firm, Age, and Years of Experience. The coefficient of Age is negative and statistically significantly different from zero. Does th
> For each of the following cases, would your primary concern about them be that they had a large residual, large leverage, or likely large influence on the regression model? 1. In a regression to predict freshman grade point averages as part of the admiss
> Here are summary statistics for the sizes (in acres) of a collection of vineyards in the Finger Lakes region of New York State: Suppose you didn’t have access to the data. Answer the following questions from the summary statistics alone
> For each of the following cases, would your primary concern about them be that they had a large residual, large leverage, or likely large influence on the regression model? Explain your thinking. 1. In a regression to predict the construction cost of rol
> 1. Look up additional nutrition information about the BK items and combine a file holding that information with the existing BK data. 2. Define the new variable Fat/Carb as the ratio of Fat grams to Carbohydrate grams in each BK item
> 1. In the Burger King items data, use one of the variables to separate the items containing meat from the items that do not contain meat, and analyze those separately. 2. Combine data about McDonald menu items using the same variables with the data from
> Use the information in Exercise 1 to test the hypotheses H0: β1=0 vs. HA: β1‰ 0. What do you conclude about the relationship between earnings and SAT scores?
> Shoot to score, another one Returning to the results of Exercise 2, write a sentence to explain the meaning of the standard error of the slope of the regression line, SE(b1)=0.0125, and the corresponding P-value.
> Continuing with the regression of Exercise 1, write a sentence that explains the meaning of the standard error of the slope of the regression line, SE(b1)=1.545, and the corresponding P-value.
> Using the regression output from Exercise 2, identify the residual standard deviation and explain its meaning with a sentence in context.
> Using the regression output in Exercise 1, identify the residual standard deviation and explain what it means in the context of the problem.
> Discuss the assumptions and conditions necessary for proceeding with the regression analysis in Exercise 2. Do you think the conditions are satisfied?
> Discuss the assumptions and conditions necessary for proceeding with the regression analysis in Exercise 1. Do you think the conditions are satisfied?
> A survey of major universities asked what percentage of incoming freshmen usually graduate on time in 4 years. Use the summary statistics given to answer the questions that follow. 1. Would you describe this distribution as symmetric or skewed? Explain.
> A college hockey coach collected data from the 2016–2017 National Hockey League season. He hopes to convince his players that the number of shots taken has an effect on the number of goals scored. The coa
> The coach we’ve been following wants to predict how many goals each of his players will score this season. Explain why a model like the ones we’ve made won’t be very successful at doing that.
> Naturally, you would like to know what you are going to earn in the next few years. Explain why a regression model such as the ones we have found won’t do a very good job of such a prediction. (Sorry.)
> Continuing from Exercise 14, the coach responds to the players by claiming that shooting accuracy is more important than time on the ice. He adds Shoot% (% of shots on goal) to the model. Response variable is: Goals R squared=95.7% s=0.8850 with 654=61 d
> A second predictor in Exercise 13 improved the regression model of Exercise 1, so let try a third. Here a model with average ACT score of the entering class included: Response variable is: Earn R squared=36.5% s=5372 with 6874=683 degrees of freedom 1. T
> The players on the team in Exercise 2 point out to the coach that they can’t shoot if they are not on the ice. They add the variable TimeOnIce/Game (TOI/G) (in minutes per game) to the regression: (Reminder: if you are using the full da
> Continuing with the data from Exercise 1, here a regression with the percent of students who receive merit-based financial aid included in the model: Response variable is: Earn R squared=35.5% 1. Write the regression model. 2. What is the interpretation
> The coach in Exercise 2 found a 95% confidence interval for the slope of his regression line. Recall that he is trying to understand how the number of goals scored is related to shots taken. Interpret with a sentence the meaning of the interval 0.099267±
> Construct a 95% confidence interval for the slope of the regression line in Exercise 1. Interpret the meaning of the interval. Be sure to state it in the context of the data and the question about the data.
> What can the hockey coach in Exercise 2 conclude about shooting and scoring goals from the fact that the P-value < 0.0001 for the slope of the regression line? Write a sentence in context.
> A survey of 1021 school-age children was conducted by randomly selecting children from several large urban elementary schools. Two of the questions concerned eye and hair color. In the survey, the following codes were used: The statistics students analyz