The managers of a large company wished to know the percentage of employees who feel extremely satisfied to work there. The company has roughly 24,000 employees. They contacted a random sample of employees and asked them about their job satisfaction, obtaining 437 completed responses. How does their study deal with the three Big Ideas of sampling?
> A citrus farmer has observed the following distribution for the number of oranges per tree. How many oranges does he expect on average?
> Facebook reports that 70% of its users are from outside the United States and that 50% of its users log on to Facebook every day. Suppose that 20% of its users are U.S. users who log on every day. Make a probability table. Why is a table better than a tr
> If the sex of a child is independent of all other births, is the probability of a woman giving birth to a girl after having four boys greater than it was on her first birth? Explain.
> On the Titanic, the probability of survival was 0.323. Among first-class passengers, it was 0.625. Were survival and ticket class independent? Explain.
> A nervous kicker usually makes 70% of his first field goal attempts. If he makes his first attempt, his success rate rises to 90%. What is the probability that he makes his first two kicks?
> A student figures that he has a 30% chance of being let out of class late. If he leaves class late, there is a 45% chance that he will miss his train. What is the probability that it will cause him to miss the train?
> From Exercise 3, if someone doesn’t like to watch basketball, what is the probability that she will be a football fan?
> What is the probability that a person likes to watch football, given that she also likes to watch basketball?
> Forty-five percent of Americans like to cook and 59% of Americans like to shop, while 23% enjoy both activities. What is the probability that a randomly selected American either enjoys cooking or shopping or both?
> Given the probabilities in Exercise 12, what is the probability that a person is younger than 50 given that she uses online banking? Has the probability that she is younger than 50 increased or decreased with the additional information?
> Sugar is a major ingredient in many breakfast cereals. The histogram displays the sugar content as a percentage of weight for 49 brands of cereal. The boxplots compare sugar content for adult and children cereals. 1. What is the range of the sugar conten
> Given the probabilities in Exercise 11, what is the probability that a person is from the United States given that he logs on to Facebook every day? Has the probability that he is from the United States increased or decreased with the additional informat
> Suppose that the information in Exercise 10 had been presented in the following way. A national survey of bank customers finds that 40% are younger than 50. Of those younger than 50, 5 of 8 conduct their banking online. Of those older than 50, only 1 of
> Suppose that the information in Exercise 9 had been presented in the following way. Facebook reports that 70% of its users are from outside the United States. Of the U.S. users, two-thirds log on every day. Of the non-U.S. users, three-sevenths log on ev
> A national survey indicated that 30% of adults conduct their banking online. It also found that 40% are younger than 50, and that 25% are younger than 50 and conduct their banking online. Make a probability table. Why is a table better than a tree here?
> Suppose that 25% of people have a dog, 29% of people have a cat, and 12% of people own both. What is the probability that someone owns a dog or a cat?
> The survey by the National Center for Health Statistics further found that 49% of adults ages 25 29 had only a cell phone and no landline. We randomly select four 25 29-year-olds: 1. What is the probability that all of these adults have a only a cell pho
> A 2010 study conducted by the National Center for Health Statistics found that 25% of U.S. households had no landline service. This raises concerns about the accuracy of certain surveys, as they depend on random-digit dialing to households via landlines.
> Your list of favorite songs contains 10 rock songs, 7 rap songs, and 3 country songs. 1. What is the probability that a randomly played song is a rap song? 2. What is the probability that a randomly played song is not country?
> In your dresser are five blue shirts, three red shirts, and two black shirts. 1. What is the probability of randomly selecting a red shirt? 2. What is the probability that a randomly selected shirt is not black?
> After rolling doubles on a pair of dice three times in a row, your friend exclaims, I can’t get doubles four times in a row! Explain why this thinking is incorrect.
> The Men Giant Slalom skiing event consists of two runs whose times are added together for a final score. Two displays of the giant slalom times in the 2018 Winter Olympics at Pyeong Chang are shown below. 1. What features of the distribution can you see
> Suppose you were to collect data for each pair of variables. You want to make a scatterplot. Which variable would you use as the explanatory variable and which as the response variable? Why? What would you expect to see in the scatterplot? Discuss the li
> Your friend says: I flipped five heads in a row! The next one has to be tails! Explain why this thinking is incorrect.
> Rolling a fair six-sided die is supposed to randomly generate the numbers 1 through 6. Explain what random means in this context.
> Flipping a fair coin is said to randomly generate heads and tails with equal probability. Explain what random means in this context.
> Is the experiment of Exercise 3 blind? Can it be double-blind? Explain.
> For the experiment of Exercise 4, discuss variables that could be controlled or that could not be controlled. Is the experiment randomized and replicated?
> For the experiment of Exercise 3, name some variables the driver did or should have controlled. Was the experiment randomized and replicated?
> For the experiment described in Exercise 4, name the factor and its levels. How might the response be measured?
> For the experiment described in Exercise 3, list the factor, the levels, and the response variable.
> You want to compare the tastiness and juiciness of tomatoes grown with three amounts of a new fertilizer: none, half the recommended amount, and the full recommended amount. You allocate 6 tomato plants to receive each amount of fertilizer, assigning the
> A pizza delivery driver, always trying to increase tips, runs an experiment on his next 40 deliveries. He flips a coin to decide whether or not to call a customer from his mobile phone when he is five minutes away, hoping this slight bump in customer ser
> Crowd Management Strategies (www.crowdsafe.com) monitors accidents at rock concerts. In their database, they list the names and other variables of victims whose deaths were attributed to crowd crush at rock concerts. Here are the histogram and boxplot of
> A business student conjectures that the Internet caused companies to become more profitable, since many transactions previously handled face-to-face could now be completed online. The student compares earnings from a sample of companies from the 1980s to
> What factors might confound the results of the experiment in Exercise 4?
> For the experiment of Exercise 3, name some confounding variables that might influence the experiment results.
> To obtain enough plants for the tomato experiment of Exercise 4, experimenters have to purchase plants from two different garden centers. They then randomly assign the plants from each garden center to all three fertilizer treatments. Is the experiment b
> The driver of Exercise 3 wants to know about tipping in general. So he recruits several other drivers to participate in the experiment. Each driver randomly decides whether to phone customers before delivery and records the tip percentage. Is this experi
> If the tomato taster doesn’t know how the tomatoes have been treated, is the experiment single- or double-blind? How might the blinding be improved further?
> The 1990s and early 2000s could be considered the steroids era in Major League Baseball, as many players have admitted to using the drug to increase performance on the field. If a sports writer wanted to compare home run totals from the steroids era to a
> What problems do you see with asking the following question of students? Are you the first member of your family to seek higher education?
> For each scenario, determine the sampling method used by the managers from Exercise 2. 1. Use the company e-mail directory to contact 150 employees from among those employed for less than 5 years, 150 from among those employed for 510 years, and 150 from
> For each scenario, identify the kind of sample used by the university administrators from Exercise 1: 1. Select several dormitories at random and contact everyone living in the selected dorms. 2. Using a computer-based list of registered students, contac
> Here are the same three prices as in Exercise 15 but for 576 cities around the world. (Prices are all in US$ as of August 2016; data in COLall 2016.) 1. In general, which commodity is the most expensive? 2. Is a carton of eggs ever more expensive than a
> A company hoping to assess employee satisfaction surveys employees by assigning computer-generated random numbers to each employee on a list of all employees and then contacting all those whose assigned random number is divisible by 7. Is this a simple r
> A professor teaching a large lecture class of 350 students samples her class by rolling a die. Then, starting with the row number on the die (1 to 6), she passes out a survey to every fourth row of the large lecture hall. She says that this is a simple r
> The company annual report states, Our survey shows that 87.34% of our employees are very happy working here. Comment on that claim. Use appropriate statistics terminology.
> The president of the university plans a speech to an alumni group. He plans to talk about the proportion of students who responded in the survey that they are the first in their family to attend college, but the first draft of his speech treats that prop
> The company of Exercise 2 is considering ways to survey their employees. For each of these proposed designs, identify the problem. 1. Leave a stack of surveys out in the employee cafeteria so people can pick them up and return them. 2. Stuff a questionna
> The university administration of Exercise 1 is considering a variety of ways to sample students for a survey. For each of these proposed survey designs, identify the problem. 1. Publish an advertisement inviting students to visit a website and answer que
> The company plans to have the head of each corporate division hold a meeting of their employees to ask whether they are happy on their jobs. They will ask people to raise their hands to indicate whether they are happy. What problems do you see with this
> Administrators at Texas A&M University were interested in estimating the percentage of students who are the first in their family to go to college. The A&M student body has about 46,000 members. How might the administrators answer their question by apply
> For each of these potential predictor variables, say whether they should be represented in a regression model by indicator variables. If so, then suggest what specific indicators should be used (that is, what values they would have). 1. In a regression t
> To help travelers know what to expect, researchers collected the prices of commodities in 16 cities throughout the world. Here are boxplots comparing the average prices of a bottle of water, a dozen eggs, and a cappuccino in the 16 cities (prices are all
> In Chapters 4 and 6 we’ve seen data Let look at data from the Hopkins Forest. Here a regression that models the maximum daily wind speed in terms of the average temperature and precipitation: Response variable is: Max wind (mph) R-squar
> Look back at the regression in Exercise 3. Here is the partial regression plot for the coefficient of Budget. 1. What is the slope of the least squares regression line in the partial regression plot? 2. The point plotted with a red x is the movie Avatar,
> For the movies regression in Exercise 3, here is a histogram of the residuals. What does it tell us about the assumptions and conditions below? 1. Linearity Condition 2. Nearly Normal Condition 3. Equal Spread Condition
> For the movies examined in Exercise 3, here is a scatterplot of USGross vs. Budget: What (if anything) does this scatterplot tell us about the following assumptions and conditions for the regression? 1. Linearity Condition 2. Equal Spread Condition 3. No
> A middle manager at an entertainment company, upon seeing the analysis of Exercise 3, concludes that longer movies make more money. He argues that his company films should all be padded by 30 minutes to improve their gross. Explain the flaw in his interp
> What can predict how much a motion picture will make? We have data on 609 recent releases that includes the USGross (in $M), the Budget ($M), the Run Time (minutes), and the score given by the critics on the Rotten Tomatoes website. The first several ent
> The dataset Grades shows the five scores from an Introductory statistics course. Find a model for final exam score by trying all possible models with two predictor variables. Which model would you choose? Be sure to check the conditions for multiple regr
> We saw a model in Exercise 24 for the calorie count of a breakfast cereal. Can we predict the calories of a serving from its vitamin and mineral content? Here a multiple regression model of Calories per serving on its Sodium (mg), Potassium (mg), and Sug
> We saw in Chapter 7 that the calorie content of a breakfast cereal is linearly associated with its sugar content. Is that the whole story? Here the output of a regression model that regresses Calories per serving on each serving Protein(g), Fat(g), Fiber
> The dataset on body fat contains 15 body measurements on 250 men from 22 to 81 years old. Is average %Body Fat related to Weight? Here a scatterplot: And here the simple regression: Dependent variable is: Pct BF R-squared = 38.1% s = 6.538 1. What does t
> Find data on the Internet (or elsewhere) for two or more groups. Make appropriate displays to compare the groups, and interpret what you find.
> A large section of Stat 101 was asked to fill out a survey on grade point average and SAT scores. A regression was run to find out how well Math and Verbal SAT scores could predict academic performance as measured by GPA. The regression was run on a comp
> The AFL-CIO has undertaken a study of the yearly salaries (in thousands of dollars) of 30 administrative assistants. The organization wants to predict salaries from several other variables. The variables considered to be potential predictors of salary ar
> Here are some diagnostic plots for the home prices data from Exercise 17. These were generated by a computer package and may look different from the plots generated by the packages you use. (In particular, note that the axes of the Normal probability plo
> A candy maker surveyed chocolate bars available in a local supermarket and found the following least squares regression model: Calories=28.4+11.37 Fat(g)+2.91 Sugar(g). 1. The hand-crafted chocolate she makes has 15 g of fat and 20 g of sugar. How many
> Here are some diagnostic plots for the final exam data from Exercise 13. These were generated by a computer package and may look different from the plots generated by the packages you use. (In particular, note that the axes of the Normal probability plot
> Here is the regression for the women records for the same Scottish hill races we considered in Exercise 14: Dependent variable is: Women Time (mins) R-squared = 96.7% s = 10.06 1. Compare the regression model for the women records with that found for the
> Many variables have an impact on determining the price of a house. Among these are Living Area of the house (square feet) and number of Bathrooms. Information for a random sample of homes for sale in the Statesboro, GA, area was obtained from the Interne
> A student collected nutrition data about candy bars by reading the labels in a supermarket. Because candy bars have different serving sizes, the data are given as values per serving. Here is a regression predicting calories from the sugar (g/serving). (F
> Several exercises in Chapter 7 showed that attendance Attendance at American League baseball games increased increases with the number of runs scored. But fans may respond more to winning teams than to high-scoring games. Here is a regression of average
> Hill running races up and down hills has a written history in Scotland dating back to the year 1040. Races are held throughout the year at different locations around Scotland. A recent compilation of information for 90 races (for which full information w
> Find an article in a newspaper, a magazine, or the Internet that compares two or more groups of data. 1. Does the article discuss the W? 2. Is the chosen display appropriate? Explain. 3. Discuss what the display reveals about the groups. 4. Does the arti
> How well do exams given during the semester predict performance on the final? One class had three tests during the semester. Computer output of the regression gives Dependent variable is: Final s = 13.46 R-Sq = 77.7% 1. Write the equation of the regressi
> A household appliance manufacturer wants to analyze the relationship between total sales and the company three primary means of advertising (television, magazines, and radio). All values were in millions of dollars. They found the regression equation Sal
> A regression performed to predict the selling price of houses found the equation Price=169,328+35.3 Area+0.718 Lotsize 6543 Age where Price is in dollars, Area is in square feet, Lotsize is in square feet, and Age is in years. The R2 is 92%. One of the
> For each of these potential predictor variables say whether they should be represented in a regression model by indicator variables. If so, then suggest what specific indicators should be used (that is, what values they would have). 1. In a regression to
> The following regression model was found for the houses in upstate New York considered in the chapter: Price=20,986.09 7483.10 Bedrooms+93.84 Living Area. 1. Find the predicted price of a 2-bedroom, 1000-sq-ft house from this model. 2. The house just so
> Abalones are edible sea snails that include over 100 species. A researcher is working with a model that uses the number of rings in an abalone shell to predict its age. He finds an observation that he believes has been miscalculated. After deleting this
> The production company of Exercise 7 offers advanced sales to Frequent Buyers through its website. Here a relevant scatterplot: One performer refused to permit advanced sales. What effect has that point had on the regression to model Total Revenue from A
> A regression of Total Revenue on Ticket Sales by the concert production company of Exercises 2 and 4 finds the model Revenue=14,228+36.87 Ticket Sales. 1. Management is considering adding a stadium-style venue that would seat 10,000. What does this mode
> Using data from 20 compact cars, a consumer group develops a model that predicts the stopping time for a vehicle by using its weight. You consider using this model to predict the stopping time for your large SUV. Explain why this is not advisable.
> Noting a recent study predicting the increase in cell phone costs, a friend remarks that by the time he a grandfather, no one will be able to afford a cell phone. Explain where his thinking went awry.
> Can you design a Simpson paradox? Two companies are vying for a city Best Local Employer award to be given to the company most committed to hiring local residents. Although both employers hired 300 new people in the past year, Company A brags that it des
> The concert production company of Exercise 2 made a second scatterplot, this time relating Total Revenue to Ticket Sales. 1. Describe the relationship between Ticket Sales and Total Revenue. 2. How are the results for the two venues similar? 3. How are t
> The analyst in Exercise 1 tried fitting the regression line to each market segment separately and found the following: What does this say about her concern in Exercise 1? Was she justified in worrying that the overall model Jan=$612.07+0.403 Dec might no
> Exercise 18 revisited the relationship between life expectancy and TVs per capita and saw that re-expression to the square root of TVs per capita made the plot more nearly straight. But was that the best choice of re-expression? Here is a scatterplot of
> A concert production company examined its records. The manager made the scatterplot at the top of the next column. The company places concerts in two venues, a smaller, more intimate theater (plotted with blue circles) and a larger auditorium-style venue
> Exercise 17 looked at the distribution of protein in the Burger King menu items, comparing meat and non-meat items. That exercise offered the logarithm as a re-expression of Protein. Here are two other alternatives, the square root and the reciprocal. Wo
> Recall the example of life expectancy vs. TVs per person in the chapter. In that example, we use the square root of TVs per person. Here are the original data and the re-expressed version. Which of the goals of re-expression does this illustrate? (Data i
> Recall the data about the Burger King menu items in Chapter 7. We look at data about Burger King menu items. Here are boxplots of protein content comparing items that contain meat with those that do not. The plot on the right graphs log(Protein). Which o
> Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re-expression and, if so, why.
> Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re-expression and, if so, why.
> An athletic director proudly states that he has used the average GPAs of the university sports teams and is predicting a high graduation rate for the teams. Why is this method unsafe?
> A 1975 article in the magazine Science examined the graduate admissions process at Berkeley for evidence of sex discrimination. The table below shows the number of applicants accepted to each of four graduate programs: 1. What percent of total applicants
> A team of calculus teachers is analyzing student scores on a final exam compared to the midterm scores. One teacher proposes that they already have every teacher class averages and they should just work with those averages. Explain why this is problemati