For the experiment you designed in the Brief Case of Chapter 9, analyze the results of your experiment and write up your analysis and conclusions, including any recommendations for further testing.
> A sample of eight states was selected randomly from each of three regions in the United States (Northeast, Southeast, and West). Mean annual salaries for marketing managers were retrieved from the U.S. Bureau of Labor Statistics (data.bls.gov/oes). The b
> Cell phone adoption rates are available for various countries in the United Nations Database (unstats.un.org). Countries were randomly selected from three regions (Africa, Asia, and Europe), and cell phone adoption (per 100 inhabitants) rates retrieved.
> Refer to Exercise 1. a) State the null and alternative hypotheses. b) Calculate the F-ratio. c) What is the P-value? d) State your conclusion at α = 0.05. Exercise 1: In a completely randomized design, ten subjects were assigned to each of
> In a completely randomized design, ten subjects were assigned to each of four treatments of a factor. Below is the partially completed ANOVA table. a) What are the degrees of freedom for treatment, error, and total? b) What is SSE? c) What is MSTr? d)
> Vendors of hearing aids test them by having patients listen to lists of words and repeat what they hear. The word lists are supposed to be equally difficult to hear accurately. But the challenge of hearing aids is perception when there is background nois
> For the decision tree of Exercise 4, a) Suppose P(Warm) = 0.5, P(Moderate) = 0.3, and P(Cold) = 0.2. What is the expected value of each action? b) What is the best choice using the expected-value approach?
> For the cost matrix of Exercise 3, a) Suppose P(Recession) = 0.2, P1Stable2 = 0.2, and P1Expansion2 = 0.6. What is the expected value of each action? b) What is the best choice using the expected-value approach? Exercise 3: You are called on to decide h
> For the decision tree of Exercise 4, a) What is the maximin choice? b) What is the maximax choice? Exercise 4: Here is a decision tree for the profits (in $000’s) you project for your sales of the cell phone screen defroster, depending
> For the cost matrix of Exercise 3, a) What is the minimax choice? b) What is the minimin choice?
> Here is a decision tree for the profits (in $000’s) you project for your sales of the cell phone screen defroster, depending on the weather this coming winter and your choice of advertising method: Write out the corresponding profit m
> You are called on to decide how your company should produce its new cell phone screen defroster (for use by skiers and others spending time out of doors in the cold.) You develop the following cost matrix ($000’s): Draw the correspond
> Which of the following are Actions and which are States of Nature? a) Whether Unemployment is above or below 10%. b) Whether a new product should be brought to market before or after the beginning of the fiscal year. c) Whether to tell employees to stay
> Which of the following are Actions and which are States of Nature? a) Whether Unemployment is above or below 10%. b) Whether a new product should be brought to market before or after the beginning of the fiscal year. c) Whether to tell employees to stay
> For which one of the following situations would Kendall’s tau be appropriate? a) Comparing the ratings of a new product on a 5-point Likert scale by a panel of consumers to their ratings of a competitor’s product on the same scale. b) Comparing the swe
> For which one of the following situations would a Wilcoxon signed-rank test be appropriate? a) The Mohs scale rates the hardness of minerals. If one mineral can scratch another, it is judged to be harder. (Diamond, the hardest mineral, is a 10.) Is hard
> A bank is studying the average time that it takes 6 of its tellers to serve a customer. Customers line up in the queue and are served by the next available teller. Here is a boxplot of the times it took to serve the last 140 customers. a) What are th
> For which one of the following situations would a Wilcoxon signed-rank test be appropriate? a) Comparing the ratings of a new product on a 5-point Likert scale by a panel of consumers to their ratings of a competitor’s product on the same scale. b) Com
> For which one of the following situations would a Kruskal-Wallis test be appropriate? a) The Mohs scale rates the hardness of minerals. If one mineral can scratch another, it is judged to be harder. (Diamond, the hardest mineral, is a 10.) Is hardness r
> For which one of the following situations would a Kruskal-Wallis test be appropriate? a) Comparing the ratings of a new product on a 5-point Likert scale by a panel of consumers to their ratings of a competitor’s product on the same scale. b) Comparing
> For which one of the following situations would a Wilcoxon rank-sum (or Mann-Whitney) test be appropriate? a) The Mohs scale rates the hardness of minerals. If one mineral can scratch another, it is judged to be harder. (Diamond, the hardest mineral, is
> For which one of the following situations would a Wilcoxon rank-sum (or Mann-Whitney) test be appropriate? a) Comparing the ratings of a new product on a 5-point Likert scale by a panel of consumers to their ratings of a competitor’s product on the same
> Which of the following variables are ranks? For those that are not ranks, give the units. a) U.S. change by size: dime, penny, nickel, quarter, half dollar, dollar. b) U.S. change by value: penny, nickel, dime, quarter, half dollar, dollar. c) The cha
> For which one of the following situations would Kendall’s tau be appropriate? a) The Mohs scale rates the hardness of minerals. If one mineral can scratch another, it is judged to be harder. (Diamond, the hardest mineral, is a 10.) Is hardness related t
> Which of the following variables are ranks? For those that are not ranks, give the units. a) Student ratings of a course on a 5-point Likert scale. b) Students’ letter grades in that course. c) Students’ point scores on the final exam.
> Your manager asked you to fit many models to predict which customers will buy this holiday season. The firm is going to pick only one. What criteria would you recommend that the firm use to evaluate the various models and to compare them?
> Rather than use the proportion of packages that contain defective f loss, another quality control inspector simply counts the number of packages. If historically, only 1 package out of 50 is defective, a) What is the standard deviation of the number of d
> The boxplots from Ex. 24 Chapter 9 display case prices (in dollars) of wine produced by wineries along three of the Finger Lakes in upstate New York. a) What are the null and alternative hypotheses? Talk about prices and location, not symbols. b) Do the
> One of the ways in which the dental f loss discussed in Exercise 5 can fail is if it snaps when pulled with a certain tension. Historically, only 2% of the tested f loss snaps. Suppose boxes of size 50 are selected at random to be tested. a) What is the
> As in Exercise 5, a process to make dental f loss produces spools with a desired mean length of 91.4 m (100 yds). The historical standard deviation is 1 cm (0.01 m). If samples of size 5 are taken and measured, a) What is the center line for the R chart?
> A process to make dental f loss produces spools with a desired mean length of 91.4 m (100 yds). The historical standard deviation is 1 cm (0.01 m). If samples of size 5 are taken and measured, a) What is the standard deviation of the sample mean? b) How
> For the data in Exercise 1, suppose the historical standard deviation had been 0.1C instead of 0.2C. How would that change your answers to the questions in Exercise 3?
> For the data in Exercise 1, instead of the specification limits, a) Use the historical standard deviation to set up 1, 2, and 3s limits and draw the control chart. b) Are any points outside the 3s limits on either side? c) Are two of three consecutive po
> A computer screen manufacturing process must produce screens of uniform size. In particular, for a new tablet, the screen should be 10.1 inches long. The engineers have set 10.16 and 10.04 as the upper and lower specifications on screen length. If the ac
> Your manager has read a couple of articles about data mining and artificial intelligence. He wants to use data mining to find the best customers. What would you suggest to structure this problem?
> Your manager learned all about databases, and frequently makes queries such as: “What fraction of our customers who bought a product in the last six months are female and live within 5 miles of the store?” He says that he is data mining. Is he right? Exp
> Your manager is confused by Big Data. Explain what makes “big data” different than “data.”
> A producer of beverage containers wants to ensure that a liquid at 90C will lose no more than 4C after 30 minutes. Containers are selected at random and subjected to testing. Historical data shows the standard deviation to be 0.2C. The quality control te
> These boxplots from Ex. 23 Chapter 9 show the relationship between the number of cylinders in a car’s engine and its fuel economy from a study conducted by a major car manufacturer. a) What are the null and alternative hypotheses? Talk
> Your manager wants to just find and use “the best” model, but you have found that a combined model (boosting) is better. Explain why boosting might help, and why it may be better than trying to find “the best.”
> What are the pros and cons of combining multiple models to produce a prediction? Should we always combine models?
> Your manager wants to use the total accurate classification rate (percent of all cases properly classified) as the metric to evaluate the division’s models. Is this a good idea? Why or why not?
> What are the advantages and disadvantages of using tree vs. neural network models?
> Is any one portion of the CRISP-DM more important than the others? Why?
> For which one of the following situations would Spearman’s rho be appropriate? a) The Mohs scale rates the hardness of minerals. If one mineral can scratch another, it is judged to be harder. (Diamond, the hardest mineral, is a 10.) Is hardness related
> For which one of the following situations would Spearman’s rho be appropriate? a) Comparing the ratings of a new product on a 5-point Likert scale by a panel of consumers to their ratings of a competitor’s product on the same scale. b) Comparing the sw
> For the probabilities of Exercise 8 and the decision tree of Exercise 4, using the expected values found in Exercise 8, compute the standard deviations of the values associated with each action and the corresponding coefficient of variation.
> For the probabilities of Exercise 7 and the cost matrix of Exercise 3, using the expected values you found in Exercise 7, compute the standard deviation of values associated with each action and the corresponding coefficient of variation.
> Cyanoacrylates, the generic name for several compounds with strong adhesive properties, were invented during WWII during experiments to make a special extra-clear plastic suitable for gun sights. They didn’t work for gun sights, however, because they stu
> A company that specializes in developing concrete for construction strives to continually improve the properties of its materials. To increase the compressive strength of one of its new formulations, they varied the amount of alkali content (low, medium,
> Here are some diagnostic plots for the home prices data from Exercise 29. These were generated by a computer package and may look different from the plots generated by the packages you use. (In particular, note that the axes of the Normal probability plo
> Many variables have an impact on determining the price of a house. A few of these are size of the house (square feet), lot size, and number of bathrooms. Information for a random sample of homes for sale in the Statesboro, Georgia, area was obtained from
> A study by the U.S. Smal Business Administration used historical data to model the GDP per capita of 24 of the countries in the Organization for Economic Cooperation and Development (OECD) (Crain, M. W., The Impact of Regulatory Costs on Small Firms, ava
> What is the financial impact of pollution abatement on small firms? The U.S. government’s Small Business Administration studied this and reported the following model. Pollution abatement/employee = -2.494 - 0.431 ln(Number of Employees) + 0.698 ln(Sales
> Here are some more interpretations of the regression model to predict the price of wine developed in Exercise 24. One of these interpretations is correct. Which is it? Explain what is wrong with the others. a) The minimum price for a bottle of wine that
> A household appliance manufacturer wants to analyze the relationship between total sales and the company’s three primary means of advertising (television, magazines, and radio). All values were in millions of dollars. They found the following regression
> Many factors affect the price of wine, including such qualitative characteristics as the variety of grape, location of winery, and label. Researchers developed a regression model considering two quantitative variables: the tasting score of the wine and t
> A regression was performed to predict selling Price of houses in dollars from their Area in square feet, Lotsize in square feet, and Age in years. The R2 is 92%. The equation from this regression is given here. Price = 169,328 + 35.3 Area + 0.718 Lotsiz
> We really should have examined the residuals. Here is a scatterplot of the residuals from the regression of Exercise 14. a) Which assumptions and conditions for regression can you check with this plot? What do you conclude? Perhaps we should re-express
> Here are some plots of residuals for the regression of Exercise 13. Which of the regression conditions can you check with these plots? Do you find that those conditions are met? 250 125 -125 300 375 450 Predicted 250 125 -125 + 1.25 -1.25 0.00 Nsco
> A second-order autoregressive model for the gas prices is: Using values from the table, what is the predicted value for January 2007 (the value just past those given in the table)? Dependent variable is: Gas R squared = 82.2% R squared (adjusted) =
> The investor in Exercise 18 now accepts your analysis but claims that it demonstrates that it doesn’t matter how many weeks a show plays on Broadway; receipts will be essentially the same. Explain why this interpretation is not a valid use of this regres
> A Police union leader accepts your analysis in Exercise 17 but claims that it proves that paying police more will reduce violent crime. Explain why this interpretation is not a valid use of this regression model. Offer some alternative explanations.
> Consider the coefficient of Playing Weeks in the regression table of Exercise 14. a) State the standard null and alternative hypotheses for the true coefficient of Playing Weeks. b) Test the null hypothesis 1at a = 0.052 and state your conclusion. c) A
> Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re-expression and, if so, why. a) - b) c)
> Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re-expression and, if so, why. a) b) c)
> A real estate agent collects data to develop a model that will use the Size of a new home (in square feet) to predict its Sale Price (in thousands of dollars). Which of these is most likely to be the slope of the regression line: 0.008, 0.08, 0.8, or 8?
> Although some women are colorblind, this condition is found primarily in men. An advertisement for socks marked so they were easy for someone who was colorblind to match started out “There’s a strong correlation between sex and colorblindness.” Explain i
> Here is a regression of Women’s age vs. Men’s age, and a plot of the residuals. a) The residual plot shows 4 outliers, labeled according to the years they correspond to. Explain what they say about the data for those
> In Exercise 39 you investigated the federal rate on 3-month Treasury bills between 1950 and 1980. The scatterplot below shows that the trend changed dramatically after 1980, so we’ve built a new regression model that includes only the d
> A second-order autoregressive model for the apple prices (for all 4 years of data) is / Using the values from the table, what is the predicted value for January 2007 (the value just past those given in the table)?
> In Exercise 21 we looked at the age at which women married as one of the variables considered by those selling wedding services. Another variable of concern is the difference in age of the two partners. The graph shows the ages of both men and women at f
> Here’s a plot showing the federal rate on 3-month Treasury bills from 1950 to 1980, and a regression model fit to the relationship between the Rate (in %) and Years since 1950. (www.gpoaccess.gov/eop/) a) What is the correlation betw
> How does the speed at which a car drives affect fuel economy? Owners of a taxi fleet, watching their bottom line sink beneath fuel costs, hired a research firm to tell them the optimal speed for their taxis to drive. Researchers drove a compact car for 2
> Small businesses must track every expense. A f lower shop owner tracked her costs for heating and related it to the average daily Fahrenheit temperature, finding the model Cost = 133 - 2.13 Temp. The residuals plot for her data is shown. a) Interpret
> Published reports about violence in computer games have become a concern to developers and distributors of these games. One firm commissioned a study of violent behavior in elementary-school children. The researcher asked the children’s parents how much
> A researcher gathering data for a pharmaceutical firm measures blood pressure and the percentage of body fat for several adult males and finds a strong positive association. Describe three different possible cause-and-effect relationships that might be p
> The original five points in Exercise 33 produce a regression line with slope 0. Match each of the green points (a–e) with the slope of the line after that one point is added: 1) -0.45 2) -0.30 3) 0.00 4) 0.05 5) 0.85
> The scatterplot shows five blue data points at the left. Not surprisingly, the correlation for these points is r = 0. Suppose one additional data point is added at one of the five positions suggested below in green. Match each point (a–
> Each of the following scatterplots a–d shows a cluster of points and one “stray” point. For each, answer questions 1–4: 1) In what way is the point unusual? Does it have high leverag
> Each of the four scatterplots a–d that follow shows a cluster of points and one “stray” point. For each, answer questions 1–4: 1) In what way is the point unusual? Does it have high
> For the Gas prices of Exercise 6, find the lag2 version of the prices.
> Like many businesses, The National Hurricane Center also participates in a program to improve the quality of data and predictions by government agencies. They report their errors in predicting the path of hurricanes. The following scatterplot shows the t
> Much attention has been paid to the challenges faced by the airline industry. Patterns in customer demand are an important variable to watch. The scatterplot below shows the number of passengers departing from Oakland (CA) airport month by month from 199
> How does what a movie earns relate to its run time? Will audiences pay more for a longer film? Does the relationship depend on the type of film? The scatterplot shows the relationship for the films in Exercise 27 between U.S. Gross earnings and Run Time.
> Here’s a scatterplot of the production budgets (in millions of dollars) vs. the running time (in minutes) for a collection of major movies. Dramas are plotted in red and all other genres are plotted in blue. A separate least squares reg
> An intern who has created a linear model is disappointed to find that her R2 value is a very low 13%. a) Does this mean that a linear model is not appropriate? Explain. b) Does this model allow the intern to make accurate predictions? Explain.
> In justifying his choice of a model, a consultant says “I know this is the correct model because R2 = 99.4%.” a) Is this reasoning correct? Explain. b) Does this model allow the consultant to make accurate predictions? Explain.
> The United Nations Development Programme (UNDP) uses the Human Development Index (HDI) in an attempt to summarize in one number the progress in health, education, and economics of a country. The mean years of schooling is positively associated with HDI.
> The United Nations Development Programme (UNDP) collects data in the developing world to help countries solve global and national development challenges. In the UNDP annual Human Development Report, you can find data on over 100 variables for each of 197
> Even with campaigns to reduce smoking, Americans still consume more than four packs of cigarettes per month per adult (libraries.ucsd.edu/ssds/pub/ CTS/tobacco/sales). The Centers for Disease Control and Prevention track cigarette smoking in the United S
> Weddings are one of the fastest growing businesses; about $40 billion is spent on weddings in the United States each year. But demographics may be changing, and this could affect wedding retailers’ marketing plans. Is there evidence tha
> For the Apple prices of Exercise 5, find the lag1 version of the prices.
> Orange growers know that the larger an orange the higher the price it will bring. But as the number of oranges on a tree increases, the fruit tends to be smaller. Here’s a table of that relationship. Create a model for this relationship
> The Organization for Economic Cooperation and Development (OECD) is an organization comprised of thirty countries. To belong, a country must support the principles of representative democracy and a free market economy. How have these countries grown in t
> For the regression model in Exercise 8, the leverage values look like this: The movie with the highest leverage of 0.219 is Walt Disney’s John Carter, which grossed $66M but had a budget of $300M. If the budget for John Carter had bee
> Here is the scatterplot of the variables in Exercise 7 with regression lines added for each kind of movie: The regression model is: a) Write out the regression model. b) In this regression, the variable Budget*R Rating is an interaction term. How wou
> Are R rated movies as profitable as those rated PG-13? Here’s scatterplot of USGross ($M) vs. Budget ($M) for PG-13 (green) and R (purple) rated movies a) How would you code the indicator variable? (Use PG-13 as the base level.) b) H
> A marketing manager has developed a regression model to predict quarterly sales of his company’s mid-weight microfiber jackets based on price and amount spent on advertising. An intern suggests that he include indicator (dummy) variables for each quarter
> For each of the following, show how you would code dummy (indicator) variables to include in a regression model. a) Type of residence (Apartment, Condominium, Townhouse, Single family home) b) Employment status (Full-time, Part-time, Unemployed)
> Here is the regression for Exercise 3 with an indicator variable: a) Write out the regression model. b) In this regression, the variable R Rating is an indicator variable that is 1 for movies that have an R rating. How would you interpret the coeffici