Answer true or false to the following statement, and explain your answer: A strong correlation between two variables doesn’t necessarily mean that they’re causally related.
> Consider the events (not J ), (H & I), (H or K), and (H & K) discussed in Problem 31. a. Find the probability of each of those four events, using the f/N rule. b. Compute P(J ), using the complementation rule and your answer for P(not J ) from part (a).
> Refer to Problems 30 and 31. a. Use the second column of Table 5.21 and the f/N rule to compute the probability of each of the events H, I, J, and K. b. Express each of the events H, I, J , and K in terms of the mutually exclusive events displayed. c. Co
> For the following groups of events, determine which are mutually exclusive. a. H and I b. I and K c. H and (not J ) d. H, (not J ), and K
> A federal individual income tax return is selected at random. Let H = event the return shows an AGI between $20K and $100K, I = event the return shows an AGI of less than $50K, J = event the return shows an AGI of less than $100K, and K = event the retur
> The Internal Revenue Service compiles data on income tax returns and summarizes its findings in Statistics of Income. The first two columns of Table 5.21 show a frequency distribution (number of returns) for adjusted gross income (AGI) from federal indiv
> What meaning is given to the probability of an event by the frequentist interpretation of probability?
> The Television Bureau of Advertising publishes a report titled TV Basics for the purpose of providing information to help advertisers make the most effective and efficient use of local and national spot television advertisements. The following table give
> Name three common discrete probability distributions other than the binomial distribution.
> Explain the difference between a frequency histogram and a relative-frequency histogram.
> Suppose that a simple random sample of size n is taken from a finite population in which the proportion of members having a specified attribute is p. Let X be the number of members sampled that have the specified attribute. a. If the sampling is done wit
> The following are two probability histograms of binomial distributions. For each, specify whether the success probability is less than, equal to, or greater than 0.5.
> The game of craps is played by rolling two balanced dice. A first roll of a sum of 7 or 11 wins; and a first roll of a sum of 2, 3, or 12 loses. To win with any other first sum, that sum must be repeated before a sum of 7 is thrown. It can be shown that
> In 10 Bernoulli trials, how many outcomes contain exactly three successes?
> What is the relationship between Bernoulli trials and the binomial distribution?
> List the three requirements for repeated trials of an experiment to constitute Bernoulli trials.
> Determine the value of each binomial coefficient.
> Determine 0!, 3!, 4!, and 7!.
> Regarding the equal-likelihood model, a. what is it? b. how are probabilities computed?
> Two random variables, X and Y , have standard deviations 2.4 and 3.6, respectively. Which one is more likely to take a value close to its mean? Explain your answer.
> We used slightly different methods for determining the “middle” of a class with limit grouping and cutpoint grouping. Identify the methods and the corresponding terminologies.
> A random variable X has mean 3.6. If you make a large number of repeated independent observations of the random variable X, the average value of those observations will be approximately ___.
> A random variable X equals 2 with probability 0.386. a. Use probability notation to express that fact. b. If you make repeated independent observations of the random variable X, in approximately what percentage of those observations will you observe the
> If you sum the probabilities of the possible values of a discrete random variable, the result always equals ___.
> How do you graphically portray the probability distribution of a discrete random variable?
> What does the probability distribution of a discrete random variable tell you?
> Fill in the blanks. a. A is a quantitative variable whose value depends on chance___. b. A discrete random variable is a random variable whose possible values ____.
> A and B are events such that P(A) = 0.2, P(B) = 0.6, and P(A & B) = 0.1. Find P(A or B).
> E is an event and P(not E) = 0.4. Find P(E).
> A, B, and C are mutually exclusive events such that P(A) = 0.2, P(B) = 0.6, and P(C) = 0.1. Find P(A or B or C).
> Why is probability theory important to statistics?
> For quantitative data, we examined three types of grouping: single-value grouping, limit grouping, and cut point grouping. For each type of data given, decide which of these three grouping types is usually best. Explain your answers. a. Continuous data d
> In an on-line press release, ABCNews.com reported that “. . . 73 percent of Americans. . . favor a law that would require every gun sold in the United States to be test-fired first, so law enforcement would have its fingerprint in case it were ever used
> Based on the least-squares criterion, the line that best fits a set of data points is the one with the ____ possible sum of squared errors.
> Regarding the variables in a regression analysis, a. what is the independent variable called? b. what is the dependent variable called?
> Identify one use of a regression equation.
> What kind of plot is useful for deciding whether finding a regression line for a set of data points is reasonable?
> Explain your answers. If a line has a positive slope, y-values on the line decrease as the x-values decrease.
> Explain your answers. A horizontal line has no slope.
> Explain your answers. The y-intercept of a line has no effect on the steepness of the line.
> From the website Golf.com, part of Sports Illustrated Sites, we obtained the scores for the first and second rounds of the 2013 U.S. Open golf tournament. You will find those scores on the WeissStats site. For part (d), predict the second-round score of
> The National Oceanic and Atmospheric Administration publish temperature and precipitation information for cities around the world in Climates of the World. Data on average high temperature (in degrees Fahrenheit) in July and average precipitation (in inc
> From the International Data Base, published by the U.S. Census Bureau, we obtained data on infant mortality rate (IMR) and life expectancy (LE), in years, for a sample of 60 countries. The data are presented on the WeissStats site. For part (d), predict
> With regard to grouping quantitative data into classes in which each class represents a range of possible values, we discussed two methods for depicting the classes. Identify the two methods and explain the relative advantages and disadvantages of each m
> In the article “Effects of Human Population, Area, and Time on Non-native Plant and Fish Diversity in the United States” (Biological Conservation, Vol. 100, No. 2, pp. 243–252), M. McKinney investigated the relationship of various factors on the number o
> Refer to Problem 21. a. Compute the linear correlation coefficient, r. b. Interpret your answer from part (a) in terms of the linear relationship between student-to faculty ratio and graduation rate. c. Discuss the graphical implications of the value of
> Refer to Problem 21. a. Determine SST, SSR, and SSE by using the computing formulas. b. Obtain the coefficient of determination. c. Obtain the percentage of the total variation in the observed graduation rates that is explained by student-to-faculty rati
> Graduation rate—the percentage of entering freshmen attending full time and graduating within 5 years— and what influences it is a concern in U.S. colleges and universities. U.S. News and World Report’s “College Guide” provides data on graduation rates f
> A small company has purchased a computer system for $7200 and plans to depreciate the value of the equipment by $1200 per year for 6 years. Let x denote the age of the equipment, in years, and y denote the value of the equipment, in hundreds of dollars.
> Consider the linear equation y = 4 − 3x. a. At what y-value does its graph intersect the y-axis? b. At what x-value does its graph intersect the y-axis? c. What is its slope? d. By how much does the y-value on the line change when the x-value increases b
> A value of r close to ____ suggests at most a weak linear relationship between the variables.
> A value of r close to −1 suggests a strong ____ linear relationship between the variables.
> A positive linear relationship between two variables means that one variable tends to increase linearly as the other ____ .
> State three of the most important guidelines in choosing the classes for grouping a quantitative data set.
> One use of the linear correlation coefficient is as a descriptive measure of the strength of the ____ relationship between two variables.
> For each of the sums of squares in regression, state its name and what it measures. a. SST b. SSR c. SSE
> Identify a use of the coefficient of determination as a descriptive measure.
> In the context of regression analysis, what is an a. outlier? b. influential observation?
> Using a regression equation to make predictions for values of the predictor variable outside the range of the observed values of the predictor variable is called _____ .
> The line that best fits a set of data points according to the least squares criterion is called the ____ line.
> For a linear equation y = b0 + b1x, identify the a. independent variable. b. dependent variable. c. slope. d. y-intercept.
> A quantitative data set of size 87 has mean 80 and standard deviation 10. At least how many observations lie between 60 and 100?
> What does Chebyshev’s rule say about the percentage of observations in any data set that lie within a. six standard deviations to either side of the mean? b. 1.5 standard deviations to either side of the mean?
> Complete the statement: Almost all the observations in any data set lie within ____ standard deviations to either side of the mean.
> Do the concepts of class limits, marks, cutpoints, and midpoints make sense for qualitative data? Explain your answer
> Data Set A has more variation than Data Set B. Decide which of the following statements are necessarily true. a. Data Set A has a larger mean than Data Set B. b. Data Set A has a larger standard deviation than Data Set B.
> Specify the mathematical symbol used for each of the following descriptive measures. a. Sample mean b. Sample standard deviation c. Population mean d. Population standard deviation
> Identify the most appropriate measure of variation corresponding to each of the following measures of center. a. Mean b. Median
> Philosophical and health issues are prompting an increasing number of Taiwanese to switch to a vegetarian lifestyle. In the paper “LDL of Taiwanese Vegetarians Are Less Oxidizable than Those of Omnivores” (Journal of Nutrition, Vol. 130, pp. 1591–1596),
> The U.S. National Oceanic and Atmospheric Administration publishes temperature data in Climatography of the United States. According to that document, the annual average maximum and minimum temperatures for selected cities in the United States are as pro
> From The World Bank, in the document Life Expectancy at Birth, we obtained data on the expectation of life (in years) at birth for people in various countries. Those data are presented on the WeissStats site. a. obtain the mean, median, and mode(s) of th
> The U.S. Department of Agriculture collects data pertaining to the value of agricultural exports and publishes its findings in U.S. Agricultural Trade Update. For one year, the values of these exports, by state, are provided on the WeissStats site. Data
> Among the measures of center discussed, which is the only one appropriate for qualitative data?
> The U.S. Census Bureau classifies the states in the United States by region and division. The data giving the region and division of each state are presented on the WeissStats site. Use the technology of your choice to determine the mode(s) of the a. reg
> The U.S. Energy Information Administration reports weekly figures on retail gasoline prices in Weekly Retail Gasoline and Diesel Prices. Every Monday, retail prices for all three grades of gasoline are collected by telephone from a sample of approximatel
> Identify an important reason for grouping data.
> According to the Statistical Summary of Students and Staff , prepared by the Department of Information Resources and Communications, Office of the President, University of California, the Fall 2012 enrollment figures for undergraduates at the University
> Beachbody, LLC, provides fitness programs, including home workout videos and nutrition. P90X, or Power 90 Extreme, is a home exercise program that consists of an intense series of workout DVDs. It is a 90-day program that uses the term “muscle confusion”
> An independent golf equipment testing facility compared the difference in the performance of golf balls hit off a regular 2-3/4” wooden tee to those hit off a 3” Stinger Competition golf tee. A Callaway Great Big Bertha driver with 10 degrees of loft was
> In the article “Distribution of Oxygen in Surface Sediments from Central Sagami Bay, Japan: In Situ Measurements by Microelectrodes and Planar Optodes” (Deep Sea Research Part I: Oceanographic Research Papers, Vol. 52, Issue 10, pp. 1974–1987), R. Glud e
> The ages of the 36 millionaires sampled are arranged in increasing order in the following table. a. Determine the quartiles for the data. b. Obtain and interpret the interquartile range. c. Find and interpret the five-number summary. d. Calculate the low
> The U.S. Census Bureau publishes annual price figures for new mobile homes in Manufactured Housing Statistics. The prices of a sample of 250 new mobile homes have roughly a bell-shaped distribution with mean $63.3 thousand and standard deviation $7.9 tho
> The objective of the article, “Caffeinated and Caffeine-free Beverages and Risk of Type 2 Diabetes” (American Journal of Clinical Nutrition, Vol. 97, No. 1, pp. 155–166) by S. Bhupathiraju et al., was to examine the association between caffeinated bevera
> Dr. Thomas Stanley of Georgia State University has collected information on millionaires, including their ages, since 1973. A sample of 36 millionaires has a mean age of 58.5 years and a standard deviation of 13.4 years. a. Complete the following graph.
> Identify the two most commonly used measures of center for quantitative data. Explain the relative advantages and disadvantages of each.
> In the paper “Injuries and Risk Factors in a 100-Mile (161-km) Infantry Road March” (Preventative Medicine, Vol. 28, pp. 167–173), K. Reynolds et al. reported on a study commissioned by the U.S. Army. The purpose of the study was to improve medical plann
> In Issue 338 of the Amstat News, thenpresident of the American Statistical Association, F. Scheuren, reported the results of a survey on how members would prefer to receive ballots in annual elections. On the WeissStats site, you will find data for prefe
> In the article “Fossil Argonauts (Mollusca: Cephalopoda: Octopodida) from Late Miocene Siltstones of the Los Angeles Basin, California” (Journal of Paleontology, Vol. 79, No. 3, pp. 520–531), paleontologists L. Saul and C. Stadum discussed fossilized Arg
> The U.S. National Center for Health Statistics collects data on causes of death and publishes its findings in National Vital Statistics Reports. Which of the three main measures of center is appropriate for causes of death? Explain your answer.
> The National Center for Health Statistics publishes information on the duration of marriages in Vital Statistics of the United States. Which measure of center is more appropriate for data on the duration of marriages, the mean or the median? Explain your
> An integral part of doing business in the dot-com culture of the late 1990s was frequenting the party circuit centered in San Francisco. Here high-tech companies threw as many as five parties a night to recruit or retain talented workers in a highly comp
> Regarding z-scores: a. How is a z-score obtained? b. What is the interpretation of a z-score? c. An observation has a z-score of 2.9. Roughly speaking, what is the relative standing of the observation?
> Regarding outliers: a. What is an outlier? b. Explain how you can identify potential outliers, using only the first and third quartiles.
> Regarding the five-number summary: a. Identify its components. b. How can it be employed to describe center and variation? c. What graphical display is based on it?
> A data set of size 152 with roughly a bell-shaped distribution has mean 25 and standard deviation 4. Approximately how many observations lie between 17 and 33?
> A data set with roughly a bell-shaped distribution has mean 45 and standard deviation 12. Approximately what percentage of the observations lie between 33 and 57?
> Define a. descriptive measures. b. measures of center. c. measures of variation.
> Research by W. Clark and L. Midanik (Alcohol Consumption and Related Problems: Alcohol and Health Monograph 1. DHHS Pub. No. (ADM) 82–1190) examined, among other issues, alcohol consumption patterns of U.S. adults by marital status. Data for marital stat
> A quantitative data set has been grouped by using limit grouping with equal-width classes. The lower and upper limits of the first class are 3 and 8, respectively, and the class width is 6. a. What is the class mark of the second class? b. What are the l
> When is the use of single-value grouping particularly appropriate?
> Some users of statistics prefer pie charts to bar charts because people are accustomed to having the horizontal axis of a graph show order. For example, someone might infer from “Republican” is less than “Other” because “Republican” is shown to the left