2.99 See Answer

Question: An issue of BARRON’S presented information


An issue of BARRON’S presented information on top wealth managers in the United States, based on individual clients with accounts of $1 million or more. Data were given for various variables, two of which were number of private client managers and private client assets. Those data are provided on the WeissStats site, where private client assets are in billions of dollars.
a. Obtain a scatterplot for the data.
b. Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)–(f).
c. Determine and interpret the regression equation for the data.
d. Identify potential outliers and influential observations.
e. In case a potential outlier is present, remove it and discuss the effect.
f. In case a potential influential observation is present, remove it and discuss the effect.


> The Atlantic Hurricane Database extends back to 1851, recording among other things the number of major hurricanes striking the U.S. Atlantic and Gulf Coast per year. A major hurricane is a hurricane measuring at least a Category 3 on the Saffir-Simpson h

> From the document American Housing Survey for the United States, published by the U.S. Census Bureau, we obtained the following frequency distribution for the number of persons per occupied housing unit, where we have used “7” in place of “7 or more.” Fr

> The National Aeronautics and Space Administration (NASA) compiles data on space-shuttle launches and publishes them on its website. The following table displays a frequency distribution for the number of crew members on each shuttle mission from April 12

> A variable x of a finite population has the following frequency distribution: Suppose a member is selected at random from the population and let X denote the value of the variable x for the member obtained. a. Determine the probability distribution of th

> A variable y of a finite population has the following frequency distribution: Suppose a member is selected at random from the population and let Y denote the value of the variable y for the member obtained. a. Determine the probability distribution of th

> A variable y of a finite population has the following frequency distribution: Suppose a member is selected at random from the population and let Y denote the value of the variable y for the member obtained. a. Determine the probability distribution of th

> Apply the empirical rule to solve each exercise. The data set has size 80. Approximately how many observations lie within two standard deviations to either side of the mean?

> In the article, “Reasons for Non-uptake of Measles, Mumps, and Rubella Catch Up Immunization in a Measles Epidemic and Side Effects of the Vaccine” (British Medical Journal, Vol. 310, pp. 1629–1632), R. Roberts et al. discussed a follow-up survey to exam

> Which of the following numbers could not possibly be a probability? Justify your answer. a. 5/6 b. 3.5 c. 0

> A variable x of a finite population has the following frequency distribution: Suppose a member is selected at random from the population and let X denote the value of the variable x for the member obtained. a. Determine the probability distribution of th

> What rule of probability permits you to obtain any probability for a discrete random variable by simply knowing its probability distribution?

> Suppose that you make a large number of independent observations of a random variable and then construct a table giving the possible values of the random variable and the proportion of times each value occurs. What will this table resemble?

> Fill in the blank. For a discrete random variable, the sum of the probabilities of its possible values equals ___ .

> Let X denote the number of siblings of a randomly selected student. Explain the difference between {X = 3} and P(X = 3).

> Provide an example (other than one discussed in the text) of a random variable that does not arise from a quantitative variable of a finite population in the context of randomness.

> Fill in the blanks. a. A relative-frequency distribution is to a variable as a ____ distribution is to a random variable. b. A relative-frequency histogram is to a variable as a____ histogram is to a random variable.

> The general addition rule for two events is presented and that for three events. a. Verify the general addition rule for three events. b. Write the general addition rule for four events and explain your reasoning.

> Apply the empirical rule to solve each exercise. The data set has size 40. Approximately how many observations lie within two standard deviations to either side of the mean?

> A certain city has three major newspapers, the Times, the Herald, and the Examiner. Circulation information indicates that 47.0% of households get the Times, 33.4% get the Herald, 34.6% get the Examiner, 11.9% get the Times and the Herald, 15.1% get the

> In the article, “Non-probability Sampling Designs for Litigation Surveys (Trademark Reporter, Vol. 81, pp. 169–179), J. Jacoby and H. Handlin discussed the controversy about whether nonprobability samples are acceptable as evidence in litigation. The aut

> Suppose that A and B are mutually exclusive events. a. Use the special addition rule to express P(A or B) in terms of P(A) and P(B). b. Show that the general addition rule gives the same answer as that in part (a).

> Which of the following numbers could not possibly be a probability? Justify your answer. a. 0.462 b. −0.201 c. 1

> Roughly speaking, what is an experiment? an event?

> Following are the age and price data for Corvettes. a. compute SST, SSR, and SSE. b. compute the coefficient of determination, r 2. c. determine the percentage of variation in the observed values of the response variable explained by the regression, and

> Following are the data on percentage of investments in energy securities and tax efficiency. a. compute SST, SSR, and SSE. b. compute the coefficient of determination, r 2. c. determine the percentage of variation in the observed values of the response v

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> Apply the empirical rule to solve each exercise. The data set has mean 30 and standard deviation 4. Approximately what percentage of the observations lie between 22 and 38?

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> In one of his books, Ted Sorenson, Special Counsel to President John F. Kennedy, presents an intimate biography of the extraordinary man. According to Sorenson, Kennedy “read every fiftieth letter of the thirty thousand coming weekly to the White House.”

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> y = 0.5x − 2 a. find the y-intercept and slope. b. determine whether the line slopes upward, slopes downward, or is horizontal, without graphing the equation. c. use two points to graph the equation.

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> a. compute the three sums of squares, SST, SSR, and SSE, using the defining formulas. b. verify the regression identity, SST = SSR + SSE. c. compute the coefficient of determination. d. determine the percentage of variation in the observed values of the

> For a regression analysis, SST = 8291.0 and SSR = 7626.6. a. Obtain and interpret the coefficient of determination. b. Determine SSE.

> Apply the empirical rule to solve each exercise. The data set has mean 15 and standard deviation 2. Approximately what percentage of the observations lie between 13 and 17

> A measure of the amount of variation in the observed values of the response variable not explained by the regression is the ___. The mathematical abbreviation for it is ___.

> A measure of the amount of variation in the observed values of the response variable explained by the regression is the ___. The mathematical abbreviation for it is ____.

> A measure of total variation in the observed values of the response variable is the ___. The mathematical abbreviation for it is ___.

> The National Agricultural Statistics Service (NASS) conducts studies of the number of acres devoted to farms in each county of the United States. Suppose that we divide the United States into the four census regions (Northeast, North Central, South, and

> In this section, we introduced a descriptive measure of the utility of the regression equation for making predictions. a. Identify the term and symbol for that descriptive measure. b. Provide an interpretation.

> A collection of observations of a variable y taken at regular intervals over time is called a time series. Economic data and electrical signals are examples of time series. We can think of a time series as providing data points (xi, yi), where xi is the

> For a set of n data points, the sample covariance, sxy, is given by The sample covariance can be used as an alternative method for finding the slope and y intercept of a regression line. The formulas are where sx denotes the sample standard deviation of

> The ability to estimate the volume of a tree based on a simple measurement, such as the tree’s diameter, is important to the lumber industry, ecologists, and conservationists. Data on volume, in cubic feet, and diameter at breast height, in inches, for 7

> y = −8 − 4x a. find the y-intercept and slope. b. determine whether the line slopes upward, slopes downward, or is horizontal, without graphing the equation. c. use two points to graph the equation.

> Apply the empirical rule to solve each exercise. The data set has mean 25 and standard deviation 5. Fill in the following blanks: a. Approximately 68% of the observations lie. b. Approximately 95% of the observations lie. c. Approximately 99.7% of the ob

> The magazine Consumer Reports publishes information on automobile gas mileage and variables that affect gas mileage. In one issue, data on gas mileage (in miles per gallon) and engine displacement (in liters) were published for 121 vehicles. Those data a

> Does a higher state per capita income equate to a higher per capita beer consumption? From the document Survey of Current Business, published by the U.S. Bureau of Economic Analysis, and from the Brewer’s Almanac, published by the Beer Institute, we obta

> Polychlorinated biphenyls (PCBs), industrial pollutants, are known to be carcinogens and a great danger to natural ecosystems. As a result of several studies, PCB production was banned in the United States in 1979 and by the Stockholm Convention on Persi

> The National Oceanic and Atmospheric Administration publishes temperature information of cities around the world in Climates of the World. A random sample of 50 cities gave the data on average high and low temperatures in January shown on the WeissStats

> In the article, “Ghost of Speciation Past” (Nature, Vol. 435, pp. 29–31), T. Kocher looked at the origins of a diverse flock of cichlid fishes in the lakes of southeast Africa. Suppose that you wanted to select a sample from the hundreds of species of ci

> On the WeissStats site are data on home size (in square feet) and assessed value (in thousands of dollars) for the same homes. a. Obtain a scatterplot for the data. b. Decide whether finding a regression line for the data is reasonable. If so, then also

> The document Arizona Residential Property Valuation System, published by the Arizona Department of Revenue, describes how county assessors use computerized systems to value single-family residential properties for property tax purposes. On the WeissStats

> Box Office Mojo collects and posts data on movie grosses. For a random sample of 50 movies, we obtained both the domestic (U.S.) and overseas grosses, in millions of dollars. The data are presented on the WeissStats site. a. Obtain a scatterplot for the

> The Information Please Almanac provides data on the ages at inauguration and of death for the presidents of the United States. We give those data on the WeissStats site for those presidents who are not still living at the time of this writing. a. Obtain

> How important are birdies (a score of one under par on a given golf hole) in determining the final total score of a woman golfer? From the U.S. Women’s Open website, we obtained data on number of birdies during a tournament and final score for 63 women g

> Apply the empirical rule to solve each exercise. The data set has mean 10 and standard deviation 3. Fill in the following blanks: a. Approximately 68% of the observations lie between ___ and ____. b. Approximately 95% of the observations lie between___ a

> y = 6 − 7x a. find the y-intercept and slope. b. determine whether the line slopes upward, slopes downward, or is horizontal, without graphing the equation. c. use two points to graph the equation.

> In the paper “Mating System and Sex Allocation in the Gregarious Parasitoid Cotesia glomerata” (Animal Behaviour, Vol. 66, pp. 259–264), H. Gu and S. Dorn reported on various aspects of the mating system and sex allocation strategy of the wasp C. glomera

> We provided data on age and price for a sample of 11 Orions between 2 and 7 years old. On the WeissStats site, we have given the ages and prices for a sample of 31 Orions between 1 and 11 years old. a. Obtain a scatterplot for the data. b. Is it reasonab

> The negative relation between study time and test score found in Exercise 4.63 has been discovered by many investigators. Provide a possible explanation for it. Data from Exercise 4.63: An instructor at Arizona State University asked a random sample of

> In the article “Graphs in Statistical Analysis” (American Statistician, Vol. 27, Issue 1, pp 17–21), F. Anscombe presented four sets of data points with almost identical basic statistical properties (means, standard deviations, regression lines, etc.) bu

> The members of a population have been numbered 1–500. A sample of size 10 is to be taken from the population, using stratified random sampling with proportional allocation. The strata are of sizes 200, 150, and 150, where stratum #1 consists of the membe

> In Exercise 4.59, you determined a regression equation that can be used to predict the price of a Corvette, given its age. a. Should that regression equation be used to predict the price of a 4-year-old Corvette? a 10-year-old Corvette? Explain your answ

> In Exercise 4.58, you determined a regression equation that relates the variables percentage of investments in energy securities and tax efficiency for mutual fund portfolios. a. Should that regression equation be used to predict the tax efficiency of a

> An instructor at Arizona State University asked a random sample of eight students to record their study times in a beginning calculus course. She then made a table for total hours studied (x) over 2 weeks and test score ( y) at the end of the 2 weeks. He

> In the article “The Human Vomeronasal Organ. Part II: Prenatal Development” (Journal of Anatomy, Vol. 197, Issue 3, pp. 421–436), T. Smith and K. Bhatnagar examined the controversial issue of the human vomeronasal organ, regarding its structure, function

> We have provided simple data sets for you to practice the basics of finding measures of center. 2, 5, 0, −1 a. mean. b. median. c. mode(s).

> Identify two statistical methods other than a census for obtaining information.

> Plants emit gases that trigger the ripening of fruit, attract pollinators, and cue other physiological responses. N. Agelopolous et al. examined factors that affect the emission of volatile compounds by the potato plant Solanum tuberosum and published th

> Hanna Properties specializes in customhome resales in the Equestrian Estates, an exclusive subdivision in Phoenix, Arizona. A random sample of nine custom homes currently listed for sale provided the following information on size and price. Here, x denot

> y = −1 + 2x a. find the y-intercept and slope. b. determine whether the line slopes upward, slopes downward, or is horizontal, without graphing the equation. c. use two points to graph the equation.

> The Kelley Blue Book provides information on wholesale and retail prices of cars. Following are age and price data for 10 randomly selected Corvettes between 1 and 6 years old. Here, x denotes age, in years, and y denotes price, in hundreds of dollars. F

> Tax efficiency is a measure, ranging from 0 to 100, of how much tax due to capital gains stock or mutual funds investors pay on their investments each year; the higher the tax efficiency, the lower is the tax. In the article “At the Mercy of the Manager”

> The data points in Exercise 4.45. a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points. Data from Exercise 4.45: Line A: y = 3 − 0.6

> The members of a population have been numbered 1–1000. A sample of size 20 is to be taken from the population, using stratified random sampling with proportional allocation. The strata are of sizes 300, 200, 400, and 100, where stratum #1 consists of the

> The data points in Exercise 4.44. a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points. Data from Exercise 4.44: Line A: y = 1.5 + 0

> The data points in Exercise 4.43. a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points. Data from Exercise 4.43: Line A: y = −1 + 3x

> The data points in Exercise 4.42. a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points. Data from Exercise 4.42: Line A: y = 9 − 2x,

> A quantitative data set of size 150 has mean 35 and standard deviation 4. At least how many observations lie between 23 and 47?

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points.

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points.

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points.

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points.

> y = 3 + 4x a. find the y-intercept and slope. b. determine whether the line slopes upward, slopes downward, or is horizontal, without graphing the equation. c. use two points to graph the equation.

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy. b. graph the regression equation and the data points.

> a. find the regression equation for the data points. Use the defining formulas in Definition 4.4 to obtain Sxx and Sxy . b. graph the regression equation and the data points.

> The members of a population have been numbered 1–100. A sample of size 30 is to be taken from the population, using cluster sampling. The clusters are of equal size 10, where cluster #1 consists of the members of the population numbered 1–10, cluster #2

> We presented summary statistics for data on bank robberies for five variables: amount stolen, number of bank staff present, number of customers present, number of bank raiders, and travel time from the bank to the nearest police station. These summary st

> For each of the following sets of data points, determine the regression equation both without and with the use of Formula.

> A quantitative data set of size 200 has mean 20 and standard deviation 4. At least how many observations lie between 8 and 32?

> For a data set consisting of two data points: a. Identify the regression line. b. What is the sum of squared errors for the regression line? Explain your answer.

> Line A: y = 3 − 0.6x, Line B: y = 4 – x Data points: a. plot the data points and the first linear equation on one graph and the data points and the second linear equation on another. b. construct tables for x, y, y, e, and eˆ2. c. determine which line fi

> Line A: y = 1.5 + 0.5x, Line B: y = 1.125 + 0.375x Data points: a. plot the data points and the first linear equation on one graph and the data points and the second linear equation on another. b. construct tables for x, y, y, e, and eˆ2. c. determine wh

> Line A: y = −1 + 3x, Line B: y = 1 + 2x Data points: a. plot the data points and the first linear equation on one graph and the data points and the second linear equation on another. b. construct tables for x, y, y, e, and eˆ2. c. determine which line f

> Line A: y = 9 − 2x, Line B: y = 6 – x Data points: a. plot the data points and the first linear equation on one graph and the data points and the second linear equation on another. b. construct tables for x, y, y, e, and eˆ2. c. determine which line fits

2.99

See Answer