Examtest with the 9th edition of Statistics for Business and Economics by Newbold

How to describe data graphically? - ExamTests 1

 

Questions

Question 1

Indicate whether each of the following variables is categorical or numeric. If the variable is categorical, specify the measurement level. If the variable is numeric, specify the measurement level and indicate whether the variable is discrete or continuous:

  1. The number of shares of a stock purchased by a broker.
  2. The nationality of a student.
  3. The grade point average of a student.
  4. The temperature in degrees Celsius.

Question 2

Upon visiting a newly opened H&M store, customers were given a brief survey. Is the answer to each of the following questions categorical or numerical? If categorical, give the level of measurement. If numerical, is it discrete or continuous?

  1. Is this your first visit to this H&M store?
  2. On a scale from 1 (very dissatisfied) to 5 (very satisfied), how satisfied are you with today's purchase(s)?
  3. What was the cost of your purchase(s)?

Question 3

Tourists visiting Croatia are asked to fill in a survey. The survey consists of various questions about how they experienced their holiday. Describe for each question the type of data obtained.

QuestionType of data

Which of the following areas did you visit?

  • Coast.
  • Islands.
  • Mountains.
  • The capital (Zagreb).
 

Did you rent a sailing boat?

  • Yes.
  • No.
 
What was the average amount of money you spent on food per day? 
What would you recommend as the optimal number of days for tourists to spend in Croatia? 

How often would you recommend visiting Croatia?

  • Every year.
  • Once every five years.
  • Once in a lifetime.
  • Never.
 

    Question 4a

    An administrator examines the travel expenses of faculty members that attended various professional meetings. He found that 36% of the travel expenses was spent for transportation costs, 17% was spent for accommodation, 13% was spent on food; 9% was spent on conference fees, 10% on registration costs, and the remainder was spent on miscellaneous costs.

    Construct a pie chart for these data.

    Question 4b

    Construct a bar chart for these data.

    Question 5

    A company has defined seven codes for possible defects for one of its products. Construct a Pareto diagram for the following frequencies:

    Defect codeABCDEFG
    Frequency10701590843

    Question 6

    Construct a time-series plot for the following data of customers shopping at a new mall during a particular week.

    DayNumber of customers
    Monday516
    Tuesday534
    Wednesday451
    Thursday487
    Friday558
    Saturday641
    Sunday830

    Question 7

    Determine an appropriate interval width for a random sample of 370 observations with scores that fall between 40 to 200.

    Question 8a

    Construct a stem-and-leaf display for the following data.

    1716151717
    2030252514
    1218312626
    1215161628

    Question 8b

    Construct a histogram for these data.

    Question 8c

    Is the distribution of these data symmetric, right-skewed, or left-skewed?

    Question 9

    Prepare a scatter plot of the following data:

    • (3, 10).
    • (2, 8).
    • (3, 12).
    • (4, 15).
    • (6, 20).
    • (5, 15).
    • (4, 12).

    Question 10a

    The following table shows the age of faculty members who have obtained a PhD degree from the largest university in the Netherlands.

    AgePercent
    26 - 2818.00
    29 - 3223.50
    33 - 4030.51
    41 - 5512.99
    56+15.00

    What percent of faculty members who obtained a PhD are 46 years or older?

    Question 10b

    What percent of faculty member who obtained a PhD are under the age of 33 years?

    Question 10c

    Construct a relative cumulative frequency distribution of the data.

    Question 10d

    Suppose, we have 200 observations. What are the cumulative frequencies for the data described?

    Question 10e

    Interpret the cumulative frequencies.

    Question 11

    The following data are presented:

    Age30 -4040 -5050 - 6060 - 70
    Number12132234

    Describe possible errors in this table.

    Question 12

    Suppose, the amount of money a person spends on movie tickets each month (in euros) is:
    6.0, 5.3, 4.0, 5.7, 10.0, 8.4, 2.5, 10.0, 9.5, 0.0, 5.0, 10.0
    What graph would you use to visually display these data?

    Question 13

    In Germany, it was found that 32% of shoppers with incomes less than 50,000 shop online. Of the remaining 68%, half of the individuals never shop, and the other half shops by going to the actual store. Use a pie chart to plot this data.

    Question 14a

    Four types of checking accounts are offered by a bank. Suppose, a random sample of 300 customers were surveyed and asked some questions. It was found that 60% of the respondents preferred "Easy Checking", 12% preferred "Intelligent Checking", 18% preferred "Super Checking", and the remainder preferred "Ultimate Checking". Of the participants who selected Easy Checking, 100 were females. Of those who selected Intelligent Checking, a third was female. Of those who selected Super checking, half was female. Finally, of those who selected Ultimate Checking, 80% was female. Describe the data with a cross table.

    Question 14b

    How many females are there in total, and how many males?

    Question 14c

    What type of graph is appropriate for these data?

    1. Histogram.
    2. Scatter plot.
    3. Time-series plot.
    4. Bar chart.

    Question 15

    What type of graph is most appropriate for two numerical variables?

    Answer indication

    Question 1

    • The number of shares of a stock purchased by a broker: Numerical; interval; discrete
    • The nationality of a student: Categorical; nominal
    • The grade point average of a student: Numerical; ratio; continuous.
    • The temperature in degrees Celsius: Numerical; interval; continuous.

    Question 2

    1. Categorical; nominal.
    2. Categorical; ordinal.
    3. Numerical; continuous.

    Question 3

    QuestionType of data

    Which of the following areas did you visit?

    • Coast.
    • Islands.
    • Mountains.
    • The capital (Zagreb).
    Both categorical (nominal data, binary coded: yes/no) as numerical (discrete) by the number of areas that one visited.

    Did you rent a sailing boat?

    • Yes.
    • No.
    Categorical; nominal; binary coded.
    What was the average amount of money you spent on food per day?Numerical; interval; continuous.
    What would you recommend as the optimal number of days for tourists to spend in Croatia?Numerical; interval; discrete.

    How often would you recommend visiting Croatia?

    • Every year.
    • Once every five years.
    • Once in a lifetime.
    • Never.
    Categorical; ordinal.

    Question 4a

    No answer indication available.

    Question 4b

    No answer indication available.

    Question 5

    No answer indication available.

    Question 6

    Note that the time points on the horizontal axis consists of numbers. This could of course also be replaced by the days (Monday - Sunday).

    Question 7

    According to the quick guide, a sample size of 370 can be approximated by eight to ten classes.
    Using the formula for interval width yields:
    w = (200 - 40) / 8 = 20; or
    w = (200 - 40) / 10 = 16
    Thus, an appropriate interval width lies somewhere between 16 and 20.

    Question 8a

    1 | 2, 2, 4, 5, 5, 6, 6, 6, 7, 7, 7, 8.
    2 | 0, 5, 5, 6, 6, 8.
    3| 0, 1.

    Question 8b

    No answer indication available.

    Question 8c

    Right skewed (positively skewed); the tail is at the right side of the distribution.

    Question 9

    No answer indication available.

    Question 10a

    12.99 + 15.00 = 27.99%

    Question 10b

    18.00 + 23.50 = 41.50%

    Question 10c

    Age

    Percent
    26 - 2818.00
    29 - 3241.50
    33 - 4072.01
    41 - 5585.00
    56+

    100.00

    Question 10d

    The cumulative frequencies for 200 observations are: 36, 82, 144, 170, 200.

    Question 10e

    For sample size n = 200, there are 36 individuals that obtained a PhD between the age of 26 and 28. There are 82 individuals that obtained a PhD before the age of 33. There are 144 individuals that obtained a PhD before the age of 41, and so forth.

    Question 11

    A possible error lies in the boundaries of the frequency classes. First, there is no upper and lower limit, hence (possibly) excluding some observations. Second, it is unclear from this frequency distribution, to what class observations such as 30 and 40 belong to.

    Question 12

    A time-series plot would be appropriate here. Data are given for t number of time points, with t = 12.

    Question 13

    No answer indication available.

    Question 14a

    Type of checking accountFemaleMaleTotal
    Easy Checking10080180
    Intelligent Checking122436
    Super checking272754
    Ultimate Checking24630
    Total163137300

    Question 14b

    There are 163 females and 137 males in the sample of 300 participants.

    Question 14c

    D, a bar chart. The other graphs are appropriate in the event of numerical variables. Here, we have frequencies for two categorical variables. This is best displayed by a bar chart (or pie chart).

    Question 15

    A scatter plot.

    How to describe data numerically? - ExamTests 2

     

    Questions

    Question 1

    A random sample of five numbers was drawn:

    18 71 80 80 84

    Compute the mean, median, and mode.

    Question 2

    The number of cars crossing the border between Israel and Jordan is recorded. Over a 6-day period, the following number of cars for each day is found:

    16 21 12 19 1 2

    Compute the mean, median, and mode.

    Question 3a

    The records of the university of Groningen over a 12-year period show the following percentage increase in the number of students enrolled:

    4.1 3.2 3.5 4.5 5.1 3.8

    2.1 2.2 3.1 5.1 1.5 1.0

    Compute the mean increase in the number of students enrolled.

    Question 3b

    Compute the median increase in the number of students enrolled.

    Question 3c

    Find the mode.

    Question 4a

    The finances over the past decade are reviewed. The records are shown per year.

    2.51 3.74 4.15 5.33 6.18

    6.65 7.18 6.92 6.95 7.54

    Calculate the mean.

    Question 4b

    Calculate the median.

    Question 5a

    During the past years, many countries faced depopulation. We collected the number of elementary schools that were closed for ten countries:

    10 6 13 5 11 5 6 3 7 9

    Find the mean, median, and mode of the number of schools closed.

    Question 5b

    Find the five-number summary.

    Question 6

    A textile manufacturer obtains a sample of 50 bolts of cloths and carefully inspects each bolt. Based on this inspection, the manufacturer records the number of imperfections.The following

    contingency table is obtained:

    Number of imperfections0123
    Number of bolts331241

    Calculate the mean, median, and mode for these sample data.

    Question 7

    Compute the variance and standard deviation of the following sample data:

    6 8 10 12 14 9 11 7 13 11

    Question 8

    Compute the variance and standard deviation of the following sample data:

    5 -3 0 2 -1 7 4

    Question 9

    Consider two different investments, stock A and stock B. The mean closing price for stock A is 4.00 and the mean closing price for stock B is 80.00. The mean rate of return is the same for both stock A and stock B. We might think that stock B is more volatile than stock A. Now, suppose the standard deviations were found to be considerably different, with SA = 2.00 and SB = 8.00. Compute the coefficient of variation for these sample data and compare these competing investment opportunities.

    Question 10

    Calculate the coefficient of variation for the following data:

    13 15 12 14 11

    Question 11a

    A set of data is mounded (bell-shaped) with a mean of 300 and a variance of 144.

    Approximately what proportion of observations is greater than 288?

    Question 11b

    Approximately what proportion of observations is less than 324?

    Question 11c

    Approximately what proportion of observations is greater than 336?

    Question 12a

    The number of cars that pass through a tunnel during a period of 35 are as follows:

    60 70 74 56 84 54 50

    47 80 71 50 95 121 90

    75 84 70 61 110 64 80

    85 85 43 76 60 91 90

    60 87 110 85 44 94 69

    What is the mean number of cars?

    Question 12b

    What is the standard deviation?

    Question 12c

    What is the coefficient of variation?

    Question 12d

    Construct a stem-and-leaf display of the number of cars that pass through the tunnel. Next, find the interquartile range.

    Question 12e

    Provide the five-number summary for the sample data.

    Question 13a

    The daily exchange rate from EUR to USD for seven business days is:

    1.14 1.14 1.13 1.13 1.12 1.11

    Over the same period, the daily exchange rate from EUR of JPY is:

    110 110 109 109 108 109

    Compare the means of these two distributions.

    Question 13b

    Compare the standard deviations of these two distributions.

    Question 14a

    A company produces light bulbs with a mean lifetime of 1,200 hours and a standard deviation of 50 hours. Find the z-score for a light bulb that lasts only 1,120 hours.

    Question 14b

    Consider the z-score computed by question 14a. What percentage of light bulbs lasts longer than 1,120 hours?

    Question 14c

    Consider again the mean and standard deviation from question 14a. Find the z-score corresponding to a light bulb that lasts 1,300 hours.

    Question 14d

    What percentage of light bulbs lasts longer than 1300 hours?

    Question 15a

    Suppose that a student who completed courses for 15 ECTS in total during his first semester of college. He received one A, one B, one C, and one D. Now, suppose that a value of 4 is assigned to an A, a value of 3 is assigned to a B, a value of 2 is assigned to a C, and a value of 1 is assigned to a D. Calculate the student's semester GPA.

    Question 15b

    Now, however, each course is not worth the same number of credit hours. The A was earned in a 3-credit English course, the B was earned in a biology course of 3 hours, the C was earned in a 4-credit biology course, and the D was earned in a 5-credit Spanish course. Using these weight, calculate again the student's semester weighted GPA.

    Question 16a

    Consider the following data:

    xiwi
    4.78
    3.87
    5.74
    2.63
    5.52

    What is the artihmetic mean of the xi values?

    Question 16b

    What is the weighted mean of the xi values?

    Question 16c

    What is the sample variance?

    Question 16d

    What is the sample standard deviation?

    Question 17a

    Consider the following data:

    (15,45) (6,18) (11,33) (12,36) (16,48), (14,42)

    (5,15) (17,51) (4,12) (19,57), (7,21)

    Compute the covariance.

    Question 17b

    Compute the correlation coefficient.

    Question 17c

    Draw a scatter plot to display the relationship between the two variables.

    Question 18a

    Consider the following data:

    Quiz score (x)43.4351.1
    Exam score (y)10066788030

    Compute the covariance.

    Question 18b

    Compute the correlation coefficient.

    Answer indication

    Question 1

    Mean = (18+71+80+80+84)/5 = 66.7; median = 80; mode = 80.

    Question 2

    Mean = (16+21+12+19+1+2)/6 = 11.8; median = (12+16)/2 = 14; there is no mode.

    Question 3a

    Mean = (4.1 + 3.2 + 3.5 + 4.5 + 5.1 + 3.8 + 2.1 + 2.2 + 3.1 + 5.1 + 1.5 + 1.0) / 12 = 3.3.

    Question 3b

    Median = 3.4.

    Question 3c

    Mode = 5.1.

    Question 4a

    Mean = (2.51 + 3.74 + 4.15 + 5.33 + 6.18 + 6.65 + 7.18 + 6.92 + 6.95 + 7.54) / 10 = 5.7.

    Question 4b

    Median = 6.4.

    Question 5a

    Mean = 7.5; median = 6.5; mode = 6.

    Question 5b

    For the five number summary, order the data in ascending order, that is:

    3 5 5 6 6 7 9 10 11 13

    Q1 is the value located in the 0.25(10+1)th position, that is the 2.75th position.
    The second value is 5, the third value is also 5.
    Q1 = 5 + 0.25*(5 - 5)
    Q1 = 5 + 0
    Q1 = 5

    Q3 = the value located in the 0.75(10+1)th ordered position, that is the 8.25th position.
    Q3 = 10 + 0.75(11 - 10)
    Q3 = 10 + 0.75
    Q3 = 10.75

    Thus, the five number summary is: 3 (minimum); 5 (Q1); 6.5 (median); 10.75 (Q3); 13 (maximum).

    Question 6

    Mean = (0*33 + 1*12 + 2*4 + 3*1) / 50 = 23/50 = 0.46.
    Median = 0
    Mode = 0

    Question 7

    To calcuate the sample variance and standard deviation, follow these steps:

    • Step 1: Calculate the sample mean. The sample mean here is equal to 10.1.
    • Step 2: Find the difference between each of the values and the sample mean of 10.1.
    • Step 3: Square each difference.

    The squared deviation from the mean for all observations are: 16.81 4.41 0.01 3.61 15.21 1.21 0.81 9.61 8.41 and 0.81. The sum of these squared deviations equals 60.9. Next, s2 = (60.9) / (n -1) = 60.9/9 = 6.76. Thus, the variance equals 6.76. The standard deviation then is computed by the square root of the variance. That is: s = √6.76 = 2.6

    Question 8

    Again, apply the same steps as in question 7. The sample mean is equal to 2. The squared deviation from the mean for each observation is: 9, 25, 4, 0, 9, 25, 4. The sum of these squared differences is equal to 76. The variance, s2 = 76/6 = 12.83. The standard deviation is the square root of the variance, that is: s = √12.83 = 3.56.

    Question 9

    CVA = 2.00 / 4.00 x 100% = 50%.
    CVB = 8.00 / 80.00 x 100% = 10%.
    The market value of stock A fluctuates more from period to period than does the market value of stock B. The coefficient of variation (CV) indicates that stock for stock A, the sample standarddeviation is 50% of the mean, and for stock B the sample standard deviation is only 10% of the mean.

    Question 10

    Use the formula:
    \[CV = \frac{s}{\bar{x}} x 100\% \hspace{5mm} if \hspace{5mm} \bar{x} > 0 \]
    CV = (1.58 / 13) x 100% = 12.15%
    Thus, the sample standard deviation is 12.15% of the mean.

    Question 11a

    Use the formula:
    \[z = \frac{x_{i} - \mu}{\sigma} \]
    The standard deviation, σ, is equal to the square root of the variance, σ2, that is: √144 = 12
    z = (288 - 300) / 12 = -12/12 = -1
    According to the empirical rule, approximately 68% fall within 1 standard deviation above and below the mean. The remaining 34% percent is thus spread to the left and right of this interval. This means that 0.5*34 = 16% of the observations fall below z = -1. Vice versa, 100 - 16 = 84% of scores are greater than 288.

    Question 11b

    z = (324 - 300) / 12 = 24/12 = 2
    According to the empirical rule, approximately 95% fall within 2 standard deviations above and below the mean. The reamining 5% is spread at the higher and lower end of the distribution. Thus, 97.5% of observations are less than 324.

    Question 11c

    z = (336 - 300) / 12 = 36/12 = 3. Approximately all observations are lower than 336. Thus, to answer the question, almost no (0.15%) observations are greater than 336.

    Question 12a

    Mean = 75.

    Question 12b

    Standard deviation = 19.26.

    Question 12c

    CV = (19.26/75) x 100% = 25.67.

    Question 12d

    4 | 3 4 7
    5 | 0 0 4 6
    6 | 0 0 0 1 4 9
    7 | 0 0 1 4 5 6
    8 | 0 0 4 4 5 5 5 7
    9 | 0 0 1 4 5
    10 |
    11| 0 0
    12| 1
    The interquartile range, IQR = 26.

    Question 12e

    Minimum = 43; Q1 = 60; Median = 75; Q3 = 86; Maximum = 121.

    Question 13a

    The means are 1.13 and 109.17.

    Question 13b

    The standard deviations are 0.01 and 0.75
    CVA = (0.01/1.13) x 100% = 1.04%
    CVB = (0.75/109.17) x 100% = 0.69%
    The coefficient of variations tells us that the sample standard deviation for EUR to USD is 1.04% of the mean, whereas the sample standard deviation for EUR to JPY is 0.69% of the mean. Thus, the exchange rate for EUR to USD fluctuates more from day to day than does that of EUR of JPY.

    Question 14a

    z = (1,120 - 1,200) / 50 = -1.6.

    Question 14b

    94.52 (you can find the p-value corresponding to this z-score in the table of a standard normal distribution).

    Question 14c

    z = (1,300 - 1,200) / 50 = 2.

    Question 14d

    According to the empirical rule, approximately 2.5% of observations are more than two standard deviations above the mean.

    Question 15a

    \[ \bar{x} = \frac{4+3+2+1}{4} = 2.5\]

    Question 15b

    Use the formula for the weighted mean, that is:
    \[\bar{x} = \frac{\Sigma w_{i}x_{i}}{n} \]
    \[\bar{x} = \frac{4*3 + 3*3 + 2*4 + 1*5}{15} = \frac{34}{15} = 2.267 \]

    Question 16a

    \[\bar{x} = \frac{4.7+2.8+5.7+2.6+5.5}{5} = \frac{22.3}{5} = 4.46\]

    Question 16b

    \[\bar{x} = \frac{4.7*8 + 3.8*7 + 5.7*4 + 2.6*3 + 5.5*2}{24} = \frac{105.8}{24} = 4.41 \]

    Question 16c

    The variance is 1.643.

    Question 16d

    The standard deviation is √1.643 = 1.281.

    Question 17a

    The covariance = 82.42.

    Question 17b

    The correlation coefficient between x and y, that is r = 1.0 (perfect positive linear relationship).

    Question 17c

    Question 18a

    Cov(x,y) = 30.8.

    Question 18b

    r = 0.83.

     

    A random sample of five numbers was drawn:

     

    18 71 80 80 84

    Compute the mean, median, and mode.

     

    How to use probability calculation? - ExamTests 3

     

     

    Questions

    Question 1a

    The sample space S = [E1, E2, E3, E4, E5, E6]. Given A = [E1, E2, E3] and B = [E3, E4, E5].

    What is A intersection B?

    Question 1b

    What is the union of A and B?

    Question 1c

    Is the union of A and B collectively exhaustive?

    Question 2a

    Use the following sample space S: S = [E1, E2, E3, E4, E5, E6, E7, E8, E9, E10].

    Given A = [E1, E2, E3, E4], what is Ā?

    Question 2b

    Given Ā = [E1, E4, E5, E7] and B̄ (complement B) = [E2, E3, E5, E8]. What is A intersection B̄ (complement B)?

    Question 2c

    What is A intersection B?

    Question 2d

    What is the union of A and B?

    Question 2e

    Is the union of A and B collectively exhaustive?

    Question 3

    Suppose, two letters are to be selected from A, B, C, D, and E. Further, these two letters have to be arranged in order. How many permutations are possible?

    Question 4

    Suppose, there are 8 candidates that applied for a particular job. Yet, there are only 4 positions available. Of these 8 candidates, 5 are men and 3 are women. If every combination of candidates is equally likely to occur, what is then the probability that no women will be hired?

    Question 5a

    Suppose, there are 10 Apple iPads, 5 Samsung tablets, and 5 Huawei tablets on offer in a store A person enters the store and wants to buy 3 tablets. These tablets are selected purely by chance. What is the total number of outcomes in the sample space?

    Question 5b

    What is the probability that this person selects 2 Apple iPads and 1 Samsung tablet?

    Question 6a

    A sample space consists of 5 A's and 7 B's. Now, suppose we want to randomly draw two letters from this sample space. What is the total number of possible combinations?

    Question 6b

    What is the probability that a randomly selected set of 2 will include 1 A and 1 B?

    Question 7

    In a family of 6 family members, there are three males and three females. What is the probability that a random sample of two family members consists of two males?

    Question 8a

    Suppose there are 12 employees who could be assigned to an editorial task. Of these 12 employees, 7 are women and 5 are men. Two of the men are brothers. The manager of the company has to assign the editorial task randomly to one employee. Let A be the event "chosen employee is a man". Let B be the event "chosen employee is one of the brothers". What is the probability of event A?

    Question 8b

    What is the probability of event B?

    Question 8c

    What is the probability of the intersection of A and B?

    Question 9a

    Suppose, P(A) = 0.75, P(B) = 0.80, and P(A ∩ B) = 0.65. What is P (A ∪ B)?

    Question 9b

    What is the conditional probability of event B, given that event A has occurred?

    Question 9c

    What is the joint probability of both event A and event B?

    Question 10

    Suppose, within the Netherlands, 54% of all master's degrees are earned by women. Of all master's degrees that are obtained, 20% is obtained in psychology. In addition, 8% of all master's degrees are obtained by women in psychology. Are the events "the diploma holder is a woman" and the event "the diploma is in psychology" statistically independent?

    Question 11

    Suppose, the odds in favor of winning are 3 to 2. What is then the probability of winning?

    Question 12a

    Suppose, we are interested in examining the effect of alcohol on highway crashes. Obviously, it is unethical to provide one group of drivers with alcohol and compare their crash involvement to that of a sober group. We know, however, that 10.3% of the nighttime drivers have been drinking, and that 32.4% of the single-vehicle-accident drivers had been drinking. In this example, single-vehicle accidents are chosen to ensure that any driving error could be assigned to the driver only.
    Based on these data, what is the sample space?

    Question 12b

    What is the conditional probability that the driver had been drinking, given that he was not involved in a crash?

    Question 12c

    Do these numbers provide sufficient evidence to conclude that alcohol increases the probability of crashes?

    Question 13

    For questions 26-30, the sample space is defined by events A1, A2, B1, and B2.
    Given that P(A1) = 0.15, P(B1) = 0.20, and P(B1|A1) = 0.60. What is P(A1|B1)?

    Question 14

    Given that P(A1 ∩ B1) = 0.09 and P(B1) = 0.18. What is P(A1|B1)?

    Question 15

    Given that P(A2 ∩ B2) = 0.81 and P(B2) = 0.82. What is P(A2|B2)?

    Question 16

    Given that P(A1) = 0.10, P(B1|A1) = 0.90. What is the probability of P(A1 ∩ B1)?

    Question 17

    Given that P(A1) = 0.10, P(B1|A1) = 0.90, P(B2|A1) = 0.10. What is the probability of P(A2)?

    Answer indication

    Question 1a

    A ∩ B = [E3].

    Question 1b

    A ∪ B = [E1, E2, E3, E4, E5].

    Question 1c

    No, A and B are not collectively exhaustive, because E6 is not covered in the union.

    Question 2a

    Ā = [E5, E6, E7, E8, E9, E10]

    Question 2b

    A ∩ complement B = [E2, E3, E5, E8], because A is equal to the complement of B.

    Question 2c

    A ∩ B is the empty set. There are no basic outcomes in both A and B, because they are each others complement.

    Question 2d

    A ∪ B = [E1, E2, E3, E4, E5, E6, E7, E8]

    Question 2e

    No, events E9 and E10 are not covered in the union of A and B.

    Question 3

    There are five outcomes, that is n = 5, and two outcomes have to be selected, that is x = 2.
    Using the formula for the number of permutations yields:
    \[P^{5}_{2} = \frac{5!}{3} = \frac{120}{6}\ = 20 ].

    Question 4

    First, calculate the total number of possible combinations of four candidates selected from the eight possible candidates. That is:
    \[ C^{8}_{4} = \frac{8!}{4!4!} = 70 \]
    Then, if no women is to be hired, this implies that the four successful candidates must come from the available five men. That means that the number of combinations is as follows:
    \[ C^{5}_{4} = \frac{5!}{4!1!} = 5 \]
    To conclude, if out of 70 possible combinations each is likely to be chosen, the probability that one of the 5-all male combinations would be selected is 5/70 = 1/14 = 0.07 (that is, 7%).

    Question 5a

    \[N = C^{20}_{3} = \frac{20!}{3!(20-3)!} = 1,140 \]
    Thus, there are 1,140 number of outcomes in the sample space.

    Question 5b

    \[ C^{10}_{2} = \frac{10!}{2!(10-2)!} = 45 \]
    Similarly, the number of ways that we can select 1 Samsung tablet from the available 5 is 5.
    \[ C^{5}_{1} = \frac{5!}{1!(5-1!)} = \frac{5!}{1!4!} = 5 \]
    Therefore, the number of outcomes that satisfy event A is as follows:
    \[ N_{A} = C^{10}_{2} x C^{5}_{1} = 45 x 5 = 225 \]
    Hence, the probability of A [i.e., 2 Apple iPads and 1 Samsung tablet] is:
    \[ P_{A} = \frac{N_{A}}{N} = \frac{225}{1140} = 0.197 \]

    Question 6a

    The total number of possible combinations of 2 letters selected from 8 is as follows:
    \[ C^{12}_{2} = \frac{12!}{2!10!} = 66 \]

    Question 6b

    The number of ways that we can select 1 A from the 5 available A's is as follows:
    \[ N_{A} = C^{12}_{2} x C^{5}_{1} = \frac{5!}{1!(5-1)!} = \frac{5!}{1!4!} = 5 \]
    Similarly, the number of ways that we can select 1 B from the 7 available B's is as follows:
    \[ N_{A} = C^{12}_{2} x C^{7}_{1} = \frac{7!}{1-(7-1)!} = \frac{7!}{1!6!} = 7\]
    Therefore, the number of ways that we can select one A and one B, that is the number of outcomes that satisfy event A, is as follows:
    \[N_{A} = C^{5}_{1} x C^{7}_{1} = 5 x 7 = 35 \]
    Finally, the probability of event A (that is, one A and one B) is as follows:
    \[ P_{A} = \frac{N_{A}}{A} = \frac{35}{66} = 0.53\].

    Question 7

    \[ N = C^{6}{3} = \frac{6!}{3!3!} = \frac{720}{36} = 20 \]
    Now, the number of combinations for two males is:
    \[ C^{3}_{2} = \frac{3!}{2!1!} = \frac{6}{2} = 3 \]
    Therefore, the probability of selecting two males is 3/20 = 0.15 (that is: 15%).

    Question 8a

    \[P_{A} = \frac{N_{A}}{N} = \frac{5}{12} = 0.42 \]

    Question 8b

    \[P_{B} = \frac{N_{B}}{N} = \frac{2}{12} = 0.17 \]

    Question 8c

    A ∩ B = 0.17

    Question 9a

    Use the addition rule of probabilities.
    \[ P (A ∪ B) = P(A) + P(B) - P(A ∩ B) \]
    Transforming this formula provides:
    \[ P (A ∩ B) = P(A) + P(B) - P(A ∪ B) \]
    This gives:
    \[ 0.75 + 0.80 - 0.65 = 0.90 \]

    Question 9b

    \[ P(B|A) = \frac{P(A ∩ B)}{P(A)} = \frac{0.65}{0.75} = 0.8667 \]

    Question 9c

    To answer this question, use the multiplication rule of probabilities. That is:
    \[ P(A ∩ B) = P(A|B) P(B) = (0.8125)(0.80) = 0.65 \]

    Question 10

    \[ P(A) = 0.54, P(B) = 0.20, P(A ∩ B) = 0.08 \]
    Since
    \[ P(A)P(B) = (0.54)(0.20) = 0.108 \neq 0.08 = P(A ∩ B) \]
    these events are not independent.
    The dependence can be found from the conditional probability:
    \[ P(A|B) = \frac{P(A ∩ B)}{P(B)} = \frac{0.08}{0.20} = 0.40 \neq 0.54 = P(A) \]
    That means that, in the Netherlands, only 40% of psychology degrees go to women, whereas women constitute 54% of all degree recipients.

    Question 11

    \[ \frac{3}{2} = \frac{P(A)}{1-P(A)} \]
    \[ 3(1-P(A)) = 2P(A) \]
    \[ 5P(A) = 3 \]
    \[ P(A) = \frac{3}{5} = 0.6 \]

    Question 12a

    A1: the driver had been drinking.
    A2: the driver had not been drinking.
    B1: the driver was involved in a single-vehicle crash.
    B2: the driver was not involved in a single-vehicle crash.

    Question 12b

    P(A1|C1) = 0.324

    Question 12c

    P(A1|C2) = 0.103

    To answer this question, use the overinvolvement ratio. That is:
    \[ \frac{P(A_{1}|C_{1})}{P(A_{1}|C_{2})} = \frac{0.324}{0.103} = 3.15 \]
    Based on this ratio of 3.15, we can conclude that there is evidence that alcohol increases the probability of car crashes.

    Question 13

    Using Bayes' theorem, we find that P(A1|B1) = (0.60*0.15)/(0.20) = 0.45.

    Question 14

    \[ P(A_{1}|B_{1}) = \frac{P(A_{1} ∩ B_{1})}{P(B_{1})} = \frac{0.09}{0.18} = 0.50 \]

    Question 15

    \[ P(A_{2}|B_{2}) = \frac{P(A_{2} ∩ B_{2})}{P(B_{2})} = \frac{0.81}{0.82} = 0.988 \]

    Question 16

    P(A1 ∩ B1) = 0.90 * 0.10 = 0.09

    Question 17

    Use both:
    P(A1 ∩ B1) = 0.90 * 0.10 = 0.09
    and:
    P(A1 ∩ B2) = 0.10 * 0.10 = 0.01
    to find that:
    P(A1) = 0.09 + 0.01 = 0.10
    A2 is the complement of A1, thus A2 = 1 - A1 = 1 - 0.10 = 0.90

     

    The sample space S = [E1, E2, E3, E4, E5, E6]. Given A = [E1, E2, E3] and B = [E3, E4, E5].

     

    What is A intersection B?

     

    How to use probability models for discrete random variables? - ExamTests 4

     

     

    Questions

    Question 1

    A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?

    Question 2

    The weight of students is recorded as part of a national health study. Is the weight of students a discrete or continuous random variable?

    Question 3

    Indicate for each of the following if a discrete or continuous random variable provides the best definition:

    • The number of sunny days in the Netherlands.
    • The level of pressure in the tires of a car.
    • The amount of oil exported by Saudi Arabia in 2019.

    Question 4

    Give the probability distribution function of the face values of a single die when a fair die is rolled.

    Question 5

    What is the probability of a value of 5 or higher, when rolling a single fair die once?

    Question 6a

    Use the following probability distribution:

    x0123456
    P(x)0.030.150.110.190.220.260.04

    P(3 < x < 6) = ?

    Question 6b

    P(x > 3) = ?

    Question 6c

    P(2 < x < 5) = ?

    Question 6d

    P(x < 4) = ?

    Question 6e

    What is the mean of this probability distribution?

    Question 7

    Suppose, the probability distribution of the number of errors (X) on pages from a business textbook is as follows: P(0) = 0.81; P(1) = 0.17; P(2) = 0.02.
    What is the mean number of errors per page?

    Question 8a

    Someone is interested in the total costs of a project on which he intends to bid. He estimates that the materials will costs €25,000,- and that the larbor will costs €900,- per day. Suppose the project takes X days to complete. Provide the linear function for the total costs, denoted by C, of the project.

    Question 8b

    Now, assume that the following probability distribution is provided for the completion time of the project.

    Completion time (x)1011121314
    P(x)0.10.20.30.20.1

    Question 8c

    What is the variance for completion time X?

    Question 8d

    What is the mean for the total costs, C?

    Question 8e

    What is the variance for the total costs, C?

    Question 9a

    Suppose that a real estate agent has five contacts and believes that for each contact the probability of making a sale is 0.40. What is the probability that the real estate agent makes at most 1 sale?

    Question 9b

    What is the probability that the real estate agent makes between 2 and 4 sales (inclusive)?

    Question 10a

    It is predicted that 3.5% of all small corporations will file for bankruptcy in 2020. For a random sample of 100 small corporations, estimate the probability that at least 3 will file for bankruptcy in 2020, assuming that this prediction is correct. To do so, use the Poisson distribution.

    Question 10b

    Now, do the same using the (actual) binomial distribution. Is the Poisson distribution a close estimate of the actual binomial distribution?

    Question 11a

    Consider the following joint probability distribution for two random variables X and Y. Find the marginal probabilities.

      Y return  
    X return0%5%10%15%
    0%0.06250.06250.06250.0625
    5%0.06250.06250.06250.0625
    10%0.06250.06250.06250.0625
    15%0.06250.06250.06250.0625

    Question 11b

    Are X and Y independent?

    Question 11c

    Find the mean of X.

    Question 11d

    Find the mean of Y.

    Question 11e

    What is the variance of X?

    Question 11f

    What is the standard deviation of X?

    Question 12

    Consider the following probability distribution

      X 
    Y 01
     00.250.35
     10.100.30

    Compute the marginal probability distributions for X and Y.

    Question 13a

    Consider the following information for questions 28-30. An investor has €1000,- to invest and two investment opportunities, each requiring a minimum of €500,-. The profit for €100,- for the first investment (X) can be represented by the following probability distributions: P(X = -5) = 0.4 and P(X = 20) = 0.6. Subsequently, the profit per €100,- from the second investment (Y) is represented by the following probability distributions: P(Y = 0) = 0.6 and P(Y = 25) = 0.4. Random variables X and Y are independent. The investor has the following possible strategies:

    1. €1000,- in the first investment.
    2. €1000,- in the second investment.
    3. €500,- in each investment.

    Find the mean and variance for the first strategy.

    Question 13b

    Find the mean and variance for the second strategy.

    Question 13c

    Find the mean and variance for the third strategy.

    Answer indication

    Question 1

    It is a discrete random variable, because it can take on a finite number of countable numbers.

    Question 2

    The weight of students is a continuous random variable.

    Question 3

    • The number of sunny days in the Netherlands: discrete.
    • The level of pressure in the tires of a car: continuous.
    • The amount of oil exported by Saudi Arabia in 2019: continuous.

    Question 4

    Probability distribution of a single fair die

    xP(x)
    10.16667
    20.16667
    30.16667
    40.16667
    50.16667
    60.16667

    Question 5

    0.1667 + 0.1667 = 0.3333

    Question 6a

    P(3 < x < 6) = 0.19 + 0.22 + 0.26 = 0.67

    Question 6b

    P(x > 3) = 0.19 + 0.22 + 0.26 + 0.04 = 0.71

    Question 6c

    P(2 < x < 5) = 0.19 + 0.22 + 0.26 = 0.67

    Question 6d

    P(x < 4) = 0.03 + 0.15 + 0.11 + 0.19 = 0.48

    Question 6e

    \[ \mu_{X} = 0(0.03) + (1)(0.15) + (2)(0.11) + (3)(0.19) + (4)(0.22) + (5)(0.26) + (6)(0.04) = 3.36 \]

    Question 7

    \[ \mu_{x} = E[X] = \sum_{x} xP(x) = (0)(0.81) + (1)(0.17) + (2)(0.02) = 0.21 \]
    Thus, the mean number of errors per page is 0.21.

    Question 8a

    C = 25,000 + 900X.

    Question 8b

    \[ \mu_{X} = E[X] = \sum_{x}xP(x) = (10)(0.1) + (11)(0.3) + (12)(0.3) + (13)(0.2) + (14)(0.1) = 11.9 \]
    So, the mean for completion time X is 11.9 days.

    Question 8c

    \[ \sigma^{2}_{Y} = Var(a + bX) = b^{2}\sigma^{2}_{X} \]
    \[ (10 - 11.9)^{2}(0.1) + (11 - 11.9)^{2}(0.3) + ... + (14 - 11.9)^{2}(0.1) = 1.29 \]
    So, the variance for completion time X is 1.29 days.

    Question 8d

    \[ \mu_{C} = E[25,000 + 900X] = (25,000 + 900\mu_{X}) = 2500 + (900)(11.9) = €35,710,- \]

    Question 8e

    \[ \sigma^{2}_{C} = Var(25,000 + 900X) = (900)^{2}\sigma^{2}_{X} = (810,000)(1.29) = €1,044,900,- \]

    Question 9a

    \[ P(0) = \frac{5!}{0!5!}(0.4)^{0}(0.6)^{5} = (0.6)^{5} = 0.078 \]
    \[ P(1) = \frac{5!}{1!4!}(0.4)^{1}(0.6)^{4} = 5(0.4)(0.6)^{4} = 0.259 \]
    P(X < 1) = P(X = 0) + P(X = 1) = 0.078 + 0.259 = 0.337

    Question 9b

    \[ P(2) = \frac{5!}{2!3!}(0.4)^{2}(0.6)^{3} = 10(0.4)^{2}(0.6)^{3} = 0.346 \]
    \[ P(3) = \frac{5!}{3!2!}(0.4)^{3}(0.6)^{2} = 10(0.4)^{3}(0.6)^{2} = 0.230 \]
    \[ P(4) = \frac{5!}{4!1!}(0.4)^{4}(0.6)^{1} = 5(0.4)^{4}(0.6)^{1} = 0.077 \]
    P(2 < X < 4) = P(2) + P(3) + P(4) = 0.346 + 0.230 + 0.077 = 0.653

    Question 10a

    The distribution of X is binomial with n = 100 and P = 0.0035, so that the mean of the distribution is equal to nP = 3.5. Next, using the Poisson distribution to approximate the probabily of at least 3 bankruptcies, we find:
    \[ P(X \geq 3) = 1 - P(X \leq 2) \]
    \[ P(0) = \frac{e^{-3.5}(3.5)^{0}}{0!} = e^{-3.5} = 0.030197 \]
    \[ P(1) = \frac{e^{-3.5}(3.5)^{1}}{1!} = (3.5)(0.030197) = 0.1056895 \]
    \[ P(2) = \frac{e^{-3.5}(3.5)^{2}}{2!} = (6.125)(0.030197) = 0.1849566 \]
    Hence,
    \[ P(X \leq 2) = P(0) + P(1) + P(2) = 0.3208431 \]
    \[ P(X \geq 3) = 1 - 0.3208431 = 0.6791569 \]

    Question 10b

    Using the binomial distribution, we compute the probability belonging to X > 3 as: P(X > 3) = 0.684093.
    Thus, the Poisson probability is a close estimate of the actual binomial distribution.

    Question 11a

    \[ P(X = 0) = \sum_{y}P(0,y) = 0.0625 + 0.0625 + 0.0625 + 0.0625 = 0.25\]
    Note that for every combination of values for X and Y, P(x,y) = 0.0625. Therefore, all the marginal probabilities of X are 25%. The same holds for the marginal probabilities of Y. Note that the sum of the marginal probabilities for a random variable is 1.

    Question 11b

    To test independence, we need to check if P(x,y) = P(x)P(y) for all possible pairs of values x and y.
    P(x,y) = 0.0625 for all possible values of x and y.
    P(x) = 0.25 and P(y) = 0.25 for all possible values of x and y.
    P(x,y) = 0.0625 = (0.25)(0.25) = P(x)P(y)
    Thus, X and Y are independent.

    Question 11c

    \[ \mu_{X} = E[X] = \sum_{x}P(x) = 0(0.25) + 0.05(0.25) + 0.10(0.25) + 0.15(0.25) = 0.075 \]

    Question 11d

    The mean of Y is equal to the mean of X, that is 0.075.

    Question 11e

    \[ \sigma^{2}_{X} = \sum_{X}(x-\mu_{X})^{2}P(x) = (0.25)[(0 - 0.075)^{2} + (0.05 - 0.075)^{2} + (0.10 - 0.075)^{2} + (0.15 - 0.075)^{2}] = 0.003125 \]

    Question 11f

    The standard deviation of X is the square root of the variance, that is 0.0559016, or 5.59%.

    Question 12

    \[ P(X = 0) = \sum_{y}P(0,y) = 0.25 + 0.10 = 0.35 \]
    \[ P(Y = 0) = \sum_{x}P(x,0) = 0.35 + 0.20 = 0.55 \]

    Question 13a

    \[ \mu_{X} = E[X] = \sum_{x}xP(x) = (-5)(0.4) + (20)(0.6) = €10,- \]
    \[ \sigma^{2}_{x} = E[(X - \mu_{X})^{2}] = \sum_{x}(x - \mu)^{2} P(x) = (-5 - 10)^{2}(0.4) + (20 - 10)^{2}(0.6) = 150 \]
    Strategy a has a mean profit of E[10X] = €100,- and variance of Var(10X) = 100Var(X) = 15,000.

    Question 13b

    \[ \mu_{Y} = E[Y] = \sum_{y}yP(y) = (0)(0.6) + (25)(0.4) = €10,- \]
    \[ \sigma^{2}_{y} = E[(Y - \mu_{Y})^{2}] = \sum_{y}(y - \mu)^{2} P(Y) = (0 - 10)^{2}(0.6) + (25 - 10)^{2}(0.4) = 150 \]
    Strategy b has a mean profit of E[10Y] = €100,- and variance of Var(10Y) = 100Var(Y) = 15,000.

    Question 13c

    \[ E[5X + 5Y] = E[5X] + E[5Y] = 5E[X] + 5E[Y] = €100,- \]
    \[ Var(5X + 5Y) = Var(5X) + Var(5Y) = 25Var(X) + 25Var(Y) = 7,500 \]
    The variance of strategy c is smaller than that of the strategies of a and b, reflecting the decrease in risk that follows from diversification in an investment portfolio. Most investors would prefer strategy c, because this strategy yields the same expected return as the other two strategies, but with a lower risk.

     

    A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?

     

     

    How to use probability models for continuous random variables? - ExamTests 5

     

     

    Questions

    Question 1

    Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?

    Question 2

    Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 0.5 and 1.6?

    Question 3

    Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is less than 0.8?

    Quesiton 4

    Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is greater than 1.3?

    Question 5

    A homeowner estimates the heating bill based on the range of likely temperatures in January. He obtains the following linear equation: Y = 290 - 5T, in which T refers to the average temperature for the month in degrees Fahrenheit. If the average temperature in January has mean 24 and standard deviation 4, what is then the mean and standard deviation of this homeowner's January heating bill?

    Question 6

    The profit for a production process is equal to 6000 dollars minus three times the number of units produced. The mean and variance for the number of units produced are 1000 and 900 respectively. Find the mean and variance of the profit.

    Question 7

    The profit of a particular production process is equal to €2000,- minus two times the number of units produced. The mean and variance for the number of units produced are 500 and 900 respectively. What are the mean and variance of the profit?

    Question 8

    The profit of a particular production process is equal to €1000,- minus two times the number of units produced. The mean and variance for the number of units produced are 50 and 90 respectively. What are the mean and variance of the profit?

    Question 9

    Consider for questions 9-15 the standard normal distribution.
    P(Z < 1.16) = ?

    Question 10

    P(Z > 1.73) = ?

    Question 11

    P(Z > -2.29) = ?

    Question 12

    P(Z > -1.35) = ?

    Question 13

    P(1.16 < Z < 1.73) = ?

    Question 14

    P(-2.29 < Z < 1.26) = ?

    Question 15

    P(-2.29 < Z < -1.35) = ?

    Question 16

    The probability is 0.70 that Z is less than what number?

    Question 17

    The probability is 0.25 that Z is less than what number?

    Question 18

    The probability is 0.2 that Z is greater than what number?

    Question 29

    The probability is 0.6 that Z is greater than what number?

    Question 20

    Let a continuous random variable X be normally distributed with X ~ (30, 81). What is the probability that X is greater than 40?

    Question 21

    The anticipated consumer demand at a restaurant can be modeled by a normal random variable with mean 1,500 pounds and standard deviation 110 pounds. What is the probability that the demand will exceed 1,300 pounds?

    Question 22

    The scores on an achievement test are known to be randomly distributed with a mean of 420 and a standard deviation of 80. What is the minimum test score needed in order to be in the top 10% of all people taking the test?

    Question 23

    Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. Can the normal distribution be used to compute probabilities belonging to this distribution. If so, why?

    Question 24

    Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. What is the probability that the number of successes is greater than 305?

    Question 25

    Service times for customers at a library information desk can be modeled by an exponential distribution with a mean service of 5 minutes. What is the probability that a customer service time will take longer than 10 minutes?

    Question 26

    A company in the Netherlands with 2000 employees has a mean number of lost-time accidents per week equal to λ = 0.4 and the number of accidents follow a Poisson distribution. What is the probability that the time between accidents is less than 2 weeks?

    Question 27a

    An investor has asked you for assistance in establishing a portfolio containing two stocks. The investor has €1000,- which can be allocated in any proportion to two alternative stocks. The returns per euro from these two investments are denoted by random variables X and Y. Both of these variables are independent and normally distributed. Investment X has a mean of 25 and variance of 81. The second investment has a mean of 40 and a variance of 121. These two stock prices have a negative correlation, ρxy = -0.40. Define the linear equation of the value of the portfolio, denoted by W.

    Question 27b

    What is the mean value for the stock portfolio?

    Question 27c

    What is the standard deviation for the stock portfolio?

    Question 27d

    What is the probability that the portfolio value exceeds 2,000?

    Answer indication

    Question 1

    P(1.8 < X < 1.4) = F(1.8) - F(1.4) = (0.5)(1.8) - (0.5)(1.4) = 0.9 - 0.7 = 0.2.

    Question 2

    P(1.6 < X < 0.5) = F(1.6) - F(0.5) = (0.5)(1.6) - (0.5)(0.5) = 0.8 - 0.25 = 0.55.

    Question 3

    P(X < 0.8) = F(0.8) = (0.5)(0.8) = 0.40.

    Question 4

    P(2.0 < X < 1.3) = F(2.0) - F(1.3) = (0.5)(2.0) - (0.5)(1.3) = 1.0 - 0.65 = 0.35.

    Question 5

    \[ \mu_{Y} = 290 - 5\mu_{T} = 290 - (5)(24) = 170 \]
    \[ \sigma_{Y} = |-5| \sigma_{T} = (5)(4) = 20 \]

    Question 6

    \[ Y = 6000 - 3U \]
    \[\mu_{Y} = 1000 = 6000 - 3U \]
    \[3U = 6000 - 1000 = 5000 \]
    \[U ≈ 1667 \]
    \[ \sigma_{Y} = |3|\sigma_{U} \]
    \[ 900 = |3|\sigma_{U} \]
    \[ \sigma_{U} = \frac{900}{3} = 300 \]
    Thus, the mean and variance of the profit are 1,667 and 300 dollars respectively.

    Question 7

    \[ Y = 2000 - 2U \]
    \[\mu_{Y} = 500 = 2000 - 2U \]
    \[2U = 2000 - 500 = 1500\]
    \[U ≈ 750 \]
    \[ \sigma_{Y} = |2|\sigma_{U} \]
    \[ 900 = |2|\sigma_{U} \]
    \[ \sigma_{U} = \frac{900}{2} = 450 \]
    Thus, the mean and variance of the profit are €750,- and €450,- respectively.

    Question 8

    \[ Y = 1000 - 2U \]
    \[\mu_{Y} = 50 = 1000 - 2U \]
    \[2U = 1000 - 50 = 950\]
    \[U ≈ 475 \]
    \[ \sigma_{Y} = |2|\sigma_{U} \]
    \[ 90 = |2|\sigma_{U} \]
    \[ \sigma_{U} = \frac{900}{2} = 45 \]
    Thus, the mean and variance of the profit are €950,- and €45,- respectively

    Question 9

    P(Z < 1.16) = 0.8770

    Question 10

    P(Z > 1.73) = 1 - 0.9582 = 0.0418

    Question 11

    P(Z > -2.29) = P(Z < 2.29) = 0.9890

    Question 12

    P(Z > -1.35) = P(Z > 1.35) = 0.9115

    Question 13

    P(1.16 < Z < 1.73) = 0.9582 - 0.8770 = 0.0812

    Question 14

    P(-2.29 < Z < 1.26) = 0.9890 - 0.8962 = 0.0928

    Question 15

    P(-2.29 < Z < -1.35) = 0.0855 - 0.011 = 0.0745

    Question 16

    z = 0.525

    Question 17

    z = -0.575

    Question 18

    z = -0.845

    Question 19

    z = -0.256

    Question 20

    \[ Z = \frac{X - \mu}{sigma} = \frac{40 - 30}{\sqrt{81}} = \frac{-10}{9} = -1.11 \]
    P(Z > -1.11) = 1 - 0.8665 = 0.1335

    Question 21

    \[ Z = \frac{(1300 - 1,500)}{110} = -1.82 \]
    P(Z > -1.82) = 0.9656

    Question 22

    Top 10% corresponds to z = 1.185 (between z = 1.18 and z = 1.19 in Standard Normal Distribution Table).
    \[ 1.185 = \frac{X - 420}{80} \]
    \[ 1.185*80 = X - 420 \]
    \[ 94.5 + 420 = X\]
    Thus, X = 514.8. One needs to score at least 515 to be in the top 10% of all people taking this test.

    Question 23

    nP(1 - P) = 900*0.30(1 - 0.30) = 189 > 5, thus the binomial distribution can be approximated by the standard normal distribution.

    Question 24

    \[ \mu = nP = 270 \]
    \[ \sigma^{2} = 189 \]
    \[ \sigma = \sqrt{189} = 13.75 \]
    \[ z = \frac{305 - 270}{13.75} = 2.55 \]
    P(Z > 2.55) = 1 - 0.9946 = 0.0054

    Question 25

    \[ P(T > 10) = 1 - P(T < 10) = 1 - F(10) = 1 - (1 - e^{-(0.20)(10)}) = e^{-2.0} = 0.1353 \]
    Thus, the probability that a service time exceeds 10 minutes is 0.1353.

    Question 26

    \[ P(T < 2) = F(2) = 1 - e^{-(0.4)(2)} = 1 - e^{-0.8} = 1 - 0.4493 = 0.5507 \]
    Thus, the probability of less than 2 weeks between accidents is about 55%.

    Question 27a

    W = 20X + 30Y

    Question 27b

    W = 20*25 + 30*40 = 1,700

    Question 27c

    \[ \sigma^{2}_{W} = 20^{2} \sigma^{2}_{X} 30^{2} \sigma^{2}_{Y} + 2*30 \rho_{XY} \sigma_{X} \ sigma_{Y} \]
    \[ \sigma^{2}_{W} = 20^{2}*81 + 30^{2}*121 + 2*20*30*{-0.40}*9*11 = 93,780 \]
    \[ \sigma = \sqrt{\sigma^{2}} = \sqrt{93,780} = 306.24 \]

    Question 27d

    \[ Z = \frac{2000 - 1700}{306.24} = 0.980 \]
    P(Z > 0.980) = 0.1635

     

    Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?

     

     

    How to obtain a proper sample from a population? - ExamTests 6

     

     

    Questions

    Question 1a

    Suppose that we know that the annual percentage salary increase is normally distributed with a mean of 12.2% and a standard deviation of 3.6%. A random sample of 9 observations is obtained from this population and the sample mean is computed. What is the standard error of the sample mean?

    Question 1b

    What is the probability that the sample mean exceeds 14.4%?

    Question 2a

    Given a population with a mean of 105 and a variance of 16, the central limit theorem applies when the sample size is n > 25. A random sample of size 25 is obtained. What are the mean and variance of the sampling distribution for the sample means?

    Question 2b

    What is the probability that x̅ > 106?

    Question 2c

    What is the probability that 104 << 106?

    Question 2d

    What is the probability that x̅ < 105.5?

    Question 3a

    Given a population with a mean of 150 and a variance of 1600, the central limit theorem applies when the sample size is n > 25. A random sample of size 36 is obtained. What are the mean and variance of the sampling distribution for the sample means?

    Question 3b

    What is the probability that x̅ > 155?

    Question 3c

    What is the probability that 145 << 165?

    Question 3d

    What is the probability that x̅ > 165?

    Question 4a

    The lifetime of light bulbs procuded by a company have a mean of 1,200 hours and a standard deviation of 400 hours. The population is normally distributed. Suppose that you buy nine light bulbs, which can be regarded as a proper random sample from the population. What is the mean of the sample mean lifetime?

    Question 4b

    What is the variance of the sample mean?

    Question 4c

    What is the standard error of the sample mean?

    Question 4d

    What is the probability that, on average, those nine light bulbs have live times of less than 1050 hours?

    Question 5a

    To get some feeling for possible magnitudes of the finite population correction factor, calculate it for samples of n = 20 observations from populations of members: 20, 100, 10,000.

    Question 5b

    Explain why the result found in the previous question is precisely what one should expect on intuitive grounds.

    Question 6a

    A random sample of 270 students was taken from a large population of students taking a statistics exam. If, in fact, 20% of the students fail the test, what is the probability that the sample proportion of students failing the test will be between 16 and 24%?

    Question 6b

    Now, compute the same probability for 16 to 24%, but this time use a sample of 400 students.

    Question 7

    It has been estimated that 43% of the students drink alcohol. Find the probability that more than half of a random sample of 80 students drink alcohol.

    Question 8

    Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 58% of a random sample of 250 adult Americans eat McDonald's once a week?

    Question 9

    Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 55% of a random sample of 250 adult Americans eat McDonald's once a week?

    Question 10

    Given is n = 6. Determine an upper limit for the sample variance such that the probability of exceeding this limit, given a population standard deviation of 3.6, is less than 0.05. Use the chi-square distribution to solve this problem.

    Question 11a

    There are six employees with the following years of experience:

    2, 4, 6, 6, 7, 8

    Two of these employees are to be chosen at random.

    What is the mean age for these six employees?

    Question 11b

    How many possible samples of two employees are there?

    Question 11c

    List all possible samples

    Question 11d

    Find the sampling distribution of the sample means.

    Question 12

    What is the central limit theorem?

    Question 13a

    Suppose a population distribution is left-skewed with mean 100 and variance 15. From this population, we draw a random sample of n = 100. What is the expected mean of this sample?

    Question 13b

    What is the expected variance of this sample?

    Question 13c

    What shape is expected for the sampling distribution?

    Answer indication

    Question 1a

    μ = 12.2; σ = 3.6; n = 9.
    \[ \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{3.6}{\sqrt{9}} = 1.2 \]

    Question 1b

    \[ P(\bar{x} > 14.4) = P( \frac{\bar{X} - \mu}{\sigma_{\bar{x}}} > \frac{14.4 - 12.2}{1.2} ) = P(z > 1.83) = 0.0336 \]
    To conclude, the probability that the sample mean will exceed 14.4% is only 0.0336.

    Question 2a

    The central limit theorem appies, thus the sampling distribution has mean 105 and variance 16/√25 = 3.2.

    Question 2b

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{106 - 105}{3.2} = 0.3125\]
    P(Z > 0.3125) = 1- 0.6217 = 0.3783

    Question 2c

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{104 - 105}{3.2} = -0.3125\]
    P(104 << 106) = P(-0.3125 < z < 0.3125) = 0.6217 - (1 - 0.6217) = 0.2434

    Question 2d

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{105.5 - 105}{3.2} = 0.1563\]
    P(Z < 0.1563) = 0.5636

    Question 3a

    The central limit theorem applies, thus the mean of the sampling distribution is 150 and the variance 1600/√36 = 266.67.

    Question 3b

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{155 - 150}{266.7} = 0.0188\]
    P(Z > 0.0188) = 1- 0.5040 = 0.4960

    Question 3c

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{145 - 150}{266.7} = -0.06563\]
    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{165 - 150}{266.7} = 0.0563\]
    P(145 << 165) = P(-0.0563 < z < 0.0563) = 0.5239 - (1 - 0.5239) = 0.5239 - 0.4761 = 0.0478

    Question 3d

    P(x̅ > 165) = 1 - 0.5239 = 0.4761

    Question 4a

    The population is normally distributed. Therefore, the sampling distribution of the sample means is normal. Hence, the mean of the sampling distribution is 1,200.

    Question 4b

    The variance is 400/√9 = 133.33

    Question 4c

    The standard error is √400/√9 = 6.67.

    Question 4d

    \[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{1050 - 1200}{133.33} = 1.1250\]
    P(x̅ < 1050) = P(Z < 1.1250) = (0.8686 + 0.8708)/2 = 0.8697

    Question 5a

    The finite population correction factor is calculated as follows: (N - n)/(N - 1).
    The population correction factor for sample size n = 20 for a population with 20 members is: (20 - 20)(20 - 1) = 0.
    The population correction factor for sample size n = 20 for a population with 100 members is: (100 - 20)(100 - 1) = 0.8081.
    The population correction factor for sample size n = 20 for a population with 10,000 members is: (10,000 - 20)(10,000 - 1) = 0.9981.

    Question 5b

    It is the total sample size, not the fraction of the population in the sample, that determines the precision of the results from a random sample. The larger the number of members in the population, the higher the precision of the estimate, regardless of the size of a single sample.

    Question 6a

    P = 0.20 and n = 270.
    \[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{270} } = 0.024 \]
    The required probability is:
    \[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.024} < Z \frac{0.24 - 0.20}{0.024} ) \]
    P(-1.67 < Z < 1.67) = 0.9525 - (1 - 0.9525) = 0.9050
    Thus, we see that the probability is 0.9050 that the sample proportion is within the interval [0.16 - 0.24] given P = 0.20 and sample size n = 270. This interval can be called a 90.50% acceptance interval. Note that, if the sample proportion was actually outside this interval, we may suspect that the population proportion P is not 0.20.

    Question 6b

    P = 0.20; n = 400.
    \[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{400} } = 0.0200 \]
    The required probability is:
    \[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.0200} < Z \frac{0.24 - 0.20}{0.0200} ) \]
    P(-2.00 < Z < 2.00) = 0.9772 - (1 - 0.9772) = 0.9544
    This interval can thus be called a 95.44% acceptance interval (given P = 0.20 and sample size n = 400).

    Question 7

    P = 0.43; n = 80.
    \[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.43(1 - 0.43)}{80} } = 0.055 \]
    \[ P(\hat{p} > 0.50) = P(Z > \frac{0.50 - 0.43}{0.055}) \]
    P (Z > 1.27) = 0.1020

    Question 8

    P = 0.50; n = 250
    \[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.50(1 - 0.50)}{250} } = 0.0316 \]
    \[ P(\hat{p} > 0.58) = P(Z > \frac{0.58 - 0.50}{0.0316}) = 2.5316 \]
    P (Z > 2.53) = 1 - 0.9943 = 0.0057

    Question 9

    \[ P(\hat{p} > 0.55) = P(Z > \frac{0.55 - 0.50}{0.0316}) = 0.9494 \]
    P (Z > 0.95) = 1 - 0.8289 = 0.1711

    Question 10

    n = 6; σ2 = (3.6)2 = 12.96.
    Using the chi-square distribution, we can state that:
    \[ P(s2 > K) = P ( \frac{(n - 1) s^{2}}{12.96} > 11.070) = 0.05 \]
    where K is the desired upper limit and X25 = 11.070 is the upper 0.05 critical value of the chi-square distribution with 5 degrees of freedom. The required upper limit for s2 is obtained by solving:
    \[ \frac{(n - 1)K}{12.96} = 11.070 \]
    \[ K = \frac{(11.070)(12.96)}{(6 - 1)} = 28.69 \]
    Thus, if the sample variance s2 from a random sample of size n = 6 exceeds 28.69, there is strong evidence to suspect that the population variance exceeds 12.96.

    Question 11a

    \[ \mu = \frac{2 + 4 + 6 + 6 + 7 + 8}{6} = 5.5 \]

    Question 11b

    Two of these employees are to be chosen randomly. We are sampling without replacement, thus, the first observation has a probability of 1/6 of being selected, while the second observation has a probability of 1/5 of being selected. Fifteen possible random samples of two eployees could be selected. Note that some samples (such as 2,6) occur twice because there are two employees with six years of experience in the population.

    Question 11c

    2 4
    2 6 (2x)
    2 7
    2 8
    4 6 (2x)
    4 7
    4 8
    6 6
    6 7 (2x)
    6 8 (2x)
    7 8

    Question 11d

    Sample meanProbability of sample mean
    3.01/15
    4.02/15
    4.51/15
    5.03/15
    5.51/15
    6.02/15
    6.52/15
    7.02/15
    7.51/15

    Question 12

    The central limit theorem shows that, if the sample size is large enough, the mean of a random sample drawn from a population with any probability distribution, will be approximately normally distributed with mean μ and variance σ2/n.

    Question 13a

    100

    Question 13b

    σ2/n = 15/100 = 0.15

    Question 13c

    According to the central limit theorem, we expect that, as n becomes large, the distribution approaches the standard normal distribution.

     

    Suppose that we know that the annual percentage salary increase is normally distributed with a mean of 12.2% and a standard deviation of 3.6%. A random sample of 9 observations is obtained from this population and the sample mean is computed. What is the standard error of the sample mean?

     

     

    How to obtain estimates for a single population? - ExamTests 7

     

     

    Questions

    Question 1

    Let x1, x2, ..., xn be a random sample from a normally distributed population with mean μ and variance σ2. Assuming that a population is normally distributed with a very large population size compared to the sample size, should the sample mean or the sample median be used to estimate the population mean?

    Question 2

    Give one advantage of the median over the mean for estimating a population mean.

    Question 3

    Give one disadvantage of the median in comparison to the mean for estimating a population mean.

    Question 4

    Which two properties should an estimator possess?

    Question 5a

    Suppose that shopping times for customers at a local mall follow a normal distribution. The population standard deviation is equal to 20 minutes. A random sample of 64 shoppers in the local grocery store has a mean time of 75 minutes. What is the standard error?

    Question 5b

    What is the margin of error?

    Question 5c

    What is the 95% confidence interval for the population mean μ?

    Question 5d

    Give an interpretation of this confidence interval.

    Question 6

    How can the margin of error be reduced?

    Question 7

    What distribution is used when the population variance is known?

    Question 8

    What distribution is used when the population variance is unknown?

    Question 9

    Find the standard error for n = 17 and s = 16.

    Question 10

    Find the upper critical value of student's t distribution with v = 23 degrees of freedom for α = 0.05.

    Question 11a

    From a random sample of 344 employees, it was found that 261 were in favor of a modified bonus plan. What is the sample proportion?

    Question 11b

    What is the reliability factor for a 90% confidence interval?

    Question 11c

    What is the margin of error for a 90% confidence interval?

    Question 11d

    Provide the 90% confidence interval.

    Question 11e

    Interpret the 90% confidence level.

    Question 12

    What is the number that is exceeded with probability 0.10 by a chi-square random variable with 4 degrees of freedom?

    Question 13

    What is the number that is exceeded with probability 0.05 by a chi-square random variable with 18 degrees of freedom?

    Question 14

    The following information is provided: n = 25, s2 = 100. What are the critical values for a 95% confidence interval with α = 0.05?

    Question 15

    Use the information provided in the previous question. Find the 95% confidence interval for the population variance.

    Question 16a

    Suppose there are 1395 secondary schools in the Netherlands. From a simple random sample of 400 of these schools, it was found that the sample mean enrollment during the past year in biology courses was 320.8 students, and the sample standard deviation was found to be 149.7 students. What it the point estimate for the population total, Nμ?

    Question 16b

    Find the corresponding 99% confidence interval for this population total.

    Question 17a

    From a simple random sample of 400 of the 1,395 students in our population, it is found that biology was a two-semester course in 141 of the sampled schools. Estimate the proportion of all schools for which the biology course is two semesters long.

    Question 17b

    Provide the confidence interval for the proportion of all schools for which the biology course is two semesters long.

    Question 18

    Suppose we have: ME = 0.50; σ = 1.8; and za/2 = z0.005 = 2.576. What is the required sample size for a 99% confidence interval?

    Question 19

    It is given that ME = 0.06 and za/2 = z0.025 = 1.96. What is the required sample size?

    Question 20

    Suppose that an opinion survey is conducted about the presidential election. The survey was said to have a 3% margin of error. The implication is that a 95% confidence interval for the population proportion holding a particular opinion is the sample proportion plus or minus 3%. How many citizens of voting age need to be sampled to obtain this 3% margin of error?

    Question 21

    Suppose that a simple random sample of the 1,395 Dutch secondary schools is taken. Whatever the true proportion, a 95% confidence interval must extend no further than 0.04 on each side of the sample proportion. How many sample observations should be taken?

    Answer indication

    Question 1

    Assuming that a population is normally distributed with a very large population size compared to the sample size, the sample mean is an unbiased estimator of the population mean.

    Question 2

    The median gives less weight to extreme observations and, thus, is less sensitive to outliers.

    Question 3

    The relative efficiency of the median is lower than that of the mean.

    Question 4

    Unbiasedness and being the most efficient.

    Question 5a

    Standard error = σ/√n = 20/√64 = 2.5

    Question 5b

    The margin of error = zα/2 * (σ/√n) = 1.96*2.5 = 4.9

    Question 5c

    The 95% confidence interval runs from 75 - 4.9 to 75 + 4.9, that is: [70.1 - 79.9].

    Question 5d

    In the long run, 95% of the intervals found in this manner contain the true value of the population mean.

    Question 6

    Decrease the population standard deviation, or increase the sample size, or decrease the confidence interval.

    Question 7

    The standard normal distribution (z distribution).

    Question 8

    Student's t distribution.

    Question 9

    Standard error = s/√n = 16/√17 = 3.88

    Question 10

    Use Table 8 (Appendix) to find that the upper critical value is 1.714.

    Question 11a

    \[ \hat{p} = 261/344 = 0.759 \]

    Question 11b

    \[ z_{\alpha/2} = z_{0.05} = 1.645\]

    Question 11c

    \[ 1.645 \sqrt{(0.759)(0.241)}{344} = 0.038 \]

    Question 11d

    0.759 +/- 0.038 = [0.721; 0.797]

    Question 11e

    Imagine taking a very large number of independent random samples of size n = 344 from this population, and, calculating a 90% confidence interval for each sample result. Then, the confidence level of the interval implies that in the long run 90% of the intervals found in this manner contain the true value of the population proportion.

    Question 12

    7.779

    Question 13

    28.869

    Question 14

    \[ X^{2}_{n-1,1-\alpha/2} = \chi^{2}_{24,0.975} = 12.401 \]
    \[ X^{2}_{n-1,\alpha/2} = \chi^{2}_{24,0.025} = 39.364 \]

    Question 15

    \[ LCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,\alpha/2} } = \frac{(24)(100)}{39.364} = 60.97 \]
    \[ UCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,1 - \alpha/2} } = \frac{(24)(100)}{12.401} = 193.53 \]
    Hence, the 95% confidence interval is: [60.97; 193.53]

    Question 16a

    Nx̄ = (1,395)(320.8) = 447,516. Thus, we estimate a total of 447,516 students to be enrolled in biology courses.

    Question 16b

    \[ N\hat{\sigma}_{\bar{x}} = \frac{Ns}{\sqrt{n}} \sqrt{ (\frac{N - n}{N - 1}) } = \frac{(1,395)(149.7)}{\sqrt{400}} = 8,821.6 \]
    Because the sample size is large, we can use the central limit theorem with zα/2 = 2.58 for a 99% confidence interval. Hence:
    \[ N\bar{x} \pm z_{\alpha/2} N \hat{\sigma}_{\bar{x}} \]
    \[ 447,516 \pm 2.58(8.821.6) \]
    \[ 447,516 \pm 22,760 \]
    Thus, the 99% confidence interval runs from 424,756 to 470,276 students.

    Question 17a

    N = 1,395; n = 400.
    \[ \hat{p} = \frac{141}{400} = 0.3525 \]
    The point estimate of the population proportion P, is simply equal to this population proportion, that is: 0.3525.

    Question 17b

    \[ \hat{\sigma}^{2}_{\hat{p}} = \frac{\hat{p} (1 - \hat{p}}{n - 1} ( \frac{N - n}{N - 1} ) = \frac{(0.3525)(0.6475)}{400} = 0.0004073 \]
    so
    \[ \hat{\sigma}_{\hat{p}} = \sqrt{0.0004073} = 0.0202 \]
    For a 90% confidence interval: za/2 = 1.645.
    \[ ME = z_{\alpha/2} \hat{\sigma}_{\hat{p}} = 1.645(0.0202) ≅ 0.0332 \]
    Thus, the 90% confidence interval runs from 0.3525 +/- 0.0332. That is, from 31.93% to 38.57%.

    Question 18

    \[ n = \frac{z^{2}_{\alpha/2}} \sigma^{2}{ME^{2}} = \frac{ (2.576)^{2} (1.8)^{2} }{(0.5)^{2}} ≈ 86 \]

    Question 19

    \[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{0.25(1.96)^{2}}{(0.06)^{2}} = 267 \]

    Question 20

    \[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{(0.25)(1.96)^{2}}{(0.03)^{2}} = 1067.11 = 1068 \]

    Question 21

    \[ 1.96 \sigma_{\hat{p}} = 0.04 \]
    \[ \sigma_{\hat{p}} = 0.020408 \]
    \[ n_{max} = \frac{0.25N}{(N - 1) \sigma^{2}_{\hat{p}} + 0.25 } = \frac{(0.25)(1,395)}{(1,394)(0.020408)^{2} + 0.25} = 419.88 = 420 \]

     

    Let x1, x2, ..., xn be a random sample from a normally distributed population with mean μ and variance σ2. Assuming that a population is normally distributed with a very large population size compared to the sample size, should the sample mean or the sample median be used to estimate the population mean?

     

     

    How to estimate parameters for two populations? - ExamTests 8

     

     

    Questions

    Question 1a

    The following information is provided for a dependent random sample from two normally distributed populations:
    \[ n = 11 \hspace{3mm} \bar{d} = 28.5 \hspace{3mm} s_{d} = 3.3 \]
    Find the 98% confidence interval for the difference between the means of the two populations.

    Question 1b

    What is the margin of error for a 98% confidence interval for the difference between the means of the two populations?

    Question 1c

    What do you conclude based on the confidence interval found in question 1a?

    Question 2a

    Consider the following data:

    BeforeAfter
    6
    12
    8
    10
    6
    8
    14
    9
    13
    7

    What type of dependent sample is depicted here?

    Question 2b

    What is the sample mean of the differences?

    Question 2c

    It is given that the mean difference is equal to 7.7 with standard deviation sd = 43.68901. Compute the 95% confidence interval using the normal approximation.

    Question 3a

    An educational study is conducted to examine the effectiveness of a mathematics reading program of elementary age school children. Each child was given a pre- and posttest. HIgher scores indicate improvement in mathematics. From a very large population, a random sample was drawn. The data obtained from this sample are provided in the table below. What is the mean difference score?

    ChildPretest ScorePosttest score
    1
    2
    3
    4
    5
    6
    7
    40
    36
    32
    38

    33
    35

    48
    42

    36
    43
    38
    45

    Question 3b

    What is the standard deviation of the difference scores?

    Question 3c

    Find the t value corresponding to a 95% confidence interval.

    Question 3d

    Compute a 95% confidence interval.

    Question 3e

    Can we conclude, based on this 95% confidence interval, that there is a significant improvement in mathematics?

    Question 3f

    Compute a 95% confidence interval using the normal approximation.

    Question 3g

    What do we conclude based on this interval?

    Question 4

    A study regarding student's GPA was conducted. From a very large university, independent random samples of 120 students majoring in economics and 90 students majoring in finance were selected. The mean GPA for the random sample of economics majors was found to be 3.08. The mean GPA for the random sample of finance majors was found to be 2.88. From similar past studies, the population standard deviation for the finance majors is 0.64. Denote the population mean for economics by μx and the population mean for finance by μy. With which scenario are we dealing here?

    1. Population variances known.
    2. Population variances unknown, but assumed to be equal.
    3. Population variances unknown, and not assumed to be equal.

    Question 4b

    Compute the 95% confidence interval for the difference score for the information provided in the previous question.

    Question 4c

    What do we conclude based on this 95% confidence interval (from question 4b)?

    Question 5a

    Consider the following data:

    X100125135128140142128137156142
    Y9587100751101058595  

    Suppose these are independent samples with unknown variances, but the variances are assumed to be equal. Give nx, ny, x̄, ȳ, σ2x and σ2y.

    Question 5b

    Compute the pooled variance.

    Question 5c

    What are the degrees of freedom?

    Question 5d

    Find the t value corresponding to a 95% confidence interval.

    Question 5e

    Compute a 95% confidence interval.

    Question 6a

    Assuming equal population variances, determine the number of degrees for:
    nx = 16; s2x = 30
    ny = 9; s2x = 36

    Question 6b

    Compute the pooled sample variance for the information provided in the previous question.

    Question 7a

    Assuming equal population variances, determine the number of degrees for:
    nx = 12; s2x = 30
    ny = 14; s2x = 36

    Question 7b

    Compute the pooled sample variance for the information provided in the previous question.

    Question 8a

    Assuming equal population variances, determine the number of degrees for:
    nx = 20; s2x = 16
    ny = 8; s2x = 25

    Question 8b

    Compute the pooled sample variance for the information provided in the previous question.

    Question 9

    The following information is provided:
    \[ n_{x} = 120; \hat{p}_{y} = 0.892 \]
    \[ n_{y} = 141; \hat{p}_{y} = 0.518 \]
    Compute a 95% confidence interval for the population difference (Px - Py).

    Question 10

    Calculate the margin of error for a 95% confidence interval with:
    \[ n_{x} = 300; \hat{p}_{y} = 0.62 \]
    \[ n_{y} = 350; \hat{p}_{y} = 0.72 \]

    Question 11

    Calculate the margin of error for a 95% confidence interval with:
    \[ n_{x} = 100; \hat{p}_{y} = 0.44 \]
    \[ n_{y} = 150; \hat{p}_{y} = 0.55 \]

    Answer indication

    Question 1a

    \[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} = 28.5 \pm 2.764 \frac{3.3}{\sqrt{11}} = 28.5 \pm 2.7502 \]
    The 98% confidence interval is: [25.75; 31.25].

    Question 1b

    ME = 2.7502

    Question 1c

    Based on these sample data we conclude that there is sufficient evidence to suggest that there is a significant difference between the two populations.

    Question 2a

    Repeated measurements

    Question 2b

    \[ \bar{d} = \frac{2 + 2 + 1 + 3 + 1}{5} = 1.8 \]

    Question 2c

    Using the normal approximation we have tn-1,a/2 = t139,0.025 ≅ 1.96.
    \[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
    \[ 7.7 \pm 1.96 \frac{43.68901}{\sqrt{140}} \]
    \[ 7.7 \pm 7.2 \]
    This results in the following 95% confidence interval: [70.5; 84.9]

    Question 3a

    \[ \bar{d} = \frac{8 + 6 -2 + 5 + 10}{5} = 5.4 \]

    Question 3b

    sd ≅ 4.56

    Question 3c

    t4,0.025 = 2.776

    Question 3d

    \[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
    \[ 5.4 \pm 2.776 \frac{4.56}{\sqrt{5}} \]
    \[ 5.4 \pm 5.6620 \]
    The 95% confidence interval is: [-0.26; 11.620]

    Question 3e

    No, because the zero is within the range of the confidence interval. Thus, there is insufficient evidence to conclude that there is a significant difference.

    Question 3f

    Using the normal approximation, we replace t by z, that is: z = 1.96.
    \[ 5.4 \pm 1.96 \frac{4.56}{\sqrt{5}} \]
    \[ 5.4 \pm 3.9976 \]
    The 95% confidence interval is: [1.40; 9.40]

    Question 3g

    Based on the 95% confidence interval computed by the normal approximation, we would conclude that there is a significant improvement in the mathematics scores. Note, however, that we are dealing with a dependent sample here (repeated measures). Therefore, the normal approximation is not a valid procedure. It is, however, important to see the difference the distribution can make on the statistical inferences.

    Question 4a

    A. population variances known.

    Question 4b

    \[ (\bar{x} - \bar{y}) \pm z_{\alpha/2} + \sqrt{\frac{\sigma^{2}_{x}}{n_{x}} + \frac{\sigma^{2}_{y}}{n_{y}}} \]
    \[ (3.08 - 2.88) \pm 1.96 \sqrt{ \frac{(0.42)^{2}}{120}} + \frac{(0.64)^{2}}{90} = 0.20 \pm 0.1521 \]
    Thus, the 95% interval extends from 0.0479 to 0.3521

    Question 4c

    The confidence interval does not comprise the zero, thus we conclude that there is a significant difference in the mean GPA of students majoring in economics and students majoring in finance. More precisely, on average, the mean GPA of students majoring in economics is higher than the GPA of students majoring in finance.

    Question 5a

    nx = 10; x̄ = 133.30; σ2x = 218.0111
    ny = 8; ȳ = 94.00; σ2y = 129.4286

    Question 5b

    \[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} = \frac{(10 - 1)(218.0111) + (8 - 1)(129.4286) }{10 + 8 -2} = 19.2563 \]

    Question 5c

    The degrees of freedom are given by: nx + ny - 2 = 10 + 8 - 2 = 16

    Question 5d

    t16,0.025 = 2.12

    Question 5e

    \[ (\bar{x} - \bar{y}) \pm t_{n_{x} + n_{y} - 2, a/2} + \sqrt{\frac{s^{2}_{p}}{n_{x}} + \frac{s^{2}_{p}}{n_{y}}} \]
    \[ 39.3 \pm (2.21) \sqrt{ \frac{179.2563}{10} + \frac{179.2563}{8} } \]
    \[ 39.3 \pm 13.46 \]
    Thus, the 95% confidence interval is: [25.84; 52.76]

    Question 6a

    df = nx + ny - 2 = 16 + 9 - 2 = 23

    Question 6b

    \[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} \]
    \[ s^{2}_{p} = \frac{ (16-1)30 + (9 - 1)36}{16 + 9 - 2} = \frac{738}{23} = 32.08 \]

    Question 7a

    df = nx + ny - 2 = 12 + 14 - 2 = 24

    Question 7b

    \[ s^{2}_{p} = \frac{ (12-1)30 + (14 - 1)36}{12 + 14 - 2} = \frac{798}{24} = 33.25 \]

    Question 8a

    df = nx + ny - 2 = 20 + 8 - 2 = 26

    Question 8b

    \[ s^{2}_{p} = \frac{ (20-1)16 + (8 - 1)25}{20 + 8 - 2} = \frac{479}{26} = 18.42 \]

    Question 9

    \[ (\hat{p}_{x} - \hat{p}_{y}) \pm z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
    \[ (0.892 - 0.518) \pm 1.96 \sqrt{ \frac{(0.892)(0.108)}{120} + \frac{(0.518)(0.482)}{141} } \]
    From this, it follows that the 95% confidence interval runs from 0.274 to 0.473.

    Question 10

    \[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
    \[ 1.96 \sqrt{ \frac{(0.62)(0.38)}{300} + \frac{(0.72)(0.28)}{350} } \]
    ME = 0.0733

    Question 11

    \[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
    \[ 1.96 \sqrt{ \frac{(0.44)(0.56)}{100} + \frac{(0.55)(0.45)}{120} } \]
    ME = 0.1329

     

    The following information is provided for a dependent random sample from two normally distributed populations:
    \[ n = 11 \hspace{3mm} \bar{d} = 28.5 \hspace{3mm} s_{d} = 3.3 \]
    Find the 98% confidence interval for the difference between the means of the two populations.

     

     

    How to develop hypothesis testing procedures for a single population? - ExamTests 9

     

     

    Questions

    Question 1a

    Kees wants to use the results of a random sample market survey to seek strong evidence that his brand of cereal has more than 20% of the total market. Formulate the null hypothesis and alternative hypothesis using P as the population proportion.

    Question 1b

    Is the alternative hypothesis you formulated a one-sided or two-sided composite alternative hypothesis?

    Question 2

    A car factory has proposed a process to monitor the diameter of pistons on a regular schedule. They want to test whether the diameter is equal to 3800. Formulate the null hypothesis and alternative hypothesis.

    Question 3

    What is a type I error?

    Question 4

    What is a type II error?

    Question 5a

    A random sample is obtained from a population with variance σ2 = 625. The sample mean is computed. Test the null hypothesis H0: μ = 100 versus the alternative hypothesis H1: μ > 100 with α = 0.05. Compute the critical value x̅c and state your decision rule regarding a sample size of n = 25.

    Question 5b

    Do the same for n = 16.

    Question 5c

    Do the same for n = 44.

    Question 5d

    Do the same for n = 32.

    Question 6a

    A random sample of n = 25 is obtained from a population with known variance. The sample mean is computed. Test the null hypothesis: H0: μ = 120 versus the alternative hypothesis H1: μ > 120 with α = 0.10. Compute the critical value x̅c and state your decision rule regarding the population variance σ2 = 196.

    Question 6b

    Do the same for σ2 = 625.

    Question 6c

    Do the same for σ2 = 900.

    Question 6d

    Do the same for σ2 = 500.

    Question 7

    Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 108; s = 20.

    Question 8

    Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 104; s = 10.

    Question 9

    Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 96; s = 10.

    Question 10

    Mention four conditions that will raise the power function.

    Question 11

    Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.56 to be β = 0.31 using a significance level of α = 0.05. What is the power?

    Question 12

    Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.66 to be β = 0.25 using a significance level of α = 0.10. What is the power?

    Question 13a

    A random sample of 20 products is obtained, and the weight of each product is measured. The sample variance is computed to be 6.62. The hypothesis is tested that the weight of the products cannot exceed. Formulate the null hypothesis and alternative hypothesis.

    Question 13b

    What are the degrees of freedom?

    Question 13c

    What is the critical value?

    Question 13d

    What is the test statistic?

    Question 13e

    Based on these sample data, can we reject the null hypothesis?

    Question 14a

    Suppose we are testing the following hypotheses:
    H0: μ < 100
    H1: μ > 100
    using a random sample of n = 49, a probability of type I error equal to 0.05.
    Suppose the population variances are unknown, what distribution should you use?

    Question 14b

    Test the hypotheses using the following test statistics: x̅ = 108; s = 20

    Question 14c

    Test the hypotheses using the following test statistics: x̅ = 104; s = 10

    Question 14d

    Test the hypotheses using the following test statistics: x̅ = 96; s = 10

    Question 14e

    Test the hypotheses using the following test statistics: x̅ = 95; s = 8

    Answer indication

    Question 1a

    H0: P = 0.20
    H1: P > 0.20

    Question 1b

    A one-sided composite alternative hypothesis.

    Question 2

    H0: μ = 3800
    H1: μ ≠ 3800

    Question 3

    A type I error refers to rejecting the null hypothesis, while the null hypothesis is true.

    Question 4

    A type II error refers to failing to reject the null hypothesis, while the null hypothesis is false.

    Question 5a

    For a one-sided hypothesis test with significance level α = 0.05, the value of zα = 1.645 from the standard normal table. The variance is 625, thus the standard deviation is √625 = 25.
    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{25}) = 109.80 \]
    The decision rule is: reject H0 if x̅ > 109.80

    Question 5b

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{16}) = 112.50 \]
    The decision rule is: reject H0 if x̅ > 112.50

    Question 5c

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{44}) = 107.39 \]
    The decision rule is: reject H0 if x̅ > 107.39

    Question 5d

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{32}) = 108.62 \]
    The decision rule is: reject H0 if x̅ > 108.62

    Question 6a

    For a one-sided hypothesis test with significance level α = 0.05, the value of zα = 1.282 from the standard normal table. The variance is 196, thus the standard deviation is √196 = 14.
    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (14 / \sqrt{25}) = 123.59 \]
    The decision rule is: reject H0 if x̅ > 123.59

    Question 6b

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{625} / \sqrt{25}) = 121.28 \]
    The decision rule is: reject H0 if x̅ > 121.28

    Question 6c

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{900} / \sqrt{25}) = 127.69 \]
    The decision rule is: reject H0 if x̅ > 127.69

    Question 6d

    \[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{500} / \sqrt{25}) = 125.73 \]
    The decision rule is: reject H0 if x̅ > 125.73

    Question 7

    t30,0.05 = 1.697
    \[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{108 - 100}{20 / \sqrt{31}} = 2.23 \]
    Thus, t > t30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.

    Questiom 8

    t30,0.05 = 1.697
    \[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{104 - 100}{10 / \sqrt{31}} = 2.23 \]
    Thus, t > t30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.
    The t value is actually the same as in the previous question, because both the nominator and denominator are half of the original value, hence yielding the same outcome.

    Question 9

    t30,0.05 = 1.697
    \[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{96 - 100}{10 / \sqrt{31}} = -2.23 \]
    Thus, t < t30,0.05. Because we are testing a one-sided alternative hypothesis with H1: μ > μ0, here, we cannot reject the null hypothesis (be aware that the sample mean is lower than the parameter of interest, rather than higher than the parameter).

    Question 10

    (1) the true mean is farther from the hypothesized mean μ0; (2) the significance level is higher; (3) the population variance is lower; (4) the sample size is larger.

    Question 11

    Power = 1 - β = 1 - 0.31 = 0.69

    Question 12

    Power = 1 - β = 1 - 0.25 = 0.75

    Question 13a

    H0: σ2 < σ20 = 4
    H1: σ2 > 4

    Question 13b

    df = n - 1 = 20 - 1 = 19

    Question 13c

    For this test with a significance level of α = 0.05 and 19 degrees of freedom, the critical value of the chi-square variable is 30.144 (see Appendix Table 7 of the book).

    Question 13d

    \[ \frac{(n - 1)s^{2}}{\sigma^{2}_{0}} = \frac{20 -1)(6.62)}{4} = 31.445 \]

    Question 13e

    31.445 > 30.144. Therefore, we can reject the null hypothesis and conclude that the variability of the weight of the products exceeds the standard.

    Question 14a

    Student's t distribution

    Question 14b

    The critical t value is: tc = 1.684
    \[ t = \frac{108 - 100}{20 / \sqrt{49}} = 2.8 \]
    t > tc, therefore we can reject the null hypothesis.

    Question 14c

    \[ t = \frac{104 - 100}{20 / \sqrt{10}} = 2.8 \]
    t > tc, therefore we can reject the null hypothesis.

    Question 14d

    \[ t = \frac{96 - 100}{10 / \sqrt{49}} = -2.8 \]
    t < tc, yet we are testing t > tc. Therefore we cannot reject the null hypothesis ("wrong side").

    Question 14e

    \[ t = \frac{95 - 100}{8 / \sqrt{49}} = 4.38 \]
    t < tc, yet we are testing t > tc. Therefore we cannot reject the null hypothesis ("wrong side").

     

    Kees wants to use the results of a random sample market survey to seek strong evidence that his brand of cereal has more than 20% of the total market. Formulate the null hypothesis and alternative hypothesis using P as the population proportion.

     

     

    What test procedures are there for testing the difference between two populations? - ExamTests 10

     

     

    Questions

    Question 1a

    A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.

    Question 1b

    Can you reject the null hypothesis if the sample standard deviation of the difference is 20?

    Question 1c

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 30?

    Question 1d

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 15?

    Question 1e

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 40?

    Question 2a

    A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 < 0. From the populations, a random sample is drawn of 25 matched pairs. The standard deviation of the difference between the sample means is found to be 25. Give the decision rule using a probability of type I error α = 0.05.

    Question 2b

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 50 for populations 1 and 2?

    Question 2c

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 59 and 50 for populations 1 and 2?

    Question 2d

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 48 for populations 1 and 2?

    Question 2e

    Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 54 and 50 for populations 1 and 2?

    Question 3a

    A researcher wants to conduct a hypothesis test for the difference in means between two populations with independent samples. The following information is provided:
    nx = 25; = 115; = 625
    ny = 25; = 100; = 400
    Compute the test statistic.

    Question 3b

    The researcher decides to test at a significance level of α = 0.05. Determine the critical z value.

    Question 3c

    Compare the critical z value to the test statistic. Can the researcher reject the null hypothesis?

    Question 4

    How large should the sample size be in order to obtain a good approximation if we replace the population variances with the sample variances?

    Question 5a

    Use the following information:
    nx = 25; = 1078; sx = 633
    ny = 25; = 908.2; sy = 469.8
    We are interested in testing the difference in population means between X and Y. The alternative hypothesis states that the mean of population 2 is larger than the mean of population 1. For this hypothesis test, we are using a significance level of α = 0.05. Note that the population variances are unknown and that the sample variances are given.
    Formulate the null hypothesis and alternative hypothesis.

    Question 5b

    Compute the pooled variance estimate.

    Question 5c

    What are the degrees of freedom?

    Question 5d

    What is the critical value of t?

    Question 5e

    Compute the test statistic.

    Question 5f

    Provide the decision rule for this hypothesis test.

    Question 6

    Can the null hypothesis be rejected?

    Question 7

    How large should the sample size be in order to be able to use the standard normal distribution for testing the equality of two population proportions?

    Question 8a

    Consider the following information:
    nx = 270; = 0.185
    ny = 203; = 0.399
    Compute the estimate of the common variance, P0, under the null hypothesis.

    Question 8b

    Compute the test statistic.

    Question 8c

    Suppose we are testing with the alternative hypothesis: H1: Px < Py. For this test, we are using a significance level of α = 0.05. What is the critical value?

    Question 8d

    Formulate the decision rule.

    Question 8e

    Can we reject the null hypothesis?

    Question 9a

    Consider the following information:
    nx = 17; sx = 123.35
    ny = 11; sy = 8.02
    What are the degrees of freedom for the F distribution?

    Question 9b

    Given a significance level of α = 0.02, what is the critical value of F?

    Question 9c

    Compute the test statistic. Can the null hypothesis be rejected?

    Answer indication

    Question 1a

    tn-1,a = t24,0.05 = 1.711
    The general decision rule here is: reject H0 if t > t24,0.05 = 1.711.

    Question 1b

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{20 / \sqrt{25}} = 2.5 \]
    t > t24,0.05 and, thus, we can reject the null hypothesis.

    Question 1c

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{30 / \sqrt{25}} = 1.67 \]
    t < t24,0.05 and, thus, we cannot reject the null hypothesis.

    Question 1d

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{15 / \sqrt{25}} = 3.33 \]
    t > t24,0.05 and, thus, we can reject the null hypothesis.

    Question 1e

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{40 / \sqrt{25}} = 1.25 \]
    t < t24,0.05 and, thus, we cannot reject the null hypothesis.

    Question 2a

    tn-1,a = t24,0.05 = -1.711
    The general decision rule here is: reject H0 if t < -t24,0.05 = -1.711.

    Question 2b

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-6}{25 / \sqrt{25}} = -3.8 \]
    t < t24,0.05 and, thus, we can reject the null hypothesis.

    Question 2c

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-9}{25 / \sqrt{25}} = -1.8 \]
    t < t24,0.05 and, thus, we can reject the null hypothesis.

    Question 2d

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-8}{25 / \sqrt{25}} = -1.6 \]
    t > t24,0.05 and, thus, we cannot reject the null hypothesis.

    Question 2e

    \[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-4}{25 / \sqrt{25}} = -0.8 \]
    t > t24,0.05 and, thus, we cannot reject the null hypothesis.

    Question 3a

    \[ z = \frac{115 - 100}{\sqrt{\frac{625}{25} + \frac{400}{25}}} = 2.34 \]

    Question 3b

    Z0.05 = 1.645

    Question 3c

    z > z0.05 thus the null hypothesis can be rejected.

    Question 4

    The sample size should be larger than 100.

    Question 5a

    H0: μx – μy = 0
    H1: μx – μy < 0

    Question 5b

    \[ s^{2}_{p} = \frac{ (25-1)(633)^{2} + (25 – 1)(469.8)^{2} }{25 + 25 - 2} = 310,700 \]

    Question 5c

    df = 25 + 25 – 2 = 48

    Question 5d

    t48,0.05 = 1.677

    Question 5e

    \[ t = \frac{1078 – 908.2}{ \sqrt{ \frac{310,700}{25} + \frac{310,700}{25}}} = 1.08 \]

    Question 5f

    Reject H0 if t > t48,0.05 = 1.677

    Question 6

    No, the test statistic is smaller than the critical value. Thus, there is not sufficient evidence to reject the null hypothesis.

    Question 7

    nP0(1 – P0) > 5

    Question 8a

    \[ \hat{p}_{0} = \frac{n_{x} \hat{p}_{x} + n_{y} \hat{p}_{y}}{n_{x} + n_{y}} = \frac{(270)(0.185) + (203)(0.399)}{270 + 203} = 0.277 \]

    Question 8b

    \[ \frac{0.185 – 0.399}{ \sqrt{ \frac{ (0.277)(1 – 0.277) }{270} + \frac{ (0.277)(1 – 0.277) }{203} } } = -5.15 \]

    Question 8c

    –z0.05 = -1.645

    Question 8d

    Reject H0 if z < –z0.05 = -1.645

    Question 8e

    Yes, we can reject the null hypothesis that there is no difference in proportions between these two populations, because -5.15 < -1.645.

    Question 9a

    dfnumerator = (nx - 1) = 17 – 1 = 16 and dfdenominator = (ny - 1) = 11 – 1 = 10.

    Question 9b

    From Appendix Table 9 (in the book) it follows that: F16,10,0.01 = 4.520

    Question 9c

    \[ F = \frac{s^{2}_{x}}{s^{2}_{y}} = \frac{123.35}{8.02} = 15.380 \]
    Obviously, the test statistic of F(15.380) exceeds the critical value (4.520). Hence, the null hypothesis can be rejected in favor of the alternative hypothesis.

     

    Use the following information for questions 1-5. A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.

     

     

    How to conduct a simple regression? - ExamTests 11

     

     

    Questions

    Question 1a

    Suppose we are interested in the relationship between the number of workers (denoted by X) and the number of tables produced per hour (Y). A sample of 10 workers is provided. The following descriptive statistics are obtained:
    \[Cov(x,y) = 106.93 \hspace{5mm} s^{2}_{x} = 42.01 \hspace{5mm} \bar{y} = 41.2 \hspace{5mm} \bar{x} = 21.3 \]
    Compute the slope of the sample regression.

    Question 1b

    Compute the y-intercept for the sample regression.

    Question 1c

    What is the equation of the regression line?

    Question 1d

    If management decides to employ 25 workers, how many tables would we expect to be produced?

    Question 2

    The following regression equation is given: Y = 559 + 0.3815X.
    What is the expected value of Y for X = 55,000.

    Question 3a

    Use the following regression equation:
    Y = 100 + 21X
    Interpret the slope of the regression line.

    Question 3b

    What is the change in Y when X changes by +5?

    Question 3c

    What is the change in Y when X changes by -7?

    Question 3d

    What is the predicted value of Y when X = 14?

    Question 3e

    What is the predicted value of Y when X = 27?

    Question 3f

    Does this equation prove that a change in X causes a change in Y?

    Question 4a

    Given the regression equation:
    Y = 107 + 10X
    What is the change in Y when X changes by +2?

    Question 4b

    What is the change in Y when X changes by -4?

    Question 4c

    What is the predicted value of Y when X = 15?

    Question 4d

    What is the predicted value of Y when X = 22?

    Question 5

    Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 10; ȳ = 50; sx = 80; sy = 75; rxy = 0.4; n = 60.

    Question 6

    Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 60; ȳ = 50; sx = 80; sy = 65; rxy = 0.7; n = 60.

    Question 7

    Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 90; ȳ = 100; sx = 60; sy = 70; rxy = 0.4; n = 60.

    Question 8

    The following information is provided: SSE = 17.89 and SST = 68.22. What is the percent explained variability?

    Question 9

    What absolute value of the Student's t statistic indicates a relationship between two variables when we use a two-tailed test with α= 0.05 and n > 60?

    Question 10a

    Given the simple regression model
    \[ Y = \beta_{0} + \beta_{1}X \]
    and the regression results that follow, test the null hypothesis that the slope coefficient is zero versus the alternative hypothesis that the slope coefficient differs from zero using probability of type I error rate equal to 0.005 and determine the two-sided 99% confidence interval. The following sample statistics are provided: n = 22; b1 = 0.3815; sb1 = 0.0253.

    Question 10b

    Consider your answer on the previous question. Based on this result, what do you conclude about the slope coefficient?

    Question 11

    Which four factors result in narrower prediction intervals?

    Question 12a

    Suppose we want to test H0: ρ = 0 against H1: ρ > 0 using the sample information: n = 49 and r = 0.42.
    What is the test statistic?

    Question 12b

    What is the critical value if we are testing at a 0.05% signifcance level?

    Question 12c

    What do we conclude about the population correlation?

    Question 13

    Suppose we have the following information: n = 25. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?

    Question 14

    Suppose we have the following information: n = 64. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?

    Question 15

    Which two factors can influence the estimated regression equation?

    Question 16

    Points with a high leverage will have a .... standard error of the residual.

    Answer indication

    Question 1a

    \[ b_{1} = \frac{Cov(x,y)}{s^{2}_{x}} = r \frac{s_{y}}{s_{x}} = \frac{106.93}{42.01} = 2.545 \]

    Question 1b

    \[ b_{0} = \bar{y} - b_{1}\bar{x} = 41.2 - 2.545(21.3) = -13.02 \]

    Question 1c

    \[ \bar{y} = b_{0} + b_{1}x = -13.02 + 2.545x \]

    Question 1d

    \[ \hat{y} = -13.02 + 2.545(25) = 50.605 \]

    Question 2

    Y = 559 + 0.3815*55,000 = 21,542

    Question 3a

    For every one-unit change in X, Y changes by 21.

    Question 3b

    If X changes by +5, Y changes by (21)(5) = 105

    Question 3c

    If X changes by -7, Y changes by (21)(-7) = -147

    Question 3d

    Y = 100 + (21)(14) = 394

    Question 3e

    Y = 100 + (21)(27) = 667

    Question 3f

    No, regression results summarize the information contained in the data. They do not prove causation.

    Question 4a

    If X changes by +2, Y changes by (10)(2) = 20

    Question 4b

    If X changes by -4, Y changes by (10)(-4) = 40

    Question 4c

    Y = 107 + (10)(15) = 257

    Question 4d

    Y = 107 + (10)(22) = 327

    Question 5

    \[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{75}{80} = 0.375 \]
    \[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.43(10) = 46.25 \]
    \[ \hat{y}_{i} = 46.25 + 0.375x_{i} \]

    Question 6

    \[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.7 \frac{65}{80} = 0.8125 \]
    \[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.8125(60) = 1.25 \]
    \[ \hat{y}_{i} = 1.25 + 0.8125x_{i} \]

    Question 7

    \[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{70}{60} = 0.467 \]
    \[ b_{0} = \bar{y} = b_{1}\bar{x} = 100 - 0.467(90) = 58 \]
    \[ \hat{y}_{i} = 58 + 0.467x_{i} \]

    Question 8

    \[ R^{2} = 1 - \frac{SSE}{SST} = 1 - \frac{17.89}{68.22} = 0.738 \]
    Thus, 73,80% of the variability is explained by the regression model.

    Question 9

    According to the rule of thumb, the absolute value of the Student's t statistic should be greater than 2.0 to indicate that there is a relationship.

    Question 10a

    For a 99% confidence interval we have 1 - α = 0.05 and n - 2 = 22 - 2 = 20 degrees of freedom. Hence, from Appendix Table 8 (see book) it follows that:
    \[ t_{n-2,\alpha/2} = t_{20,0.005} = 2.845 \]
    Therefore, the 99% confidence interval is:
    \[ 0.3815 - (2.845)(0.0253) < \beta_{1} < 0.381 + (2.845)(0.0253) \]
    \[ 0.3095 < \beta_{1} < 0.4535 \]

    Question 10b

    The confidence interval does not comprise the zero, therefore we can reject the null hypothesis and conclude that the slope coefficient is not equal to zero.

    Question 11

    1. A larger sample size (n).
    2. A smaller value of s2e.
    3. A large dispersion of the observations of the independent variable.
    4. Smaller values of the quantity (xn+1 - x̅)2.

    Question 12a

    \[ t = \frac{0.43 \sqrt{(49 - 2)}}{\sqrt{1 - (0.43)^{2}}} = 3.265 \]

    Question 12b

    Since there are (n - 2) = 47 degrees of freedom, it follows from Appendix Table 8 that t47,0.005 = 2.704

    Question 12c

    t47,0.005 = 2.704 < t. Therefore, we can reject the null hypothesis. There is strong evidence of a positive linear relationship between the two variables. Note, however, that we cannot conclude from this result that one variable caused the other, but only that they are related.

    Question 13

    \[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{25}} > 0.4 \]

    Question 14

    \[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{64}} > 0.25 \]

    Question 15

    Points with a high leverage and outliers.

    Question 16

    Smaller.

     

    Suppose we are interested in the relationship between the number of workers (denoted by X) and the number of tables produced per hour (Y). A sample of 10 workers is provided. The following descriptive statistics are obtained:
    \[Cov(x,y) = 106.93 \hspace{5mm} s^{2}_{x} = 42.01 \hspace{5mm} \bar{y} = 41.2 \hspace{5mm} \bar{x} = 21.3 \]
    Compute the slope of the sample regression.

     

     

    How to conduct a multiple regression? - ExamTests 12

     

     

    Questions

    Question 1a

    \[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
    Compute the expected value of y when x1 = 11, x2 = 24, and x3 = 27.

    Question 1b

    Compute the expected value of y when x1 = 31, x2 = 20, and x3 = 17.

    Question 1c

    Compute the expected value of y when x1 = 32, x2 = 29, and x3 = 13.

    Question 1d

    Compute the expected value of y when x1 = 30, x2 = 26, and x3 = 29.

    Question 2a

    \[ \hat{y} = 10 + 5_{x1} + 4_{x2} + 2_{x3} \]
    Compute the expected value of y when x1 = 20, x2 = 11, and x3 = 10.

    Question 2b

    Compute the expected value of y when x1 = 15, x2 = 14, and x3 = 20.

    Question 2c

    Compute the expected value of y when x1 = 35, x2 = 19, and x3 = 25.

    Question 2d

    Compute the expected value of y when x1 = 10, x2 = 17, and x3 = 30.

    Question 3a

    \[ \hat{y} = 10 - 2_{x1} - 14_{x2} + 6_{x3} \]
    What is the change in y when x1 increases by 4?

    Question 3b

    What is the change in y when x3 increases by 1?

    Question 3c

    What is the change in y when x2 increases by 2?

    Question 4

    What is the fifth assumption of a multiple linear regression model?

    Question 5

    Compute the coefficient b1 for the regression model
    \[ \hat{y}_{i} = b_{0} + b_{1}x_{1i} + b_{2}x_{x2i} \]
    given the following summary statistics:
    rx1y = 0.80, rx2y = 0.30, rx1x2 - 0.90, sx1 = 500, sx2 = 400, sy = 100

    Question 6

    Compute the coefficient b2 for the regression model (using the regression model of question 13).

    Question 7

    The following data are available: n = 25; K = 2; SSE = 0.0625; SST = 0.4640.
    Compute the adjusted coefficient of determination.

    Question 8

    When is the adjusted coefficient of determination preferred over the standard coefficient of determination?

    Question 9

    How is the coefficient of multiple correlation related to the multiple coefficient of determination?

    Question 10a

    b1 = 0.2372; sb1 = 0.0556; b2 = -0.000249; sb2 = 0.00003205.
    What is the critical t statistic for a two-tailed hypothesis test with a 99% confidence interval?

    Question 10b

    Provide the 99% confidence interval for β1.

    Question 10c

    Provide the 99% confidence interval for β2.

    Question 11a

    A researcher is testing the influence of four independent variables on a certain dependent variable using multiple regression (n = 88). He finds that, for the complete model with four predictor variables, SSE = 1,149.14. For a multiple regression model with only two of the four predictor variables, SSE = 1,426.93. The variance estimator is s2e = 13.52. Compute the F statistic.

    Question 11b

    How many degrees of freedom does the F statistic have?

    Question 11c

    What is the critical value for F with a significance level of 0.01?

    Question 11d

    What is a dummy variable?

    Question 12

    Formulate the null hypothesis and the alternative hypothesis for testing the slope coefficient in the event of dummy variables.

    Question 13

    What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
    \[ \hat{y} = 9 + 6x_{1} + 9x_{2} \]

    Question 14

    What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
    \[ \hat{y} = 7 + 4x_{1} + 2x_{2} \]

    Question 15

    What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
    \[ \hat{y} = 4 + 4x_{1} + 8x_{2} + 9x_{1}x_{2} \]

    Question 16

    Consider the following equation: yi = 2x1.4
    Compute the value of yi when xi = 1

    Question 17

    Consider the following equation: yi = 2x1.4
    Compute the value of yi when xi = 1

    Answer indication

    Question 1a

    \[ \hat{y} = 12 + (5)(11) + (6)(24) + (2)(27) = 265 \]

    Question 1b

    \[ \hat{y} = 12 + (5)(31) + (6)(20) + (2)(17) = 321 \]

    Question 1c

    \[ \hat{y} = 12 + (5)(32) + (6)(29) + (2)(13) = 372 \]

    Question 1d

    \[ \hat{y} = 12 + (5)(30) + (6)(26) + (2)(9) = 336 \]

    Question 2a

    \[ \hat{y} = 10 + (5)(20) + (4)(11) + (2)(10) = 174 \]

    Question 2b

    \[ \hat{y} = 10 + (5)(15) + (4)(14) + (2)(20) = 181 \]

    Question 2c

    \[ \hat{y} = 10 + (5)(35) + (4)(19) + (2)(25) = 311 \]

    Question 2d

    \[ \hat{y} = 10 + (5)(10) + (4)(17) + (2)(30) = 188 \]

    Question 3a

    The change in y when x1 increases by 4 is equal to (2)(4) = 8.

    Question 3b

    The change in y when x3 increases by 1 is equal to (6)(1) = 6.

    Question 3c

    The change in y when x2 increases by 2 is equal to (14)(2) = 28.

    Question 4

    There is no direct linear relationship between the independent variables.

    Question 5

    \[ b_{1} = \frac{ s_{y} (r_{x1y} - r_{x1x2}r_{x2y} ) }{s_{x1} (1 - r^{2}_{x1x2})} = \frac{100 (0.80 - 0.90*0.30)}{500 (1 - 0.90^{2}) = 0.56 } \]

    Question 6

    \[ b_{2} = \frac{s_{y} (r_{x2y} - r_{x1x2} r_{x1y} ) }{s_{x2} (1 - r^{2}_{x1x2})} =
    \frac{100 (0.30 - 0.90*0.80)}{400 (1 - 0.90^{2}) = -0.55 } \]

    Question 7

    \[ \bar{R}^{2} = 1 - \frac{0.0625/22}{0.4640/24} = 0.853 \]

    Question 8

    This adjusted coefficient of determination corrects for the fact that nonnrelevant independent variables will result in a (small) reduction in the error sum of squares (SSE). Consequently, the adjusted coefficient of determination offers a better comparison between multiple regression models with different numbers of independent variables.

    Question 9

    The coefficient of multiple correlation is equal to the square root of the multiple coefficient of determination

    Question 10a

    tn-K-1,a/2 = t22,0.005 = 2.819

    Question 10b

    0.237 - (2.819)(0.05556) < β1 < 0.237 + (2.819)(0.05556)
    0.80 < β1 < 0.394

    Question 10c

    -0.000249 - (2.819)(0.0000320) < β2 < -0.000249 + (2.819)(0.0000320)
    -0.000339 < β2 < -0.000159

    Question 11a

    \[ F = \frac{(1426.93 - 1149.14)/2}{13.52} = 10.27 \]

    Question 11b

    The F statistic has 2 degrees of freedom (i.e., for the two variables tested simultaneously) for the numerator and 85 degrees of freedom for the denominator.

    Question 11c

    F* = 4.9 (see Appendix Table 9)

    Question 11d

    A dummy variable is a variable with two possible outcomes: 0 and 1.

    Question 12

    \[ H_{0}: \beta_{3} = 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]
    \[ H_{1}: \beta_{3} \neq 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]

    Question 13

    18

    Question 14

    9

    Question 15

    12

    Question 16

    2.64

    Question 17

    5.28

     

    \[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
    Compute the expected value of y when x1 = 11, x2 = 24, and x3 = 27.

     

     

    What other topics are important in regression analysis? - ExamTests 13

     

     

    Questions

    Question 1

    What are the four stages of model building?

    Question 2

    If a model cannot be verified, what should you do?

    Question 3

    In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for ... and ... variables.

    Question 4

    If a blocking variable has 4 levels, how many dummy variables should be created?

    Question 5

    What is a treatment variable?

    Question 6

    What is a blocking variable?

    Question 7

    What is a lagged value?

    Question 8

    What is multicollinearity?

    Question 9

    Suppose that all the coefficient student t statistics are small, indicting no individual effect, and yet the overall F statistic indicates a strong effect for the total regression model. What is this an indication of?

    Question 10

    How to correct for multicollinearity?

    Question 11

    What is the danger of correcting multicollinearity by removing one or more of the highly correlated independent variables?

    Question 12

    What are the four assumptions made in a simple linear regression analysis?

    Question 13

    What is the fifth assumption that is added for multiple regression analysis?

    Question 14

    What is heteroscedasticity?

    Question 15

    Describe one procedure to check for heteroscedasticity.

    Question 16a

    From the regression of the squared residuals on the predicted values, we obtain the following estimated model (for n = 25):
    \[ e^{2} = 0.00621 - 0.00550 \hat{y} \hspace{2mm} with \hspace{2mm} R^{2} = 0.066 \]
    Compute the test statistic.

    Question 16b

    What is the critical value if we are testing with a 10% significance level?

    Question 16c

    Can we reject the null hypothesis that the regression model has uniform variance?

    Question 17

    What is the meaning of ρ for (auto)correlated errors?

    Question 18

    What does it imply if ρ = 0?

    Question 19

    What does it imply if ρ = 0.3?

    Question 20

    What does it imply if ρ = 0.9?

    Question 21a

    What is the most commonly used test to check possible autocorrelation of error terms?

    Question 21b

    Formulate the null hypothesis of this test.

    Question 22

    Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H1: ρ > 0.

    Question 23

    Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H1: ρ < 0.

    Question 24

    Suppose we found d = 0.2015, indicating positive autocorrelation. Estimate the serial correlation.

    Question 25

    Suppose we found d = 0.5213, indicating positive autocorrelation. Estimate the serial correlation.

    Question 26a

    In determining whether the errors in a regression model are positively correlated for the model
    \[ y_{t} = \beta_{0} + \beta_{1}x_{1t} + \epsilon_{t} \]
    we determine
    \[ \sum^{30}_{t = 1}e^{2}_{t} = 7587.9154 \]
    and
    \[ \sum^{30}_{t = 2} (e_{t} - e_{t - 1})^{2} = 8195.2065 \]
    Formulate the null and alternative hypothesis for the mentioned analysis.

    Question 26b

    Calculate the Durbin-Watson statistic.

    Answer indication

    Question 1

    Model building consists of four stages: (1) model specification; (2) coefficient estimation; (3) model verification, and; (4) interpretation and inference.

    Question 2

    Go back to the first stage; model specification.

    Question 3

    In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for treatment and blocking variables.

    Question 4

    3

    Question 5

    A treatment variable is a variable whose effect we are interested in estimating with minimum variance. For instance, we may desire to know which of the five different production machines provides the highest productivity per hour. For this example, the treatment variable is the production machine, represented by a four-level categorical variable.

    Question 6

    A blocking variable is a variable that is part of the environment. Therefore, the variable level of such a variable cannot be preselected.

    Question 7

    When time series are analyzed (i.e., when measurements are taken over time) lagged values of the dependent variable are an important issue. Often in time series data, the dependent variable in time period t is related to the value taken by this dependent variable in an earlier time period, that is yt-1. The lagged value then is the value of the dependent variable in this previous time period.

    Question 8

    Multicollinearity refers to a state of very high intercorrelations among the independent variables.

    Question 9

    Multicollinearity

    Question 10

    1. Remove one or more of the highly correlated independent variables.
    2. Change the model specification, including possibly a new independent variable that is a function of several correlated independent variables.
    3. Obtain additional data that do not have the same strong correlations between the independent variables.

    Question 11

    This might lead to a bias in coefficient estimation

    Question 12

    1. The Y's are linear functions of X, plus a random error term.
    2. The x values are fixed number that are independent of the error terms.
    3. The error terms are assumed to be random variables with a mean of zero and a covariance of σ2.
    4. The random error terms are not correlated with one another.

    Question 13

    There is no direct linear relationship between the Xj independent variables.

    Question 14

    Heteroscedasticity refers to the situation in which the errors terms do not have uniform variance.

    Question 15

    One possibility to check for heteroscedasticity is by examining a scatter plot of the residuals versus the independent variable. If the magnitude of the error terms tends to increase (or decrease) for increasing values of the independent variable, this indicates that the error variances are not constant.

    Question 16a

    \[ nR^{2} = (25)(0.066) = 1.65 \]

    Question 16b

    From Appendix Table 7, it can be found that for a 10% significance level, the critical value is: X21,0.10 = 2.706

    Question 16c

    The test statistic does not exceed the critical value, therefore the null hypothesis cannot be rejected.

    Question 17

    This ρ is the correlation coefficient (range -1 to +1) between the error in time t and the error in the previous time point, that is t - 1.

    Question 18

    If ρ = 0, this means that there is no autocorrelation in the errors.

    Question 19

    There is a relatively weak autocorrelation.

    Question 20

    There is a quite strong autocorrelation.

    Question 21a

    Durbin-Watson test.

    Question 21b

    H0: ρ = 0.

    Question 22

    Reject H0 if d > dL. Accept H0 if d > du. Test inconclusive if dL < d < dU.

    Question 23

    Reject H0 if d > 4 - dL. Accept H0 if d < 4 - du. Test inconclusive if 4 - dL > d > 4 - dU

    Question 24

    \[ r = 1 - \frac{d}{2} = 1 - \frac{0.2015}{2} = 0.90 \]

    Question 25

    \[ r = 1 - \frac{d}{2} = 1 - \frac{0.5213}{2} = 0.74 \]

    Question 26a

    H0: ρ = 0 and H0: ρ > 0.

    Question 26b

    \[ d = \frac{ \sum^{n}_{t = 2} (e_{t} - e_{t-1})^{2} }{\sum^{n}_{t=1} e^{2}_{t}} = \frac{8195.2065}{7587.9154} = 1.08 \]

     

    What are the four stages of model building?

     

     

    How to analyze categorical data? - ExamTests 14

     

     

    Questions

    Question 1a

    Consider the following data:

    CategoryABCDTotal
    Observed number of objects43536044200
    Probability (under H0)1/41/41/41/41
    Expected number of objects (under H0)50505050200

    Compute the chi-square test statistic.

    Question 1b

    What are the degrees of freedom for the critical test statistic?

    Question 1c

    Provide the range of the test statistic with probability .10 and .90 using Table 7a and 7b.

    Question 1d

    Can we reject the null hypothesis that there is no preference for any of the four categories?

    Question 2a

    Consider the following data:

    CategoryABCDTotal
    Observed number of objects50934512200
    Probability (under H0)0.300.500.150.051
    Expected number of objects (under H0)    200

    Compute the expected values based on the null hypothesis that is specified in the table.

    Question 2b

    Compute the chi-square test statistic.

    Question 2c

    How many degrees of freedom are there?

    Question 2d

    From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between .... and ....

    Question 2e

    Can the null hypothesis be rejected?

    Question 3a

    Consider the following data:

    CategoryABCDTotal
    Observed number of objects287493034400
    Probability (under H0)0.800.100.060.041
    Expected number of objects (under H0)    400

    Compute the expected values based on the null hypothesis that is specified in the table.

    Question 3b

    Compute the chi-square test statistic.

    Question 3c

    How many degrees of freedom are there?

    Question 3d

    Find the critical value using a significance level of 0.001.

    Question 3e

    Can the null hypothesis be rejected?

    Question 4a

    It is tested whether the population distribution is Poisson. Consider the following data:

    Number of occurrences0123+
    Observed frequency156632914
    Expected frequency under H0135.489.429.57.7

    Compute the test statistic.

    Question 4b

    How many degrees of freedom are there?

    Question 4c

    Find the corresponding critical value using a 0.001 significance level.

    Question 4d

    Can the null hypothesis that the population distribution is Poisson be rejected?

    Question 5

    Suppose we are interested in whether people prefer pinapple on their pizza. We sample 7 participants under the null hypothesis H0: P = 0.5. What is the probability of obtaining no more than 2 people with a preference for pineapple on their pizza?

    Question 6

    If our test statistic for a Sign test is equal to S = 2. Can we reject the null hypothesis?

    Question 7a

    A random sample of 100 students was asked to compare two new ice cream flavors: grilled BBQ and bubblegum surprise. After testing both flavors, 65 students preferred grilled BBQ, 40 students preerred bubblegum flavor, and 4 expressed no preference. Use the normal approximation to determine the mean and standard deviation for preferring bubblegum surprise.

    Question 7b

    Compute the test statistic using the normal approximation and continuity correction.

    Question 7c

    Find the approximate p-value.

    Question 7d

    Can we reject the null hypothesis?

    Question 7e

    What will be the test statistic if the continuity correction is not used?

    Question 8

    Given a random sample of n = 31 matched pairs, compute the mean and standard deviation for the Wilcoxon statistic under the null hypothesis.

    Question 9

    Now, suppose we find that the observed value of the statistic is T = 189. If we test the null hypothesis against a lower-tail alternative hypothesis with significance level 0.05, what can we conclude about the null hypothesis?

    Question 10

    Two independent samples are considered with n1 = 10, n2 = 12 and R1 = 93.5.
    Compute the mean and variance for the Mann-Whitney statistic.

    Question 11

    Compute the Mann-Whitney U statistic.

    Question 12

    What can we conclude about the null hypothesis if we are testing with a significence level of 0.05?

    Answer indication

    Question 1a

    X2 = 3.88

    Question 1b

    df = K - 1 = 4 - 1 = 3.

    Question 1c

    Lower critical value (Appendix Table 7b) X23,0.90 = 0.584
    Upper critical value (Appendix Table 7a) X23,0.10 = 6.251

    Question 1d

    It is found that the test statistic of 3.88 falls between 0.584 and 6.251; from this it follows that 0.10 < p-value < 0.90. The null hypothesis can therefore not be rejected. However, this does not mean that we can conclude that all four categories are equally preferred. It only means that there is not enough evidence to support a preference.

    Question 2a

    EA = nPA = 200(0.30) = 60
    EB = nPB = 200(0.50) = 100
    EC = nPC = 200(0.15) = 30
    ED = nPD = 200(0.05) = 10

    Question 2b

    X2 = 10.06

    Question 2c

    df = K - 1 = 4 - 1 = 3.

    Question 2d

    From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between 9.348 and 11.345.

    Question 2e

    0.001 < p-value < 0.025. Hence, the null hypothesis can be rejected.

    Question 3a

    EA = nPA = 400(0.80) = 320
    EB = nPB = 400(0.10) = 40
    EC = nPC = 400(0.06) = 24
    ED = nPD = 400(0.04) = 16

    Question 3b

    X2 = 27.178

    Question 3c

    df = K - 1 = 4 - 1 = 3.

    Question 3d

    From Appendix Table 7 with K - 1 degrees of freedom and significance level 0.001, it is found that X23,0.001 = 16.266

    Question 3e

    The test statistic is much larger than the critical value. Hence, the null hypothesis can be rejected.

    Question 4a

    X2 = 16.08

    Question 4b

    df = K - m - 1 = 4 - 1 - 1 = 2

    Question 4c

    X22,0.001 = 13.816

    Question 4d

    The test statistic exceeds the critical value, thus the null hypothesis that the population distribution is Poisson can be rejected at the 0.01% significance level.

    Question 5

    p-value = P(x < 2) = 0.227 (see Appendix Table 3)

    Question 6

    No, with a p-value this large, the null hypothesis cannot be rejected.

    Question 7a

    Let P be the population proportion that prefers bubblegum surprise, given S = 40.
    \[ \mu = np = 0.5n = 0.5(96) = 48 \]
    \[ \sigma = 0.5 \sqrt{96} = 4.899 \]

    Question 7b

    Since 40 < 48, S* = 40.5
    \[ z = \frac{S* - \mu}{\sigma} = \frac{40.5 - 48}{4.899} = -1.53 \]

    Question 7c

    From the standard normal distribution, it follows that the approximate p-value = 2(0.0630) = 0.126

    Question 7d

    The null hypothesis can be rejected at all significance levels greater than 12.6%.

    Question 7e

    If no continuity correction factor is used, the value for the test statistic becomes Z = -1.633, yielding a slightly smaller p-value of 0.1024.

    Question 8

    \[ \mu_{T} = \frac{n(n + 1)}{4} = \frac{(31)(32)}{4} = 248 \]
    \[ Var(T) = \sigma^{2}_{T} = \frac{n(n + 1)(2n + 1)}{24} = \frac{ (31)(32)(63) }{24} = 2604 \]
    \[ \sigma_{T} = \sqrt{2604} = 51.03 \]

    Question 9

    \[ Z = \frac{T - \mu_{T}}{\sigma_{T}} = \frac{189 - 248}{51.03} = \frac{-59}{51.03} = -1.16 \]
    For α = 0.05, zα = -1.645
    The test statistic does not exceed the critica value, hence there is not enough evidence to reject the null hypothesis.

    Question 10

    \[ E(U) = \mu_{U} = \frac{n1n2}{2} = \frac{ (10)(12) }{2} = 60 \]
    \[ Var(U) = \sigma^{2}_{U} = \frac{ n1n2 (n1 + n2 + 1) }{12} = \frac{ (10)(12)(23) }{12} = 230 \]

    Question 11

    \[ Z = \frac{U - \mu{U}}{\sigma_{U}} = \frac{81.5 - 60}{ \sqrt{230} } = 1.42 \]

    Question 12

    The corresponding p-value = 0.1556. With a 0.05 significance level, this test result is not sufficient to conclude that the null hypothesis can be rejected.

    How to conduct an analysis of variance? - ExamTests 15

     

    Questions

    Question 1

    What is the null hypothesis of a one-way analysis of variance?

    Question 2

    Suppose, we found the following data: SSW = 12.18, n = 20, k = 3. Compute an estimate of the within-groups mean square.

    Question 3

    Suppose, we found the following data: SSG = 21.55, n = 20, k = 3. Compute an estimate of the between-groups mean square.

    Question 4

    Compute the F ratio for the MSW and MSG calculate in the previous two questions.

    Question 5

    What are the degrees of freedom corresponding to the information provided in questions 2 and 3.

    Question 6

    What is the critical F value if we are testing with a 1% significance level?

    Question 7

    What can we conclude about the population means based on this F ratio?

    Question 8a

    Consider the following analysis of variance table:

    Source of variationSum of SquaresDegrees of freedomMean SquaresF ratio
    Between groups17284  
    Within groups624..  
    Total235217  

    How many degrees of freedom does the within-groups sum of squares have?

    Question 8b

    Compute the mean squares for between groups.

    Question 8c

    Compute the mean squares for within groups.

    Question 8d

    Compute the F ratio.

    Question 8e

    Find the critical F value corresponding to a significance level of 0.05.

    Question 8f

    What can be concluded about the null hypothesis?

    Question 9a

    Consider the following analysis of variance table:

    Source of variationSum of SquaresDegrees of freedomMean SquaresF ratio
    Between groups879..  
    Within groups79816  
    Total167719  

    How many degrees of freedom does the between-groups sum of squares have?

    Question 9b

    Compute the mean squares for between groups.

    Question 9c

    Compute the mean squares for within groups.

    Question 9d

    Compute the F ratio.

    Question 9e

    Find the critical F value corresponding to a significance level of 0.05.

    Question 9f

    What can be concluded about the null hypothesis?

    Question 10a

    Consider for questins 20-28 a two-way analysis of variance with one observations per cell and randomized blocks with the following results:

    Source of variationSum of squaresDegrees of freedomMean squaresF ratio
    Between groups363633MSG = SSG / (K - 1) 
    Between blocks757566MSB = SSB / (H - 1) 
    Error99991818MSE = SSE / ((K - 1) (H - 1)) 
    Total2102102727  

    Compute the mean squares for the between groups.

    Question 10b

    Compute the mean squares for the within groups.

    Question 10c

    Compute the mean squares for the error.

    Question 10d

    Compute the F ratio MSG / MSE.

    Question 10e

    Find the critical value for the hypothesis test that the between group means are equal using a 5% significance level.

    Question 10f

    What do we conclude about the null hypothesis that the between group means are equal?

    Question 10g

    Compute the F ratio MSB / MSE.

    Question 10h

    Find the critical value for the hypothesis test that the between block means are equal using a 5% significance level.

    Question 10i

    What do we conclude about the null hypothesis that the between block means are equal?

    Question 11a

    Consider the following data:

    Source of variationSum of squaresDegrees of freedomMean squaresF ratio
    Between groups62.04162.04 
    Between blocks0.0610.06 
    Interaction1.85...1.85 
    Error23.31630.37 
    Total87.2666  

    Compute the degrees of freedom for the interaction term.

    Question 11b

    Compute the F ratio for the interaction term.

    Answer indication

    Question 1

    All population means are equal, that is: H0: μ1 = μ2 = ... = μk for K populations.

    Question 2

    MSW = (12.18) / (20 - 3) = 0.72

    Question 3

    MSG = (21.55) / (3 - 1) = 10.78

    Question 4

    F = MSG / MSW = 10.78 / 0.72 = 15.039

    Question 5

    df = (K - 1) = 3 - 1 = 2 for the numerator
    df = (n - K) = 20 - 3 = 17 for the denominator

    Question 6

    F2,17,0.01 = 6.112 (Appendix Table 9)

    Question 7

    The test value (15.039) exceeds the critical value (6.112), therefore we can reject the null hypothesis that the population mean is the same for all three groups.

    Question 8a

    It follows from the degrees of freedom of the between-groups sum of squares that there are K - 1 = 4, thus K = 5. Further, from the degrees of freedom of the total sum of squares it follows that n - 1 = 17, thus n = 18.
    As a result, we obtain: df = N - k = 18 - 5 = 13.

    Question 8b

    MSG = SSG / (K - 1) = 1728 / 4 = 432

    Question 8c

    MSW = SSW / (n - K) = 624 / 13 = 48

    Question 8d

    F = MSG / MSW = 246.86 / 48 = 9

    Question 8e

    F4,13,0.05 = 3.179

    Question 8f

    F > F4,13,0.05 , therefore we can reject the null hypothesis that the population means are equal.

    Question 9a

    n - 1 = 19 --> n = 20
    n - k = 16 --> 20 - k = 16 --> k = 4
    df = k - 1 = 4 - 1 = 3
    Thus, there are 3 degrees of freedom.

    Question 9b

    MSG = SSG / (K - 1) = 879 / 3 = 293

    Question 9c

    MSW = SSW / (n - K) = 798 / 16 = 49.875

    Question 9d

    F = MSG / MSW = 293 / 49.875 = 5.875

    Question 9e

    F3,16,0.05 = 3.239

    Question 9f

    F < F3,16,0.05 , therefore we cannot reject the null hypothesis that the population means are equal.

    Question 10a

    MSG = SSG / (K - 1) = 3636 / 33 = 110.18

    Question 10b

    MSB = SSB / (H - 1) = 7575 / 66 = 114.77

    Question 10c

    MSE = SSE / ((K - 1) (H - 1)) = 9999 / 1818 = 5.5

    Question 10d

    F = MSG / MSE = 110.18 / 5.5 = 20.03

    Question 10e

    F33,1818,0.05 = 1.676

    Question 10f

    The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-groups means are equal.

    Question 10g

    F = MSB / MSE = 114.77 / 5.5 = 20.87

    Question 10h

    F66,9999,0.05 = 1.676

    Question 10i

    The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-blocks means are equal.

    Question 11a

    df = 1

    Question 11b

    F = MSI / MSE = 1.85 / 0.37 = 5.

     

    What is the null hypothesis of a one-way analysis of variance?

     

     

    How to analyze data sets with measurements over time? - ExamTests 16

     

     

    Questions

    Question 1

    What is meant with a time series?

    Question 2

    What are the four components of a time series?

    Question 3

    Let the estimates of level and trend in year 5 be as follows:
    \[ \hat{x}_{5} = 347 \]
    \[ T_{5} = 13 \]
    What is the forecast for the next year using the Holt-Winters method?

    Question 4

    What is the forecast for year 7 using the Holt-Winters method for nonseasonal series?

    Question 5

    What is the forecast for year 8 using the Holt-Winters method for nonseasonal series?

    Question 6

    What is the forecast for year 9 using the Holt-Winters method for nonseasonal series?

    Question 7

    Suppose we have 32 observations and a seasonal factor s = 4 indicating quarterly data. Write down the equation for the forecast the next observation beyond the end of the series. Use for this the method developed by Holt-Winters for seasonal series.

    Question 8

    What is the null hypothesis in an autoregressive model?

    Question 9

    Provide the general equation that represents a series according to the autoregressive model.

    Question 10

    What algorithm is used to obtain the parameters for the autoregressive model?

    Answer indication

    Question 1

    A time series is a set of measurements, ordered over time, on a particular quantity of interest. In a time series, the sequence of observations is important.

    Question 2

    • Tt: trend component.
    • St: Seasonality component.
    • Ct: Cyclical component.
    • It: Irregular component.

    Question 3

    \[ \hat{x}_{6} = 347 + 13 = 360 \]

    Question 4

    \[ \hat{x}_{7} = 347 + (2)(13) = 373 \]

    Question 5

    \[ \hat{x}_{8} = 347 + (3)(13) = 386 \]

    Question 6

    \[ \hat{x}_{8} = 347 + (4)(13) = 399 \]

    Question 7

    \[ \hat{x}_{n+h} = ( \hat{x}_{n} + hT_{n} ) F_{n+h-s} = \hat{x}_{33} = (\hat{x}_{32} + T_{32}) F_29 \]

    Question 8

    H0: Φp = 0

    Question 9

    \[ x_{t} = \gamma + \phi_{1}x_{t - 1} + \gamma + \phi_{2}x_{t - 2} + ... + \gamma + \phi_{p}x_{t - p} + \epsilon_{t} \]

    Question 10

    The least squares algorithm.

     

    What is meant with a time series?

     

     

    What other sampling procedures are available? - ExamTests 17

     

     

    Questions

    Question 1a

    Suppose we conducted a stratified sampling procedure. Use the following information:
    N1 = 75; N2 = 30; N3 = 125.
    n1 = 15; n2 = 8; n3 = 25.
    1 = 21.2; s1 = 12.8.
    2 = 13.3; s2 = 11.4.
    3 = 26.1; s3 = 9.2.
    Compute the point estimate of the population mean.

    Question 1b

    Compute the point estimate of the variance for the first stratum.

    Question 1c

    Compute the point estimate of the variance for the second stratum.

    Question 1d

    Compute the point estimate of the variance for the third stratum.

    Question 1e

    Compute the point estimate of the variance for the population mean.

    Question 1f

    Compute the point estimate of the standard deviation for the population mean.

    Question 1g

    Compute a 95% confidence interval for the population mean.

    Question 2a

    Suppose we conducted a stratified sampling procedure. Use the following information:
    N1 = 364; N2 = 1031.
    n1 = 40; n2 = 60.
    p(hat)1 = 7/40 = 0.175
    p(hat)2 = 13/60 = 0.217
    Compute the point estimate of the population proportion.

    Question 2b

    Compute the point estimate of the variance of the proportion for the first stratum.

    Question 2c

    Compute the point estimate of the variance of the proportion for the second stratum.

    Question 2d

    Compute the point estimate of the variance of the proportion for the population.

    Question 2e

    Compute the point estimate of the standard deviation of the proportion for the population.

    Question 2f

    Compute the 90% confidence interval for the population proportion from these stratified samples.

    Question 3a

    Suppose we have a total of N = 125 which is divided into three strata with N1 = 75, N2 = 30, and N3 = 20. Now, suppose we want to select a sample of size n = 25.
    Compute the sample size for the first stratum using proportional allocation.

    Question 3b

    Compute the sample size for the second stratum using proportional allocation.

    Question 3c

    Compute the sample size for the third stratum using proportional allocation.

    Question 4a

    Suppose we have a total of N = 225 which is divided into three strata with N1 = 100, N2 = 75, and N3 = 50. Now, suppose we want to select a sample of size n = 50.
    Compute the sample size for the first stratum using proportional allocation.

    Question 4b

    Compute the sample size for the second stratum using proportional allocation.

    Question 4c

    Compute the sample size for the third stratum using proportional allocation.

    Question 5a

    Suppose we have a total of N = 500 which is divided into three strata with N1 = 250, N2 = 100, and N3 = 150. Now, suppose we want to select a sample of size n = 50.
    Compute the sample size for the first stratum using proportional allocation.

    Question 5b

    Compute the sample size for the second stratum using proportional allocation.

    Question 5c

    Compute the sample size for the third stratum using proportional allocation.

    Question 6a

    Suppose we have a total of N = 500 which is divided into three strata with N1 = 250, N2 = 100, and N3 = 150. Now, suppose we want to select a sample of size n = 100.
    Compute the sample size for the first stratum using proportional allocation.

    Question 6b

    Compute the sample size for the second stratum using proportional allocation.

    Question 6c

    Compute the sample size for the third stratum using proportional allocation.

    Question 7

    What is the difference between proportional allocation and optimal allocation in terms of sample effort?

    Question 8

    What is the difference between proportional allocation and optimal allocation in terms of estimating the sample size for strata for population proportions?

    Question 9

    What is the difference between stratified sampling and cluster sampling?

    Question 10

    Mention one advantage and one disadvantage of cluster sampling.

    Question 11

    Mention one advantage and one disadvantage of two-phase sampling

    Answer indication

    Question 1a

    \[ \bar{x}_{st} = \frac{1}{N} \ sum^{K}_{j = 1} N_{j}\bar{x}_{j} = \frac{ (75)(21.2) + (30)(13.3) + (20)(26.1) }{125} = 20.09 \]

    Question 1b

    \[ \hat{\sigma}^{\frac{2}{x_{1}}} = \frac{ s^{2}_{1} }{n_{1}} x \frac{ (N_{1} - n_{1} ) }{N_{1} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(12.8)^{2}}{15} x \frac{60}{74} = 8.856 \]

    Question 1c

    \[ \hat{\sigma}^{\frac{2}{x_{2}}} = \frac{ s^{2}_{2} }{n_{2}} x \frac{ (N_{2} - n_{2} ) }{N_{2} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(11.4)^{2}}{8} x \frac{22}{29} = 12.324 \]

    Question 1d

    \[ \hat{\sigma}^{\frac{2}{x_{3}}} = \frac{ s^{2}_{3} }{n_{3}} x \frac{ (N_{3} - n_{3} ) }{N_{3} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(9.2)^{2}}{2} x \frac{18}{19} = 40.093 \]

    Question 1e

    \[ \hat{\sigma}^{\frac{2}{st}} = \frac{1}{N^{2}} \ sum^{K}_{j = 1} N^{2}_{j} \hat{\sigma}^{2}_{x_{j}} = \frac{ (75)^{2}(8.856) + (30)^{2}(12.324) + (20)^{2}(40.093) }{125^{2}} = 4.924 \]

    Question 1f

    \[ \hat{\sigma}_{\bar{x}_{st}} = \sqrt{4.924} = 2.22 \]

    Question 1g

    20.09 +/- (1.96)(2.22) = [15.74; 24.44]

    Question 2a

    \[ \hat{p}_{st} = \frac{1}{N} = \sum^{K}_{j = 1} N_{j} \hat{p}_{j} = \frac{ (364)(0.175) + (1031)(0.217) }{1395} = 0.206 \]

    Question 2b

    \[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.175)(0.825) }{39} x \frac{324}{363} = 0.003304 \]

    Question 2c

    \[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.217)(0.783) }{59} x \frac{971}{1030} = 0.002715 \]

    Question 2d

    \[ \hat{\sigma}^{2}_{\hat{p}_{st}} = \frac{1}{N^{2}} \sum^{K}{j = 1} N^{2}_{j} \ hat{\sigma}^{2}_{\hat{p}_{j}} = \frac{ (364)^{2}(0.003304) + (1031)^{2}(0.002715) }{ (1395)^{2} } = 0.001708 \]

    Question 2e

    \[ \hat{\sigma}_{\hat{p}_{st}} = 0.0413 \]

    Question 2f

    (0.206) +/- (1.645)(0.0413) = [0.138; 0. 274]

    Question 3a

    \[ n_{1} = \frac{75}{125} x 25 = 12 \]

    Question 3b

    \[ n_{2} = \frac{30}{125} x 25 = 5 \]

    Question 3c

    \[ n_{3} = \frac{20}{125} x 25 = 6 \]

    Question 4a

    \[ n_{1} = \frac{100}{225} x 50 = 22 \]

    Question 4b

    \[ n_{2} = \frac{75}{225} x 50 = 17 \]

    Question 4c

    \[ n_{3} = \frac{50}{225} x 50 = 11 \]

    Question 5a

    \[ n_{1} = \frac{250}{500} x 50 = 25 \]

    Question 5b

    \[ n_{2} = \frac{100}{500} x 50 = 10 \]

    Question 5c

    \[ n_{3} = \frac{150}{500} x 50 = 15 \]

    Question 6a

    \[ n_{1} = \frac{250}{500} x 100 = 50 \]

    Question 6b

    \[ n_{2} = \frac{100}{500} x 100 = 20 \]

    Question 6c

    \[ n_{3} = \frac{150}{500} x 100 = 30 \]

    Question 7

    Optimal allocation allocates relatively more sample effort to strata in which the population variance is highest.

    Question 8

    Optimal allocation allocates more sample observations to strata in which the true population proportions are closest to 0.50.

    Question 9

    In stratified random sampling, a sample is taken from every stratum of the population in an attempt to ensure that important segments of the population are given corresponding weight. In cluster sampling, a random sample of clusters is taken, such that some clusters will have no members in the sample.

    Question 10

    Advantage: convenience. Disadvantage: the additional imprecision in the sample estimates.

    Question 11

    Advantage: it enables the researcher, at a low cost, to try out the survey. Disadvantage: time consuming.

     

    Suppose we conducted a stratified sampling procedure. Use the following information:

     

    N1 = 75; N2 = 30; N3 = 125.
    n1 = 15; n2 = 8; n3 = 25.
    1 = 21.2; s1 = 12.8.
    2 = 13.3; s2 = 11.4.
    3 = 26.1; s3 = 9.2.
    Compute the point estimate of the population mean.

    Image

    Access: 
    Public

    Image

    Join WorldSupporter!
    Search a summary

    Image

     

     

    Contributions: posts

    Help other WorldSupporters with additions, improvements and tips

    Add new contribution

    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Image CAPTCHA
    Enter the characters shown in the image.

    Image

    Spotlight: topics

    Check the related and most recent topics and summaries:
    Activities abroad, study fields and working areas:

    Image

    Check how to use summaries on WorldSupporter.org

    Online access to all summaries, study notes en practice exams

    How and why use WorldSupporter.org for your summaries and study assistance?

    • For free use of many of the summaries and study aids provided or collected by your fellow students.
    • For free use of many of the lecture and study group notes, exam questions and practice questions.
    • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
    • For compiling your own materials and contributions with relevant study help
    • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

    Using and finding summaries, notes and practice exams on JoHo WorldSupporter

    There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

    1. Use the summaries home pages for your study or field of study
    2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
    3. Use and follow your (study) organization
      • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
      • this option is only available through partner organizations
    4. Check or follow authors or other WorldSupporters
    5. Use the menu above each page to go to the main theme pages for summaries
      • Theme pages can be found for international studies as well as Dutch studies

    Do you want to share your summaries with JoHo WorldSupporter and its visitors?

    Quicklinks to fields of study for summaries and study assistance

    Main summaries home pages:

    Main study fields:

    Main study fields NL:

    Follow the author: Vintage Supporter
    Work for WorldSupporter

    Image

    JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

    Working for JoHo as a student in Leyden

    Parttime werken voor JoHo

    Statistics
    4223 1