Examtest with the 9th edition of Statistics for Business and Economics by Newbold
- How to describe data graphically? - ExamTests 1
- How to describe data numerically? - ExamTests 2
- How to use probability calculation? - ExamTests 3
- How to use probability models for discrete random variables? - ExamTests 4
- How to use probability models for continuous random variables? - ExamTests 5
- How to obtain a proper sample from a population? - ExamTests 6
- How to obtain estimates for a single population? - ExamTests 7
- How to estimate parameters for two populations? - ExamTests 8
- How to develop hypothesis testing procedures for a single population? - ExamTests 9
- What test procedures are there for testing the difference between two populations? - ExamTests 10
- How to conduct a simple regression? - ExamTests 11
- How to conduct a multiple regression? - ExamTests 12
- What other topics are important in regression analysis? - ExamTests 13
- How to analyze categorical data? - ExamTests 14
- How to conduct an analysis of variance? - ExamTests 15
- How to analyze data sets with measurements over time? - ExamTests 16
- What other sampling procedures are available? - ExamTests 17
How to describe data graphically? - ExamTests 1
Questions
Question 1
Indicate whether each of the following variables is categorical or numeric. If the variable is categorical, specify the measurement level. If the variable is numeric, specify the measurement level and indicate whether the variable is discrete or continuous:
- The number of shares of a stock purchased by a broker.
- The nationality of a student.
- The grade point average of a student.
- The temperature in degrees Celsius.
Question 2
Upon visiting a newly opened H&M store, customers were given a brief survey. Is the answer to each of the following questions categorical or numerical? If categorical, give the level of measurement. If numerical, is it discrete or continuous?
- Is this your first visit to this H&M store?
- On a scale from 1 (very dissatisfied) to 5 (very satisfied), how satisfied are you with today's purchase(s)?
- What was the cost of your purchase(s)?
Question 3
Tourists visiting Croatia are asked to fill in a survey. The survey consists of various questions about how they experienced their holiday. Describe for each question the type of data obtained.
Question | Type of data |
Which of the following areas did you visit?
| |
Did you rent a sailing boat?
| |
What was the average amount of money you spent on food per day? | |
What would you recommend as the optimal number of days for tourists to spend in Croatia? | |
How often would you recommend visiting Croatia?
|
Question 4a
An administrator examines the travel expenses of faculty members that attended various professional meetings. He found that 36% of the travel expenses was spent for transportation costs, 17% was spent for accommodation, 13% was spent on food; 9% was spent on conference fees, 10% on registration costs, and the remainder was spent on miscellaneous costs.
Construct a pie chart for these data.
Question 4b
Construct a bar chart for these data.
Question 5
A company has defined seven codes for possible defects for one of its products. Construct a Pareto diagram for the following frequencies:
Defect code | A | B | C | D | E | F | G |
Frequency | 10 | 70 | 15 | 90 | 8 | 4 | 3 |
Question 6
Construct a time-series plot for the following data of customers shopping at a new mall during a particular week.
Day | Number of customers |
Monday | 516 |
Tuesday | 534 |
Wednesday | 451 |
Thursday | 487 |
Friday | 558 |
Saturday | 641 |
Sunday | 830 |
Question 7
Determine an appropriate interval width for a random sample of 370 observations with scores that fall between 40 to 200.
Question 8a
Construct a stem-and-leaf display for the following data.
17 | 16 | 15 | 17 | 17 |
20 | 30 | 25 | 25 | 14 |
12 | 18 | 31 | 26 | 26 |
12 | 15 | 16 | 16 | 28 |
Question 8b
Construct a histogram for these data.
Question 8c
Is the distribution of these data symmetric, right-skewed, or left-skewed?
Question 9
Prepare a scatter plot of the following data:
- (3, 10).
- (2, 8).
- (3, 12).
- (4, 15).
- (6, 20).
- (5, 15).
- (4, 12).
Question 10a
The following table shows the age of faculty members who have obtained a PhD degree from the largest university in the Netherlands.
Age | Percent |
26 - 28 | 18.00 |
29 - 32 | 23.50 |
33 - 40 | 30.51 |
41 - 55 | 12.99 |
56+ | 15.00 |
What percent of faculty members who obtained a PhD are 46 years or older?
Question 10b
What percent of faculty member who obtained a PhD are under the age of 33 years?
Question 10c
Construct a relative cumulative frequency distribution of the data.
Question 10d
Suppose, we have 200 observations. What are the cumulative frequencies for the data described?
Question 10e
Interpret the cumulative frequencies.
Question 11
The following data are presented:
Age | 30 -40 | 40 -50 | 50 - 60 | 60 - 70 |
Number | 12 | 13 | 22 | 34 |
Describe possible errors in this table.
Question 12
Suppose, the amount of money a person spends on movie tickets each month (in euros) is:
6.0, 5.3, 4.0, 5.7, 10.0, 8.4, 2.5, 10.0, 9.5, 0.0, 5.0, 10.0
What graph would you use to visually display these data?
Question 13
In Germany, it was found that 32% of shoppers with incomes less than 50,000 shop online. Of the remaining 68%, half of the individuals never shop, and the other half shops by going to the actual store. Use a pie chart to plot this data.
Question 14a
Four types of checking accounts are offered by a bank. Suppose, a random sample of 300 customers were surveyed and asked some questions. It was found that 60% of the respondents preferred "Easy Checking", 12% preferred "Intelligent Checking", 18% preferred "Super Checking", and the remainder preferred "Ultimate Checking". Of the participants who selected Easy Checking, 100 were females. Of those who selected Intelligent Checking, a third was female. Of those who selected Super checking, half was female. Finally, of those who selected Ultimate Checking, 80% was female. Describe the data with a cross table.
Question 14b
How many females are there in total, and how many males?
Question 14c
What type of graph is appropriate for these data?
- Histogram.
- Scatter plot.
- Time-series plot.
- Bar chart.
Question 15
What type of graph is most appropriate for two numerical variables?
Answer indication
Question 1
- The number of shares of a stock purchased by a broker: Numerical; interval; discrete
- The nationality of a student: Categorical; nominal
- The grade point average of a student: Numerical; ratio; continuous.
- The temperature in degrees Celsius: Numerical; interval; continuous.
Question 2
- Categorical; nominal.
- Categorical; ordinal.
- Numerical; continuous.
Question 3
Question | Type of data |
Which of the following areas did you visit?
| Both categorical (nominal data, binary coded: yes/no) as numerical (discrete) by the number of areas that one visited. |
Did you rent a sailing boat?
| Categorical; nominal; binary coded. |
What was the average amount of money you spent on food per day? | Numerical; interval; continuous. |
What would you recommend as the optimal number of days for tourists to spend in Croatia? | Numerical; interval; discrete. |
How often would you recommend visiting Croatia?
| Categorical; ordinal. |
Question 4a
No answer indication available.
Question 4b
No answer indication available.
Question 5
No answer indication available.
Question 6
Note that the time points on the horizontal axis consists of numbers. This could of course also be replaced by the days (Monday - Sunday).
Question 7
According to the quick guide, a sample size of 370 can be approximated by eight to ten classes.
Using the formula for interval width yields:
w = (200 - 40) / 8 = 20; or
w = (200 - 40) / 10 = 16
Thus, an appropriate interval width lies somewhere between 16 and 20.
Question 8a
1 | 2, 2, 4, 5, 5, 6, 6, 6, 7, 7, 7, 8.
2 | 0, 5, 5, 6, 6, 8.
3| 0, 1.
Question 8b
No answer indication available.
Question 8c
Right skewed (positively skewed); the tail is at the right side of the distribution.
Question 9
No answer indication available.
Question 10a
12.99 + 15.00 = 27.99%
Question 10b
18.00 + 23.50 = 41.50%
Question 10c
Age | Percent |
26 - 28 | 18.00 |
29 - 32 | 41.50 |
33 - 40 | 72.01 |
41 - 55 | 85.00 |
56+ | 100.00 |
Question 10d
The cumulative frequencies for 200 observations are: 36, 82, 144, 170, 200.
Question 10e
For sample size n = 200, there are 36 individuals that obtained a PhD between the age of 26 and 28. There are 82 individuals that obtained a PhD before the age of 33. There are 144 individuals that obtained a PhD before the age of 41, and so forth.
Question 11
A possible error lies in the boundaries of the frequency classes. First, there is no upper and lower limit, hence (possibly) excluding some observations. Second, it is unclear from this frequency distribution, to what class observations such as 30 and 40 belong to.
Question 12
A time-series plot would be appropriate here. Data are given for t number of time points, with t = 12.
Question 13
No answer indication available.
Question 14a
Type of checking account | Female | Male | Total |
Easy Checking | 100 | 80 | 180 |
Intelligent Checking | 12 | 24 | 36 |
Super checking | 27 | 27 | 54 |
Ultimate Checking | 24 | 6 | 30 |
Total | 163 | 137 | 300 |
Question 14b
There are 163 females and 137 males in the sample of 300 participants.
Question 14c
D, a bar chart. The other graphs are appropriate in the event of numerical variables. Here, we have frequencies for two categorical variables. This is best displayed by a bar chart (or pie chart).
Question 15
A scatter plot.
How to describe data numerically? - ExamTests 2
Questions
Question 1
A random sample of five numbers was drawn:
18 71 80 80 84
Compute the mean, median, and mode.
Question 2
The number of cars crossing the border between Israel and Jordan is recorded. Over a 6-day period, the following number of cars for each day is found:
16 21 12 19 1 2
Compute the mean, median, and mode.
Question 3a
The records of the university of Groningen over a 12-year period show the following percentage increase in the number of students enrolled:
4.1 3.2 3.5 4.5 5.1 3.8
2.1 2.2 3.1 5.1 1.5 1.0
Compute the mean increase in the number of students enrolled.
Question 3b
Compute the median increase in the number of students enrolled.
Question 3c
Find the mode.
Question 4a
The finances over the past decade are reviewed. The records are shown per year.
2.51 3.74 4.15 5.33 6.18
6.65 7.18 6.92 6.95 7.54
Calculate the mean.
Question 4b
Calculate the median.
Question 5a
During the past years, many countries faced depopulation. We collected the number of elementary schools that were closed for ten countries:
10 6 13 5 11 5 6 3 7 9
Find the mean, median, and mode of the number of schools closed.
Question 5b
Find the five-number summary.
Question 6
A textile manufacturer obtains a sample of 50 bolts of cloths and carefully inspects each bolt. Based on this inspection, the manufacturer records the number of imperfections.The following
contingency table is obtained:
Number of imperfections | 0 | 1 | 2 | 3 |
Number of bolts | 33 | 12 | 4 | 1 |
Calculate the mean, median, and mode for these sample data.
Question 7
Compute the variance and standard deviation of the following sample data:
6 8 10 12 14 9 11 7 13 11
Question 8
Compute the variance and standard deviation of the following sample data:
5 -3 0 2 -1 7 4
Question 9
Consider two different investments, stock A and stock B. The mean closing price for stock A is 4.00 and the mean closing price for stock B is 80.00. The mean rate of return is the same for both stock A and stock B. We might think that stock B is more volatile than stock A. Now, suppose the standard deviations were found to be considerably different, with SA = 2.00 and SB = 8.00. Compute the coefficient of variation for these sample data and compare these competing investment opportunities.
Question 10
Calculate the coefficient of variation for the following data:
13 15 12 14 11
Question 11a
A set of data is mounded (bell-shaped) with a mean of 300 and a variance of 144.
Approximately what proportion of observations is greater than 288?
Question 11b
Approximately what proportion of observations is less than 324?
Question 11c
Approximately what proportion of observations is greater than 336?
Question 12a
The number of cars that pass through a tunnel during a period of 35 are as follows:
60 70 74 56 84 54 50
47 80 71 50 95 121 90
75 84 70 61 110 64 80
85 85 43 76 60 91 90
60 87 110 85 44 94 69
What is the mean number of cars?
Question 12b
What is the standard deviation?
Question 12c
What is the coefficient of variation?
Question 12d
Construct a stem-and-leaf display of the number of cars that pass through the tunnel. Next, find the interquartile range.
Question 12e
Provide the five-number summary for the sample data.
Question 13a
The daily exchange rate from EUR to USD for seven business days is:
1.14 1.14 1.13 1.13 1.12 1.11
Over the same period, the daily exchange rate from EUR of JPY is:
110 110 109 109 108 109
Compare the means of these two distributions.
Question 13b
Compare the standard deviations of these two distributions.
Question 14a
A company produces light bulbs with a mean lifetime of 1,200 hours and a standard deviation of 50 hours. Find the z-score for a light bulb that lasts only 1,120 hours.
Question 14b
Consider the z-score computed by question 14a. What percentage of light bulbs lasts longer than 1,120 hours?
Question 14c
Consider again the mean and standard deviation from question 14a. Find the z-score corresponding to a light bulb that lasts 1,300 hours.
Question 14d
What percentage of light bulbs lasts longer than 1300 hours?
Question 15a
Suppose that a student who completed courses for 15 ECTS in total during his first semester of college. He received one A, one B, one C, and one D. Now, suppose that a value of 4 is assigned to an A, a value of 3 is assigned to a B, a value of 2 is assigned to a C, and a value of 1 is assigned to a D. Calculate the student's semester GPA.
Question 15b
Now, however, each course is not worth the same number of credit hours. The A was earned in a 3-credit English course, the B was earned in a biology course of 3 hours, the C was earned in a 4-credit biology course, and the D was earned in a 5-credit Spanish course. Using these weight, calculate again the student's semester weighted GPA.
Question 16a
Consider the following data:
xi | wi |
4.7 | 8 |
3.8 | 7 |
5.7 | 4 |
2.6 | 3 |
5.5 | 2 |
What is the artihmetic mean of the xi values?
Question 16b
What is the weighted mean of the xi values?
Question 16c
What is the sample variance?
Question 16d
What is the sample standard deviation?
Question 17a
Consider the following data:
(15,45) (6,18) (11,33) (12,36) (16,48), (14,42)
(5,15) (17,51) (4,12) (19,57), (7,21)
Compute the covariance.
Question 17b
Compute the correlation coefficient.
Question 17c
Draw a scatter plot to display the relationship between the two variables.
Question 18a
Consider the following data:
Quiz score (x) | 4 | 3.4 | 3 | 5 | 1.1 |
Exam score (y) | 100 | 66 | 78 | 80 | 30 |
Compute the covariance.
Question 18b
Compute the correlation coefficient.
Answer indication
Question 1
Mean = (18+71+80+80+84)/5 = 66.7; median = 80; mode = 80.
Question 2
Mean = (16+21+12+19+1+2)/6 = 11.8; median = (12+16)/2 = 14; there is no mode.
Question 3a
Mean = (4.1 + 3.2 + 3.5 + 4.5 + 5.1 + 3.8 + 2.1 + 2.2 + 3.1 + 5.1 + 1.5 + 1.0) / 12 = 3.3.
Question 3b
Median = 3.4.
Question 3c
Mode = 5.1.
Question 4a
Mean = (2.51 + 3.74 + 4.15 + 5.33 + 6.18 + 6.65 + 7.18 + 6.92 + 6.95 + 7.54) / 10 = 5.7.
Question 4b
Median = 6.4.
Question 5a
Mean = 7.5; median = 6.5; mode = 6.
Question 5b
For the five number summary, order the data in ascending order, that is:
3 5 5 6 6 7 9 10 11 13
Q1 is the value located in the 0.25(10+1)th position, that is the 2.75th position.
The second value is 5, the third value is also 5.
Q1 = 5 + 0.25*(5 - 5)
Q1 = 5 + 0
Q1 = 5
Q3 = the value located in the 0.75(10+1)th ordered position, that is the 8.25th position.
Q3 = 10 + 0.75(11 - 10)
Q3 = 10 + 0.75
Q3 = 10.75
Thus, the five number summary is: 3 (minimum); 5 (Q1); 6.5 (median); 10.75 (Q3); 13 (maximum).
Question 6
Mean = (0*33 + 1*12 + 2*4 + 3*1) / 50 = 23/50 = 0.46.
Median = 0
Mode = 0
Question 7
To calcuate the sample variance and standard deviation, follow these steps:
- Step 1: Calculate the sample mean. The sample mean here is equal to 10.1.
- Step 2: Find the difference between each of the values and the sample mean of 10.1.
- Step 3: Square each difference.
The squared deviation from the mean for all observations are: 16.81 4.41 0.01 3.61 15.21 1.21 0.81 9.61 8.41 and 0.81. The sum of these squared deviations equals 60.9. Next, s2 = (60.9) / (n -1) = 60.9/9 = 6.76. Thus, the variance equals 6.76. The standard deviation then is computed by the square root of the variance. That is: s = √6.76 = 2.6
Question 8
Again, apply the same steps as in question 7. The sample mean is equal to 2. The squared deviation from the mean for each observation is: 9, 25, 4, 0, 9, 25, 4. The sum of these squared differences is equal to 76. The variance, s2 = 76/6 = 12.83. The standard deviation is the square root of the variance, that is: s = √12.83 = 3.56.
Question 9
CVA = 2.00 / 4.00 x 100% = 50%.
CVB = 8.00 / 80.00 x 100% = 10%.
The market value of stock A fluctuates more from period to period than does the market value of stock B. The coefficient of variation (CV) indicates that stock for stock A, the sample standarddeviation is 50% of the mean, and for stock B the sample standard deviation is only 10% of the mean.
Question 10
Use the formula:
\[CV = \frac{s}{\bar{x}} x 100\% \hspace{5mm} if \hspace{5mm} \bar{x} > 0 \]
CV = (1.58 / 13) x 100% = 12.15%
Thus, the sample standard deviation is 12.15% of the mean.
Question 11a
Use the formula:
\[z = \frac{x_{i} - \mu}{\sigma} \]
The standard deviation, σ, is equal to the square root of the variance, σ2, that is: √144 = 12
z = (288 - 300) / 12 = -12/12 = -1
According to the empirical rule, approximately 68% fall within 1 standard deviation above and below the mean. The remaining 34% percent is thus spread to the left and right of this interval. This means that 0.5*34 = 16% of the observations fall below z = -1. Vice versa, 100 - 16 = 84% of scores are greater than 288.
Question 11b
z = (324 - 300) / 12 = 24/12 = 2
According to the empirical rule, approximately 95% fall within 2 standard deviations above and below the mean. The reamining 5% is spread at the higher and lower end of the distribution. Thus, 97.5% of observations are less than 324.
Question 11c
z = (336 - 300) / 12 = 36/12 = 3. Approximately all observations are lower than 336. Thus, to answer the question, almost no (0.15%) observations are greater than 336.
Question 12a
Mean = 75.
Question 12b
Standard deviation = 19.26.
Question 12c
CV = (19.26/75) x 100% = 25.67.
Question 12d
4 | 3 4 7
5 | 0 0 4 6
6 | 0 0 0 1 4 9
7 | 0 0 1 4 5 6
8 | 0 0 4 4 5 5 5 7
9 | 0 0 1 4 5
10 |
11| 0 0
12| 1
The interquartile range, IQR = 26.
Question 12e
Minimum = 43; Q1 = 60; Median = 75; Q3 = 86; Maximum = 121.
Question 13a
The means are 1.13 and 109.17.
Question 13b
The standard deviations are 0.01 and 0.75
CVA = (0.01/1.13) x 100% = 1.04%
CVB = (0.75/109.17) x 100% = 0.69%
The coefficient of variations tells us that the sample standard deviation for EUR to USD is 1.04% of the mean, whereas the sample standard deviation for EUR to JPY is 0.69% of the mean. Thus, the exchange rate for EUR to USD fluctuates more from day to day than does that of EUR of JPY.
Question 14a
z = (1,120 - 1,200) / 50 = -1.6.
Question 14b
94.52 (you can find the p-value corresponding to this z-score in the table of a standard normal distribution).
Question 14c
z = (1,300 - 1,200) / 50 = 2.
Question 14d
According to the empirical rule, approximately 2.5% of observations are more than two standard deviations above the mean.
Question 15a
\[ \bar{x} = \frac{4+3+2+1}{4} = 2.5\]
Question 15b
Use the formula for the weighted mean, that is:
\[\bar{x} = \frac{\Sigma w_{i}x_{i}}{n} \]
\[\bar{x} = \frac{4*3 + 3*3 + 2*4 + 1*5}{15} = \frac{34}{15} = 2.267 \]
Question 16a
\[\bar{x} = \frac{4.7+2.8+5.7+2.6+5.5}{5} = \frac{22.3}{5} = 4.46\]
Question 16b
\[\bar{x} = \frac{4.7*8 + 3.8*7 + 5.7*4 + 2.6*3 + 5.5*2}{24} = \frac{105.8}{24} = 4.41 \]
Question 16c
The variance is 1.643.
Question 16d
The standard deviation is √1.643 = 1.281.
Question 17a
The covariance = 82.42.
Question 17b
The correlation coefficient between x and y, that is r = 1.0 (perfect positive linear relationship).
Question 17c
Question 18a
Cov(x,y) = 30.8.
Question 18b
r = 0.83.
A random sample of five numbers was drawn:
18 71 80 80 84
Compute the mean, median, and mode.
How to use probability calculation? - ExamTests 3
Questions
Question 1a
The sample space S = [E1, E2, E3, E4, E5, E6]. Given A = [E1, E2, E3] and B = [E3, E4, E5].
What is A intersection B?
Question 1b
What is the union of A and B?
Question 1c
Is the union of A and B collectively exhaustive?
Question 2a
Use the following sample space S: S = [E1, E2, E3, E4, E5, E6, E7, E8, E9, E10].
Given A = [E1, E2, E3, E4], what is Ā?
Question 2b
Given Ā = [E1, E4, E5, E7] and B̄ (complement B) = [E2, E3, E5, E8]. What is A intersection B̄ (complement B)?
Question 2c
What is A intersection B?
Question 2d
What is the union of A and B?
Question 2e
Is the union of A and B collectively exhaustive?
Question 3
Suppose, two letters are to be selected from A, B, C, D, and E. Further, these two letters have to be arranged in order. How many permutations are possible?
Question 4
Suppose, there are 8 candidates that applied for a particular job. Yet, there are only 4 positions available. Of these 8 candidates, 5 are men and 3 are women. If every combination of candidates is equally likely to occur, what is then the probability that no women will be hired?
Question 5a
Suppose, there are 10 Apple iPads, 5 Samsung tablets, and 5 Huawei tablets on offer in a store A person enters the store and wants to buy 3 tablets. These tablets are selected purely by chance. What is the total number of outcomes in the sample space?
Question 5b
What is the probability that this person selects 2 Apple iPads and 1 Samsung tablet?
Question 6a
A sample space consists of 5 A's and 7 B's. Now, suppose we want to randomly draw two letters from this sample space. What is the total number of possible combinations?
Question 6b
What is the probability that a randomly selected set of 2 will include 1 A and 1 B?
Question 7
In a family of 6 family members, there are three males and three females. What is the probability that a random sample of two family members consists of two males?
Question 8a
Suppose there are 12 employees who could be assigned to an editorial task. Of these 12 employees, 7 are women and 5 are men. Two of the men are brothers. The manager of the company has to assign the editorial task randomly to one employee. Let A be the event "chosen employee is a man". Let B be the event "chosen employee is one of the brothers". What is the probability of event A?
Question 8b
What is the probability of event B?
Question 8c
What is the probability of the intersection of A and B?
Question 9a
Suppose, P(A) = 0.75, P(B) = 0.80, and P(A ∩ B) = 0.65. What is P (A ∪ B)?
Question 9b
What is the conditional probability of event B, given that event A has occurred?
Question 9c
What is the joint probability of both event A and event B?
Question 10
Suppose, within the Netherlands, 54% of all master's degrees are earned by women. Of all master's degrees that are obtained, 20% is obtained in psychology. In addition, 8% of all master's degrees are obtained by women in psychology. Are the events "the diploma holder is a woman" and the event "the diploma is in psychology" statistically independent?
Question 11
Suppose, the odds in favor of winning are 3 to 2. What is then the probability of winning?
Question 12a
Suppose, we are interested in examining the effect of alcohol on highway crashes. Obviously, it is unethical to provide one group of drivers with alcohol and compare their crash involvement to that of a sober group. We know, however, that 10.3% of the nighttime drivers have been drinking, and that 32.4% of the single-vehicle-accident drivers had been drinking. In this example, single-vehicle accidents are chosen to ensure that any driving error could be assigned to the driver only.
Based on these data, what is the sample space?
Question 12b
What is the conditional probability that the driver had been drinking, given that he was not involved in a crash?
Question 12c
Do these numbers provide sufficient evidence to conclude that alcohol increases the probability of crashes?
Question 13
For questions 26-30, the sample space is defined by events A1, A2, B1, and B2.
Given that P(A1) = 0.15, P(B1) = 0.20, and P(B1|A1) = 0.60. What is P(A1|B1)?
Question 14
Given that P(A1 ∩ B1) = 0.09 and P(B1) = 0.18. What is P(A1|B1)?
Question 15
Given that P(A2 ∩ B2) = 0.81 and P(B2) = 0.82. What is P(A2|B2)?
Question 16
Given that P(A1) = 0.10, P(B1|A1) = 0.90. What is the probability of P(A1 ∩ B1)?
Question 17
Given that P(A1) = 0.10, P(B1|A1) = 0.90, P(B2|A1) = 0.10. What is the probability of P(A2)?
Answer indication
Question 1a
A ∩ B = [E3].
Question 1b
A ∪ B = [E1, E2, E3, E4, E5].
Question 1c
No, A and B are not collectively exhaustive, because E6 is not covered in the union.
Question 2a
Ā = [E5, E6, E7, E8, E9, E10]
Question 2b
A ∩ complement B = [E2, E3, E5, E8], because A is equal to the complement of B.
Question 2c
A ∩ B is the empty set. There are no basic outcomes in both A and B, because they are each others complement.
Question 2d
A ∪ B = [E1, E2, E3, E4, E5, E6, E7, E8]
Question 2e
No, events E9 and E10 are not covered in the union of A and B.
Question 3
There are five outcomes, that is n = 5, and two outcomes have to be selected, that is x = 2.
Using the formula for the number of permutations yields:
\[P^{5}_{2} = \frac{5!}{3} = \frac{120}{6}\ = 20 ].
Question 4
First, calculate the total number of possible combinations of four candidates selected from the eight possible candidates. That is:
\[ C^{8}_{4} = \frac{8!}{4!4!} = 70 \]
Then, if no women is to be hired, this implies that the four successful candidates must come from the available five men. That means that the number of combinations is as follows:
\[ C^{5}_{4} = \frac{5!}{4!1!} = 5 \]
To conclude, if out of 70 possible combinations each is likely to be chosen, the probability that one of the 5-all male combinations would be selected is 5/70 = 1/14 = 0.07 (that is, 7%).
Question 5a
\[N = C^{20}_{3} = \frac{20!}{3!(20-3)!} = 1,140 \]
Thus, there are 1,140 number of outcomes in the sample space.
Question 5b
\[ C^{10}_{2} = \frac{10!}{2!(10-2)!} = 45 \]
Similarly, the number of ways that we can select 1 Samsung tablet from the available 5 is 5.
\[ C^{5}_{1} = \frac{5!}{1!(5-1!)} = \frac{5!}{1!4!} = 5 \]
Therefore, the number of outcomes that satisfy event A is as follows:
\[ N_{A} = C^{10}_{2} x C^{5}_{1} = 45 x 5 = 225 \]
Hence, the probability of A [i.e., 2 Apple iPads and 1 Samsung tablet] is:
\[ P_{A} = \frac{N_{A}}{N} = \frac{225}{1140} = 0.197 \]
Question 6a
The total number of possible combinations of 2 letters selected from 8 is as follows:
\[ C^{12}_{2} = \frac{12!}{2!10!} = 66 \]
Question 6b
The number of ways that we can select 1 A from the 5 available A's is as follows:
\[ N_{A} = C^{12}_{2} x C^{5}_{1} = \frac{5!}{1!(5-1)!} = \frac{5!}{1!4!} = 5 \]
Similarly, the number of ways that we can select 1 B from the 7 available B's is as follows:
\[ N_{A} = C^{12}_{2} x C^{7}_{1} = \frac{7!}{1-(7-1)!} = \frac{7!}{1!6!} = 7\]
Therefore, the number of ways that we can select one A and one B, that is the number of outcomes that satisfy event A, is as follows:
\[N_{A} = C^{5}_{1} x C^{7}_{1} = 5 x 7 = 35 \]
Finally, the probability of event A (that is, one A and one B) is as follows:
\[ P_{A} = \frac{N_{A}}{A} = \frac{35}{66} = 0.53\].
Question 7
\[ N = C^{6}{3} = \frac{6!}{3!3!} = \frac{720}{36} = 20 \]
Now, the number of combinations for two males is:
\[ C^{3}_{2} = \frac{3!}{2!1!} = \frac{6}{2} = 3 \]
Therefore, the probability of selecting two males is 3/20 = 0.15 (that is: 15%).
Question 8a
\[P_{A} = \frac{N_{A}}{N} = \frac{5}{12} = 0.42 \]
Question 8b
\[P_{B} = \frac{N_{B}}{N} = \frac{2}{12} = 0.17 \]
Question 8c
A ∩ B = 0.17
Question 9a
Use the addition rule of probabilities.
\[ P (A ∪ B) = P(A) + P(B) - P(A ∩ B) \]
Transforming this formula provides:
\[ P (A ∩ B) = P(A) + P(B) - P(A ∪ B) \]
This gives:
\[ 0.75 + 0.80 - 0.65 = 0.90 \]
Question 9b
\[ P(B|A) = \frac{P(A ∩ B)}{P(A)} = \frac{0.65}{0.75} = 0.8667 \]
Question 9c
To answer this question, use the multiplication rule of probabilities. That is:
\[ P(A ∩ B) = P(A|B) P(B) = (0.8125)(0.80) = 0.65 \]
Question 10
\[ P(A) = 0.54, P(B) = 0.20, P(A ∩ B) = 0.08 \]
Since
\[ P(A)P(B) = (0.54)(0.20) = 0.108 \neq 0.08 = P(A ∩ B) \]
these events are not independent.
The dependence can be found from the conditional probability:
\[ P(A|B) = \frac{P(A ∩ B)}{P(B)} = \frac{0.08}{0.20} = 0.40 \neq 0.54 = P(A) \]
That means that, in the Netherlands, only 40% of psychology degrees go to women, whereas women constitute 54% of all degree recipients.
Question 11
\[ \frac{3}{2} = \frac{P(A)}{1-P(A)} \]
\[ 3(1-P(A)) = 2P(A) \]
\[ 5P(A) = 3 \]
\[ P(A) = \frac{3}{5} = 0.6 \]
Question 12a
A1: the driver had been drinking.
A2: the driver had not been drinking.
B1: the driver was involved in a single-vehicle crash.
B2: the driver was not involved in a single-vehicle crash.
Question 12b
P(A1|C1) = 0.324
Question 12c
P(A1|C2) = 0.103
To answer this question, use the overinvolvement ratio. That is:
\[ \frac{P(A_{1}|C_{1})}{P(A_{1}|C_{2})} = \frac{0.324}{0.103} = 3.15 \]
Based on this ratio of 3.15, we can conclude that there is evidence that alcohol increases the probability of car crashes.
Question 13
Using Bayes' theorem, we find that P(A1|B1) = (0.60*0.15)/(0.20) = 0.45.
Question 14
\[ P(A_{1}|B_{1}) = \frac{P(A_{1} ∩ B_{1})}{P(B_{1})} = \frac{0.09}{0.18} = 0.50 \]
Question 15
\[ P(A_{2}|B_{2}) = \frac{P(A_{2} ∩ B_{2})}{P(B_{2})} = \frac{0.81}{0.82} = 0.988 \]
Question 16
P(A1 ∩ B1) = 0.90 * 0.10 = 0.09
Question 17
Use both:
P(A1 ∩ B1) = 0.90 * 0.10 = 0.09
and:
P(A1 ∩ B2) = 0.10 * 0.10 = 0.01
to find that:
P(A1) = 0.09 + 0.01 = 0.10
A2 is the complement of A1, thus A2 = 1 - A1 = 1 - 0.10 = 0.90
The sample space S = [E1, E2, E3, E4, E5, E6]. Given A = [E1, E2, E3] and B = [E3, E4, E5].
What is A intersection B?
How to use probability models for discrete random variables? - ExamTests 4
Questions
Question 1
A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?
Question 2
The weight of students is recorded as part of a national health study. Is the weight of students a discrete or continuous random variable?
Question 3
Indicate for each of the following if a discrete or continuous random variable provides the best definition:
- The number of sunny days in the Netherlands.
- The level of pressure in the tires of a car.
- The amount of oil exported by Saudi Arabia in 2019.
Question 4
Give the probability distribution function of the face values of a single die when a fair die is rolled.
Question 5
What is the probability of a value of 5 or higher, when rolling a single fair die once?
Question 6a
Use the following probability distribution:
x | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
P(x) | 0.03 | 0.15 | 0.11 | 0.19 | 0.22 | 0.26 | 0.04 |
P(3 < x < 6) = ?
Question 6b
P(x > 3) = ?
Question 6c
P(2 < x < 5) = ?
Question 6d
P(x < 4) = ?
Question 6e
What is the mean of this probability distribution?
Question 7
Suppose, the probability distribution of the number of errors (X) on pages from a business textbook is as follows: P(0) = 0.81; P(1) = 0.17; P(2) = 0.02.
What is the mean number of errors per page?
Question 8a
Someone is interested in the total costs of a project on which he intends to bid. He estimates that the materials will costs €25,000,- and that the larbor will costs €900,- per day. Suppose the project takes X days to complete. Provide the linear function for the total costs, denoted by C, of the project.
Question 8b
Now, assume that the following probability distribution is provided for the completion time of the project.
Completion time (x) | 10 | 11 | 12 | 13 | 14 |
P(x) | 0.1 | 0.2 | 0.3 | 0.2 | 0.1 |
Question 8c
What is the variance for completion time X?
Question 8d
What is the mean for the total costs, C?
Question 8e
What is the variance for the total costs, C?
Question 9a
Suppose that a real estate agent has five contacts and believes that for each contact the probability of making a sale is 0.40. What is the probability that the real estate agent makes at most 1 sale?
Question 9b
What is the probability that the real estate agent makes between 2 and 4 sales (inclusive)?
Question 10a
It is predicted that 3.5% of all small corporations will file for bankruptcy in 2020. For a random sample of 100 small corporations, estimate the probability that at least 3 will file for bankruptcy in 2020, assuming that this prediction is correct. To do so, use the Poisson distribution.
Question 10b
Now, do the same using the (actual) binomial distribution. Is the Poisson distribution a close estimate of the actual binomial distribution?
Question 11a
Consider the following joint probability distribution for two random variables X and Y. Find the marginal probabilities.
Y return | ||||
X return | 0% | 5% | 10% | 15% |
0% | 0.0625 | 0.0625 | 0.0625 | 0.0625 |
5% | 0.0625 | 0.0625 | 0.0625 | 0.0625 |
10% | 0.0625 | 0.0625 | 0.0625 | 0.0625 |
15% | 0.0625 | 0.0625 | 0.0625 | 0.0625 |
Question 11b
Are X and Y independent?
Question 11c
Find the mean of X.
Question 11d
Find the mean of Y.
Question 11e
What is the variance of X?
Question 11f
What is the standard deviation of X?
Question 12
Consider the following probability distribution
X | |||
Y | 0 | 1 | |
0 | 0.25 | 0.35 | |
1 | 0.10 | 0.30 |
Compute the marginal probability distributions for X and Y.
Question 13a
Consider the following information for questions 28-30. An investor has €1000,- to invest and two investment opportunities, each requiring a minimum of €500,-. The profit for €100,- for the first investment (X) can be represented by the following probability distributions: P(X = -5) = 0.4 and P(X = 20) = 0.6. Subsequently, the profit per €100,- from the second investment (Y) is represented by the following probability distributions: P(Y = 0) = 0.6 and P(Y = 25) = 0.4. Random variables X and Y are independent. The investor has the following possible strategies:
- €1000,- in the first investment.
- €1000,- in the second investment.
- €500,- in each investment.
Find the mean and variance for the first strategy.
Question 13b
Find the mean and variance for the second strategy.
Question 13c
Find the mean and variance for the third strategy.
Answer indication
Question 1
It is a discrete random variable, because it can take on a finite number of countable numbers.
Question 2
The weight of students is a continuous random variable.
Question 3
- The number of sunny days in the Netherlands: discrete.
- The level of pressure in the tires of a car: continuous.
- The amount of oil exported by Saudi Arabia in 2019: continuous.
Question 4
x | P(x) |
1 | 0.16667 |
2 | 0.16667 |
3 | 0.16667 |
4 | 0.16667 |
5 | 0.16667 |
6 | 0.16667 |
Question 5
0.1667 + 0.1667 = 0.3333
Question 6a
P(3 < x < 6) = 0.19 + 0.22 + 0.26 = 0.67
Question 6b
P(x > 3) = 0.19 + 0.22 + 0.26 + 0.04 = 0.71
Question 6c
P(2 < x < 5) = 0.19 + 0.22 + 0.26 = 0.67
Question 6d
P(x < 4) = 0.03 + 0.15 + 0.11 + 0.19 = 0.48
Question 6e
\[ \mu_{X} = 0(0.03) + (1)(0.15) + (2)(0.11) + (3)(0.19) + (4)(0.22) + (5)(0.26) + (6)(0.04) = 3.36 \]
Question 7
\[ \mu_{x} = E[X] = \sum_{x} xP(x) = (0)(0.81) + (1)(0.17) + (2)(0.02) = 0.21 \]
Thus, the mean number of errors per page is 0.21.
Question 8a
C = 25,000 + 900X.
Question 8b
\[ \mu_{X} = E[X] = \sum_{x}xP(x) = (10)(0.1) + (11)(0.3) + (12)(0.3) + (13)(0.2) + (14)(0.1) = 11.9 \]
So, the mean for completion time X is 11.9 days.
Question 8c
\[ \sigma^{2}_{Y} = Var(a + bX) = b^{2}\sigma^{2}_{X} \]
\[ (10 - 11.9)^{2}(0.1) + (11 - 11.9)^{2}(0.3) + ... + (14 - 11.9)^{2}(0.1) = 1.29 \]
So, the variance for completion time X is 1.29 days.
Question 8d
\[ \mu_{C} = E[25,000 + 900X] = (25,000 + 900\mu_{X}) = 2500 + (900)(11.9) = €35,710,- \]
Question 8e
\[ \sigma^{2}_{C} = Var(25,000 + 900X) = (900)^{2}\sigma^{2}_{X} = (810,000)(1.29) = €1,044,900,- \]
Question 9a
\[ P(0) = \frac{5!}{0!5!}(0.4)^{0}(0.6)^{5} = (0.6)^{5} = 0.078 \]
\[ P(1) = \frac{5!}{1!4!}(0.4)^{1}(0.6)^{4} = 5(0.4)(0.6)^{4} = 0.259 \]
P(X < 1) = P(X = 0) + P(X = 1) = 0.078 + 0.259 = 0.337
Question 9b
\[ P(2) = \frac{5!}{2!3!}(0.4)^{2}(0.6)^{3} = 10(0.4)^{2}(0.6)^{3} = 0.346 \]
\[ P(3) = \frac{5!}{3!2!}(0.4)^{3}(0.6)^{2} = 10(0.4)^{3}(0.6)^{2} = 0.230 \]
\[ P(4) = \frac{5!}{4!1!}(0.4)^{4}(0.6)^{1} = 5(0.4)^{4}(0.6)^{1} = 0.077 \]
P(2 < X < 4) = P(2) + P(3) + P(4) = 0.346 + 0.230 + 0.077 = 0.653
Question 10a
The distribution of X is binomial with n = 100 and P = 0.0035, so that the mean of the distribution is equal to nP = 3.5. Next, using the Poisson distribution to approximate the probabily of at least 3 bankruptcies, we find:
\[ P(X \geq 3) = 1 - P(X \leq 2) \]
\[ P(0) = \frac{e^{-3.5}(3.5)^{0}}{0!} = e^{-3.5} = 0.030197 \]
\[ P(1) = \frac{e^{-3.5}(3.5)^{1}}{1!} = (3.5)(0.030197) = 0.1056895 \]
\[ P(2) = \frac{e^{-3.5}(3.5)^{2}}{2!} = (6.125)(0.030197) = 0.1849566 \]
Hence,
\[ P(X \leq 2) = P(0) + P(1) + P(2) = 0.3208431 \]
\[ P(X \geq 3) = 1 - 0.3208431 = 0.6791569 \]
Question 10b
Using the binomial distribution, we compute the probability belonging to X > 3 as: P(X > 3) = 0.684093.
Thus, the Poisson probability is a close estimate of the actual binomial distribution.
Question 11a
\[ P(X = 0) = \sum_{y}P(0,y) = 0.0625 + 0.0625 + 0.0625 + 0.0625 = 0.25\]
Note that for every combination of values for X and Y, P(x,y) = 0.0625. Therefore, all the marginal probabilities of X are 25%. The same holds for the marginal probabilities of Y. Note that the sum of the marginal probabilities for a random variable is 1.
Question 11b
To test independence, we need to check if P(x,y) = P(x)P(y) for all possible pairs of values x and y.
P(x,y) = 0.0625 for all possible values of x and y.
P(x) = 0.25 and P(y) = 0.25 for all possible values of x and y.
P(x,y) = 0.0625 = (0.25)(0.25) = P(x)P(y)
Thus, X and Y are independent.
Question 11c
\[ \mu_{X} = E[X] = \sum_{x}P(x) = 0(0.25) + 0.05(0.25) + 0.10(0.25) + 0.15(0.25) = 0.075 \]
Question 11d
The mean of Y is equal to the mean of X, that is 0.075.
Question 11e
\[ \sigma^{2}_{X} = \sum_{X}(x-\mu_{X})^{2}P(x) = (0.25)[(0 - 0.075)^{2} + (0.05 - 0.075)^{2} + (0.10 - 0.075)^{2} + (0.15 - 0.075)^{2}] = 0.003125 \]
Question 11f
The standard deviation of X is the square root of the variance, that is 0.0559016, or 5.59%.
Question 12
\[ P(X = 0) = \sum_{y}P(0,y) = 0.25 + 0.10 = 0.35 \]
\[ P(Y = 0) = \sum_{x}P(x,0) = 0.35 + 0.20 = 0.55 \]
Question 13a
\[ \mu_{X} = E[X] = \sum_{x}xP(x) = (-5)(0.4) + (20)(0.6) = €10,- \]
\[ \sigma^{2}_{x} = E[(X - \mu_{X})^{2}] = \sum_{x}(x - \mu)^{2} P(x) = (-5 - 10)^{2}(0.4) + (20 - 10)^{2}(0.6) = 150 \]
Strategy a has a mean profit of E[10X] = €100,- and variance of Var(10X) = 100Var(X) = 15,000.
Question 13b
\[ \mu_{Y} = E[Y] = \sum_{y}yP(y) = (0)(0.6) + (25)(0.4) = €10,- \]
\[ \sigma^{2}_{y} = E[(Y - \mu_{Y})^{2}] = \sum_{y}(y - \mu)^{2} P(Y) = (0 - 10)^{2}(0.6) + (25 - 10)^{2}(0.4) = 150 \]
Strategy b has a mean profit of E[10Y] = €100,- and variance of Var(10Y) = 100Var(Y) = 15,000.
Question 13c
\[ E[5X + 5Y] = E[5X] + E[5Y] = 5E[X] + 5E[Y] = €100,- \]
\[ Var(5X + 5Y) = Var(5X) + Var(5Y) = 25Var(X) + 25Var(Y) = 7,500 \]
The variance of strategy c is smaller than that of the strategies of a and b, reflecting the decrease in risk that follows from diversification in an investment portfolio. Most investors would prefer strategy c, because this strategy yields the same expected return as the other two strategies, but with a lower risk.
A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?
How to use probability models for continuous random variables? - ExamTests 5
Questions
Question 1
Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?
Question 2
Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 0.5 and 1.6?
Question 3
Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is less than 0.8?
Quesiton 4
Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is greater than 1.3?
Question 5
A homeowner estimates the heating bill based on the range of likely temperatures in January. He obtains the following linear equation: Y = 290 - 5T, in which T refers to the average temperature for the month in degrees Fahrenheit. If the average temperature in January has mean 24 and standard deviation 4, what is then the mean and standard deviation of this homeowner's January heating bill?
Question 6
The profit for a production process is equal to 6000 dollars minus three times the number of units produced. The mean and variance for the number of units produced are 1000 and 900 respectively. Find the mean and variance of the profit.
Question 7
The profit of a particular production process is equal to €2000,- minus two times the number of units produced. The mean and variance for the number of units produced are 500 and 900 respectively. What are the mean and variance of the profit?
Question 8
The profit of a particular production process is equal to €1000,- minus two times the number of units produced. The mean and variance for the number of units produced are 50 and 90 respectively. What are the mean and variance of the profit?
Question 9
Consider for questions 9-15 the standard normal distribution.
P(Z < 1.16) = ?
Question 10
P(Z > 1.73) = ?
Question 11
P(Z > -2.29) = ?
Question 12
P(Z > -1.35) = ?
Question 13
P(1.16 < Z < 1.73) = ?
Question 14
P(-2.29 < Z < 1.26) = ?
Question 15
P(-2.29 < Z < -1.35) = ?
Question 16
The probability is 0.70 that Z is less than what number?
Question 17
The probability is 0.25 that Z is less than what number?
Question 18
The probability is 0.2 that Z is greater than what number?
Question 29
The probability is 0.6 that Z is greater than what number?
Question 20
Let a continuous random variable X be normally distributed with X ~ (30, 81). What is the probability that X is greater than 40?
Question 21
The anticipated consumer demand at a restaurant can be modeled by a normal random variable with mean 1,500 pounds and standard deviation 110 pounds. What is the probability that the demand will exceed 1,300 pounds?
Question 22
The scores on an achievement test are known to be randomly distributed with a mean of 420 and a standard deviation of 80. What is the minimum test score needed in order to be in the top 10% of all people taking the test?
Question 23
Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. Can the normal distribution be used to compute probabilities belonging to this distribution. If so, why?
Question 24
Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. What is the probability that the number of successes is greater than 305?
Question 25
Service times for customers at a library information desk can be modeled by an exponential distribution with a mean service of 5 minutes. What is the probability that a customer service time will take longer than 10 minutes?
Question 26
A company in the Netherlands with 2000 employees has a mean number of lost-time accidents per week equal to λ = 0.4 and the number of accidents follow a Poisson distribution. What is the probability that the time between accidents is less than 2 weeks?
Question 27a
An investor has asked you for assistance in establishing a portfolio containing two stocks. The investor has €1000,- which can be allocated in any proportion to two alternative stocks. The returns per euro from these two investments are denoted by random variables X and Y. Both of these variables are independent and normally distributed. Investment X has a mean of 25 and variance of 81. The second investment has a mean of 40 and a variance of 121. These two stock prices have a negative correlation, ρxy = -0.40. Define the linear equation of the value of the portfolio, denoted by W.
Question 27b
What is the mean value for the stock portfolio?
Question 27c
What is the standard deviation for the stock portfolio?
Question 27d
What is the probability that the portfolio value exceeds 2,000?
Answer indication
Question 1
P(1.8 < X < 1.4) = F(1.8) - F(1.4) = (0.5)(1.8) - (0.5)(1.4) = 0.9 - 0.7 = 0.2.
Question 2
P(1.6 < X < 0.5) = F(1.6) - F(0.5) = (0.5)(1.6) - (0.5)(0.5) = 0.8 - 0.25 = 0.55.
Question 3
P(X < 0.8) = F(0.8) = (0.5)(0.8) = 0.40.
Question 4
P(2.0 < X < 1.3) = F(2.0) - F(1.3) = (0.5)(2.0) - (0.5)(1.3) = 1.0 - 0.65 = 0.35.
Question 5
\[ \mu_{Y} = 290 - 5\mu_{T} = 290 - (5)(24) = 170 \]
\[ \sigma_{Y} = |-5| \sigma_{T} = (5)(4) = 20 \]
Question 6
\[ Y = 6000 - 3U \]
\[\mu_{Y} = 1000 = 6000 - 3U \]
\[3U = 6000 - 1000 = 5000 \]
\[U ≈ 1667 \]
\[ \sigma_{Y} = |3|\sigma_{U} \]
\[ 900 = |3|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{3} = 300 \]
Thus, the mean and variance of the profit are 1,667 and 300 dollars respectively.
Question 7
\[ Y = 2000 - 2U \]
\[\mu_{Y} = 500 = 2000 - 2U \]
\[2U = 2000 - 500 = 1500\]
\[U ≈ 750 \]
\[ \sigma_{Y} = |2|\sigma_{U} \]
\[ 900 = |2|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{2} = 450 \]
Thus, the mean and variance of the profit are €750,- and €450,- respectively.
Question 8
\[ Y = 1000 - 2U \]
\[\mu_{Y} = 50 = 1000 - 2U \]
\[2U = 1000 - 50 = 950\]
\[U ≈ 475 \]
\[ \sigma_{Y} = |2|\sigma_{U} \]
\[ 90 = |2|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{2} = 45 \]
Thus, the mean and variance of the profit are €950,- and €45,- respectively
Question 9
P(Z < 1.16) = 0.8770
Question 10
P(Z > 1.73) = 1 - 0.9582 = 0.0418
Question 11
P(Z > -2.29) = P(Z < 2.29) = 0.9890
Question 12
P(Z > -1.35) = P(Z > 1.35) = 0.9115
Question 13
P(1.16 < Z < 1.73) = 0.9582 - 0.8770 = 0.0812
Question 14
P(-2.29 < Z < 1.26) = 0.9890 - 0.8962 = 0.0928
Question 15
P(-2.29 < Z < -1.35) = 0.0855 - 0.011 = 0.0745
Question 16
z = 0.525
Question 17
z = -0.575
Question 18
z = -0.845
Question 19
z = -0.256
Question 20
\[ Z = \frac{X - \mu}{sigma} = \frac{40 - 30}{\sqrt{81}} = \frac{-10}{9} = -1.11 \]
P(Z > -1.11) = 1 - 0.8665 = 0.1335
Question 21
\[ Z = \frac{(1300 - 1,500)}{110} = -1.82 \]
P(Z > -1.82) = 0.9656
Question 22
Top 10% corresponds to z = 1.185 (between z = 1.18 and z = 1.19 in Standard Normal Distribution Table).
\[ 1.185 = \frac{X - 420}{80} \]
\[ 1.185*80 = X - 420 \]
\[ 94.5 + 420 = X\]
Thus, X = 514.8. One needs to score at least 515 to be in the top 10% of all people taking this test.
Question 23
nP(1 - P) = 900*0.30(1 - 0.30) = 189 > 5, thus the binomial distribution can be approximated by the standard normal distribution.
Question 24
\[ \mu = nP = 270 \]
\[ \sigma^{2} = 189 \]
\[ \sigma = \sqrt{189} = 13.75 \]
\[ z = \frac{305 - 270}{13.75} = 2.55 \]
P(Z > 2.55) = 1 - 0.9946 = 0.0054
Question 25
\[ P(T > 10) = 1 - P(T < 10) = 1 - F(10) = 1 - (1 - e^{-(0.20)(10)}) = e^{-2.0} = 0.1353 \]
Thus, the probability that a service time exceeds 10 minutes is 0.1353.
Question 26
\[ P(T < 2) = F(2) = 1 - e^{-(0.4)(2)} = 1 - e^{-0.8} = 1 - 0.4493 = 0.5507 \]
Thus, the probability of less than 2 weeks between accidents is about 55%.
Question 27a
W = 20X + 30Y
Question 27b
W = 20*25 + 30*40 = 1,700
Question 27c
\[ \sigma^{2}_{W} = 20^{2} \sigma^{2}_{X} 30^{2} \sigma^{2}_{Y} + 2*30 \rho_{XY} \sigma_{X} \ sigma_{Y} \]
\[ \sigma^{2}_{W} = 20^{2}*81 + 30^{2}*121 + 2*20*30*{-0.40}*9*11 = 93,780 \]
\[ \sigma = \sqrt{\sigma^{2}} = \sqrt{93,780} = 306.24 \]
Question 27d
\[ Z = \frac{2000 - 1700}{306.24} = 0.980 \]
P(Z > 0.980) = 0.1635
Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?
How to obtain a proper sample from a population? - ExamTests 6
Questions
Question 1a
Suppose that we know that the annual percentage salary increase is normally distributed with a mean of 12.2% and a standard deviation of 3.6%. A random sample of 9 observations is obtained from this population and the sample mean is computed. What is the standard error of the sample mean?
Question 1b
What is the probability that the sample mean exceeds 14.4%?
Question 2a
Given a population with a mean of 105 and a variance of 16, the central limit theorem applies when the sample size is n > 25. A random sample of size 25 is obtained. What are the mean and variance of the sampling distribution for the sample means?
Question 2b
What is the probability that x̅ > 106?
Question 2c
What is the probability that 104 < x̅ < 106?
Question 2d
What is the probability that x̅ < 105.5?
Question 3a
Given a population with a mean of 150 and a variance of 1600, the central limit theorem applies when the sample size is n > 25. A random sample of size 36 is obtained. What are the mean and variance of the sampling distribution for the sample means?
Question 3b
What is the probability that x̅ > 155?
Question 3c
What is the probability that 145 < x̅ < 165?
Question 3d
What is the probability that x̅ > 165?
Question 4a
The lifetime of light bulbs procuded by a company have a mean of 1,200 hours and a standard deviation of 400 hours. The population is normally distributed. Suppose that you buy nine light bulbs, which can be regarded as a proper random sample from the population. What is the mean of the sample mean lifetime?
Question 4b
What is the variance of the sample mean?
Question 4c
What is the standard error of the sample mean?
Question 4d
What is the probability that, on average, those nine light bulbs have live times of less than 1050 hours?
Question 5a
To get some feeling for possible magnitudes of the finite population correction factor, calculate it for samples of n = 20 observations from populations of members: 20, 100, 10,000.
Question 5b
Explain why the result found in the previous question is precisely what one should expect on intuitive grounds.
Question 6a
A random sample of 270 students was taken from a large population of students taking a statistics exam. If, in fact, 20% of the students fail the test, what is the probability that the sample proportion of students failing the test will be between 16 and 24%?
Question 6b
Now, compute the same probability for 16 to 24%, but this time use a sample of 400 students.
Question 7
It has been estimated that 43% of the students drink alcohol. Find the probability that more than half of a random sample of 80 students drink alcohol.
Question 8
Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 58% of a random sample of 250 adult Americans eat McDonald's once a week?
Question 9
Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 55% of a random sample of 250 adult Americans eat McDonald's once a week?
Question 10
Given is n = 6. Determine an upper limit for the sample variance such that the probability of exceeding this limit, given a population standard deviation of 3.6, is less than 0.05. Use the chi-square distribution to solve this problem.
Question 11a
There are six employees with the following years of experience:
2, 4, 6, 6, 7, 8
Two of these employees are to be chosen at random.
What is the mean age for these six employees?
Question 11b
How many possible samples of two employees are there?
Question 11c
List all possible samples
Question 11d
Find the sampling distribution of the sample means.
Question 12
What is the central limit theorem?
Question 13a
Suppose a population distribution is left-skewed with mean 100 and variance 15. From this population, we draw a random sample of n = 100. What is the expected mean of this sample?
Question 13b
What is the expected variance of this sample?
Question 13c
What shape is expected for the sampling distribution?
Answer indication
Question 1a
μ = 12.2; σ = 3.6; n = 9.
\[ \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{3.6}{\sqrt{9}} = 1.2 \]
Question 1b
\[ P(\bar{x} > 14.4) = P( \frac{\bar{X} - \mu}{\sigma_{\bar{x}}} > \frac{14.4 - 12.2}{1.2} ) = P(z > 1.83) = 0.0336 \]
To conclude, the probability that the sample mean will exceed 14.4% is only 0.0336.
Question 2a
The central limit theorem appies, thus the sampling distribution has mean 105 and variance 16/√25 = 3.2.
Question 2b
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{106 - 105}{3.2} = 0.3125\]
P(Z > 0.3125) = 1- 0.6217 = 0.3783
Question 2c
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{104 - 105}{3.2} = -0.3125\]
P(104 < x̅ < 106) = P(-0.3125 < z < 0.3125) = 0.6217 - (1 - 0.6217) = 0.2434
Question 2d
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{105.5 - 105}{3.2} = 0.1563\]
P(Z < 0.1563) = 0.5636
Question 3a
The central limit theorem applies, thus the mean of the sampling distribution is 150 and the variance 1600/√36 = 266.67.
Question 3b
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{155 - 150}{266.7} = 0.0188\]
P(Z > 0.0188) = 1- 0.5040 = 0.4960
Question 3c
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{145 - 150}{266.7} = -0.06563\]
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{165 - 150}{266.7} = 0.0563\]
P(145 < x̅ < 165) = P(-0.0563 < z < 0.0563) = 0.5239 - (1 - 0.5239) = 0.5239 - 0.4761 = 0.0478
Question 3d
P(x̅ > 165) = 1 - 0.5239 = 0.4761
Question 4a
The population is normally distributed. Therefore, the sampling distribution of the sample means is normal. Hence, the mean of the sampling distribution is 1,200.
Question 4b
The variance is 400/√9 = 133.33
Question 4c
The standard error is √400/√9 = 6.67.
Question 4d
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{1050 - 1200}{133.33} = 1.1250\]
P(x̅ < 1050) = P(Z < 1.1250) = (0.8686 + 0.8708)/2 = 0.8697
Question 5a
The finite population correction factor is calculated as follows: (N - n)/(N - 1).
The population correction factor for sample size n = 20 for a population with 20 members is: (20 - 20)(20 - 1) = 0.
The population correction factor for sample size n = 20 for a population with 100 members is: (100 - 20)(100 - 1) = 0.8081.
The population correction factor for sample size n = 20 for a population with 10,000 members is: (10,000 - 20)(10,000 - 1) = 0.9981.
Question 5b
It is the total sample size, not the fraction of the population in the sample, that determines the precision of the results from a random sample. The larger the number of members in the population, the higher the precision of the estimate, regardless of the size of a single sample.
Question 6a
P = 0.20 and n = 270.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{270} } = 0.024 \]
The required probability is:
\[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.024} < Z \frac{0.24 - 0.20}{0.024} ) \]
P(-1.67 < Z < 1.67) = 0.9525 - (1 - 0.9525) = 0.9050
Thus, we see that the probability is 0.9050 that the sample proportion is within the interval [0.16 - 0.24] given P = 0.20 and sample size n = 270. This interval can be called a 90.50% acceptance interval. Note that, if the sample proportion was actually outside this interval, we may suspect that the population proportion P is not 0.20.
Question 6b
P = 0.20; n = 400.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{400} } = 0.0200 \]
The required probability is:
\[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.0200} < Z \frac{0.24 - 0.20}{0.0200} ) \]
P(-2.00 < Z < 2.00) = 0.9772 - (1 - 0.9772) = 0.9544
This interval can thus be called a 95.44% acceptance interval (given P = 0.20 and sample size n = 400).
Question 7
P = 0.43; n = 80.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.43(1 - 0.43)}{80} } = 0.055 \]
\[ P(\hat{p} > 0.50) = P(Z > \frac{0.50 - 0.43}{0.055}) \]
P (Z > 1.27) = 0.1020
Question 8
P = 0.50; n = 250
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.50(1 - 0.50)}{250} } = 0.0316 \]
\[ P(\hat{p} > 0.58) = P(Z > \frac{0.58 - 0.50}{0.0316}) = 2.5316 \]
P (Z > 2.53) = 1 - 0.9943 = 0.0057
Question 9
\[ P(\hat{p} > 0.55) = P(Z > \frac{0.55 - 0.50}{0.0316}) = 0.9494 \]
P (Z > 0.95) = 1 - 0.8289 = 0.1711
Question 10
n = 6; σ2 = (3.6)2 = 12.96.
Using the chi-square distribution, we can state that:
\[ P(s2 > K) = P ( \frac{(n - 1) s^{2}}{12.96} > 11.070) = 0.05 \]
where K is the desired upper limit and X25 = 11.070 is the upper 0.05 critical value of the chi-square distribution with 5 degrees of freedom. The required upper limit for s2 is obtained by solving:
\[ \frac{(n - 1)K}{12.96} = 11.070 \]
\[ K = \frac{(11.070)(12.96)}{(6 - 1)} = 28.69 \]
Thus, if the sample variance s2 from a random sample of size n = 6 exceeds 28.69, there is strong evidence to suspect that the population variance exceeds 12.96.
Question 11a
\[ \mu = \frac{2 + 4 + 6 + 6 + 7 + 8}{6} = 5.5 \]
Question 11b
Two of these employees are to be chosen randomly. We are sampling without replacement, thus, the first observation has a probability of 1/6 of being selected, while the second observation has a probability of 1/5 of being selected. Fifteen possible random samples of two eployees could be selected. Note that some samples (such as 2,6) occur twice because there are two employees with six years of experience in the population.
Question 11c
2 4
2 6 (2x)
2 7
2 8
4 6 (2x)
4 7
4 8
6 6
6 7 (2x)
6 8 (2x)
7 8
Question 11d
Sample mean | Probability of sample mean |
3.0 | 1/15 |
4.0 | 2/15 |
4.5 | 1/15 |
5.0 | 3/15 |
5.5 | 1/15 |
6.0 | 2/15 |
6.5 | 2/15 |
7.0 | 2/15 |
7.5 | 1/15 |
Question 12
The central limit theorem shows that, if the sample size is large enough, the mean of a random sample drawn from a population with any probability distribution, will be approximately normally distributed with mean μ and variance σ2/n.
Question 13a
100
Question 13b
σ2/n = 15/100 = 0.15
Question 13c
According to the central limit theorem, we expect that, as n becomes large, the distribution approaches the standard normal distribution.
Suppose that we know that the annual percentage salary increase is normally distributed with a mean of 12.2% and a standard deviation of 3.6%. A random sample of 9 observations is obtained from this population and the sample mean is computed. What is the standard error of the sample mean?
How to obtain estimates for a single population? - ExamTests 7
Questions
Question 1
Let x1, x2, ..., xn be a random sample from a normally distributed population with mean μ and variance σ2. Assuming that a population is normally distributed with a very large population size compared to the sample size, should the sample mean or the sample median be used to estimate the population mean?
Question 2
Give one advantage of the median over the mean for estimating a population mean.
Question 3
Give one disadvantage of the median in comparison to the mean for estimating a population mean.
Question 4
Which two properties should an estimator possess?
Question 5a
Suppose that shopping times for customers at a local mall follow a normal distribution. The population standard deviation is equal to 20 minutes. A random sample of 64 shoppers in the local grocery store has a mean time of 75 minutes. What is the standard error?
Question 5b
What is the margin of error?
Question 5c
What is the 95% confidence interval for the population mean μ?
Question 5d
Give an interpretation of this confidence interval.
Question 6
How can the margin of error be reduced?
Question 7
What distribution is used when the population variance is known?
Question 8
What distribution is used when the population variance is unknown?
Question 9
Find the standard error for n = 17 and s = 16.
Question 10
Find the upper critical value of student's t distribution with v = 23 degrees of freedom for α = 0.05.
Question 11a
From a random sample of 344 employees, it was found that 261 were in favor of a modified bonus plan. What is the sample proportion?
Question 11b
What is the reliability factor for a 90% confidence interval?
Question 11c
What is the margin of error for a 90% confidence interval?
Question 11d
Provide the 90% confidence interval.
Question 11e
Interpret the 90% confidence level.
Question 12
What is the number that is exceeded with probability 0.10 by a chi-square random variable with 4 degrees of freedom?
Question 13
What is the number that is exceeded with probability 0.05 by a chi-square random variable with 18 degrees of freedom?
Question 14
The following information is provided: n = 25, s2 = 100. What are the critical values for a 95% confidence interval with α = 0.05?
Question 15
Use the information provided in the previous question. Find the 95% confidence interval for the population variance.
Question 16a
Suppose there are 1395 secondary schools in the Netherlands. From a simple random sample of 400 of these schools, it was found that the sample mean enrollment during the past year in biology courses was 320.8 students, and the sample standard deviation was found to be 149.7 students. What it the point estimate for the population total, Nμ?
Question 16b
Find the corresponding 99% confidence interval for this population total.
Question 17a
From a simple random sample of 400 of the 1,395 students in our population, it is found that biology was a two-semester course in 141 of the sampled schools. Estimate the proportion of all schools for which the biology course is two semesters long.
Question 17b
Provide the confidence interval for the proportion of all schools for which the biology course is two semesters long.
Question 18
Suppose we have: ME = 0.50; σ = 1.8; and za/2 = z0.005 = 2.576. What is the required sample size for a 99% confidence interval?
Question 19
It is given that ME = 0.06 and za/2 = z0.025 = 1.96. What is the required sample size?
Question 20
Suppose that an opinion survey is conducted about the presidential election. The survey was said to have a 3% margin of error. The implication is that a 95% confidence interval for the population proportion holding a particular opinion is the sample proportion plus or minus 3%. How many citizens of voting age need to be sampled to obtain this 3% margin of error?
Question 21
Suppose that a simple random sample of the 1,395 Dutch secondary schools is taken. Whatever the true proportion, a 95% confidence interval must extend no further than 0.04 on each side of the sample proportion. How many sample observations should be taken?
Answer indication
Question 1
Assuming that a population is normally distributed with a very large population size compared to the sample size, the sample mean is an unbiased estimator of the population mean.
Question 2
The median gives less weight to extreme observations and, thus, is less sensitive to outliers.
Question 3
The relative efficiency of the median is lower than that of the mean.
Question 4
Unbiasedness and being the most efficient.
Question 5a
Standard error = σ/√n = 20/√64 = 2.5
Question 5b
The margin of error = zα/2 * (σ/√n) = 1.96*2.5 = 4.9
Question 5c
The 95% confidence interval runs from 75 - 4.9 to 75 + 4.9, that is: [70.1 - 79.9].
Question 5d
In the long run, 95% of the intervals found in this manner contain the true value of the population mean.
Question 6
Decrease the population standard deviation, or increase the sample size, or decrease the confidence interval.
Question 7
The standard normal distribution (z distribution).
Question 8
Student's t distribution.
Question 9
Standard error = s/√n = 16/√17 = 3.88
Question 10
Use Table 8 (Appendix) to find that the upper critical value is 1.714.
Question 11a
\[ \hat{p} = 261/344 = 0.759 \]
Question 11b
\[ z_{\alpha/2} = z_{0.05} = 1.645\]
Question 11c
\[ 1.645 \sqrt{(0.759)(0.241)}{344} = 0.038 \]
Question 11d
0.759 +/- 0.038 = [0.721; 0.797]
Question 11e
Imagine taking a very large number of independent random samples of size n = 344 from this population, and, calculating a 90% confidence interval for each sample result. Then, the confidence level of the interval implies that in the long run 90% of the intervals found in this manner contain the true value of the population proportion.
Question 12
7.779
Question 13
28.869
Question 14
\[ X^{2}_{n-1,1-\alpha/2} = \chi^{2}_{24,0.975} = 12.401 \]
\[ X^{2}_{n-1,\alpha/2} = \chi^{2}_{24,0.025} = 39.364 \]
Question 15
\[ LCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,\alpha/2} } = \frac{(24)(100)}{39.364} = 60.97 \]
\[ UCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,1 - \alpha/2} } = \frac{(24)(100)}{12.401} = 193.53 \]
Hence, the 95% confidence interval is: [60.97; 193.53]
Question 16a
Nx̄ = (1,395)(320.8) = 447,516. Thus, we estimate a total of 447,516 students to be enrolled in biology courses.
Question 16b
\[ N\hat{\sigma}_{\bar{x}} = \frac{Ns}{\sqrt{n}} \sqrt{ (\frac{N - n}{N - 1}) } = \frac{(1,395)(149.7)}{\sqrt{400}} = 8,821.6 \]
Because the sample size is large, we can use the central limit theorem with zα/2 = 2.58 for a 99% confidence interval. Hence:
\[ N\bar{x} \pm z_{\alpha/2} N \hat{\sigma}_{\bar{x}} \]
\[ 447,516 \pm 2.58(8.821.6) \]
\[ 447,516 \pm 22,760 \]
Thus, the 99% confidence interval runs from 424,756 to 470,276 students.
Question 17a
N = 1,395; n = 400.
\[ \hat{p} = \frac{141}{400} = 0.3525 \]
The point estimate of the population proportion P, is simply equal to this population proportion, that is: 0.3525.
Question 17b
\[ \hat{\sigma}^{2}_{\hat{p}} = \frac{\hat{p} (1 - \hat{p}}{n - 1} ( \frac{N - n}{N - 1} ) = \frac{(0.3525)(0.6475)}{400} = 0.0004073 \]
so
\[ \hat{\sigma}_{\hat{p}} = \sqrt{0.0004073} = 0.0202 \]
For a 90% confidence interval: za/2 = 1.645.
\[ ME = z_{\alpha/2} \hat{\sigma}_{\hat{p}} = 1.645(0.0202) ≅ 0.0332 \]
Thus, the 90% confidence interval runs from 0.3525 +/- 0.0332. That is, from 31.93% to 38.57%.
Question 18
\[ n = \frac{z^{2}_{\alpha/2}} \sigma^{2}{ME^{2}} = \frac{ (2.576)^{2} (1.8)^{2} }{(0.5)^{2}} ≈ 86 \]
Question 19
\[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{0.25(1.96)^{2}}{(0.06)^{2}} = 267 \]
Question 20
\[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{(0.25)(1.96)^{2}}{(0.03)^{2}} = 1067.11 = 1068 \]
Question 21
\[ 1.96 \sigma_{\hat{p}} = 0.04 \]
\[ \sigma_{\hat{p}} = 0.020408 \]
\[ n_{max} = \frac{0.25N}{(N - 1) \sigma^{2}_{\hat{p}} + 0.25 } = \frac{(0.25)(1,395)}{(1,394)(0.020408)^{2} + 0.25} = 419.88 = 420 \]
Let x1, x2, ..., xn be a random sample from a normally distributed population with mean μ and variance σ2. Assuming that a population is normally distributed with a very large population size compared to the sample size, should the sample mean or the sample median be used to estimate the population mean?
How to estimate parameters for two populations? - ExamTests 8
Questions
Question 1a
The following information is provided for a dependent random sample from two normally distributed populations:
\[ n = 11 \hspace{3mm} \bar{d} = 28.5 \hspace{3mm} s_{d} = 3.3 \]
Find the 98% confidence interval for the difference between the means of the two populations.
Question 1b
What is the margin of error for a 98% confidence interval for the difference between the means of the two populations?
Question 1c
What do you conclude based on the confidence interval found in question 1a?
Question 2a
Consider the following data:
Before | After |
6 12 8 10 6 | 8 14 9 13 7 |
What type of dependent sample is depicted here?
Question 2b
What is the sample mean of the differences?
Question 2c
It is given that the mean difference is equal to 7.7 with standard deviation sd = 43.68901. Compute the 95% confidence interval using the normal approximation.
Question 3a
An educational study is conducted to examine the effectiveness of a mathematics reading program of elementary age school children. Each child was given a pre- and posttest. HIgher scores indicate improvement in mathematics. From a very large population, a random sample was drawn. The data obtained from this sample are provided in the table below. What is the mean difference score?
Child | Pretest Score | Posttest score |
1 2 3 4 5 6 7 | 40 36 32 38 33 | 48 42 36 |
Question 3b
What is the standard deviation of the difference scores?
Question 3c
Find the t value corresponding to a 95% confidence interval.
Question 3d
Compute a 95% confidence interval.
Question 3e
Can we conclude, based on this 95% confidence interval, that there is a significant improvement in mathematics?
Question 3f
Compute a 95% confidence interval using the normal approximation.
Question 3g
What do we conclude based on this interval?
Question 4
A study regarding student's GPA was conducted. From a very large university, independent random samples of 120 students majoring in economics and 90 students majoring in finance were selected. The mean GPA for the random sample of economics majors was found to be 3.08. The mean GPA for the random sample of finance majors was found to be 2.88. From similar past studies, the population standard deviation for the finance majors is 0.64. Denote the population mean for economics by μx and the population mean for finance by μy. With which scenario are we dealing here?
- Population variances known.
- Population variances unknown, but assumed to be equal.
- Population variances unknown, and not assumed to be equal.
Question 4b
Compute the 95% confidence interval for the difference score for the information provided in the previous question.
Question 4c
What do we conclude based on this 95% confidence interval (from question 4b)?
Question 5a
Consider the following data:
X | 100 | 125 | 135 | 128 | 140 | 142 | 128 | 137 | 156 | 142 |
Y | 95 | 87 | 100 | 75 | 110 | 105 | 85 | 95 |
Suppose these are independent samples with unknown variances, but the variances are assumed to be equal. Give nx, ny, x̄, ȳ, σ2x and σ2y.
Question 5b
Compute the pooled variance.
Question 5c
What are the degrees of freedom?
Question 5d
Find the t value corresponding to a 95% confidence interval.
Question 5e
Compute a 95% confidence interval.
Question 6a
Assuming equal population variances, determine the number of degrees for:
nx = 16; s2x = 30
ny = 9; s2x = 36
Question 6b
Compute the pooled sample variance for the information provided in the previous question.
Question 7a
Assuming equal population variances, determine the number of degrees for:
nx = 12; s2x = 30
ny = 14; s2x = 36
Question 7b
Compute the pooled sample variance for the information provided in the previous question.
Question 8a
Assuming equal population variances, determine the number of degrees for:
nx = 20; s2x = 16
ny = 8; s2x = 25
Question 8b
Compute the pooled sample variance for the information provided in the previous question.
Question 9
The following information is provided:
\[ n_{x} = 120; \hat{p}_{y} = 0.892 \]
\[ n_{y} = 141; \hat{p}_{y} = 0.518 \]
Compute a 95% confidence interval for the population difference (Px - Py).
Question 10
Calculate the margin of error for a 95% confidence interval with:
\[ n_{x} = 300; \hat{p}_{y} = 0.62 \]
\[ n_{y} = 350; \hat{p}_{y} = 0.72 \]
Question 11
Calculate the margin of error for a 95% confidence interval with:
\[ n_{x} = 100; \hat{p}_{y} = 0.44 \]
\[ n_{y} = 150; \hat{p}_{y} = 0.55 \]
Answer indication
Question 1a
\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} = 28.5 \pm 2.764 \frac{3.3}{\sqrt{11}} = 28.5 \pm 2.7502 \]
The 98% confidence interval is: [25.75; 31.25].
Question 1b
ME = 2.7502
Question 1c
Based on these sample data we conclude that there is sufficient evidence to suggest that there is a significant difference between the two populations.
Question 2a
Repeated measurements
Question 2b
\[ \bar{d} = \frac{2 + 2 + 1 + 3 + 1}{5} = 1.8 \]
Question 2c
Using the normal approximation we have tn-1,a/2 = t139,0.025 ≅ 1.96.
\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
\[ 7.7 \pm 1.96 \frac{43.68901}{\sqrt{140}} \]
\[ 7.7 \pm 7.2 \]
This results in the following 95% confidence interval: [70.5; 84.9]
Question 3a
\[ \bar{d} = \frac{8 + 6 -2 + 5 + 10}{5} = 5.4 \]
Question 3b
sd ≅ 4.56
Question 3c
t4,0.025 = 2.776
Question 3d
\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
\[ 5.4 \pm 2.776 \frac{4.56}{\sqrt{5}} \]
\[ 5.4 \pm 5.6620 \]
The 95% confidence interval is: [-0.26; 11.620]
Question 3e
No, because the zero is within the range of the confidence interval. Thus, there is insufficient evidence to conclude that there is a significant difference.
Question 3f
Using the normal approximation, we replace t by z, that is: z = 1.96.
\[ 5.4 \pm 1.96 \frac{4.56}{\sqrt{5}} \]
\[ 5.4 \pm 3.9976 \]
The 95% confidence interval is: [1.40; 9.40]
Question 3g
Based on the 95% confidence interval computed by the normal approximation, we would conclude that there is a significant improvement in the mathematics scores. Note, however, that we are dealing with a dependent sample here (repeated measures). Therefore, the normal approximation is not a valid procedure. It is, however, important to see the difference the distribution can make on the statistical inferences.
Question 4a
A. population variances known.
Question 4b
\[ (\bar{x} - \bar{y}) \pm z_{\alpha/2} + \sqrt{\frac{\sigma^{2}_{x}}{n_{x}} + \frac{\sigma^{2}_{y}}{n_{y}}} \]
\[ (3.08 - 2.88) \pm 1.96 \sqrt{ \frac{(0.42)^{2}}{120}} + \frac{(0.64)^{2}}{90} = 0.20 \pm 0.1521 \]
Thus, the 95% interval extends from 0.0479 to 0.3521
Question 4c
The confidence interval does not comprise the zero, thus we conclude that there is a significant difference in the mean GPA of students majoring in economics and students majoring in finance. More precisely, on average, the mean GPA of students majoring in economics is higher than the GPA of students majoring in finance.
Question 5a
nx = 10; x̄ = 133.30; σ2x = 218.0111
ny = 8; ȳ = 94.00; σ2y = 129.4286
Question 5b
\[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} = \frac{(10 - 1)(218.0111) + (8 - 1)(129.4286) }{10 + 8 -2} = 19.2563 \]
Question 5c
The degrees of freedom are given by: nx + ny - 2 = 10 + 8 - 2 = 16
Question 5d
t16,0.025 = 2.12
Question 5e
\[ (\bar{x} - \bar{y}) \pm t_{n_{x} + n_{y} - 2, a/2} + \sqrt{\frac{s^{2}_{p}}{n_{x}} + \frac{s^{2}_{p}}{n_{y}}} \]
\[ 39.3 \pm (2.21) \sqrt{ \frac{179.2563}{10} + \frac{179.2563}{8} } \]
\[ 39.3 \pm 13.46 \]
Thus, the 95% confidence interval is: [25.84; 52.76]
Question 6a
df = nx + ny - 2 = 16 + 9 - 2 = 23
Question 6b
\[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} \]
\[ s^{2}_{p} = \frac{ (16-1)30 + (9 - 1)36}{16 + 9 - 2} = \frac{738}{23} = 32.08 \]
Question 7a
df = nx + ny - 2 = 12 + 14 - 2 = 24
Question 7b
\[ s^{2}_{p} = \frac{ (12-1)30 + (14 - 1)36}{12 + 14 - 2} = \frac{798}{24} = 33.25 \]
Question 8a
df = nx + ny - 2 = 20 + 8 - 2 = 26
Question 8b
\[ s^{2}_{p} = \frac{ (20-1)16 + (8 - 1)25}{20 + 8 - 2} = \frac{479}{26} = 18.42 \]
Question 9
\[ (\hat{p}_{x} - \hat{p}_{y}) \pm z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ (0.892 - 0.518) \pm 1.96 \sqrt{ \frac{(0.892)(0.108)}{120} + \frac{(0.518)(0.482)}{141} } \]
From this, it follows that the 95% confidence interval runs from 0.274 to 0.473.
Question 10
\[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ 1.96 \sqrt{ \frac{(0.62)(0.38)}{300} + \frac{(0.72)(0.28)}{350} } \]
ME = 0.0733
Question 11
\[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ 1.96 \sqrt{ \frac{(0.44)(0.56)}{100} + \frac{(0.55)(0.45)}{120} } \]
ME = 0.1329
The following information is provided for a dependent random sample from two normally distributed populations:
\[ n = 11 \hspace{3mm} \bar{d} = 28.5 \hspace{3mm} s_{d} = 3.3 \]
Find the 98% confidence interval for the difference between the means of the two populations.
How to develop hypothesis testing procedures for a single population? - ExamTests 9
Questions
Question 1a
Kees wants to use the results of a random sample market survey to seek strong evidence that his brand of cereal has more than 20% of the total market. Formulate the null hypothesis and alternative hypothesis using P as the population proportion.
Question 1b
Is the alternative hypothesis you formulated a one-sided or two-sided composite alternative hypothesis?
Question 2
A car factory has proposed a process to monitor the diameter of pistons on a regular schedule. They want to test whether the diameter is equal to 3800. Formulate the null hypothesis and alternative hypothesis.
Question 3
What is a type I error?
Question 4
What is a type II error?
Question 5a
A random sample is obtained from a population with variance σ2 = 625. The sample mean is computed. Test the null hypothesis H0: μ = 100 versus the alternative hypothesis H1: μ > 100 with α = 0.05. Compute the critical value x̅c and state your decision rule regarding a sample size of n = 25.
Question 5b
Do the same for n = 16.
Question 5c
Do the same for n = 44.
Question 5d
Do the same for n = 32.
Question 6a
A random sample of n = 25 is obtained from a population with known variance. The sample mean is computed. Test the null hypothesis: H0: μ = 120 versus the alternative hypothesis H1: μ > 120 with α = 0.10. Compute the critical value x̅c and state your decision rule regarding the population variance σ2 = 196.
Question 6b
Do the same for σ2 = 625.
Question 6c
Do the same for σ2 = 900.
Question 6d
Do the same for σ2 = 500.
Question 7
Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 108; s = 20.
Question 8
Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 104; s = 10.
Question 9
Test the hypotheses: H0: μ = 100 and H1 = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 96; s = 10.
Question 10
Mention four conditions that will raise the power function.
Question 11
Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.56 to be β = 0.31 using a significance level of α = 0.05. What is the power?
Question 12
Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.66 to be β = 0.25 using a significance level of α = 0.10. What is the power?
Question 13a
A random sample of 20 products is obtained, and the weight of each product is measured. The sample variance is computed to be 6.62. The hypothesis is tested that the weight of the products cannot exceed. Formulate the null hypothesis and alternative hypothesis.
Question 13b
What are the degrees of freedom?
Question 13c
What is the critical value?
Question 13d
What is the test statistic?
Question 13e
Based on these sample data, can we reject the null hypothesis?
Question 14a
Suppose we are testing the following hypotheses:
H0: μ < 100
H1: μ > 100
using a random sample of n = 49, a probability of type I error equal to 0.05.
Suppose the population variances are unknown, what distribution should you use?
Question 14b
Test the hypotheses using the following test statistics: x̅ = 108; s = 20
Question 14c
Test the hypotheses using the following test statistics: x̅ = 104; s = 10
Question 14d
Test the hypotheses using the following test statistics: x̅ = 96; s = 10
Question 14e
Test the hypotheses using the following test statistics: x̅ = 95; s = 8
Answer indication
Question 1a
H0: P = 0.20
H1: P > 0.20
Question 1b
A one-sided composite alternative hypothesis.
Question 2
H0: μ = 3800
H1: μ ≠ 3800
Question 3
A type I error refers to rejecting the null hypothesis, while the null hypothesis is true.
Question 4
A type II error refers to failing to reject the null hypothesis, while the null hypothesis is false.
Question 5a
For a one-sided hypothesis test with significance level α = 0.05, the value of zα = 1.645 from the standard normal table. The variance is 625, thus the standard deviation is √625 = 25.
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{25}) = 109.80 \]
The decision rule is: reject H0 if x̅ > 109.80
Question 5b
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{16}) = 112.50 \]
The decision rule is: reject H0 if x̅ > 112.50
Question 5c
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{44}) = 107.39 \]
The decision rule is: reject H0 if x̅ > 107.39
Question 5d
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{32}) = 108.62 \]
The decision rule is: reject H0 if x̅ > 108.62
Question 6a
For a one-sided hypothesis test with significance level α = 0.05, the value of zα = 1.282 from the standard normal table. The variance is 196, thus the standard deviation is √196 = 14.
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (14 / \sqrt{25}) = 123.59 \]
The decision rule is: reject H0 if x̅ > 123.59
Question 6b
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{625} / \sqrt{25}) = 121.28 \]
The decision rule is: reject H0 if x̅ > 121.28
Question 6c
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{900} / \sqrt{25}) = 127.69 \]
The decision rule is: reject H0 if x̅ > 127.69
Question 6d
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{500} / \sqrt{25}) = 125.73 \]
The decision rule is: reject H0 if x̅ > 125.73
Question 7
t30,0.05 = 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{108 - 100}{20 / \sqrt{31}} = 2.23 \]
Thus, t > t30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.
Questiom 8
t30,0.05 = 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{104 - 100}{10 / \sqrt{31}} = 2.23 \]
Thus, t > t30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.
The t value is actually the same as in the previous question, because both the nominator and denominator are half of the original value, hence yielding the same outcome.
Question 9
t30,0.05 = 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{96 - 100}{10 / \sqrt{31}} = -2.23 \]
Thus, t < t30,0.05. Because we are testing a one-sided alternative hypothesis with H1: μ > μ0, here, we cannot reject the null hypothesis (be aware that the sample mean is lower than the parameter of interest, rather than higher than the parameter).
Question 10
(1) the true mean is farther from the hypothesized mean μ0; (2) the significance level is higher; (3) the population variance is lower; (4) the sample size is larger.
Question 11
Power = 1 - β = 1 - 0.31 = 0.69
Question 12
Power = 1 - β = 1 - 0.25 = 0.75
Question 13a
H0: σ2 < σ20 = 4
H1: σ2 > 4
Question 13b
df = n - 1 = 20 - 1 = 19
Question 13c
For this test with a significance level of α = 0.05 and 19 degrees of freedom, the critical value of the chi-square variable is 30.144 (see Appendix Table 7 of the book).
Question 13d
\[ \frac{(n - 1)s^{2}}{\sigma^{2}_{0}} = \frac{20 -1)(6.62)}{4} = 31.445 \]
Question 13e
31.445 > 30.144. Therefore, we can reject the null hypothesis and conclude that the variability of the weight of the products exceeds the standard.
Question 14a
Student's t distribution
Question 14b
The critical t value is: tc = 1.684
\[ t = \frac{108 - 100}{20 / \sqrt{49}} = 2.8 \]
t > tc, therefore we can reject the null hypothesis.
Question 14c
\[ t = \frac{104 - 100}{20 / \sqrt{10}} = 2.8 \]
t > tc, therefore we can reject the null hypothesis.
Question 14d
\[ t = \frac{96 - 100}{10 / \sqrt{49}} = -2.8 \]
t < tc, yet we are testing t > tc. Therefore we cannot reject the null hypothesis ("wrong side").
Question 14e
\[ t = \frac{95 - 100}{8 / \sqrt{49}} = 4.38 \]
t < tc, yet we are testing t > tc. Therefore we cannot reject the null hypothesis ("wrong side").
Kees wants to use the results of a random sample market survey to seek strong evidence that his brand of cereal has more than 20% of the total market. Formulate the null hypothesis and alternative hypothesis using P as the population proportion.
What test procedures are there for testing the difference between two populations? - ExamTests 10
Questions
Question 1a
A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.
Question 1b
Can you reject the null hypothesis if the sample standard deviation of the difference is 20?
Question 1c
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 30?
Question 1d
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 15?
Question 1e
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 40?
Question 2a
A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 < 0. From the populations, a random sample is drawn of 25 matched pairs. The standard deviation of the difference between the sample means is found to be 25. Give the decision rule using a probability of type I error α = 0.05.
Question 2b
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 50 for populations 1 and 2?
Question 2c
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 59 and 50 for populations 1 and 2?
Question 2d
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 48 for populations 1 and 2?
Question 2e
Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 54 and 50 for populations 1 and 2?
Question 3a
A researcher wants to conduct a hypothesis test for the difference in means between two populations with independent samples. The following information is provided:
nx = 25; = 115; = 625
ny = 25; = 100; = 400
Compute the test statistic.
Question 3b
The researcher decides to test at a significance level of α = 0.05. Determine the critical z value.
Question 3c
Compare the critical z value to the test statistic. Can the researcher reject the null hypothesis?
Question 4
How large should the sample size be in order to obtain a good approximation if we replace the population variances with the sample variances?
Question 5a
Use the following information:
nx = 25; = 1078; sx = 633
ny = 25; = 908.2; sy = 469.8
We are interested in testing the difference in population means between X and Y. The alternative hypothesis states that the mean of population 2 is larger than the mean of population 1. For this hypothesis test, we are using a significance level of α = 0.05. Note that the population variances are unknown and that the sample variances are given.
Formulate the null hypothesis and alternative hypothesis.
Question 5b
Compute the pooled variance estimate.
Question 5c
What are the degrees of freedom?
Question 5d
What is the critical value of t?
Question 5e
Compute the test statistic.
Question 5f
Provide the decision rule for this hypothesis test.
Question 6
Can the null hypothesis be rejected?
Question 7
How large should the sample size be in order to be able to use the standard normal distribution for testing the equality of two population proportions?
Question 8a
Consider the following information:
nx = 270; = 0.185
ny = 203; = 0.399
Compute the estimate of the common variance, P0, under the null hypothesis.
Question 8b
Compute the test statistic.
Question 8c
Suppose we are testing with the alternative hypothesis: H1: Px < Py. For this test, we are using a significance level of α = 0.05. What is the critical value?
Question 8d
Formulate the decision rule.
Question 8e
Can we reject the null hypothesis?
Question 9a
Consider the following information:
nx = 17; sx = 123.35
ny = 11; sy = 8.02
What are the degrees of freedom for the F distribution?
Question 9b
Given a significance level of α = 0.02, what is the critical value of F?
Question 9c
Compute the test statistic. Can the null hypothesis be rejected?
Answer indication
Question 1a
tn-1,a = t24,0.05 = 1.711
The general decision rule here is: reject H0 if t > t24,0.05 = 1.711.
Question 1b
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{20 / \sqrt{25}} = 2.5 \]
t > t24,0.05 and, thus, we can reject the null hypothesis.
Question 1c
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{30 / \sqrt{25}} = 1.67 \]
t < t24,0.05 and, thus, we cannot reject the null hypothesis.
Question 1d
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{15 / \sqrt{25}} = 3.33 \]
t > t24,0.05 and, thus, we can reject the null hypothesis.
Question 1e
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{40 / \sqrt{25}} = 1.25 \]
t < t24,0.05 and, thus, we cannot reject the null hypothesis.
Question 2a
tn-1,a = t24,0.05 = -1.711
The general decision rule here is: reject H0 if t < -t24,0.05 = -1.711.
Question 2b
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-6}{25 / \sqrt{25}} = -3.8 \]
t < t24,0.05 and, thus, we can reject the null hypothesis.
Question 2c
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-9}{25 / \sqrt{25}} = -1.8 \]
t < t24,0.05 and, thus, we can reject the null hypothesis.
Question 2d
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-8}{25 / \sqrt{25}} = -1.6 \]
t > t24,0.05 and, thus, we cannot reject the null hypothesis.
Question 2e
\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-4}{25 / \sqrt{25}} = -0.8 \]
t > t24,0.05 and, thus, we cannot reject the null hypothesis.
Question 3a
\[ z = \frac{115 - 100}{\sqrt{\frac{625}{25} + \frac{400}{25}}} = 2.34 \]
Question 3b
Z0.05 = 1.645
Question 3c
z > z0.05 thus the null hypothesis can be rejected.
Question 4
The sample size should be larger than 100.
Question 5a
H0: μx – μy = 0
H1: μx – μy < 0
Question 5b
\[ s^{2}_{p} = \frac{ (25-1)(633)^{2} + (25 – 1)(469.8)^{2} }{25 + 25 - 2} = 310,700 \]
Question 5c
df = 25 + 25 – 2 = 48
Question 5d
t48,0.05 = 1.677
Question 5e
\[ t = \frac{1078 – 908.2}{ \sqrt{ \frac{310,700}{25} + \frac{310,700}{25}}} = 1.08 \]
Question 5f
Reject H0 if t > t48,0.05 = 1.677
Question 6
No, the test statistic is smaller than the critical value. Thus, there is not sufficient evidence to reject the null hypothesis.
Question 7
nP0(1 – P0) > 5
Question 8a
\[ \hat{p}_{0} = \frac{n_{x} \hat{p}_{x} + n_{y} \hat{p}_{y}}{n_{x} + n_{y}} = \frac{(270)(0.185) + (203)(0.399)}{270 + 203} = 0.277 \]
Question 8b
\[ \frac{0.185 – 0.399}{ \sqrt{ \frac{ (0.277)(1 – 0.277) }{270} + \frac{ (0.277)(1 – 0.277) }{203} } } = -5.15 \]
Question 8c
–z0.05 = -1.645
Question 8d
Reject H0 if z < –z0.05 = -1.645
Question 8e
Yes, we can reject the null hypothesis that there is no difference in proportions between these two populations, because -5.15 < -1.645.
Question 9a
dfnumerator = (nx - 1) = 17 – 1 = 16 and dfdenominator = (ny - 1) = 11 – 1 = 10.
Question 9b
From Appendix Table 9 (in the book) it follows that: F16,10,0.01 = 4.520
Question 9c
\[ F = \frac{s^{2}_{x}}{s^{2}_{y}} = \frac{123.35}{8.02} = 15.380 \]
Obviously, the test statistic of F(15.380) exceeds the critical value (4.520). Hence, the null hypothesis can be rejected in favor of the alternative hypothesis.
Use the following information for questions 1-5. A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ1 and the mean of production process 2 is defined as μ2. The null and alternative hypotheses are as follows: H0: μ1 – μ2 = 0 and H1: μ1 – μ2 > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.
How to conduct a simple regression? - ExamTests 11
Questions
Question 1a
Suppose we are interested in the relationship between the number of workers (denoted by X) and the number of tables produced per hour (Y). A sample of 10 workers is provided. The following descriptive statistics are obtained:
\[Cov(x,y) = 106.93 \hspace{5mm} s^{2}_{x} = 42.01 \hspace{5mm} \bar{y} = 41.2 \hspace{5mm} \bar{x} = 21.3 \]
Compute the slope of the sample regression.
Question 1b
Compute the y-intercept for the sample regression.
Question 1c
What is the equation of the regression line?
Question 1d
If management decides to employ 25 workers, how many tables would we expect to be produced?
Question 2
The following regression equation is given: Y = 559 + 0.3815X.
What is the expected value of Y for X = 55,000.
Question 3a
Use the following regression equation:
Y = 100 + 21X
Interpret the slope of the regression line.
Question 3b
What is the change in Y when X changes by +5?
Question 3c
What is the change in Y when X changes by -7?
Question 3d
What is the predicted value of Y when X = 14?
Question 3e
What is the predicted value of Y when X = 27?
Question 3f
Does this equation prove that a change in X causes a change in Y?
Question 4a
Given the regression equation:
Y = 107 + 10X
What is the change in Y when X changes by +2?
Question 4b
What is the change in Y when X changes by -4?
Question 4c
What is the predicted value of Y when X = 15?
Question 4d
What is the predicted value of Y when X = 22?
Question 5
Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 10; ȳ = 50; sx = 80; sy = 75; rxy = 0.4; n = 60.
Question 6
Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 60; ȳ = 50; sx = 80; sy = 65; rxy = 0.7; n = 60.
Question 7
Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 90; ȳ = 100; sx = 60; sy = 70; rxy = 0.4; n = 60.
Question 8
The following information is provided: SSE = 17.89 and SST = 68.22. What is the percent explained variability?
Question 9
What absolute value of the Student's t statistic indicates a relationship between two variables when we use a two-tailed test with α= 0.05 and n > 60?
Question 10a
Given the simple regression model
\[ Y = \beta_{0} + \beta_{1}X \]
and the regression results that follow, test the null hypothesis that the slope coefficient is zero versus the alternative hypothesis that the slope coefficient differs from zero using probability of type I error rate equal to 0.005 and determine the two-sided 99% confidence interval. The following sample statistics are provided: n = 22; b1 = 0.3815; sb1 = 0.0253.
Question 10b
Consider your answer on the previous question. Based on this result, what do you conclude about the slope coefficient?
Question 11
Which four factors result in narrower prediction intervals?
Question 12a
Suppose we want to test H0: ρ = 0 against H1: ρ > 0 using the sample information: n = 49 and r = 0.42.
What is the test statistic?
Question 12b
What is the critical value if we are testing at a 0.05% signifcance level?
Question 12c
What do we conclude about the population correlation?
Question 13
Suppose we have the following information: n = 25. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?
Question 14
Suppose we have the following information: n = 64. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?
Question 15
Which two factors can influence the estimated regression equation?
Question 16
Points with a high leverage will have a .... standard error of the residual.
Answer indication
Question 1a
\[ b_{1} = \frac{Cov(x,y)}{s^{2}_{x}} = r \frac{s_{y}}{s_{x}} = \frac{106.93}{42.01} = 2.545 \]
Question 1b
\[ b_{0} = \bar{y} - b_{1}\bar{x} = 41.2 - 2.545(21.3) = -13.02 \]
Question 1c
\[ \bar{y} = b_{0} + b_{1}x = -13.02 + 2.545x \]
Question 1d
\[ \hat{y} = -13.02 + 2.545(25) = 50.605 \]
Question 2
Y = 559 + 0.3815*55,000 = 21,542
Question 3a
For every one-unit change in X, Y changes by 21.
Question 3b
If X changes by +5, Y changes by (21)(5) = 105
Question 3c
If X changes by -7, Y changes by (21)(-7) = -147
Question 3d
Y = 100 + (21)(14) = 394
Question 3e
Y = 100 + (21)(27) = 667
Question 3f
No, regression results summarize the information contained in the data. They do not prove causation.
Question 4a
If X changes by +2, Y changes by (10)(2) = 20
Question 4b
If X changes by -4, Y changes by (10)(-4) = 40
Question 4c
Y = 107 + (10)(15) = 257
Question 4d
Y = 107 + (10)(22) = 327
Question 5
\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{75}{80} = 0.375 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.43(10) = 46.25 \]
\[ \hat{y}_{i} = 46.25 + 0.375x_{i} \]
Question 6
\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.7 \frac{65}{80} = 0.8125 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.8125(60) = 1.25 \]
\[ \hat{y}_{i} = 1.25 + 0.8125x_{i} \]
Question 7
\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{70}{60} = 0.467 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 100 - 0.467(90) = 58 \]
\[ \hat{y}_{i} = 58 + 0.467x_{i} \]
Question 8
\[ R^{2} = 1 - \frac{SSE}{SST} = 1 - \frac{17.89}{68.22} = 0.738 \]
Thus, 73,80% of the variability is explained by the regression model.
Question 9
According to the rule of thumb, the absolute value of the Student's t statistic should be greater than 2.0 to indicate that there is a relationship.
Question 10a
For a 99% confidence interval we have 1 - α = 0.05 and n - 2 = 22 - 2 = 20 degrees of freedom. Hence, from Appendix Table 8 (see book) it follows that:
\[ t_{n-2,\alpha/2} = t_{20,0.005} = 2.845 \]
Therefore, the 99% confidence interval is:
\[ 0.3815 - (2.845)(0.0253) < \beta_{1} < 0.381 + (2.845)(0.0253) \]
\[ 0.3095 < \beta_{1} < 0.4535 \]
Question 10b
The confidence interval does not comprise the zero, therefore we can reject the null hypothesis and conclude that the slope coefficient is not equal to zero.
Question 11
- A larger sample size (n).
- A smaller value of s2e.
- A large dispersion of the observations of the independent variable.
- Smaller values of the quantity (xn+1 - x̅)2.
Question 12a
\[ t = \frac{0.43 \sqrt{(49 - 2)}}{\sqrt{1 - (0.43)^{2}}} = 3.265 \]
Question 12b
Since there are (n - 2) = 47 degrees of freedom, it follows from Appendix Table 8 that t47,0.005 = 2.704
Question 12c
t47,0.005 = 2.704 < t. Therefore, we can reject the null hypothesis. There is strong evidence of a positive linear relationship between the two variables. Note, however, that we cannot conclude from this result that one variable caused the other, but only that they are related.
Question 13
\[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{25}} > 0.4 \]
Question 14
\[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{64}} > 0.25 \]
Question 15
Points with a high leverage and outliers.
Question 16
Smaller.
Suppose we are interested in the relationship between the number of workers (denoted by X) and the number of tables produced per hour (Y). A sample of 10 workers is provided. The following descriptive statistics are obtained:
\[Cov(x,y) = 106.93 \hspace{5mm} s^{2}_{x} = 42.01 \hspace{5mm} \bar{y} = 41.2 \hspace{5mm} \bar{x} = 21.3 \]
Compute the slope of the sample regression.
How to conduct a multiple regression? - ExamTests 12
Questions
Question 1a
\[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
Compute the expected value of y when x1 = 11, x2 = 24, and x3 = 27.
Question 1b
Compute the expected value of y when x1 = 31, x2 = 20, and x3 = 17.
Question 1c
Compute the expected value of y when x1 = 32, x2 = 29, and x3 = 13.
Question 1d
Compute the expected value of y when x1 = 30, x2 = 26, and x3 = 29.
Question 2a
\[ \hat{y} = 10 + 5_{x1} + 4_{x2} + 2_{x3} \]
Compute the expected value of y when x1 = 20, x2 = 11, and x3 = 10.
Question 2b
Compute the expected value of y when x1 = 15, x2 = 14, and x3 = 20.
Question 2c
Compute the expected value of y when x1 = 35, x2 = 19, and x3 = 25.
Question 2d
Compute the expected value of y when x1 = 10, x2 = 17, and x3 = 30.
Question 3a
\[ \hat{y} = 10 - 2_{x1} - 14_{x2} + 6_{x3} \]
What is the change in y when x1 increases by 4?
Question 3b
What is the change in y when x3 increases by 1?
Question 3c
What is the change in y when x2 increases by 2?
Question 4
What is the fifth assumption of a multiple linear regression model?
Question 5
Compute the coefficient b1 for the regression model
\[ \hat{y}_{i} = b_{0} + b_{1}x_{1i} + b_{2}x_{x2i} \]
given the following summary statistics:
rx1y = 0.80, rx2y = 0.30, rx1x2 - 0.90, sx1 = 500, sx2 = 400, sy = 100
Question 6
Compute the coefficient b2 for the regression model (using the regression model of question 13).
Question 7
The following data are available: n = 25; K = 2; SSE = 0.0625; SST = 0.4640.
Compute the adjusted coefficient of determination.
Question 8
When is the adjusted coefficient of determination preferred over the standard coefficient of determination?
Question 9
How is the coefficient of multiple correlation related to the multiple coefficient of determination?
Question 10a
b1 = 0.2372; sb1 = 0.0556; b2 = -0.000249; sb2 = 0.00003205.
What is the critical t statistic for a two-tailed hypothesis test with a 99% confidence interval?
Question 10b
Provide the 99% confidence interval for β1.
Question 10c
Provide the 99% confidence interval for β2.
Question 11a
A researcher is testing the influence of four independent variables on a certain dependent variable using multiple regression (n = 88). He finds that, for the complete model with four predictor variables, SSE = 1,149.14. For a multiple regression model with only two of the four predictor variables, SSE = 1,426.93. The variance estimator is s2e = 13.52. Compute the F statistic.
Question 11b
How many degrees of freedom does the F statistic have?
Question 11c
What is the critical value for F with a significance level of 0.01?
Question 11d
What is a dummy variable?
Question 12
Formulate the null hypothesis and the alternative hypothesis for testing the slope coefficient in the event of dummy variables.
Question 13
What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
\[ \hat{y} = 9 + 6x_{1} + 9x_{2} \]
Question 14
What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
\[ \hat{y} = 7 + 4x_{1} + 2x_{2} \]
Question 15
What is the model constant when the dummy variable equals 1 in the following equation, where x1 is a continuous variable and x2 is a dummy variable?
\[ \hat{y} = 4 + 4x_{1} + 8x_{2} + 9x_{1}x_{2} \]
Question 16
Consider the following equation: yi = 2x1.4
Compute the value of yi when xi = 1
Question 17
Consider the following equation: yi = 2x1.4
Compute the value of yi when xi = 1
Answer indication
Question 1a
\[ \hat{y} = 12 + (5)(11) + (6)(24) + (2)(27) = 265 \]
Question 1b
\[ \hat{y} = 12 + (5)(31) + (6)(20) + (2)(17) = 321 \]
Question 1c
\[ \hat{y} = 12 + (5)(32) + (6)(29) + (2)(13) = 372 \]
Question 1d
\[ \hat{y} = 12 + (5)(30) + (6)(26) + (2)(9) = 336 \]
Question 2a
\[ \hat{y} = 10 + (5)(20) + (4)(11) + (2)(10) = 174 \]
Question 2b
\[ \hat{y} = 10 + (5)(15) + (4)(14) + (2)(20) = 181 \]
Question 2c
\[ \hat{y} = 10 + (5)(35) + (4)(19) + (2)(25) = 311 \]
Question 2d
\[ \hat{y} = 10 + (5)(10) + (4)(17) + (2)(30) = 188 \]
Question 3a
The change in y when x1 increases by 4 is equal to (2)(4) = 8.
Question 3b
The change in y when x3 increases by 1 is equal to (6)(1) = 6.
Question 3c
The change in y when x2 increases by 2 is equal to (14)(2) = 28.
Question 4
There is no direct linear relationship between the independent variables.
Question 5
\[ b_{1} = \frac{ s_{y} (r_{x1y} - r_{x1x2}r_{x2y} ) }{s_{x1} (1 - r^{2}_{x1x2})} = \frac{100 (0.80 - 0.90*0.30)}{500 (1 - 0.90^{2}) = 0.56 } \]
Question 6
\[ b_{2} = \frac{s_{y} (r_{x2y} - r_{x1x2} r_{x1y} ) }{s_{x2} (1 - r^{2}_{x1x2})} =
\frac{100 (0.30 - 0.90*0.80)}{400 (1 - 0.90^{2}) = -0.55 } \]
Question 7
\[ \bar{R}^{2} = 1 - \frac{0.0625/22}{0.4640/24} = 0.853 \]
Question 8
This adjusted coefficient of determination corrects for the fact that nonnrelevant independent variables will result in a (small) reduction in the error sum of squares (SSE). Consequently, the adjusted coefficient of determination offers a better comparison between multiple regression models with different numbers of independent variables.
Question 9
The coefficient of multiple correlation is equal to the square root of the multiple coefficient of determination
Question 10a
tn-K-1,a/2 = t22,0.005 = 2.819
Question 10b
0.237 - (2.819)(0.05556) < β1 < 0.237 + (2.819)(0.05556)
0.80 < β1 < 0.394
Question 10c
-0.000249 - (2.819)(0.0000320) < β2 < -0.000249 + (2.819)(0.0000320)
-0.000339 < β2 < -0.000159
Question 11a
\[ F = \frac{(1426.93 - 1149.14)/2}{13.52} = 10.27 \]
Question 11b
The F statistic has 2 degrees of freedom (i.e., for the two variables tested simultaneously) for the numerator and 85 degrees of freedom for the denominator.
Question 11c
F* = 4.9 (see Appendix Table 9)
Question 11d
A dummy variable is a variable with two possible outcomes: 0 and 1.
Question 12
\[ H_{0}: \beta_{3} = 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]
\[ H_{1}: \beta_{3} \neq 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]
Question 13
18
Question 14
9
Question 15
12
Question 16
2.64
Question 17
5.28
\[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
Compute the expected value of y when x1 = 11, x2 = 24, and x3 = 27.
What other topics are important in regression analysis? - ExamTests 13
Questions
Question 1
What are the four stages of model building?
Question 2
If a model cannot be verified, what should you do?
Question 3
In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for ... and ... variables.
Question 4
If a blocking variable has 4 levels, how many dummy variables should be created?
Question 5
What is a treatment variable?
Question 6
What is a blocking variable?
Question 7
What is a lagged value?
Question 8
What is multicollinearity?
Question 9
Suppose that all the coefficient student t statistics are small, indicting no individual effect, and yet the overall F statistic indicates a strong effect for the total regression model. What is this an indication of?
Question 10
How to correct for multicollinearity?
Question 11
What is the danger of correcting multicollinearity by removing one or more of the highly correlated independent variables?
Question 12
What are the four assumptions made in a simple linear regression analysis?
Question 13
What is the fifth assumption that is added for multiple regression analysis?
Question 14
What is heteroscedasticity?
Question 15
Describe one procedure to check for heteroscedasticity.
Question 16a
From the regression of the squared residuals on the predicted values, we obtain the following estimated model (for n = 25):
\[ e^{2} = 0.00621 - 0.00550 \hat{y} \hspace{2mm} with \hspace{2mm} R^{2} = 0.066 \]
Compute the test statistic.
Question 16b
What is the critical value if we are testing with a 10% significance level?
Question 16c
Can we reject the null hypothesis that the regression model has uniform variance?
Question 17
What is the meaning of ρ for (auto)correlated errors?
Question 18
What does it imply if ρ = 0?
Question 19
What does it imply if ρ = 0.3?
Question 20
What does it imply if ρ = 0.9?
Question 21a
What is the most commonly used test to check possible autocorrelation of error terms?
Question 21b
Formulate the null hypothesis of this test.
Question 22
Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H1: ρ > 0.
Question 23
Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H1: ρ < 0.
Question 24
Suppose we found d = 0.2015, indicating positive autocorrelation. Estimate the serial correlation.
Question 25
Suppose we found d = 0.5213, indicating positive autocorrelation. Estimate the serial correlation.
Question 26a
In determining whether the errors in a regression model are positively correlated for the model
\[ y_{t} = \beta_{0} + \beta_{1}x_{1t} + \epsilon_{t} \]
we determine
\[ \sum^{30}_{t = 1}e^{2}_{t} = 7587.9154 \]
and
\[ \sum^{30}_{t = 2} (e_{t} - e_{t - 1})^{2} = 8195.2065 \]
Formulate the null and alternative hypothesis for the mentioned analysis.
Question 26b
Calculate the Durbin-Watson statistic.
Answer indication
Question 1
Model building consists of four stages: (1) model specification; (2) coefficient estimation; (3) model verification, and; (4) interpretation and inference.
Question 2
Go back to the first stage; model specification.
Question 3
In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for treatment and blocking variables.
Question 4
3
Question 5
A treatment variable is a variable whose effect we are interested in estimating with minimum variance. For instance, we may desire to know which of the five different production machines provides the highest productivity per hour. For this example, the treatment variable is the production machine, represented by a four-level categorical variable.
Question 6
A blocking variable is a variable that is part of the environment. Therefore, the variable level of such a variable cannot be preselected.
Question 7
When time series are analyzed (i.e., when measurements are taken over time) lagged values of the dependent variable are an important issue. Often in time series data, the dependent variable in time period t is related to the value taken by this dependent variable in an earlier time period, that is yt-1. The lagged value then is the value of the dependent variable in this previous time period.
Question 8
Multicollinearity refers to a state of very high intercorrelations among the independent variables.
Question 9
Multicollinearity
Question 10
1. Remove one or more of the highly correlated independent variables.
2. Change the model specification, including possibly a new independent variable that is a function of several correlated independent variables.
3. Obtain additional data that do not have the same strong correlations between the independent variables.
Question 11
This might lead to a bias in coefficient estimation
Question 12
1. The Y's are linear functions of X, plus a random error term.
2. The x values are fixed number that are independent of the error terms.
3. The error terms are assumed to be random variables with a mean of zero and a covariance of σ2.
4. The random error terms are not correlated with one another.
Question 13
There is no direct linear relationship between the Xj independent variables.
Question 14
Heteroscedasticity refers to the situation in which the errors terms do not have uniform variance.
Question 15
One possibility to check for heteroscedasticity is by examining a scatter plot of the residuals versus the independent variable. If the magnitude of the error terms tends to increase (or decrease) for increasing values of the independent variable, this indicates that the error variances are not constant.
Question 16a
\[ nR^{2} = (25)(0.066) = 1.65 \]
Question 16b
From Appendix Table 7, it can be found that for a 10% significance level, the critical value is: X21,0.10 = 2.706
Question 16c
The test statistic does not exceed the critical value, therefore the null hypothesis cannot be rejected.
Question 17
This ρ is the correlation coefficient (range -1 to +1) between the error in time t and the error in the previous time point, that is t - 1.
Question 18
If ρ = 0, this means that there is no autocorrelation in the errors.
Question 19
There is a relatively weak autocorrelation.
Question 20
There is a quite strong autocorrelation.
Question 21a
Durbin-Watson test.
Question 21b
H0: ρ = 0.
Question 22
Reject H0 if d > dL. Accept H0 if d > du. Test inconclusive if dL < d < dU.
Question 23
Reject H0 if d > 4 - dL. Accept H0 if d < 4 - du. Test inconclusive if 4 - dL > d > 4 - dU
Question 24
\[ r = 1 - \frac{d}{2} = 1 - \frac{0.2015}{2} = 0.90 \]
Question 25
\[ r = 1 - \frac{d}{2} = 1 - \frac{0.5213}{2} = 0.74 \]
Question 26a
H0: ρ = 0 and H0: ρ > 0.
Question 26b
\[ d = \frac{ \sum^{n}_{t = 2} (e_{t} - e_{t-1})^{2} }{\sum^{n}_{t=1} e^{2}_{t}} = \frac{8195.2065}{7587.9154} = 1.08 \]
What are the four stages of model building?
How to analyze categorical data? - ExamTests 14
Questions
Question 1a
Consider the following data:
Category | A | B | C | D | Total |
Observed number of objects | 43 | 53 | 60 | 44 | 200 |
Probability (under H0) | 1/4 | 1/4 | 1/4 | 1/4 | 1 |
Expected number of objects (under H0) | 50 | 50 | 50 | 50 | 200 |
Compute the chi-square test statistic.
Question 1b
What are the degrees of freedom for the critical test statistic?
Question 1c
Provide the range of the test statistic with probability .10 and .90 using Table 7a and 7b.
Question 1d
Can we reject the null hypothesis that there is no preference for any of the four categories?
Question 2a
Consider the following data:
Category | A | B | C | D | Total |
Observed number of objects | 50 | 93 | 45 | 12 | 200 |
Probability (under H0) | 0.30 | 0.50 | 0.15 | 0.05 | 1 |
Expected number of objects (under H0) | 200 |
Compute the expected values based on the null hypothesis that is specified in the table.
Question 2b
Compute the chi-square test statistic.
Question 2c
How many degrees of freedom are there?
Question 2d
From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between .... and ....
Question 2e
Can the null hypothesis be rejected?
Question 3a
Consider the following data:
Category | A | B | C | D | Total |
Observed number of objects | 287 | 49 | 30 | 34 | 400 |
Probability (under H0) | 0.80 | 0.10 | 0.06 | 0.04 | 1 |
Expected number of objects (under H0) | 400 |
Compute the expected values based on the null hypothesis that is specified in the table.
Question 3b
Compute the chi-square test statistic.
Question 3c
How many degrees of freedom are there?
Question 3d
Find the critical value using a significance level of 0.001.
Question 3e
Can the null hypothesis be rejected?
Question 4a
It is tested whether the population distribution is Poisson. Consider the following data:
Number of occurrences | 0 | 1 | 2 | 3+ |
Observed frequency | 156 | 63 | 29 | 14 |
Expected frequency under H0 | 135.4 | 89.4 | 29.5 | 7.7 |
Compute the test statistic.
Question 4b
How many degrees of freedom are there?
Question 4c
Find the corresponding critical value using a 0.001 significance level.
Question 4d
Can the null hypothesis that the population distribution is Poisson be rejected?
Question 5
Suppose we are interested in whether people prefer pinapple on their pizza. We sample 7 participants under the null hypothesis H0: P = 0.5. What is the probability of obtaining no more than 2 people with a preference for pineapple on their pizza?
Question 6
If our test statistic for a Sign test is equal to S = 2. Can we reject the null hypothesis?
Question 7a
A random sample of 100 students was asked to compare two new ice cream flavors: grilled BBQ and bubblegum surprise. After testing both flavors, 65 students preferred grilled BBQ, 40 students preerred bubblegum flavor, and 4 expressed no preference. Use the normal approximation to determine the mean and standard deviation for preferring bubblegum surprise.
Question 7b
Compute the test statistic using the normal approximation and continuity correction.
Question 7c
Find the approximate p-value.
Question 7d
Can we reject the null hypothesis?
Question 7e
What will be the test statistic if the continuity correction is not used?
Question 8
Given a random sample of n = 31 matched pairs, compute the mean and standard deviation for the Wilcoxon statistic under the null hypothesis.
Question 9
Now, suppose we find that the observed value of the statistic is T = 189. If we test the null hypothesis against a lower-tail alternative hypothesis with significance level 0.05, what can we conclude about the null hypothesis?
Question 10
Two independent samples are considered with n1 = 10, n2 = 12 and R1 = 93.5.
Compute the mean and variance for the Mann-Whitney statistic.
Question 11
Compute the Mann-Whitney U statistic.
Question 12
What can we conclude about the null hypothesis if we are testing with a significence level of 0.05?
Answer indication
Question 1a
X2 = 3.88
Question 1b
df = K - 1 = 4 - 1 = 3.
Question 1c
Lower critical value (Appendix Table 7b) X23,0.90 = 0.584
Upper critical value (Appendix Table 7a) X23,0.10 = 6.251
Question 1d
It is found that the test statistic of 3.88 falls between 0.584 and 6.251; from this it follows that 0.10 < p-value < 0.90. The null hypothesis can therefore not be rejected. However, this does not mean that we can conclude that all four categories are equally preferred. It only means that there is not enough evidence to support a preference.
Question 2a
EA = nPA = 200(0.30) = 60
EB = nPB = 200(0.50) = 100
EC = nPC = 200(0.15) = 30
ED = nPD = 200(0.05) = 10
Question 2b
X2 = 10.06
Question 2c
df = K - 1 = 4 - 1 = 3.
Question 2d
From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between 9.348 and 11.345.
Question 2e
0.001 < p-value < 0.025. Hence, the null hypothesis can be rejected.
Question 3a
EA = nPA = 400(0.80) = 320
EB = nPB = 400(0.10) = 40
EC = nPC = 400(0.06) = 24
ED = nPD = 400(0.04) = 16
Question 3b
X2 = 27.178
Question 3c
df = K - 1 = 4 - 1 = 3.
Question 3d
From Appendix Table 7 with K - 1 degrees of freedom and significance level 0.001, it is found that X23,0.001 = 16.266
Question 3e
The test statistic is much larger than the critical value. Hence, the null hypothesis can be rejected.
Question 4a
X2 = 16.08
Question 4b
df = K - m - 1 = 4 - 1 - 1 = 2
Question 4c
X22,0.001 = 13.816
Question 4d
The test statistic exceeds the critical value, thus the null hypothesis that the population distribution is Poisson can be rejected at the 0.01% significance level.
Question 5
p-value = P(x < 2) = 0.227 (see Appendix Table 3)
Question 6
No, with a p-value this large, the null hypothesis cannot be rejected.
Question 7a
Let P be the population proportion that prefers bubblegum surprise, given S = 40.
\[ \mu = np = 0.5n = 0.5(96) = 48 \]
\[ \sigma = 0.5 \sqrt{96} = 4.899 \]
Question 7b
Since 40 < 48, S* = 40.5
\[ z = \frac{S* - \mu}{\sigma} = \frac{40.5 - 48}{4.899} = -1.53 \]
Question 7c
From the standard normal distribution, it follows that the approximate p-value = 2(0.0630) = 0.126
Question 7d
The null hypothesis can be rejected at all significance levels greater than 12.6%.
Question 7e
If no continuity correction factor is used, the value for the test statistic becomes Z = -1.633, yielding a slightly smaller p-value of 0.1024.
Question 8
\[ \mu_{T} = \frac{n(n + 1)}{4} = \frac{(31)(32)}{4} = 248 \]
\[ Var(T) = \sigma^{2}_{T} = \frac{n(n + 1)(2n + 1)}{24} = \frac{ (31)(32)(63) }{24} = 2604 \]
\[ \sigma_{T} = \sqrt{2604} = 51.03 \]
Question 9
\[ Z = \frac{T - \mu_{T}}{\sigma_{T}} = \frac{189 - 248}{51.03} = \frac{-59}{51.03} = -1.16 \]
For α = 0.05, zα = -1.645
The test statistic does not exceed the critica value, hence there is not enough evidence to reject the null hypothesis.
Question 10
\[ E(U) = \mu_{U} = \frac{n1n2}{2} = \frac{ (10)(12) }{2} = 60 \]
\[ Var(U) = \sigma^{2}_{U} = \frac{ n1n2 (n1 + n2 + 1) }{12} = \frac{ (10)(12)(23) }{12} = 230 \]
Question 11
\[ Z = \frac{U - \mu{U}}{\sigma_{U}} = \frac{81.5 - 60}{ \sqrt{230} } = 1.42 \]
Question 12
The corresponding p-value = 0.1556. With a 0.05 significance level, this test result is not sufficient to conclude that the null hypothesis can be rejected.
How to conduct an analysis of variance? - ExamTests 15
Questions
Question 1
What is the null hypothesis of a one-way analysis of variance?
Question 2
Suppose, we found the following data: SSW = 12.18, n = 20, k = 3. Compute an estimate of the within-groups mean square.
Question 3
Suppose, we found the following data: SSG = 21.55, n = 20, k = 3. Compute an estimate of the between-groups mean square.
Question 4
Compute the F ratio for the MSW and MSG calculate in the previous two questions.
Question 5
What are the degrees of freedom corresponding to the information provided in questions 2 and 3.
Question 6
What is the critical F value if we are testing with a 1% significance level?
Question 7
What can we conclude about the population means based on this F ratio?
Question 8a
Consider the following analysis of variance table:
Source of variation | Sum of Squares | Degrees of freedom | Mean Squares | F ratio |
Between groups | 1728 | 4 | ||
Within groups | 624 | .. | ||
Total | 2352 | 17 |
How many degrees of freedom does the within-groups sum of squares have?
Question 8b
Compute the mean squares for between groups.
Question 8c
Compute the mean squares for within groups.
Question 8d
Compute the F ratio.
Question 8e
Find the critical F value corresponding to a significance level of 0.05.
Question 8f
What can be concluded about the null hypothesis?
Question 9a
Consider the following analysis of variance table:
Source of variation | Sum of Squares | Degrees of freedom | Mean Squares | F ratio |
Between groups | 879 | .. | ||
Within groups | 798 | 16 | ||
Total | 1677 | 19 |
How many degrees of freedom does the between-groups sum of squares have?
Question 9b
Compute the mean squares for between groups.
Question 9c
Compute the mean squares for within groups.
Question 9d
Compute the F ratio.
Question 9e
Find the critical F value corresponding to a significance level of 0.05.
Question 9f
What can be concluded about the null hypothesis?
Question 10a
Consider for questins 20-28 a two-way analysis of variance with one observations per cell and randomized blocks with the following results:
Source of variation | Sum of squares | Degrees of freedom | Mean squares | F ratio |
Between groups | 3636 | 33 | MSG = SSG / (K - 1) | |
Between blocks | 7575 | 66 | MSB = SSB / (H - 1) | |
Error | 9999 | 1818 | MSE = SSE / ((K - 1) (H - 1)) | |
Total | 210210 | 2727 |
Compute the mean squares for the between groups.
Question 10b
Compute the mean squares for the within groups.
Question 10c
Compute the mean squares for the error.
Question 10d
Compute the F ratio MSG / MSE.
Question 10e
Find the critical value for the hypothesis test that the between group means are equal using a 5% significance level.
Question 10f
What do we conclude about the null hypothesis that the between group means are equal?
Question 10g
Compute the F ratio MSB / MSE.
Question 10h
Find the critical value for the hypothesis test that the between block means are equal using a 5% significance level.
Question 10i
What do we conclude about the null hypothesis that the between block means are equal?
Question 11a
Consider the following data:
Source of variation | Sum of squares | Degrees of freedom | Mean squares | F ratio |
Between groups | 62.04 | 1 | 62.04 | |
Between blocks | 0.06 | 1 | 0.06 | |
Interaction | 1.85 | ... | 1.85 | |
Error | 23.31 | 63 | 0.37 | |
Total | 87.26 | 66 |
Compute the degrees of freedom for the interaction term.
Question 11b
Compute the F ratio for the interaction term.
Answer indication
Question 1
All population means are equal, that is: H0: μ1 = μ2 = ... = μk for K populations.
Question 2
MSW = (12.18) / (20 - 3) = 0.72
Question 3
MSG = (21.55) / (3 - 1) = 10.78
Question 4
F = MSG / MSW = 10.78 / 0.72 = 15.039
Question 5
df = (K - 1) = 3 - 1 = 2 for the numerator
df = (n - K) = 20 - 3 = 17 for the denominator
Question 6
F2,17,0.01 = 6.112 (Appendix Table 9)
Question 7
The test value (15.039) exceeds the critical value (6.112), therefore we can reject the null hypothesis that the population mean is the same for all three groups.
Question 8a
It follows from the degrees of freedom of the between-groups sum of squares that there are K - 1 = 4, thus K = 5. Further, from the degrees of freedom of the total sum of squares it follows that n - 1 = 17, thus n = 18.
As a result, we obtain: df = N - k = 18 - 5 = 13.
Question 8b
MSG = SSG / (K - 1) = 1728 / 4 = 432
Question 8c
MSW = SSW / (n - K) = 624 / 13 = 48
Question 8d
F = MSG / MSW = 246.86 / 48 = 9
Question 8e
F4,13,0.05 = 3.179
Question 8f
F > F4,13,0.05 , therefore we can reject the null hypothesis that the population means are equal.
Question 9a
n - 1 = 19 --> n = 20
n - k = 16 --> 20 - k = 16 --> k = 4
df = k - 1 = 4 - 1 = 3
Thus, there are 3 degrees of freedom.
Question 9b
MSG = SSG / (K - 1) = 879 / 3 = 293
Question 9c
MSW = SSW / (n - K) = 798 / 16 = 49.875
Question 9d
F = MSG / MSW = 293 / 49.875 = 5.875
Question 9e
F3,16,0.05 = 3.239
Question 9f
F < F3,16,0.05 , therefore we cannot reject the null hypothesis that the population means are equal.
Question 10a
MSG = SSG / (K - 1) = 3636 / 33 = 110.18
Question 10b
MSB = SSB / (H - 1) = 7575 / 66 = 114.77
Question 10c
MSE = SSE / ((K - 1) (H - 1)) = 9999 / 1818 = 5.5
Question 10d
F = MSG / MSE = 110.18 / 5.5 = 20.03
Question 10e
F33,1818,0.05 = 1.676
Question 10f
The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-groups means are equal.
Question 10g
F = MSB / MSE = 114.77 / 5.5 = 20.87
Question 10h
F66,9999,0.05 = 1.676
Question 10i
The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-blocks means are equal.
Question 11a
df = 1
Question 11b
F = MSI / MSE = 1.85 / 0.37 = 5.
What is the null hypothesis of a one-way analysis of variance?
How to analyze data sets with measurements over time? - ExamTests 16
Questions
Question 1
What is meant with a time series?
Question 2
What are the four components of a time series?
Question 3
Let the estimates of level and trend in year 5 be as follows:
\[ \hat{x}_{5} = 347 \]
\[ T_{5} = 13 \]
What is the forecast for the next year using the Holt-Winters method?
Question 4
What is the forecast for year 7 using the Holt-Winters method for nonseasonal series?
Question 5
What is the forecast for year 8 using the Holt-Winters method for nonseasonal series?
Question 6
What is the forecast for year 9 using the Holt-Winters method for nonseasonal series?
Question 7
Suppose we have 32 observations and a seasonal factor s = 4 indicating quarterly data. Write down the equation for the forecast the next observation beyond the end of the series. Use for this the method developed by Holt-Winters for seasonal series.
Question 8
What is the null hypothesis in an autoregressive model?
Question 9
Provide the general equation that represents a series according to the autoregressive model.
Question 10
What algorithm is used to obtain the parameters for the autoregressive model?
Answer indication
Question 1
A time series is a set of measurements, ordered over time, on a particular quantity of interest. In a time series, the sequence of observations is important.
Question 2
- Tt: trend component.
- St: Seasonality component.
- Ct: Cyclical component.
- It: Irregular component.
Question 3
\[ \hat{x}_{6} = 347 + 13 = 360 \]
Question 4
\[ \hat{x}_{7} = 347 + (2)(13) = 373 \]
Question 5
\[ \hat{x}_{8} = 347 + (3)(13) = 386 \]
Question 6
\[ \hat{x}_{8} = 347 + (4)(13) = 399 \]
Question 7
\[ \hat{x}_{n+h} = ( \hat{x}_{n} + hT_{n} ) F_{n+h-s} = \hat{x}_{33} = (\hat{x}_{32} + T_{32}) F_29 \]
Question 8
H0: Φp = 0
Question 9
\[ x_{t} = \gamma + \phi_{1}x_{t - 1} + \gamma + \phi_{2}x_{t - 2} + ... + \gamma + \phi_{p}x_{t - p} + \epsilon_{t} \]
Question 10
The least squares algorithm.
What is meant with a time series?
What other sampling procedures are available? - ExamTests 17
Questions
Question 1a
Suppose we conducted a stratified sampling procedure. Use the following information:
N1 = 75; N2 = 30; N3 = 125.
n1 = 15; n2 = 8; n3 = 25.
x̄1 = 21.2; s1 = 12.8.
x̄2 = 13.3; s2 = 11.4.
x̄3 = 26.1; s3 = 9.2.
Compute the point estimate of the population mean.
Question 1b
Compute the point estimate of the variance for the first stratum.
Question 1c
Compute the point estimate of the variance for the second stratum.
Question 1d
Compute the point estimate of the variance for the third stratum.
Question 1e
Compute the point estimate of the variance for the population mean.
Question 1f
Compute the point estimate of the standard deviation for the population mean.
Question 1g
Compute a 95% confidence interval for the population mean.
Question 2a
Suppose we conducted a stratified sampling procedure. Use the following information:
N1 = 364; N2 = 1031.
n1 = 40; n2 = 60.
p(hat)1 = 7/40 = 0.175
p(hat)2 = 13/60 = 0.217
Compute the point estimate of the population proportion.
Question 2b
Compute the point estimate of the variance of the proportion for the first stratum.
Question 2c
Compute the point estimate of the variance of the proportion for the second stratum.
Question 2d
Compute the point estimate of the variance of the proportion for the population.
Question 2e
Compute the point estimate of the standard deviation of the proportion for the population.
Question 2f
Compute the 90% confidence interval for the population proportion from these stratified samples.
Question 3a
Suppose we have a total of N = 125 which is divided into three strata with N1 = 75, N2 = 30, and N3 = 20. Now, suppose we want to select a sample of size n = 25.
Compute the sample size for the first stratum using proportional allocation.
Question 3b
Compute the sample size for the second stratum using proportional allocation.
Question 3c
Compute the sample size for the third stratum using proportional allocation.
Question 4a
Suppose we have a total of N = 225 which is divided into three strata with N1 = 100, N2 = 75, and N3 = 50. Now, suppose we want to select a sample of size n = 50.
Compute the sample size for the first stratum using proportional allocation.
Question 4b
Compute the sample size for the second stratum using proportional allocation.
Question 4c
Compute the sample size for the third stratum using proportional allocation.
Question 5a
Suppose we have a total of N = 500 which is divided into three strata with N1 = 250, N2 = 100, and N3 = 150. Now, suppose we want to select a sample of size n = 50.
Compute the sample size for the first stratum using proportional allocation.
Question 5b
Compute the sample size for the second stratum using proportional allocation.
Question 5c
Compute the sample size for the third stratum using proportional allocation.
Question 6a
Suppose we have a total of N = 500 which is divided into three strata with N1 = 250, N2 = 100, and N3 = 150. Now, suppose we want to select a sample of size n = 100.
Compute the sample size for the first stratum using proportional allocation.
Question 6b
Compute the sample size for the second stratum using proportional allocation.
Question 6c
Compute the sample size for the third stratum using proportional allocation.
Question 7
What is the difference between proportional allocation and optimal allocation in terms of sample effort?
Question 8
What is the difference between proportional allocation and optimal allocation in terms of estimating the sample size for strata for population proportions?
Question 9
What is the difference between stratified sampling and cluster sampling?
Question 10
Mention one advantage and one disadvantage of cluster sampling.
Question 11
Mention one advantage and one disadvantage of two-phase sampling
Answer indication
Question 1a
\[ \bar{x}_{st} = \frac{1}{N} \ sum^{K}_{j = 1} N_{j}\bar{x}_{j} = \frac{ (75)(21.2) + (30)(13.3) + (20)(26.1) }{125} = 20.09 \]
Question 1b
\[ \hat{\sigma}^{\frac{2}{x_{1}}} = \frac{ s^{2}_{1} }{n_{1}} x \frac{ (N_{1} - n_{1} ) }{N_{1} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(12.8)^{2}}{15} x \frac{60}{74} = 8.856 \]
Question 1c
\[ \hat{\sigma}^{\frac{2}{x_{2}}} = \frac{ s^{2}_{2} }{n_{2}} x \frac{ (N_{2} - n_{2} ) }{N_{2} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(11.4)^{2}}{8} x \frac{22}{29} = 12.324 \]
Question 1d
\[ \hat{\sigma}^{\frac{2}{x_{3}}} = \frac{ s^{2}_{3} }{n_{3}} x \frac{ (N_{3} - n_{3} ) }{N_{3} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(9.2)^{2}}{2} x \frac{18}{19} = 40.093 \]
Question 1e
\[ \hat{\sigma}^{\frac{2}{st}} = \frac{1}{N^{2}} \ sum^{K}_{j = 1} N^{2}_{j} \hat{\sigma}^{2}_{x_{j}} = \frac{ (75)^{2}(8.856) + (30)^{2}(12.324) + (20)^{2}(40.093) }{125^{2}} = 4.924 \]
Question 1f
\[ \hat{\sigma}_{\bar{x}_{st}} = \sqrt{4.924} = 2.22 \]
Question 1g
20.09 +/- (1.96)(2.22) = [15.74; 24.44]
Question 2a
\[ \hat{p}_{st} = \frac{1}{N} = \sum^{K}_{j = 1} N_{j} \hat{p}_{j} = \frac{ (364)(0.175) + (1031)(0.217) }{1395} = 0.206 \]
Question 2b
\[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.175)(0.825) }{39} x \frac{324}{363} = 0.003304 \]
Question 2c
\[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.217)(0.783) }{59} x \frac{971}{1030} = 0.002715 \]
Question 2d
\[ \hat{\sigma}^{2}_{\hat{p}_{st}} = \frac{1}{N^{2}} \sum^{K}{j = 1} N^{2}_{j} \ hat{\sigma}^{2}_{\hat{p}_{j}} = \frac{ (364)^{2}(0.003304) + (1031)^{2}(0.002715) }{ (1395)^{2} } = 0.001708 \]
Question 2e
\[ \hat{\sigma}_{\hat{p}_{st}} = 0.0413 \]
Question 2f
(0.206) +/- (1.645)(0.0413) = [0.138; 0. 274]
Question 3a
\[ n_{1} = \frac{75}{125} x 25 = 12 \]
Question 3b
\[ n_{2} = \frac{30}{125} x 25 = 5 \]
Question 3c
\[ n_{3} = \frac{20}{125} x 25 = 6 \]
Question 4a
\[ n_{1} = \frac{100}{225} x 50 = 22 \]
Question 4b
\[ n_{2} = \frac{75}{225} x 50 = 17 \]
Question 4c
\[ n_{3} = \frac{50}{225} x 50 = 11 \]
Question 5a
\[ n_{1} = \frac{250}{500} x 50 = 25 \]
Question 5b
\[ n_{2} = \frac{100}{500} x 50 = 10 \]
Question 5c
\[ n_{3} = \frac{150}{500} x 50 = 15 \]
Question 6a
\[ n_{1} = \frac{250}{500} x 100 = 50 \]
Question 6b
\[ n_{2} = \frac{100}{500} x 100 = 20 \]
Question 6c
\[ n_{3} = \frac{150}{500} x 100 = 30 \]
Question 7
Optimal allocation allocates relatively more sample effort to strata in which the population variance is highest.
Question 8
Optimal allocation allocates more sample observations to strata in which the true population proportions are closest to 0.50.
Question 9
In stratified random sampling, a sample is taken from every stratum of the population in an attempt to ensure that important segments of the population are given corresponding weight. In cluster sampling, a random sample of clusters is taken, such that some clusters will have no members in the sample.
Question 10
Advantage: convenience. Disadvantage: the additional imprecision in the sample estimates.
Question 11
Advantage: it enables the researcher, at a low cost, to try out the survey. Disadvantage: time consuming.
Suppose we conducted a stratified sampling procedure. Use the following information:
N1 = 75; N2 = 30; N3 = 125.
n1 = 15; n2 = 8; n3 = 25.
x̄1 = 21.2; s1 = 12.8.
x̄2 = 13.3; s2 = 11.4.
x̄3 = 26.1; s3 = 9.2.
Compute the point estimate of the population mean.
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
Contributions: posts
Spotlight: topics
Online access to all summaries, study notes en practice exams
- Check out: Register with JoHo WorldSupporter: starting page (EN)
- Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)
How and why use WorldSupporter.org for your summaries and study assistance?
- For free use of many of the summaries and study aids provided or collected by your fellow students.
- For free use of many of the lecture and study group notes, exam questions and practice questions.
- For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
- For compiling your own materials and contributions with relevant study help
- For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.
Using and finding summaries, notes and practice exams on JoHo WorldSupporter
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
- Use the summaries home pages for your study or field of study
- Use the check and search pages for summaries and study aids by field of study, subject or faculty
- Use and follow your (study) organization
- by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
- this option is only available through partner organizations
- Check or follow authors or other WorldSupporters
- Use the menu above each page to go to the main theme pages for summaries
- Theme pages can be found for international studies as well as Dutch studies
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
- Check out: Why and how to add a WorldSupporter contributions
- JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
- Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form
Quicklinks to fields of study for summaries and study assistance
Main summaries home pages:
- Business organization and economics - Communication and marketing -International relations and international organizations - IT, logistics and technology - Law and administration - Leisure, sports and tourism - Medicine and healthcare - Pedagogy and educational science - Psychology and behavioral sciences - Society, culture and arts - Statistics and research
- Summaries: the best textbooks summarized per field of study
- Summaries: the best scientific articles summarized per field of study
- Summaries: the best definitions, descriptions and lists of terms per field of study
- Exams: home page for exams, exam tips and study tips
Main study fields:
Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports
Main study fields NL:
- Studies: Bedrijfskunde en economie, communicatie en marketing, geneeskunde en gezondheidszorg, internationale studies en betrekkingen, IT, Logistiek en technologie, maatschappij, cultuur en sociale studies, pedagogiek en onderwijskunde, rechten en bestuurskunde, statistiek, onderzoeksmethoden en SPSS
- Studie instellingen: Maatschappij: ISW in Utrecht - Pedagogiek: Groningen, Leiden , Utrecht - Psychologie: Amsterdam, Leiden, Nijmegen, Twente, Utrecht - Recht: Arresten en jurisprudentie, Groningen, Leiden
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
4223 | 1 |
Add new contribution