Hypothesis Testing Petra Petrovics PhD Student
Inference from the Sample to the Population Estimation Hypothesis Testing Estimation: how can we determine the value of an unknown parameter of a population by using the sample. Hypothesis testing: how to test a statement concerning a population parameter.
Hypotheses Null hypothesis H The hypothesis to be tested. A proposition that is considered true unless the sample we use to conduct the hypothesis test gives convincing evidence that the null hypothesis is false. Alternative hypothesis H It is the hypothesis accepted when the null hypothesis is rejected.
Hypothesis Testing Null hypothesis: H : = Alternative hypothesis: H : Two-tailed test One-tailed test
Basic Terms I Statistical test A decision function that takes its values in the set of hypotheses. Region of acceptance The set of values for which we fail to reject the null hypothesis. Region of rejection / Critical region The set of values of the test statistic for which the null hypothesis is rejected.
p-value Basic Terms II The probability, assuming the null hypothesis is true, of observing a result at least as extreme as the test statistic. Significance level: α a value indicating the percentage of sample values that is outside certain limits the probability of rejecting the null hypothesis when it is true Critical values: limits of the rejection region
Conclusion H true Actual situation H false ail to reject H Correct decision - Type I error Reject H Type II error Correct decision
The ive-step ocedure for Hypothesis Testing. Set up the null hypothesis H and the alternative hypothesis H.. Define the test statistic. The test statistic will be evaluated, using the sample data, to determine if the data are compatible with the null hypothesis. 3. Define a rejection region, having determined a value for α. 4. Carry out the test. State our decision: to reject H or to fail to reject H. 5. Give a conclusion! (In the terms of the original problem or question)
Hypothesis Testing for the Population Mean H : μ = m.) Population with normal distribution, σ known x - m z = σ/ n.) Population with normal distribution, σ unknown, n x - m t = s/ n 3.) σ unknown, n x - m z = s/ n
z π Critical Values in case of Large Sample H : m H : m H : m z z / z / z t π Critical Values in case of Small Sample H : m H : m H : m t t / t / t
Hypothesis Testing for a Population oportion H : P = P In case of a sample when n! z p - P P Q Hypothesis Testing for a Population Standard Deviation H : σ = σ Only in the case when the population distribution is normal. n - σ /n s
χ Critical values of χ -test H : H : H : / /
Exercise The data below are a random sample of a whole coffee filling machine which fills 5 g bags of coffee. The distribution of the weight in the population is normal. Weights of the bags (g) Number of bags 4 8 4 45 46 5 3 5 55 8 55 Total
a) Estimate the average weight of the coffee bags! Construct a 95 percent confidence interval for the population mean! b) The management is actually saying that it wants a more precise estimation. What sample size will provide a % less maximum error? c) Construct a 95% confidence interval for the population standard deviation! d) How much coffee does this machine need to fill in the bags if it produces coffee bags per day? (π=95%) e) Estimate the proportion of the bags in the population which contain not more than 5 g coffee ( = 99 %)! f) Do the data support the statement that the mean of the population is 5 g? ( = %) g) Test the hypothesis that the population standard deviation is 6g! Use a 5% significance level! h) Test the hypothesis that the proportion of the coffee bags containing more than 5g coffee is at least 4%!
Comparing Two Population Means H : μ μ = δ Sample Sample Sample size m n Data x, x,..., x m x, x,..., x n Sample mean Sample variance x x s s a) Using two small independent samples and normal populations Assumption: equal standard deviations b) Using two large independent samples d - z = s d t = s p d - n m
Comparing Two Population oportions H : P P = ε Sample Sample Sample size m n oportion Standard deviation p k / m p k s pq s pq where q = - p q = - p / n Using large independent samples z e s e
Comparing the Variances of Two Normal Populations H H : obability -/ H : = Using independent samples: s Lower critical value (c l ) Upper critical value (c u ) H : < - - H : > - ( ; ) ( ; ) = s ( ; ) ( ; )
Critical values of -test H : H : : H : H H : : H H : H : : ( ; ) ( ; ) ( ; ) ( ; ) ( ; ) ( ; ) ( ; ) ( ; ) ; ) ( ( ( ; ) ; ) ( ; )
Exercise 6 (cont.) i) The management wants to buy a new filling automat which fills the coffee bags more precisely. Test the statement that the standard deviation of weight of bags produced by the new machine is not equal to the standard deviation of the old machine. A sample of 5 coffee bags was taken and their total weight was 37.65kg (x = 9 454 3). α =. j) Should the management conclude, at the percent significance level, that the new machine fills at least 7 g more coffee per bags?
Exercise 7 The data below are a random sample of a tube cutting machine which cuts mm long pieces of tubes. 8; 4; ; ; 94; 95; 5; 94; 97; 93; 5; ; 9; 95; 94 a) Test the hypothesis that the population standard deviation is 3 mm! Use a 5% significance level! b) Do the data support the statement that the mean of the population is mm? (α=5%)
Exercise 8 Doctors wants to compare the heigh of basketball players and swimmers. The following sample of 4 basketball player was taken (in cm): 98; ; 99; ; 9; 98; 99; 5; 4; ; 99; 99; ; 4 The average heigh of the swimmers is 96 cm, its standard deviation is 5. cm. (The distribution of the heigh is normal.) a) Determine a 98% confidence interval for the average heigh of basketball players. b) At what significance level can you accept that at least half of the basketball players is higher than m? c) Test the hypothesis that the average heigh of basketball players is more than that of the swimmers and the difference is not more than 5 cm.
Exercise 9 rom the 6, female employees of a multinational company a sample of 6 was taken. Their total age is 5,33 years (Σx =88). a) Estimate the average age (π=98.8%). b) What sample size will provide half of the maximum error? c) In the sample there are women younger than 35. Would you accept the statement that 75% of the women is younger than 35? (α=5%) d) Another sample was drawn from men. The average age of the men is 36.9. (s=9.54). At what significance level can you accept that the average age of the women is less than that of the man by 3 years?
Exercise To increase the amount of the trade, McDonald s tried two kinds of promotions. On Monday it reduced the price of the menu, and on Tuesday free ice was given with the menu. To test which promotion is more successful the trade was examined in McDonald s:.. 3. 4. 5. 6. 7. 8. 9.. Monday 6 9 98 5 39 6 99 93 Tuesday 53 88 93 8 35 97 4 85 Test the statement that price reduction was more successful. (α = 5%)
Thank You for Your Attention!