- Education and Science
More on Hypothesis Testing
Professor Jamal entered the class. After saying “Good Morning”, he started writing on the white-board “Intervening Variable”, which was topic of the day. While writing, he noticed not only the class response was luke-warm but also there was a pin drop silence. When he turned back towards the students, he was surprised to see many sleepy and drowsy faces, some with red-eyes. A bad start, he thought and somehow finished the period. There are more surprises in store for him as he found the two other classes quite lively and cheerful.
After finishing the teaching sessions, he relaxed on a reclining chair but his mind was occupied by the behavior of the morning class. Was it due to the first period as students rush off from houses and collapse in the class? Was it due to dull sessions? Were these students different from other students as perhaps they do not get full sleep? Are there other distractions like drama-week, forth-coming week long tour to a neighboring country or what? These were relevant assumptions, suppositions or guesswork and are called "Hypothesis". If the professor was looking for root-cause of the problem, these hypotheses would tested one by one and only those retained which proved valid.
What is a hypothesis?
Hypothesis means theoretical, imaginary, academic, contestable, disputable, questionable, refutable, and unconfirmed.
Hypothesisis an attempt to give a possible cause of a certain situation. It is a tentative statement that proposes a possible explanation to some problem or event. A useful hypothesis is a testable statement.
It differs from theory which has already been tested. In fact, a theory starts with a hypothesis. If the hypothesis is proved ‘true”, it becomes a theory and remain valid until proved wrong by subsequent scientific investigation. At one time it was found out that egg causes cholesterol. But subsequent investigations reveal that eating two egg-white and one egg-yellow is a healthy protein diet.
A great researcher, Dr. Asim Ijaz Khwaja of Pakistan
The researach starts
So the professor decides to conduct a research to find out real cause of class behavior. He had already thought over the matter and made the following assumptions:
- Was it due to the first period as students rush off from theirs homes and collapse in the class?
- Was it due to dull lecturer?
- Were these students different from other students as perhaps they do not get full sleep?
- Are there other distractions like drama-week, forthcoming week long tour to a neighboring country or what?
He discussed the above assumptions with students and other faculty members and come to the following conclusions:
Assumption (1) was not tenable as other students was observed fresh and cheerful even in the first period.
As to (2), these were not dull sessions but the course had entered an interesting phase.
While (3) would required some data, the (4) is rejected right away as there were no extra-curriculum activities going in the University.
Formulate the null and alternative hypothesis
Hypothesis testing is one of the most important concepts in Statistics. It starts with Null Hypothesis which means that (i) nothing was found or (ii) the sample is not different from the population or (iii) all attempts to improve the situation have not been successful. While the ‘alternate’ indicates the contrary situation like (i) cause has been found (ii) the sample is different (iii) there has been a significant improvement which cannot due to a chance alone.
In fact, the researcher puts his findings in the alternate but he tries to reject the null hypothesis in order to take it for granted that the alternate is true. In other words, he tries to disprove the Null Hypothesis. If he disproves that nothing was found, then we conclude that SOMETHING was found. So it is easy to reject some statement than to prove it.
NULL HYPOTHESIS – There is no difference in sleeping pattern of final-year students and the other students.
ALTERNATE: Final year student sleep less than the other students.
Choose a level of significance.
A result is called significant if we have reasons to believe that it shows a real change or difference which has been brought by our efforts and is not by chance or sampling error. We are basing our arguments about a population after studying a small sample. It would be called significant only if there was no likelihood of samling error i.e. the sample was small or by chance it contained better data..
Usually, the level of significance is chosen to be 0.05 or 5% and is denoted by ∂. It means that our chances of error are only 5%. We can reduce these chances by choosing a significant level of 1% or even lesser.
Determine the sample size.
Sampling is the process of selecting a sufficient number of elements (members of the population) for study and generalization. The size of sample depends on three factors: (i) Confidence Interval, (ii) Confidence Level and (3) degree of heterogeneity as described by standard deviation. In other words, it depends upon precision, confidence and characteristics of the population.
It must be remembered that there is no direct relationship between population and sample sizes. However, if from the same population a larger sample is taken, it would reduce the sampling error. But it would certainly increase the cost while not reducing the error by the same proportion. In other words, if we double the sample, the sampling errors would not decline to one-half.
In this case, the professor from the past experience knew that there is a lot of homogeneity among the students since most of them come from middle class living in close vicinity of the university. Previous studies shows that the students normally sleep for 7 hours in each night with a small standard deviation of 0.5 hour. Now the professor wants to be 95% sure that the range of hours would not differ from 20% on either side. If he makes it 2%, the sample size would be very high costing a lot of time and money in interviewing students and subsequently checking and re-checking. Based on this information, the sample size would be 23 students as shown in the side sheet.
Collect data and calculate statistics.
Since it was a university, a complete list of students was available. Of this, a random sample of 23 students was drawn and they were asked to fill up a questionnaire which, inter-alia, includes “How hours do you sleep in 24 hours”. The response was processed and it was found to be 6.5 hours with a standard deviation of 0.5.
Calculate z or t score
Since we are testing a single sample based on 23 observation, t-test would be used rather than z-test which is applicable to a sample of more than 30.
Based on the working given in side sheet, we find the test-statistics to be -4.8.
Now the professor consulted the t-table and look for value of t if degree of freedom is n-1, i.e. 23-1=22 and . It is - 2.07.
What is your conclusion?
Would you reject the Null Hypothesis or you would not be able to reject the Null Hypothesis which says "There is no difference in sleeping pattern of final-year students and the other students."
You reject the null hypothesis. Your test statistics is in the rejection regions which is shown in the curve on the side sheet. Since you calculated t-value is far beyound the table-value, we would reject the null hypothesis. In doing so, we would automatically accept the alternate which says - 'Final year student sleep less than the other students'.
There can be a error in accepting or rejecting null hypothesis. One can say there was a significant improvement when there was none. One can find no effect while there was one.
Suppose you go to doctor to know whether you are sick not. This can lead to following situations:
- You may be sick
- You may not be sick
- The doctor may tell you that you are sick
The doctor may say that you are not sick
Now here the null hypothesis is that: You are not sick
In this case, the alternate would be : You are sick
Consider doctor as a researcher. He would reject the ‘Null’, if you found you sick and accept the alternate. If you are not sick but doctor says you are sick, he is making a mistake. Another error can be that you are sick but doctor says you are not sick. These are incorrect conclusions. In statistical hypothesis testing, these are called Type I Error and Type II Error respective.
TYPE I ERROR
- If a null hypothesis is incorrectly rejected when it is in fact true, this is called a Type I error.
- Also known as an "error of the first kind" or an α error, or a "false positive".
- It indicates "A Positive Assumption is False"
TYPE II ERROR
- A Type II error occurs when a null hypothesis is not rejected despite being false.
- Also known as an "error of the second kind", a β error, or a "false negative":
- It indicates "A Negative assumption is False".
It is difficult to accept a situation directly but it is easy to reject it. We cannot prove that all dogs bark as there many species that do not bark like Mexican Chihuahua or Beagle. But we can reject a supposition that “No dog barks.” Side by side, we can give another supposition that “All dogs bark”. If we reject the first proposition, we accept the alternate.
While accepting or rejecting, we may make an errors called Type I and Type II. If we want to reduce Type I error, one way is increase level of significance to a level of 99.99% but in case we would increase possibility of Type II Error. Both can be reduced if we increase the sample size and also increase the reliability of data. In any case, we trying to evolve a theory which may be nullified by any subsequent investigations.
SOME RELEVANT TERMS
Also known as SIZE or TYPE-1 ERROR. This is the probability that, according to some null hypothesis, a statistical test will generate a false-positive error : affirming a non-null pattern by chance
The P value is the probability that random sampling would lead to a difference between sample means as large (or larger) than you observed?
Wilcoxon Signed Rank Test
This is a test for an EXPERIMENTAL DESIGN involving two INDEPENDENT GROUPS of experimental units, where data need be only ORDINAL-SCALE.
A statistical test provides a mechanism for making quantitative decisions about a process or processes