Continued from Measurement of Variables
A sample is a part of the whole, also called population, whose properties are to be studied to gain information about the entire lot. When dealing with people, it can be defined as a set of respondents or representatives selected from a larger population for the purpose of a study.
Researcher need not study the entire population for many reasons such as (i) the cost is too high and (ii) the population is dynamic or could change over time. In case of sample, there are many benefits like (i) lower cost, (ii) faster collection and (iii) accuracy.
In some cases, sampling is the only method when the entire population is inaccessible as wildlife in the forest. Moreover, due to study or experiment, there could be some wastage. So the entire population cannot be destroyed for the sake of study. Supposing, a researcher wants to find out average life of a light bulb. For this purpose, the researcher would switch on the light of a number of bulbs and keep them on till the bulbs burnout. By taking an average, life of a light bulb could be determined. But for this purpose, all the bulbs (the population) cannot be put to test else there would be nothing left to sell.
If the population is small, the researcher can study each and every member of the population. In this case, it would be called census. For example, if there are only 10 fertilizer plants in a country, it would be worthwhile to study all rather than a sample of them.
Representativeness of the sample
An appropriate and adequate sample is expected to reflect the properties or characteristics of the population. Depending upon the situation, the researcher may feel comfortable with a random sample. If population is not homogenous, the researchers can resort to other ways of taking samples such as stratified or cluster or even convenience sample. Moreover, the degree of representativeness of the sample may be limited by cost or convenience.
There are some other problems in sampling. First is called coverage error which depends upon how good is the population frame. If we want to study people of Karachi and we draw our sample from telephone directory, some percentage of population may not have telephones and hence no chance to be included in our survey. Under such a situation, a sample drawn from telephone directory may not be a representative one. Next is sampling error which pertains to sample size relative to the population.. Similarly, there are non-sampling errors which arise due to non-participation of some of the selected samples.
Sample versus Population Mean
TYPES OF SAMPLING
Basically, there are two types of sampling: Non-Probability and Probability Sampling.
In Non-probability, there is no known or pre-determined chance of being selected. It is used when time is critical. The researcher resorts to convenience or judgement. It includes assigning a quota to certain areas and selecting the sample as conveniently as possible. Another technique is snow-balling which is used when people are reluctant to speak or there is no way of locating them. It includes illegal immigrants, junkies, left-handed, vegetarian or homeless people. In this case, the researcher selects one person as a sample and ask him or her to refer him to other people in the same category.
In probability sampling, elements have some known chance of being selected. If out of 100, a researcher decides to pick 10 by lottery, everyone has 10% chance of being selected. This method is used when a wider generalizability is needed or when selection bias is to be reduced to minimum possible.
Brief description of two type of samples is give below:
Least expensive, least time consuming, most convenient
Selection bias, sample not representative
Low cost, convenient, not time consuming
Subjective, does not allow generalization
Sample can be controlled for certain characteristics
Selection bias, no assurance of representativeness
Can estimate rare characteristics
Simpling random Sampling
Easily understood, results projectable
Difficult to construct sampling frame, expensive
Can increase representativeness, easier to implement than simple random sampling, sampling frame no necessary
Can decrease representativeness
Includes all important sub-population, precision
Difficult to select relevant stratification variables; not feasibile to stratify on many variables, expensive
Easy to implement, cost effective
Imprecise, difficult to compute and interpret results
Issue of precision and confidence in determining sample size
Sampling is the process of selecting a sufficient number of elements for study and generalization. It may be noted that there is no direct relationship between population and sample size. But, of course, larger the sample, lesser the sampling error. Also, larger the sample, more the cost but it would not reduce the error by the same proportion. In other words, if we double size of our sample, the cost would also be doubled but the reduction in error would not correspondingly 50%.
In determining the sample size, we need three pieces of information:
- Confidence Interval or desired precision: ± 10 or ±1,000
- Confidence Level or in other words how sure you want to be: 68%, 95% or 99.9%
- How heterogeneous are the numbers as indicated by the standard deviation of the population.
A bank manager is interested to know weekly cash withdrawal from his bank. She wants to find out quantum of monthly withdrawal with 95% confidence that the results would be within a narrow range of ± 500. How large a sample, she should select?
First, she would pickup a small sample just to know standard deviation of the samples by a random selection. Suppose, she takes five samples of weekly withdrawal which come to 200,000, 202,000, 204,000, 206,000 and 209,100. The standard deviation of these amounts would be 3,527.
Z value of the 95% probability would be 1.96. Using the formula given in the side sheet, the sample size would work out to be 191,
There is a trade off between confidence and precision. The more precise you want to be, the more confidence level would be lowered. For example, you see a person and from his dress you want to make a guess of his income. You can be 90% sure that you have made a good guess if you say that his income is between Rs.5,000 and Rs.50,000. But that is a wide range. If you reduce it or be more precise by saying between Rs.40,000 and Rs.45,000, you cannot be 95% confident but maybe 50%.
Random sample of cars
In any research, selection of an appropriate sample is very important. This involves sampling plan and sample size.
Probability samples are good for generalization as samples are selected on random basis. But if time and cost is the concern, there are many techniques of non-probability samples which are cheapest and quickest.
Awareness of sample designs and sample size helps researcher to strengthen their research. It also helps assess cost implication of different designs and the trade off between the precision and confidence.
NEXT DATA COLLECTION