Issues regarding Reliable and Valid research studies
Leech, N.L. & Onwuegbuzie, A.J. (2007) in their article “An Array of Qualitative Data Analysis Tools”, said that analysis of data is one of the most important parts in the research process and can be one of the most time-consuming. They said that there are two main reasons for using multiple qualitative data analysis tools; Representation and Legitimation. Data analysis, being one of the most important parts of a research study, can help make your study look more attracting to people interested in your chosen topic. For example, stating that ‘23% of students cycle to college’ sounds more reliable than stating that ‘not a lot of student’s cycle to college’.
Putting figures within your study makes your study appear more reliable and worthy of your trust. Trustworthiness of analysed data collected for a research study can often contain discrepancies, whether they were added purposely or by mistake. Even though adding analysed data into your study helps, should someone find a single mistake in your study, this could give you a bad reputation or may even taint your research studies in the future.
Psychologist William James once said, “We are all ready to be savage in some cause. The difference between a good man and a bad one is the choice of the cause”. With this quote in mind, how is it possible for a person to put their trust in anything someone does? Or better yet, writes? In theory, no study can be trusted fully unless there was a way to show to the reader that your data is correct. The best possible way to do this is to try a number of different of data collection methods, (questionnaires, observation studies or interviews) and try to achieve the same result to verify your stated hypothesis. This is known as the triangulation of data.
Reliability is important when it comes to research methods. It is a term used to describe the consistency of a measure. For example, if you selected a group of 5 people and gave each of them standard personality tests, e.g. MBTI (Myers Briggs Type Indicator), you should be able to give the participants the tests again and get the same results no matter how many times you repeat it.
Inter-Rater Reliability - refers to the rate of which two or more individuals agree on something. For example, two or more researchers are observing students in a room. The students are discussing a film they just saw. The researchers give the rating scale of 1 being most positive and 5 being most negative. Inter-rater reliability will then assess the consistency of how the rating system is implemented. If one researcher gives a "1" to a student response, while another researcher gives a "5", obviously the inter-rater reliability would be inconsistent.
Test-Retest Reliability - refers to the measuring of the same participants over time. To determine stability, a measure or test is repeated several times, on the same participants, at a future date. Results are compared and correlated with the previous tests to give a measure of stability. For example, a researcher wants to test the personality of a person suffering from depression. The researcher will give the participant the same test over a certain period of time. However, there is one disadvantage of the test-retest method. Due to the fact the participant is using the same test; he/she may remember the answers for each of the questions. One way of reducing this is by rewording the questions slightly.
Internal consistency - is the extent to which tests or procedures assess the same characteristic, skill or quality. It is a measure of the accuracy between the researchers or of the measuring instruments used in a study. This type of reliability is very useful and often helps researchers interpret data and accurately predict the value of scores and the limits of the relationship among variables.
Difficulties in achieving reliability - researchers can spend weeks to try to make every attempt to ensure accuracy in their studies. Sometimes human error can affect research studies, particularly Naturalistic Observation (which is the study of a person or animal in its own natural habitat). The only measuring device available to the researcher is his/her own observations of human/animal interaction or human/animal reaction to the various stimuli. As these methods are ultimately subjective in nature, results may be unreliable and multiple interpretations are possible. One of these inherent difficulties is quixotic reliability.
Quixotic Reliability refers to “the situation where a single manner of observation consistently, yet erroneously, yields the same result. It is often a problem when research appears to be going well. This consistency might seem to suggest that the experiment was demonstrating perfect stability reliability”.
Colorado StateUniversity (Howard et al., 2005) have given several examples of Quixotic Reliability. I have taken the examples directly from the web site: “if a measuring device used in an Olympic competition always read 100 meters for every discus throw, this would be an example of an instrument consistently, yet erroneously, yielding the same result”.
Validity is the term given to the degree to which a study “accurately reflects or assesses the specific concept that the researcher is attempting to measure” (Howard et al., 2005). While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure.
Researchers are usually concerned with both external and internal validity. External validity refers to the extent to which the results of a study are generalizable (which means for a sample to be applied to a larger population) or transferable (which means to apply the results of research in one context to another similar context). Internal validity refers to the careful, strict, steps to which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured).
Face validity is concerned with how a measure or procedure appears. For example, “does it seem like a reasonable way to gain the information the researchers are attempting to obtain?”Unlike content validity, face validity does not depend on established theories for support” (Fink, 1995).
Criterion related validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.
Construct Validity seeks agreement between a theoretical concept and a specific measuring device or procedure. For example, a researcher trying to develop a new IQ test might spend a great deal of time attempting to "define" intelligence in order to reach an acceptable level of construct validity.
In conclusion, reliability and validity are both essential parts of writing a research study. The amount of work you do preparing and collecting data will reflect how popular you and your work will become. Even though triangulation of data is not entirely essential and is very time consuming, it will greatly benefit the way people trust and read your study and gives a more methodical process to the research study. The importance of a test achieving a reasonable level of reliability and validity cannot be overemphasized by any means. Dr Tracy Gilbert said that you may have a reliable test that is not valid, but you cannot have a valid test that is not reliable. “Reliability gives an upper limit to validity”. A score of 6, for example, may be no different than a score of 4 or 8 in terms of what a student knows or has learned, as measured by the test the student had taken.
About.com. Famous Psychology Quotes. Retrieved February 23rd from About.com Web site: http://psychology.about.com/library/bl_quotes.htm
Carmines, E. G. & Zeller, R.A. (1991). Reliability and validity assessment. NewburyPark: Sage Publications.
Dr Tracy Gilbert. PhD. (2008, February 5th) Dr Tracy Gilberts’ guide on Reliability and Validity. Retrieved February 25th, 2008, from http://www.essex.ac.uk/psychology/psychology/PTR/RESTRICTED/ps114R/ps114W19.pdf
Fink, L.A. (1995). How to measure survey reliability and validity. Thousand Oaks, CA: Sage.
Howell, J., Miller, P., Park, H.H., Sattler, D., Schack, T., Spery, E., Widhalm, S., & Palmquist, M. (2005). Reliability and Validity. Writing@CSU. Colorado State University Department of English. Retrieved February 23rd from: http://writing.colostate.edu/guides/research/relval/
Leech, N. L., & Onwuegbuzie, A. J. (2007). An Array of Qualitative Data Analysis Tools. School Psychology Quarterly. 22, 557-584