Can an assessment be reliable and not valid? (3 insights)
This blog post aims to answer the question, “Can an assessment be reliable and not valid?” and explore the various aspects of reliability and validity to help understand the answer.
Can an assessment be reliable and not valid?
Yes, an assessment can be reliable and not valid. The following are 3 insights into how an assessment can be reliable and not valid –
- Reliability and validity.
- High-quality assessments.
- Establishing reliability as well as validity.
These 3 insights into how an assessment can be reliable and not valid will be discussed in further detail below after taking a deeper look at what reliability and validity mean.
What is Reliability?
Assessment dependability is the easier notion to express and comprehend the two phrases. In a research environment, here’s a decent definition of reliability: if a test is reliable, the findings will be relatively comparable no matter when someone does it. The test is not deemed trustworthy if the findings are inconsistent.
So, if you’re looking for a test’s dependability, the question to ask is: are the test’s results consistent? Will the findings be the same if someone takes the exam now, a week from now, and a month from now?
Assessment firms pay great attention to two areas of dependability in particular when determining the reliability of their tests: re-test reliability and internal consistency measurements.
Test Re-Test Reliability.
Test-retest reliability is used by assessment businesses to ensure a test’s dependability over time. This technique involves administering the test to the same set of people twice (a few days or weeks apart) in order to detect changes in results.
To examine the test’s reliability, researchers calculate the correlation coefficient, which is a statistical metric ranging from 0 (no correlation) to 1 (perfect correlation). Because no test is totally error-free, a correlation of 0.7 or greater is required to be regarded as trustworthy.
Internal Consistency.
Internal consistency focuses on ensuring that test items that are supposed to be connected are, in fact, related. Internal consistency is often measured by comparing results from the first half of the exam to those from the second half.
The correlation should be 0.7 or greater because these scores should be measuring the same thing. If a portion of pre-employment testing is meant to examine math skills, for example, test-takers should do equally well on the first and second half of that section of the test.
What is Validity?
Because validity is more difficult to measure than dependability, it has a more complicated definition. There are several ways to establish if an assessment is valid; in research, validity relates to how accurate a test is, or, to put it another way, how effectively it performs the purpose for which it is being utilised.
This refers to forecasting employee performance or recognising high talent in pre-employment tests. Content, criterion-related, and construct validity are three forms of validity that assessment businesses may examine inside tests.
Content Validity.
When the criteria that an assessment is measuring correspond with and effectively cover the content of the work, it is said to have content validity.
In addition, examining how effectively the examination displays content validity includes looking at how well the information correlates with employment performance.
For example, an executive secretary’s quick typing speed would certainly be regarded as a significant element of the work, but not for an executive. While the executive secretary may be expected to type on occasion, this ability is not nearly as crucial to their job performance as it is for the CEO.
Assessing the degree to which test items and job content match each other is one way to ensure that an assessment has content validity.
Criterion-Related Validity.
If the outcomes of an assessment are predictive of a function linked to work performance, it is said to have criterion-related validity.
So, how can we know if a test predicts future performance? The results of the assessment must be statistically compared to a metric of employee performance.
For example, an employer who wants to know how well a personality test identifies people who are likely to engage in counterproductive work behaviours might compare applicants’ personality test scores to how many accidents or injuries they have on the job, if they use drugs on the job, or how many times they disobey company policies.
The extent to which the assessment results are connected to a performance measure—such as unproductive work behaviours—is called criterion-related validity.
Construct Validity.
If an assessment is connected to other assessments assessing the same psychological construct—a construct is a notion used to explain behaviour—it displays construct validity.
Cognitive ability, for example, is a term used to describe a person’s ability to comprehend and solve difficulties.
To determine to construct validity, a corporation would statistically compare its assessment to comparable assessments that, in principle, should be linked since they measure the same thing.
Because they measure two separate categories, there shouldn’t be a substantial association between a test that measures personality and one that measures cognitive capacity. The personality test, on the other hand, should have a good correlation with other personality tests.
What are these 3 insights into how an assessment can be reliable and not valid?
Reliability and validity.
When it comes to words that aren’t linked to statistics, the terms dependability and validity are frequently interchanged. However, when critical statisticians use these phrases, they’re referring to other aspects of the statistical or experimental approach.
Consistency is also known as reliability. The test is trustworthy if one individual takes the same personality test numerous times and gets the same results each time.
If a test measures what it claims to measure, it is legitimate. The personality test would be invalid if the results showed that an extremely timid individual was actually extroverted.
If a test isn’t dependable, it can’t be considered legitimate. A test, on the other hand, might be dependable without being valid.
Assume your bathroom scale was adjusted to indicate a weight loss of ten pounds. Because it is not reading your real weight, the weight it reads will be dependable (it will be the same every time you walk on it), but it will not be genuine.
You’ve got yourself a dependable test if you provide a personality test and obtain the same responses from potential recruits after testing them twice.
However, if the personality test isn’t assessing the personality traits it purports to be measuring and instead correlates to a different evaluation, such as on-the-job abilities, the assessment isn’t likely to be valid.
High-quality assessments.
Validity and reliability are two of the cornerstones of high-quality evaluations, along with fairness.
Though these two characteristics are frequently discussed in tandem, it is important to note that an assessment can be reliable (i.e., have repeatable results) without necessarily being valid (i.e., accurately measuring the skills it is intended to measure), but an assessment cannot be valid unless it is also reliable.
Other high-quality assessment standards include fairness—that an assessment is devoid of bias—and coherence—that each assessment is used in a way that is consistent with its intended purpose.
Establishing reliability as well as validity.
The assessment should always be linked to a quantifiable personality feature, a real job consequence, or an objective problem.
The product’s industrial-organizational scientists should have done extensive study and engaged subject matter experts in your industry to examine test questions and guarantee they’re built for the purpose of measuring what they’re supposed to assess.
Furthermore, the sample population utilised to construct the test should be representative of the population as a whole that would use the assessment. (You wouldn’t want to test homogeneous populations or a tiny sample, for example.)
Next, inquire whether your potential providers have put in place any safeguards for dependability by asking –
- “Does the assessment employ clear, easy-to-understand language with a range of questions to measure each category?”
- “Did the industrial-organizational researchers do a bias check on the test items?”
- “Can you tell me about the sample population that was utilised to create and validate the test?”
The assessment you pick should also provide specific instructions that minimise any differences in testing settings, from the amount of time allotted for taking the test to the amount of noise in the testing location.
When you grasp the differences between reliability and validity, you’ll realise that both are important for the success of every test you utilise.
For example, any pre-employment exam, from cognitive ability tests to personality tests to emotional intelligence tests, must measure what it claims to measure and generate consistent findings over time to be beneficial to you and your firm.
Conclusion –
This blog post aimed to answer the question, “Can an assessment be reliable and not valid?” and reviewed the various aspects of reliability and validity to help determine if an assessment can be reliable and not valid. Please feel free to reach out to us with any questions or comments you may have.
References –
Valid and Reliable Assessments. CSAI Update, WestEd. (2018, March). Retrieved from https://files.eric.ed.gov/fulltext/ED588476.pdf
Rader, M. The Difference Between Validity and Reliability — and Why Both Are So Important in Testing. (2021, June 8). Retrieved from https://wonderlic.com/blog/assessments/validity-and-reliability/
Phelan, C. & Wren, J. EXPLORING RELIABILITY IN ACADEMIC ASSESSMENT. (n.d.). Retrieved from https://chfasoa.uni.edu/reliabilityandvalidity.htm
Importance of Validity and Reliability in Classroom Assessments. The Graide Network. (2018, September 10). Retrieved from https://www.thegraidenetwork.com/blog-all/2018/8/1/the-two-keys-to-quality-testing-reliability-and-validity
Why Reliability and Validity Are Important to Learning Assessment. University of North Texas. (n.d.). Retrieved from https://teachingcommons.unt.edu/teaching-essentials/assessment/why-reliability-and-validity-are-important-learning-assessment
Is it possible for a test to be valid but not reliable? Quora. (n.d.). Retrieved from https://www.quora.com/Is-it-possible-for-a-test-to-be-valid-but-not-reliable
Can a test be reliable and yet not valid? (n.d.). Retrieved from https://www.quora.com/Can-a-test-be-reliable-and-yet-not-valid
Chapter 3: Understanding Test Quality-Concepts of Reliability and Validity. U.S. Department of Labor, Employment and Training Administration. (1999) Retrieved from https://hr-guide.com/Testing_and_Assessment/Reliability_and_Validity.htm