Measurement: Classical Test Theory Basic Concepts

Part I:
Student Question 1: What do I want to learn?
I want to learn how to use appropriate tools to measure scales of a study

Student Question 2: What do you know now?

Beyond my answer on question1: I know reliability and validity (how important

reliability and validity play the roles in the scale measures) ; CTT (the definitation

of CTT, its advantages, and its limitations). However, in calss, we did not cover

the whole of information. sometimes, it's kind of frustrating to study alone at

home. Need to pay more attention to study that alone.

Student Question 3: What must change for me to learn what I do not know?

1) I will do some research online and check on the textbooks if I have barrier with

understanding those materials;

2) If the way still does not work, I will ask experts in this field, such as my friends who are
familiar with measurements.

3) finally, bring the questions to class, and further ask Dr. Farmer.

Student Question 4: What can I do to make this happen?

       Based on our two class assignments, final class project and annotated bibliography, the
       projects will guide me to learn and how to use what I have learned to implement to the
       papers. 1) follow the weekly reading assignments; 2) follow two assignments due.

Sometimes, I feel like it is really helpful to contribute what we have learned into a paper.

Because there is actually sample, data, and result of data analysis, it's a good learning

experiences/process.

X (observation) = T (true value of the observation: the true score of a person can be found by taking the mean score that the person would get on the same test if they had an infinite number of testing sessions) +e (across multiple observation of the same person error is normally distributed and uncorrelated with true score)

Variance & Reliability
VAR (X)= VAR (T) + VA R (E) + 2 COV- (no correlation between VAR (T) and VAR (E) )

Reliability= VAR (T)/VAR (X)- (can not directly observe VAR (T) )
reliability=1- VAR (E)/VAR (X)

What sorts of things create measurement error?

1) Error can result from the way the test is designed, factors related to the individual students, the testing situation, and many other sources. Some students may know the answers, but fatigue, distractions, and nervousness affect their ability to concentrate. Students may know correct answers but accidentally mark wrong answers on an answer sheet. Students may misunderstand the instructions on a test or misinterpret a

single question. Scores can also be an overestimate of true achievement. Students may make random guesses and get some questions right. Johnson et al., (2000)

2) Test-specific sources of error would be another measuremnet error.
For example, suppose the test uses reading selections as the basis for some questions. If a class happened to have previously studied the text passage being used, that class will probably do better than a class of students who have never seen the text before. For some tests, we know that changing the order of the items on the test leads to higher or lower scores. This means the order of the items is causing measurement error. Some test items may be biased in favor of or against particular groups of students. For example, if the reading passage contains a story that takes place on a farm, students from the inner city may be at a systematic disadvantage in making inferences based on the story.

Inter-rater Reliability (coefficient of agreeement)
1) analogous to alternate forms
2) have to two observers assess the same phenomena, assess consistency between the observers.
Source of measurement error? object of phenomenon (observe1 vs. observe2)
-cause subjective bias

Cohen's Kappa (inter-rater reliability measure) more sophisticated, takes into account chance agreement
Value range from -1 (less agreement than expected by chnace) to +1 (perfect agreement)
+.75 "excellent"
.40-.75 "fair to good"
below .40 "poor"

Reliability coefficient value:
.90 and up " excellent"
.80-.89 "good"
.70-.79 "adequate"
below .70 "may have limited applicability"

Different procedure requiring two test administrations to same group:

Test-retest (coefficient of stability)
time1-time2- source of measurement error: time factor (intervention program)
1). A. Test-Retest Method: If you are concerned with error factors related to the passing of time then you
    want to know how consistently examinees respond to this form at different time. Administer, wait, and
    then re-administer. The correlation coefficient from this procedure is called the coefficient of stability.
     B. Test-Retest with Alternate forms: Administer form 1 of test, wait then administer form 2. The
    correlation coefficient is known as the coefficient of stability and equivalence.
2). Reliability conefficient reported is correlation between the two administrations. The assumption is that the
    correlation is less than perfect (not 1.00) because of error.
3). However this technique is particularly prone to carry over effects from one administration to another.
     Reliability will be overstimated.

Parallel or alternate-form (coefficient of equivalence)
Two supposedly equivalent forms of same instrument to the sam e individuals are administered either immediately or in delayed succession
Form A-Form B (through time 1 to time 2)
Alternate form method: To reduce possibility of cheating, similar tests need to be given over time (i.e. board exam). The errors of measurement that concerns the test user are those due to differences in content of the test forms. A correlation coefficient should be used to see how different the tests are. This is called the coefficient of equivalence. Usually between .8 and .9. (http://www.smaddicts.com/2008/09/what-is-reliability_28.html

Internal analysis (coefficient of internal consistency)-Internal consistency is a method of estimating reliability that is computed from a single administration of a test. The coefficients reflect the degree to which the items are measuring the same construct and are homogeneous. Cronbach's alpha and the Kuder-Richardson formulas are measures of the internal consistency of a test. (http://www.csus.edu/indiv/d/deaner/glossary.htm#i)

Measurement

Friday, September 24, 2010

Classical Test Theory Basic Concepts

No comments:

Post a Comment