Testing the items

Once a preliminary version of the test is available, the developer usually administers it to a modest-sized sample of subjects in order to collect initial data about test-item characteristics.
Testing the items entails a variety of statistical procedures referred to collectively as item analysis. The purpose of item analysis is to determine which items should be retained, which revised and which thrown out. In conducting a thorough item analysis, the test developer might make use of –

Item Difficulty Index

The item difficulty for a single test item is defined as the proportion of examinees in a large tryout sample who get that item correct. For any individual item i, the index of item difficulty is Pi which varies from 0 to 1.
An item with difficulty of .2 is more difficult that an item with difficulty of .7, because fewer examinees answered it correctly. The item difficulty index is a useful tool for identifying items that should be altered or discarded. The optimal level of item difficulty. Item difficulties that hover around .5, ranging between .3 and .7, maximize the information the test provides about differences between examinees.

The optimal level of item difficulty can be computed from the formula (1+g)/2, where g is the chance success level. For example, for a four-option multiple-choice item, the chance success level is 1/4 =25 and the optional level of item difficulty would be (1+25)/2 or about .63.

Item Reliability Index

The potential values of a dichotomously scored test item depends jointly upon its internal consistency as indexed by the correlation with the total score and also its variability as indexed by the standard deviation.

If we compute the product of these two indices, we obtain riTSi, which is the item reliability index.
Item reliability index = riT x Si,

A test developer may desire an instrument with a high level of internal consistency in which the items are reasonably homogeneous. Higher the point-biserial correlation between an individual item and the total score, the more useful is the item from the standpoint of internal consistency.

Item Validity Index

The item validity index is a useful tool in the psychometrician’s quest to identify predicatively useful test items. By computing the test validity index for every item in the preliminary test, the test developer can identify ineffectual items, eliminate or rewrite them, and produce a revised instrument with greater practical utility.

The item validity index consists of the product of the standard deviation and the point-biserial correlation between the item score and the score on the criterion variable. The higher the point biserial correlation between the item score and the score on the criterion variable, the more useful is the item from the standpoint of predictive validity.

Item characteristics Curves

An item characteristics curves (ICC) is a graphical display of the relationship between the probability of a correct response and the examinee’s position on the underlying trait measured by the test.
A separate ICC is graphed for each item, based upon a plot of the total test scores on the horizontal axis versus the proportion of examinees passing the item on the vertical axis. An ICC is actually a mathematical idealization of the relationship between the probability of a correct response and the amount of the trait possessed of a correct response and the amount of the trait possessed by test respondents. A good item has a positive ICC slope. If the ability to solve a particular item is normal ogive is simply the normal distribution graphed in cumulative from.

Item Discrimination Index
An effective test is one which discriminates between high scorers and low scores on the entire test. An ideal test item is one which most of the high scorers pass and most of the low scorers fail.

An item-discrimination index is a statistical index of how efficiently an item discriminates between persons who obtain high and low scores on the entire test.
The item-discrimination index for a test item is calculated from the formula;
d= (U-l)/N
Where U is the number of examinees in the upper range who answered the item correctly, L is the number of examinees in the lower range who answered the item correctly, and n is the total number of examinees in the upper or lower range.

My Suggestion

Pages

বুধবার, ২৮ নভেম্বর, ২০১২

Testing the items

কোন মন্তব্য নেই:

একটি মন্তব্য পোস্ট করুন

Popular Posts