Measurement

Monday, December 13, 2010

Analysis

Research hypotheses

Hypothesis on convergent validity:

1) There will be a positive relationship between self worth and connection (family and

school dimension).

Hypothesis on divergent validity:

2) There will be a negative relationship between self-worth and loneliness.

HypothesiAccis on self-worth and gender:

3) Adolescent boys will show a higher level of self worth perception than adolescent girls.

Last week, we were in the lab for runing the data. For our group, self-worth, we got a good result based reliability, .80. In terms of convergent and divergent validity, we also got a good result of both.

Tuesday, December 7, 2010

Warm up

Phase 3: What have I learned?

This semester is almost towards the end. I have recalled what I have learnd from this class. I went back to check our first class' discussion questions as below:

1) What am I learning objectives for this class? I guess our group final project, team work with learning process. Building on the positive youth development research being carried out by Dr. Lerner and colleagues at Tuft University’s Institute for Applied Research in Youth Development (e.g., Bowers, et. al., 2010; Lerner, et. al., 2005;2010), we have been carried out an observational study that examines the relationship among dimensions of positive youth development, weight, healthy eating habits, career aspirations and community resources. Online Surveys were used to collect data for this study. The 5-C measure of positive youth development (Lerner, et. al., 2005) will be used to measure positive youth development. Our group focous on self-worth of the study. The class developed measures for youth’s demographics, health status/healthy eating habits, career aspirations and community resources. I think this course examines a number of approaches to data collection in social work research such as surveys, scales, and observational techniques. We have been working on

2) What is measurement?

3) What role does measurement play in the research process? According to Barker et al., (2002), Measurement (Chapters 4 to 7). Having formulated the research questions, the next step is to decide how to measure the psychological constructs of interest. Using the term ‘‘measurement’’ in its broadest sense, to encompass qualitative as well as quantitative approaches to data collection. Quantitative measurement; Psychometric theory: (reliability; validity; generalizability theory; Item response theory; standards for reliability and validity); Qualitative methods: Self-report methods (open-ended and closed-ended questions; quantitative self-report methods); Observation (qualitative observation; reliability and validity issues).

4)What are the distinctions between these two interpretations of the measurement process (see Boumans, 2005)?

Ellis - Associative interpretation
Correlation interpretation - Heidelberger (1994a, 1994b)

I have learnd how to development and evaluation of the psychometric properties of quantitative social science measurement tools. Theories of measurement (True Score Theory and Item Response Theory), scale development, item and scale analysis using advanced statistical procedures (e.g., factor analysis and structural equation modeling) will be addressed. a) I have learned how measurement error impact on the quality of a research study. b) how to match measurement strategies to their research design in order to reduce measurement error. Based on the two strategic above, Dr. Farmer has described the use of various approaches to measurement, including standardized scales; behavioral counts and ratings; and individualized rating scales. c) classical test theroy within the context of social work research, which the major assupmtions, strengths, and weakness of classical measurement theory. I posted lots of information about Item Response Theory (IRT) in the previous sections here. Furthermore, we will be more understandable for reliably and validity within the context of classical measurement theroy and IRT. According to develop and validate a measure, I guess my group did not engaging in the process of developing and validating a measure. The last step is to learn how to evaluate the dimensionality of a measure. For the past few weeks, Dr. farmer spent much time on describing EFA/CFA, which I also posted some information here.

Monday, November 22, 2010

Scale Dimensionality (Sessions 7 and 8)

Phase 3: What have I learned?

Student Question 9: What actions have I taken?

We are busy for writing up literature review and introuduction part for our final paper. But since like we also need to deal with different personalities in a group. Three of us has diffrent opinions, but i believe we should choose a person to our group leader so that we won't have too much arguement and conflicts. But because everyone is busy and we do not have much time to sit each other. I feel sad and frustrated. Most of time we communicate each other via email. I feel like one of my goals is to get our final paper done and everyone in the group is on the same page.

Student Question 10: What barriers have been removed?

Gretta and I focus on introduction and literature reivew. Both of us try to make a time and send our own part to each other. The purpose of doing that is to merge our senctions together. Try to listen her opinions and thoughts regarding the paper. At the same time, areen made lots of timelines for us. Obviously, we did not really follow the scheduel. She made lots of unrealistic timeline. But i have tried to discuss to her.

Student Question 11: What has changed about what I don’t know?

Gretta gave me some suggestions regarding my introduction. At the same time, Areen wants us to follow the outline on the paper.

Student Question 12: Do I know what I want to know?

Areen looks like want to focus on writing up chpater 3, methodology. Our study will be aimed at youth development data collected from a Web-based self-administered questionnaire. we will run the data by spss. we will have to compute and composite subscale of self-worth.

Traditional statistical methods normally utilize one statistical test to determine the significance of the analysis. However, Structural Equation Modeling (SEM), CFA specifically, relies on several statistical tests to determine the adequacy of model fit to the data. The chi-square test indicates the amount of difference between expected and observed covariance matrices. A chi-square value close to zero indicates little difference between the expected and observed covariance matrices. In addition, the probability level must be greater than 0.05 when chi-square is close to zero.

Tuesday, November 16, 2010

survey development

it's always a learning experience of developing survey for this class.
I had develop survey though Monkeysurvey before, while I studied my MSW at rutgers. I developed the survey for my field place. My supervisor was kind of giving my some directions of how to create a survey, which is easily understandable to participants. First of all, we should give them an introduction of what the purpose of the study is, including confidentiality and withdrawal at any time. Them in the direction give the reader information on how to interpret the scale based on our questionnaire. Be careful to make the questions simply, according to the reader's language. try to estimate how much money that will finish the survey. do not expect the reader will have much time and pay more attention.

Tuesday, October 19, 2010

Scale Dimensionality (Sessions 7 and 8)

Validity-
How accurately does the scale measure the concept it says it measures?
How much systematic error do I have?

Face Validity
1) on the face of it, does it seem to measure what I say it does.
2) assessed by asking individuals in the field to review items.

Content Validity
1) A scale or measure has content validity when all aspects of the concept have been covered. These are frequently referred to as domains.

Criterion-Related Validity
1) researcher compares scores on the measure under development with some external criterion known to or believed to measure the same concept.
2) researcher creating measure determines criterion. The closer the criterion is to the measure in concept the better.
3) concurrent validity: criterion and present simultanously with the measure you are developing.
Predictive validity: criterion is in future.
4) construct validity: has the unobserved construct underlying the measure being developed been measured accurately
5) one traditional way of assessing construct validity is to look at series of studies using the measure being developed. How well do findings reflect the theory underlying the measure.
6) Statement of validity on the way a measure relates to other variables within a system of theoretical relationships.

Confirmatory Factor Analysis
1) another way is through confirmatory factor analysis in which one hypothesizes that the construct is made up of several domains and particular items belong to one particular domain.
2) one can then test the hypotheses and the model statictically

Thursday, October 14, 2010

Literature review of self-worth

Current, we have a group project,
Our group is focusing on adolescent's self-worth. Our group has been looking at a variety of literature related to adolecents' self-worth. A study conducted by Quarterly et al. (2006), adolescent' sperceptions of social support in relationships with mothers, close friends, and romatic partners and their contributions to individual adolescent self-worth and interpersonal competence.

Less is known about links between social support and adolescent wellbeing. Global self-worth is one measure of well-being: contemporary conceptualizations of self-esteem emphasize a distinctive array of perceived competencies in a variety of domains. Adolescents queried about different domains of interpersonal competence indicated that support from parents is associated with global self-worth that support from friends is associated with perceived friendship competence and social acceptance, and that support from romantic partners is associated with perceived romantic competence (Connolly & Konarski, 1994).

Global self-worth (M α = .84) provides an assessment of overall self-esteem (e.g., "Some teenagers are disappointed with themselves BUT other teenagers are pretty pleased with themselves"). Social acceptance (M α = .84) provides an assessment of competence in the peer group (e.g., "Some teens are popular with others their age BUT other teens are not very popular"). Friendship competence (M α = .76) provides an assessment of capabilities in friendships (e.g., "Some teens are able to make really close friends BUT other teens find it hard to make really close friends"). Romantic competence (M α = .74) provides an assessment of capabilities in romantic relationships (e.g., "Some teens feel that people their age will be romantically attracted to them BUT other teens feel worry about whether people their age will be attracted to them").

In a study conducted by Sargent J. T. et al., (2006), the relationship between contingencies of self–worth and vulnerability to depressive symptoms was investigated in a longitudinal sample of 629 freshmen over the first semester of college. Higher levels of external contingencies of self–worth, in a composite measure of four external contingencies of self–worth (approval from others, appearance, competition, academics), predicted increases in depressive symptoms over the first semester of college, even controlling for initial level of depressive symptoms, social desirability, gender, and race. Internal contingencies of self–worth (God’s love, virtue) were not associated with the level of depressive symptoms. We conclude that external contingencies of self–worth may contribute to vulnerability to depressive symptoms.

In another study conducted by Sanchez & Crocker (2005), the study examined the relationship between investment in gender ideals and well-being and the role of external contingencies of self-worth in a longitudinal survey of 677 college freshmen. The study proposed a model of how investment in gender ideals affects external contingencies and the consequences for self-esteem, depression, and symptoms of disordered eating. The study found that the negative relationship between investment in gender ideals and wellbeing is mediated through externally contingent self-worth. The model showed a good fit for the overall sample. Comparative model testing revealed a good fit for men and women as well as White Americans, Asian Americans, and African Americans.

The research examined effects of receiving negative interpersonal feedback on state self-esteem, affect, and goal pursuit as a function of trait self-esteem and contingencies of self-worth. Two same-sex participants interacted with each other and then received negative feedback. Participants then reported their state self esteem, affect, and self-presentation goals—how they wanted to be perceived by others at the moment. Among participants who received negative feedback, those who more strongly based their self-worth on others’ approval experienced lower state self-esteem, positive effect, and greater negative affect than those whose self-worth was less contingent on others’ approval. Participants with low self-esteem showed greater desire to appear physically attractive to others the more they based self worth on others’ approval and received negative feedback. In contrast, participants with high self-esteem showed greater desire to appear warm/caring/kind the more they based self-worth on others’ approval and received negative feedback.

Through the literature search of contingencies of self-worth, William James (1890) argued over a century ago that people derive self-esteem from succeeding in certain domains and not others. According to the contingencies of self worth model (Crocker & Wolfe, 2001), people differ in their bases of self-esteem, which are shaped by their beliefs about what they think they need to be or do to be a person of worth. Crocker and colleagues (2003b) identified seven domains in which people may derive their self-worth: Virtue, God’s love, family support, academic competence, physical attractiveness, competition, and gaining others’ approval. The more a person bases self-worth in a domain, the more he or she may be vulnerable to experiencing negative effects of self-threat in that domain. For example, research has shown that the more students base their self-worth on academics, the more likely they are to experience lower state self-esteem and greater negative affect and self evaluative thoughts when they perform poorly on academics tasks, receive lower than- expected grades, or are rejected from graduate schools

Tuesday, October 5, 2010

Item Response Theory (IRT)

Limitation of Classical Test Theory

Examine characteristics cannot be separated from test characteristics
1) The discrimination or difficulty of a item is sample dependent.
2) It does not allow you to predict how an examine, given an ability level, is likely to respond to
    particular item.
3) Only three sources of error can be estimated: A. error due the lack of internal consistency (of the
    items, coefficient alpha); B. error due to instability of a measure over repeated obervations (test-retest
    reliability); C. error due the lack of equivalence among parallel measures (correlation betweeen parallel
    forms).
4) comparison of indivduals is limited to those situations when the same test was given to individuals you
    want to compare. ex: CTT makes the false assumption that error variance is the same across all subjects
    (ex: there not relationship between you true score and error variance).

IRT allows for the development of items that are free from test and examinee biases.
IRT models are mathematical equations describing the association between a respondent's underlying levle on a latent trait or ability and the probability of a particular item response (correct response) using a nonlinear monotonic function.
Most IRT modeling is done with unidimensional models.

IRT Theory

One can consider each examinee to have a numerical value, a score, that places him or her somewhere on the ability scale. 1) at each ability level, there will be a certain probability that an examinee with that ability will give a correct answer to the item. 2) this probabilty will be small for examinee of low ability and larger for examinees of high ability.

Item Characteristics Curve (ICC)
1) If one plotted probabilty of getting a question correct as function of ability, the result would be a smooth S-shaped.
2) Each item has it own ICC
3) The item characteristic curve is the basic building block of item response theory; all the other constructs of the theory depend upon this curve.

IRT
1) These measurement models use response to items on a test or survey questionnaire to simultaneously same latent continuum (or latent space in the case of multidimensional IRT).
2) This enables one to measure individuals on the latent trait defined by the set of items (ex: ability, attitude, craving, satisfaction, quality of life, etc.) while simultaneously scaling each item on the very same dimension (ex: easy versus hard item s in the case of an ability test, unfavorable versus favorable statement in the case of an attitude questionnaire)

Two families: unidimensional and multidimensional
1) unidimensional: unidimensional models require a single trait (ability) dimension.
2) multidimensional: multidimensional IRT models response data hypothesized to arise from multiple traits.

Binary vs. polytomous items
IRT models can also be categorized based on the number of scored responses.
Dichotomous: presence/absence, correct/incorrect
Polychromous outcomes: where each response has a different score value (Likert scaling)

Item difficulty and discrimination
There are two technical properties of an item characteristic curve that are used to describe it.
1) item difficulty: the difficulty of an item describes where the item functions along the ability scale.(ex: an easy item functions among the low-ability examinees and a hard item functions among the high-ability )
2) item discrimination:

Number of IRT parameters:
IRT generally refers to three probabilistic measurement models:
1) 1-parameter logistic model (Rush model)-Latent trait: item difficulty defined: the logit point at which the probability of answering the item correctly is 50% (latent trait + item difficulty); guessing is irrelevant, and all items are equivalent in terms of discrimination.
2) 2-parameter logistic model (latent trait + item difficulty + item discrimination)
3) 3-parameter logistic model for dichotomous and polytomous responses. Latent trait + item difficulty + item discrimination + guessing parameter (this takes into consideration guessing by candidates at the lower end of the ability continuum )

IRT Assumption
1) examinee characteristic can be separated from test characterics: the easy or diffuculty of a item is sample independent; it allows you to predict how an examinee, given an ability level, is likely to respond to a particular item.
2) unidimensionality only one ability or laten t reait is meaausred
3) local independence
4) assuming a large poopl of items-each measuring the same latent trait
5) assuming the existence of a large population of examinees, the descriptors of a test item are independent of the sample of examinees drawn for the purpose of item calibration
6) a statistic indicating the precision with which each examinee's ability is estimated is provided
7) person-free and item-free measurement.

IRT Item Selection/Test construction
1) describe the shape of the desired test information over the desired ranged of abilities target information function.
2) select items with item information functions that will fill up the hard-to-fill areas under the target information function.
3) after each item is added to the test, calculate the test information function for the selected test items.
4) continue selecting items until the test information function approximates the target information function to a satisfactory degree.