The use of rating scales is frequent in most psychological, sociological and epidemiological studies nowadays. The Lights4Violence study is a clear example of this. Rating scales provide information on non-directly observable concepts, known as constructs, such as symptoms (for example, pain, fatigue), mental and cognitive status (intelligence, depression), feelings (anger, euphoria, attitudes to dating violence), well-being and quality of life. The development of rating scales requires systematic and structured procedures to ensure its quality, through the analysis of several attributes. This process tries to answer several questions, for example: is the rating scale suitable and acceptable for the users? If we apply the rating scale in two separate moments to the same individuals, do we obtain similar results? Are the items that compose the scale covering all the relevant aspects of the construct? Are the items that compose the rating scale aimed at assessing the same construct? Are they measuring the construct in the same way? These two last questions are addressed to measure what in Psychometrics is known as internal consistency, and the Cronbach’s alpha is the one of the most adequate indices to analyze it. Cronbach’s alpha, first proposed by Lee Cronbach in 1951, is defined as the average correlation of a set of items. The values of Cronbach’s alpha range between 0 and 1, with higher values indicating higher internal consistency. It has been proposed that values equal or higher than 0.70 are indicative of good internal consistency, that is, that items are strongly associated and that are measuring the same construct. However, values over 0.95 suggest redundancy, thus, some items are not necessary for the rating scale and could be removed.
So, how are we using the Cronbach’s alpha in our Lights4Violence study? From the results of the pilot study, we can take the Interpersonal Reactivity Index (IRI) as an example. We obtained the following Cronbach’s alphas in one of our pilot studies:
As we can see, the Cronbach’s alpha is 0.781 for Personal distress, indicating an adequate internal consistency of this subscale. However, it is 0.513 for Perspective taking, 0.496 for Fantasy, and -0.103 for Empathetic concerns. What does it mean? It means that some of the items of these subscales are not consistent, and that probably, they are not measuring the same construct (perspective, or fantasy, or empathy). A low Cronbach’s alpha does not necessarily mean that those items should be deleted. We have then to analyze those items individually and see what has happened: is the wording or the rating score correct? Are items correctly translated? Are they understandable? Are they scored in a different way than the other items? Are they associated to other items in other subscales? If we, for example, take an individual item from one of the IRI subscales with low Cronbach’s alpha, item 15: “If I'm sure I'm right about something, I don't waste much time listening to other people's arguments”. Is this item properly understood, or is it measuring another construct? It is a long item, with many parts, which might cause that people with similar perspective level (the construct) might answer it in different ways.
When looking at the internal consistency results from the pilot study of Lights4Violence, we were able to identify items that, in some countries, had translation problems, or were scored in a reserved way. In other cases, such as the Empathic Concerns subscale, items were very long, written in a negative way, which might have led teenagers to have difficulties to understand and thus an inconsistent way of answering these items in most countries.
The subsequent action plan is depending on the answers to these questions and on the results of the full psychometric analysis of data from all intervention sites. In case of low internal consistency, some strategies can be taken: to correct the translation; to correct the rating scheme by reserving it; to delete one or two problematic items (with low correlations with the rest of the item); to use the total score instead of the subscale scores; to delete a scale in a certain or all countries. All strategies have advantages and inconvenient, and decision should be properly justified. It is important to resolve these problems at the pilot testing phase. A scale with good internal consistency has less error associated to it, which will impact the following statistical analyses.