This week was Valentine’s Day. Love may be a great emotion for many people, but for psychologists, it is another phenomenon that we can try to isolate, measure, and study.
By chance (really, I swear this was not planned), this Valentine’s Day, I met with two other researchers at UCLA (Josie Menkin and Ted Robles) to discuss how to assess love, in particular how many distinct dimensions it has and how to accurately measure each of them.
As is common in psychology, people completed self-report questionnaires rating various characteristics they valued in a romantic partner on a Likert scale. How do we really know what we are measuring? After all, these are abstract constructs and all we have are individuals’ self-reported responses to some questions.
There are many ways to try to address validity including relating it to expected behaviours or outcomes (e.g., perhaps we would expect people high in love to spend more time together, argue a smaller proportion of the time, be more physically proximal to each other than friends, etc.). We can also examine the empirical nature of the data.
In our case, we had data on many face valid items—they look like they are asking about aspects of what we consider love. Using statistics, we can see how many unique dimensions are required to explain most of the variability in all the items. When you have many items, but only a few dimensions explain most their variability, we often think of these dimensions as representing some underlying, latent (or unobserved) factor that drives responses on sets of items. Concretely, people running out of a building, a fire truck siren, and smoke, could be three indicators that allow us to infer a building is on fire, even without seeing the actual fire.
The first step was to split our (fortunately large) dataset into two pieces, a training dataset and a validation or test dataset. The purpose of this is to allow us to explore the data in one piece, and then validate our model on unused data. This helps reduce overfitting data, because data driven changes to the theory are done based on the training dataset, and then validated in the test data.
I used exploratory factor analysis to probe the question of how many factors (dimensions) existed in the set of items people responded to. Then we looked at the factors and the particular items that contributed most to each, and selected the best empirical and theoretical indicators for each construct. Next, I used confirmatory factor analysis to see how well our model fit the (still training) data. With a good fit, it is tempting to move on and start using the factor, but there are more steps. To be confident in the factors you have created, it is paramount that you are measuring the same thing in all the types of people you care about. For love, we might worry that there are sex differences—females and males may systematically interpret questions differently. Likewise, younger and older people may have differences. Note that I am not talking about differences in the level of love, but in the configural patterns in the data—in the structure of the different factors of love.
As a simple (made up) example, suppose that females who say their partner’s personality is very important, they also tend to say their partner’s moral values are important; whereas for males who say their partner’s personality is very important, they tend to say their partner’s moral values are not very important. For females, the two items are positively correlated, but for males the items are negatively correlated. This would tell you that there is something fundamentally different about how the items work for females and males, and that they are probably not really measuring the same thing.
So this weekend, I have been taking the factors of love we identified on Valentine’s Day, and testing whether those hold up for both females and males, and across age groups. Working first with the training data, I can test our theories and modify them as needed when they do not fit the data. Once we have a solid model that both makes theoretical sense and seems to fit the data well, this model can be vetted by testing it in the validation data set.
With a theoretically sensible model, that also fits the sample data we have, and seems to be consistent across different groups of people we are interested in comparing on the construct, we can be fairly confident that we have a good measure of the different facets of love. These factor scores can be used to compare groups on their levels of various aspects of love, and can be used to predict behavior, or anything else we might be interested in. So even though love may seem like an abstract emotion, we can break down its constituent components, and systematically measure and study their relations.