Language pervades every aspect of our daily lives. From the books we read to the TV shows we watch to the conversations we strike up on the bus home, we rely on words to communicate and share information about the world around us. Not only do we use language to share simple facts and pleasantries, we also use language to communicate social stereotypes, that is, the associations between groups (for example, men/women) and their traits or attributes (such as competence/incompetence). As a result, studying patterns of language can provide the key to unlocking how social stereotypes become shared, widespread, and pervasive in society.

But the task of looking at stereotypes in language is not as straightforward as it might initially seem. Especially today, it is rare that we would hear or read an obviously and explicitly biased statement about a social group. And yet, even seemingly innocuous phrases such as “get mommy from the kitchen” or “daddy is late at work” connote stereotypes about the roles and traits that we expect of social groups. Thus, if we dig a little deeper into the relatively hidden patterns of language, we can uncover the ways that our culture may still represent groups in biased ways.

Using Computer Science to Uncover Hidden Biases

Recent advances in computer science methods (specifically, the area of Natural Language Processing) have shown the promise of word embeddings as a potential tool to uncover hidden biases in language. Briefly, the idea behind word embeddings is that all word meaning can be represented as a “cloud” of meanings in which every word is placed according to its meaning. We place a given word (let’s say “kitchen”) in that cloud of meaning by looking at the words it co-occurs with in similar contexts (in this case, it might be “cook,” “pantry,” “mommy,” and so on). If we have millions to billions of words to analyze, we eventually arrive at an accurate picture of word meaning where words that are close in meaning (like “kitchen” and “pantry”) will be placed close together in the cloud of meaning. Once we’ve achieved that, we can then answer even more detailed questions such as whether “mommy” is placed as closer in meaning to “kitchen” or to “work.”

Using these and other tools, my colleagues and I saw the potential to provide some of the first systematic insights into a long-standing question of the social sciences: just how widespread are gender stereotypes really? Are these stereotypes truly “collective” in the sense of being present across all types of language, from conversations to books to TV shows and movies? Are stereotypes “collective” in pervading not only adults’ language but also sneaking into the very early language environments of children? Although evidence for such biases has long been documented by scholars, our computer science tools allowed us to quantify the biases at a larger scale than ever before.

To study stereotype pervasiveness, we first created word embeddings from texts across seven different sources that were produced for adults or children including classic books (from the early 1900s), everyday conversations between parents and children or between two adults (recorded around the 1990s), and contemporary TV and movie transcripts (from the 2000s), ultimately totaling over 65 million words. Next, we examined the consistency and strength of gender stereotypes across these seven very different sources of language. In our first study, we tested a small set of four gender stereotypes that have been well-studied in previous work and thus might reasonably be expected to emerge in our data. These were the stereotypes associating:

  • men-work/women-home
  • men-science/women-arts
  • men-math/women-reading
  • men-bad/women-good

Stereotypes Really Are Everywhere In Our Language

Even though our seven kinds of texts differed in many ways, we found pervasive evidence for the presence of gender stereotypes. All four gender stereotypes were strong and significant. Moreover, there were no notable differences across child versus adult language, across domains of stereotypes, or even across older texts versus newer texts. To us, this consistency was especially remarkable in showing that even speech produced by children (as young as 3 years old!) and speech from parents to those young children revealed the presence of gender stereotypes that have not been documented on such a big scale at such young ages.

Having shown pervasiveness for these four well-studied stereotype topics, we next turned to gender stereotypes for more than 600 traits and 300 occupation labels. Here, we found that 76% of traits and 79% of occupations revealed meaningful associations with one gender over another, although not all were large in magnitude. The strength of gender stereotypes of occupations was stronger in older texts versus newer texts; and the strength of gender stereotypes of traits was stronger in adult texts versus child texts. And yet, we also saw continued evidence of consistency. For instance, across most of our seven different kinds of texts the occupations “nurse,” “maid,” and “teacher” were stereotyped as female, while “pilot,” “guard,” and “excavator” were stereotyped as male.

By bringing together both the unprecedented availability of massive amounts of archived naturalistic texts, and the rapid advances in computer science algorithms to systematically analyze those texts, we have shown undeniable evidence that gender stereotypes are indeed truly “collective” representations. Stereotypes are widely expressed across different language formats, age groups, and time periods. More than any individual finding, however, this work stands as a signal of the vast possibilities that lie ahead for using language to uncover the ways that biases are widely embedded in our social world.

For Further Reading

Charlesworth, T. E. S., Yang, V., Mann, T. C., Kurdi, B., & Banaji, M. R. (2021). Gender stereotypes in natural language: Word embeddings show robust consistency across child and adult language corpora of more than 65 million words. Psychological Science, 32(2), 218–240.

Caliskan, A., Bryson, J. J., & Narayanan, A. (2016). Semantics derived automatically from language corpora necessarily contain human biases. Science, 356(6334), 183–186.

Tessa Charlesworth is a Postdoctoral Research Fellow in the Department of Psychology at Harvard University where she studies the patterns of long-term change in social cognition.