An audit of bias in performance reviews at a midsized law firm found sobering differences by both race and gender. The authors identified four patterns of bias in the evaluations and recommended two simple changes for the following year: 1) Reworking the performance evaluation form to break job categories down into competencies and require that ratings be backed by at least three pieces of evidence; and 2) Developing a simple, one-hour workshop in which participants were introduced to the patterns of bias and learned how to use the new form. One year later, people of color and women got more constructive feedback, and the playing field was leveled for everyone: Whereas white men had longer, more complex evaluations in year one, in year two, both word count and language complexity were similar across all groups. While there was still room for improvement, the intervention showed that evidence-based metrics can help companies make steady progress and improve outcomes for everyone.
About two years ago, a midsize U.S. law firm reached out to the Center for WorkLife Law to learn how bias was surfacing in their performance evaluations. The firm’s D&I director had spot-checked a sample of supervisor evaluations for bias and identified several red flags. They decided they wanted to go a step further and take a data-driven approach. (Music to our ears!)
We started by conducting an audit of the firm’s performance evaluations. The vast majority seemed useful and appropriate. But when we looked closer at the data, we found sobering differences by both race and gender. Most dramatic was that only 9.5% of people of color received mentions of leadership in their performance evaluations — more than 70 percentage points lower than white women. Not surprisingly, leadership mentions typically predicted higher competency ratings the next year.
We recommended a number of interventions — what we call bias interrupters — and agreed to test their efficacy by looking at the firm’s performance evaluations the following year.
The good news? The results of the interventions were striking. We saw sharp improvement in a single year. Here’s how.
The Four Patterns of Bias That Affect Evaluations
We identified four basic patterns of racial and gender bias, documented by decades of research, in our assessment of the evaluations:
1. Prove It Again
Groups stereotyped as less competent — including women, people of color, individuals with disabilities, older employees, LGBT+, and professionals from blue-collar backgrounds — have to prove themselves over and over again. The way this plays out in performance evaluations is that “prove-it-again” groups tend to be judged on their performance — their mistakes are noticed more and remembered longer — while the majority white men are judged on their potential.
In year one of our study, 43% of people of color and 31% of white women had at least one mistake mentioned in their evaluation, compared to 26% of white men. Studies have shown that Black attorneys are consistently subjected to higher scrutiny, and this data set was no different, with 50% of Black men and 50% of Black women’s evaluations mentioning at least one mistake.
2. The Tightrope
A narrower range of workplace behavior is accepted from women and people of color. White men simply need to be authoritative and ambitious in order to succeed, but women and people of color risk being seen as overly aggressive or “difficult” if they behave the same way.
The clearest evidence of tightrope bias in our audit concerned comments about personality. We found that people of color and white women were far more likely to have their personality mentioned in their evaluations (including negative personality traits). What’s optional for white men (getting along with others), seemed to be necessary for white women and people of color. Case in point: 83% of Black men were praised for having a “good attitude” vs. 46% of white men, and 27% of white women were praised for being “friendly and warm” vs. 10% of white men.
Personality wasn’t the only type of tightrope bias we found: 50% of Black women’s evaluations included mentions of doing the “office housework” (aka the undervalued, behind-the-scenes work) compared to 16% of white women and 3% of white men. Prescriptive stereotypes create pressure for women to be modest, helpful, and nice. (Think the “office mom.”)
3. The Maternal Wall
This reflects assumptions that mothers are no longer committed to their work, that they probably shouldn’t be, and that they are less competent. (Think “pregnancy brain.”)
One of our most shocking findings was that almost 20% of white women received comments on their performance evaluations to the effect that they did not want to make partner. We suspect that many of these women had not said so and that managers were just making assumptions about their diminishing commitment to their work after having children. Women were also more likely to receive comments about being overworked than men.
4. Racial Stereotypes
Racial stereotypes pertaining to performance evaluations can be overt, such as the stereotype that Asian Americans are good at technical tasks but lack leadership ability, or more subtle, such as the assumption that people of color need to be more willing to sacrifice work-life balance than white men. In our audit, we found that one third (33%) of people of color received comments that they were willing to travel, as compared to 13% of white men.
Two Simple Changes
To combat these biases, we worked with the firm to make two simple and inexpensive tweaks to their performance evaluation system.
First, we changed the form itself. The original form had an open-ended prompt that didn’t specify which competencies the organization valued or require evidence to justify the manager’s ratings. The new form broke job categories down into competencies and asked that ratings be backed by at least three pieces of evidence. That’s to combat the “halo-horns” effect where white men are artificially advantaged by global ratings because they get halos (where one strength is generalized into an overall high rating) whereas other groups get horns (where one mistake is generalized into an overall low rating).
Second, we worked with the company to help develop a simple, one-hour workshop that taught everyone how to use the new form. The workshop showed actual comments from the prior year’s evaluations and asked a simple question: Which of the four basic patterns of bias does this comment represent, or does it represent no bias?
How the Intervention Helped Everybody — And Why
We then examined the next round of performance reviews after the interventions. In year two, not only did people of color get more leadership mentions (100% in year 2), they also got wildly more constructive feedback. Only 17% of the comments given to people of color contained constructive feedback in year one, as compared to 49% in year two. Constructive feedback increased for white women, too (from 10.5% to 29.5%) — and for white men (from 15% to 27%). This highlights a supremely important point: Using an evidence-based performance evaluation system helps all employees. In year two, the evaluation form’s specificity also allowed for far more effective assessments of the key skills and contributions which are of great value to the company.
The intervention leveled the playing field in other important ways, too. White men had longer, more complex evaluations in year one; in year two, both word count and language complexity were similar across all groups. Negative personality comments sharply declined in year two for people of color: 14% had a negative personality comment in year one, but 0% in year two. The organization identified “taking initiative” as a core value, mentioned it as a competency on the form, and saw a dramatic increase in the number of employees who received a comment describing a time they took initiative — the change was most dramatic for people of color (19% in year one to 94% in year two) but was large for white people as well.
White men are unfairly advantaged by global ratings provided without backup, which are petri dishes for bias. By shifting to specific skills and competencies, white men lost their unearned advantage over people of color in promotion recommendations, too.
Not One and Done: An Iterative Process
A single year of a single intervention will not transform a company culture. Think about it: If your company had a problem with sales, you would not expect to nail it in a single year. You would change one thing and see how it worked, then change another thing, then another until you achieved your sales goal. Companies need to use the same iterative approach with DEI.
The evaluations in year two suggest that this company still has a “women are wonderful” problem. Women had higher ratings on many different items, including being referred to as a value or an asset to the company, but this didn’t seem to translate into the opportunities that lead to promotion. White women were still far more likely than other groups to have comments in their evaluations saying they need additional opportunities (51% vs 33%) and that they deserve promotions (37% vs. 22%).
People of color also still had “prove it again” problems. The new form asked evaluators to list the employee’s two or three top competencies. Only 33% of people of color had efficacy and effectiveness (a key value for the organization) listed, compared to 80% of white women and 63% of white men. People of color’s mistakes were also reported at higher levels than white people’s (78% vs. 43%).
Finally, the firm still has an office housework problem. White men were much less likely than people of color or white women to have mentions of less important, administrative tasks.
At this firm, as at any organization, solving DEI challenges will be a multi-year process that will require changing long-standing performance management practices. But one thing’s for sure: Working with evidence and using metrics will help companies make steady progress year after year and improve outcomes for everyone. This is the only road to sustainable change.
At this moment when many organizations are reckoning with their roles and responsibility to ensure racial equity, the good news is that a data-driven approach can deliver rapid concrete gains.