Performance Ratings, pt 4: New Ways Forward


Welcome back to our fourth and final episode of a long and rocky odyssey in the world of performance management. The short version of what we have established so far goes something like: Scrapping the annual performance review is quite uncontroversial in the face of research. We know that is not how to motivate and enable improved performance – that is rather done through close contact with the supervisor, useful feedback, challenging tasks, etc. However, we are still stuck with the question of how to evaluate performance. This happens to be an incredibly difficult task – but if we want to work systematically with quality improvement, and aim for fair and unbiased decisions in e.g. pay and promotions, we need to take it on. So today, after this long journey, I will finally try to get to the natural follow-up question: How are we supposed to do it?

I dare say that the utopia of completely objective performance judgments is out, or at least it should be out. It is simply vain, and even dangerous, to believe that we can ever find a way to rate other people’s performance that is completely ”accurate” in some universal meaning. Let us draw on Waters et al. (2016) and call that a fantasy of the industrial era. Still, if we are to make these judgements and let people’s careers and salaries depend on them, we absolutely cannot settle for completely subjective statements either. So where do we go from here? A number of developments have arisen in recent years, and many of them seem promising. I base them both on current research in management and organizational psychology, and on accounts from foresighted practitioners that I meet as part of my research.

Multiple raters. What you usually call 360 ratings; using ratings not only from your supervisor but also from colleagues, subordinates,  clients, etc. The logic is; it is unlikely that all of the stakeholders you interact with hold the same biases or political interests about you (Pulakos et al., 2015). E.g., if your supervisor is constantly underrating your performance, it is likely that your colleagues, subordinates, and customers will at least give you a higher rating, and that will equal out the adverse impact.  Evidence is scattered, however. It is not certain that they substantially improve rating accuracy – sometimes, more raters seem to add marginal effects at best (e.g. Howard, 2016). Still it is being used increasingly to try to get a broader picture but also decrease the risk of bias. We should keep in mind, however, that stereotypes based on e.g. gender, background, or age could very well break through in 360 ratings as well.

Supervisor training. In research, opinions diverge as to the effectiveness of training managers in rating performance. Some state that these initiatives do not reliably improve rating quality (e.g. Adler et al., 2016). However, there is reason to believe that this is an effect of benchmarking against an unattainable goal (i.e., complete “accuracy”). As some scholars are now starting to argue, the goal can never be complete objectivity – instead, it should be the more humble state of intersubjectivity. That is; we should train raters to share a common way of thinking when making the ratings, even if that way of thinking does not reflect an objective “truth” about performance. If we can attain that, there are good hopes that rating accuracy in a more realistic sense can improve (Gorman & Rentsch, 2009; Schleicher & Day, 1998).

Self-rating: An idea increasingly heard in the start-up world and among avant-gard HR people. Why not let people rate their own 1) performance 2) access to the help they needed. Personally, I believe it is a practice we will be seeing more of. However, not the best idea if you are tying it to pay.

Evaluating only on goal fulfillment. Common in the tech world. I.e., you throw out all performance criteria that demand subjective judgment, and only ask the question: Did you deliver on your measurable goals or not? This strategy could be one way of decreasing subjectivity, as long as the goals are pre-determined and clearly measurable. As noted by Hunt (2016), however, the strategy might open up for attaining your goals in ways that do not rhyme with company ethics or values. Further, it is badly suited for positions where the end goals are actually unclear or develop underway.

Rating on behaviors. This is another development, sprung from behavioral psychology and organizational behavior management (OBM): Instead of trying to judge fluffy criteria like ”team player” or ”strategic thinking”, the rater only looks at observable behaviors such as ”takes time to help colleagues with problems” or ”identifies upcoming obstacles in planning meetings”. The advantage of this approach is that observable behaviors are less ambiguous than more general evaluative judgements (Lievens, 2001). The high level of specificity is also their problem, however – it can easily get too specific and rigid, which makes it difficult to account for the fact that different people can demonstrate the same quality in different kinds of behaviors.

Skipping the compensation link. Some companies have come to the conclusion that if you ever want accurate, truthful ratings that can be used for systematic organizational development, you should stop tying them to pay. Some are switching to standard pay levels for different types of positions, usually with reference to the ”startup philosophy” mentioned earlier in this series: We only hire really good people, so there is no reason why not all people in the same type of role would get the same pay. Controversial? To some, absolutely. Probably less so in Sweden than in the US. Regardlessly, research supports the notion that lower stakes lead to more honest performance ratings. In addition, a survey study underway (Ledford et al., 2016) showed that organizations that drop the annual rating do not face increasing reward costs.

Hopefully, this brief review provided some hope for the future: New and promising solutions for better performance evaluations are coming forward quickly. From the ashes of the old annual performance review, chances are we will see a new paradigm arising that holds more humble hopes for completely objective ratings, but still uses research and creativity to find new high-quality ways of evaluating performance.



Performance Ratings, pt 3: Why the Really Daunting Task Is Still Looming


We continue our odyssey through the complex topic of performance management – a practice that is changing forcefully at the moment. The annual performance review, where employees’ performance is rated according to complex criteria and scales once a year, is increasingly being abandoned. To many’s joy, one should add: This practice is disliked by most stakeholders and has been accused of not adding any substantial value. So, that’s it? Ding dong, the witch is dead? Not really.

As noted last time, performance management serves at least three purposes in organizations: Motivating performance improvement; enabling analysis of performance patterns and trends; and serving as a decision basis for e.g. promotions, layoffs, talent nominations, and compensation. We also noted that the first function is actually the one where research can provide the most clear-cut answer with regards to ratings: In order to develop and motivate employees, ratings generally give little added value. When it comes to the second and third purposes of performance management, however, the issue becomes a lot more complicated.

Let’s say you are an executive, and want to identify those managers in your organization that continuously succeed at growing high-performing individuals. Or you want to investigate whether there is a pattern of where in the organization low-performers are located. Or, for that matter, you want to tie some aspect of compensation to performance. How are you going to do this? First thing, you need to be able to compare people to each other. That means you will need some kind of systematization and documentation of performance. And as soon as you start logging people’s performance in any standardized way, be it by a number or a qualitative judgment, you are indeed doing a performance evaluation. Bottom line: Even if you throw out the annual performance review and remove ratings from the coaching sessions with employees, you will arguably still need some kind of method for evaluating people’s performance.

In light of this, it should come as no surprise that the witch is not really dead. As pointed out by several scholars (Hunt, 2016; Ledford et al., 2016), most companies that claim they have gotten rid of performance ratings really only refer to the annual performance review. Most continue to rate their employees as part of e.g. their talent review, their compensation process, or their leadership audits. And then we are actually back at what is really the key issue here: Evaluating someone’s work performance is an incredibly difficult task. As noted by Adler et al. (2016), sports judges spend their entire careers specializing in this – in contrast to managers, who are supposed to handle performance judgments as a ”side task”. And yet, sports judges often disagree with each other…

To no surprise, substantial research has shown that supervisor ratings of employee performance are pretty far from being accurate or consistent (Levy & Williams, 2004; Murphy & Cleveland, 1995). A number of things besides performance tend to go into these ratings: Personal liking, politics, different kinds of cognitive biases, stereotypes, and chance. Unfortunately, there is no real reason to believe that the ratings would be any more accurate just because they are conducted in another format than the annual performance review.

What I am trying to say here is this: In the general excitement about scrapping the annual performance review, there is a risk that organizations speed through the issue of how to actually evaluate employees’ performance. Chances are, then, that we just move this daunting task from one place to another (e.g., from the annual review to the leadership audit), without having improved our methods. Instead, why not take this time of change as a perfect opportunity to actually try to improve performance evaluation in a broader sense? Like so often before, the most forward-thinking practitioners are already ahead of research on this matter. In the next blog post we will look closer at some of the concrete strategies that are now being used to try to make performance evaluations more fair, accurate, and fit to our knowledge-intensive, post-industrial era.



Performance Ratings, pt 2: The Paradox of Dual Purposes


As noted last time, one of the most flagrant trends in HR right now is the change going on in performance management. An increasing number of companies are throwing out their annual performance reviews, and often also their complex criteria and matrices for rating employees’ performance. An old paradigm thus seems to be on its way out. Why is this happening, and what broader underlying challenges are driving this change? Today, I thought we would dwell on one of the inherent difficulties of performance management: Combining development with evaluation.

First of all, we need to acknowledge that performance management serves at least three broad purposes in organizations today. First; to enable and motivate employees to improve their performance. Second; to feed systematic analysis and business intelligence, e.g. finding patterns in performance differences. And third, to inform decisions about promotion, talent nominations, layoffs, and – not least – compensation and benefits. This multi-purpose nature of performance management is absolutely central if we want to understand what is happening now – and what an up-to-date performance management system might look like.

So far, the discussion has mostly focused on the first purpose. And if that was the only one, the issue of throwing out performance ratings would indeed be pretty clear-cut: There is little evidence to suggest that ratings should be a central part of motivating or enabling employees to improve their performance (DeNisi & Smith, 2014). On the contrary, there is quite a lot of research pointing to the demotivating effects of ratings (Aguinis et al., 2011; Culbertson et al., 2013), or at least indicating that ratings only motivate a minority of employees (usually – surprise! – the top-rated ones). This should come as no surprise. We have long known that motivation at work is fueled by frequent feedback, challenging goals combined with the right resources to achieve them, and close contact with a supportive supervisor. A label or number put on your performance once a year has scant chances of affecting your everyday behavior and engagement at work.

Furthermore, we know that evaluation and human growth tend to be like oil and water: They are virtually impossible to combine in the same process (a fact noted already by Meyer, Kay, and French, 1965). Performance management as it has been carried out to date thus carries an inherent paradox: In one and the same process, supervisors are supposed to help the employee develop and grow, while at the same time giving an evaluative judgment of his or her performance over the past year. There is also a lot of research showing that the use of numbers or categories in that process actually works to aggravate this paradox (e.g. Murphy, Cleveland, & Lim, 2007 in Langhan-Fox, Cooper, & Klimoski (eds.)). By introducing a rating scale, you forcefully direct the employee’s attention to the rating itself and not to the qualitative feedback and discussion that go with it. No matter how much you emphasize that the process is forward-looking and developmental, the employee will tend to focus mainly on the rating.

Thus, you might say that many of the problems of performance management stem from its apparent janus face: It includes both a developmental and a judgmental focus. One clear practical implication can be drawn out of this: If you want to enable performance improvement among employees, try removing any talk about ratings and formal judgments from that conversation. Coaching or development sessions between supervisor and employee are best held with little focus on evaluation. In other words; the first of the above purposes with performance management is often best fulfilled when separated from the second and third. From that perspective, the scrapping of the performance review is a promising development.

As evident above, we actually know quite a lot about how to motivate employees at work. We definitely know enough to say that performance ratings seldom serve that purpose. But – and this is an important but – when it comes to the second and third purpose of performance management, the issue of ratings becomes a lot more complicated. The reason is, they relate to one of the most difficult issues that organizational psychology has to offer: How do you fairly evaluate another person’s performance? This daunting task does not go away just by getting rid of the annual performance review – and we will dig into it in depth in the next blog post.