Welcome back to our fourth and final episode of a long and rocky odyssey in the world of performance management. The short version of what we have established so far goes something like: Scrapping the annual performance review is quite uncontroversial in the face of research. We know that is not how to motivate and enable improved performance – that is rather done through close contact with the supervisor, useful feedback, challenging tasks, etc. However, we are still stuck with the question of how to evaluate performance. This happens to be an incredibly difficult task – but if we want to work systematically with quality improvement, and aim for fair and unbiased decisions in e.g. pay and promotions, we need to take it on. So today, after this long journey, I will finally try to get to the natural follow-up question: How are we supposed to do it?
I dare say that the utopia of completely objective performance judgments is out, or at least it should be out. It is simply vain, and even dangerous, to believe that we can ever find a way to rate other people’s performance that is completely ”accurate” in some universal meaning. Let us draw on Waters et al. (2016) and call that a fantasy of the industrial era. Still, if we are to make these judgements and let people’s careers and salaries depend on them, we absolutely cannot settle for completely subjective statements either. So where do we go from here? A number of developments have arisen in recent years, and many of them seem promising. I base them both on current research in management and organizational psychology, and on accounts from foresighted practitioners that I meet as part of my research.
Multiple raters. What you usually call 360 ratings; using ratings not only from your supervisor but also from colleagues, subordinates, clients, etc. The logic is; it is unlikely that all of the stakeholders you interact with hold the same biases or political interests about you (Pulakos et al., 2015). E.g., if your supervisor is constantly underrating your performance, it is likely that your colleagues, subordinates, and customers will at least give you a higher rating, and that will equal out the adverse impact. Evidence is scattered, however. It is not certain that they substantially improve rating accuracy – sometimes, more raters seem to add marginal effects at best (e.g. Howard, 2016). Still it is being used increasingly to try to get a broader picture but also decrease the risk of bias. We should keep in mind, however, that stereotypes based on e.g. gender, background, or age could very well break through in 360 ratings as well.
Supervisor training. In research, opinions diverge as to the effectiveness of training managers in rating performance. Some state that these initiatives do not reliably improve rating quality (e.g. Adler et al., 2016). However, there is reason to believe that this is an effect of benchmarking against an unattainable goal (i.e., complete “accuracy”). As some scholars are now starting to argue, the goal can never be complete objectivity – instead, it should be the more humble state of intersubjectivity. That is; we should train raters to share a common way of thinking when making the ratings, even if that way of thinking does not reflect an objective “truth” about performance. If we can attain that, there are good hopes that rating accuracy in a more realistic sense can improve (Gorman & Rentsch, 2009; Schleicher & Day, 1998).
Self-rating: An idea increasingly heard in the start-up world and among avant-gard HR people. Why not let people rate their own 1) performance 2) access to the help they needed. Personally, I believe it is a practice we will be seeing more of. However, not the best idea if you are tying it to pay.
Evaluating only on goal fulfillment. Common in the tech world. I.e., you throw out all performance criteria that demand subjective judgment, and only ask the question: Did you deliver on your measurable goals or not? This strategy could be one way of decreasing subjectivity, as long as the goals are pre-determined and clearly measurable. As noted by Hunt (2016), however, the strategy might open up for attaining your goals in ways that do not rhyme with company ethics or values. Further, it is badly suited for positions where the end goals are actually unclear or develop underway.
Rating on behaviors. This is another development, sprung from behavioral psychology and organizational behavior management (OBM): Instead of trying to judge fluffy criteria like ”team player” or ”strategic thinking”, the rater only looks at observable behaviors such as ”takes time to help colleagues with problems” or ”identifies upcoming obstacles in planning meetings”. The advantage of this approach is that observable behaviors are less ambiguous than more general evaluative judgements (Lievens, 2001). The high level of specificity is also their problem, however – it can easily get too specific and rigid, which makes it difficult to account for the fact that different people can demonstrate the same quality in different kinds of behaviors.
Skipping the compensation link. Some companies have come to the conclusion that if you ever want accurate, truthful ratings that can be used for systematic organizational development, you should stop tying them to pay. Some are switching to standard pay levels for different types of positions, usually with reference to the ”startup philosophy” mentioned earlier in this series: We only hire really good people, so there is no reason why not all people in the same type of role would get the same pay. Controversial? To some, absolutely. Probably less so in Sweden than in the US. Regardlessly, research supports the notion that lower stakes lead to more honest performance ratings. In addition, a survey study underway (Ledford et al., 2016) showed that organizations that drop the annual rating do not face increasing reward costs.
Hopefully, this brief review provided some hope for the future: New and promising solutions for better performance evaluations are coming forward quickly. From the ashes of the old annual performance review, chances are we will see a new paradigm arising that holds more humble hopes for completely objective ratings, but still uses research and creativity to find new high-quality ways of evaluating performance.