To Stop Cheating, Nuclear Officers Ditch the Grades

hliyan · on July 29, 2014

In other words, you get what you measure. This is one of the reasons I discourage measuring team performance in terms of bug counts or lines-of-code. The former lead to what some call "issue-tennis" and informal fixes, and the latter lead to code bloat. After years of trying, I've finally concluded that "key performance indicators" attached to individuals or teams (as opposed to artifacts and periods) usually leads to more trouble than they're worth.

616c · on July 29, 2014

So I was in language courses in Cairo for a while. I was the only language major; all the other students were students from my university in a specialized diplomacy and government service track. Unlike me, they had to pass language proficiency exams, in a language of their choosing, or they did not graduate until they did.

So, I had a great teacher. He did not care about grades at all. He would give bad grades, and when kids would complain give them good ones if they wanted to retest. He would play games with students regarding this. Now the students who wanted good marks to count towards proficiency were really pissed. He put it quite simply: "I'll give you all the A's you want; but when you go back you will still not speak the language. I do not really care and that is not the focus of my classes. I am here to teach and discuss the language and material with you. That is what interests me." I was one of the few that enjoyed that course, and I was admittedly the best student. I continued to be a strong student in all courses, even those I failed (like economics and computer science) because I was far more interested in material than grades.

In the end, I graduated with a high GPA. I see this as a cautionary tale. Now if only I could finally learn C and submit kernel patches like I promised myself in college. But such is life.

protonfish · on July 29, 2014

I graded like this when I taught community college and most students hated it. What surprised me is that it was not enough for a student to get an "A" - they wanted better grades than the other students, like it was a competition.

TeMPOraL · on July 29, 2014

Because it always is. First there are parents, who want their kids to be above average / top of the class. Kids mindsets then turns to competition, as students themselves start to evaluate each other by their grades. And somewhere in the middle of this the system starts throwing in various scholarships and other perks for a fixed number of best students. No wonder people treat it like competition, because it is one.

616c · on July 29, 2014

Because kids are sad sad sad. Haha. I also work in an educational institution doing IT. This, as you get from my anecdote above, was not my educational background.

It is funny people laugh when they hear me bust out into Arabic in front of them. I told them I am a language major by trade, and did IT in college to make money. Many people congratulate on mastering these "magical bodies of knowledge that are witchcraft, because surely I (the person I am helping) could never get it." If you want scary, these are other people married to Arab men and women, as am I. Decades of exposure mean nothing.

Now why is that? Grades breed mediocrity. Good students stroke their ego, and most students give up when they cannot compete with good students. So we encourage people to believe "I am a magician that speaks computer" and "Arabic is impossible" (if you beleive that, try teaching and explaining inconsistent English grammar; Arabic has encoded in the morphology a lot of syntax that would scare you into seeing what is like a human programming language) and accept mediocrity because evaluation and rank is more important than effort. I see it so often, even from students, I mess with them when they come to my office. I never bitched to change my grades. Why? I was busy reading. Who cares!

Please keep being that professor, I hope you save some of them. What is the most cautionary part of that tale is the people in my story all end working for government, some as counter-terrorism and intel guys. They cannot string a sentence together in Arabic, despite the A, and dictate USG policy "on dem terrorist A-rabs" so this culture will get us a real far.

watwut · on July 29, 2014

While the relative grade does not matter in some majors, it may matter in others. They can influence internships or scholarship accessible to you.

It matters a lot when student wants to study medicine. Law firms look at grades when selecting graduates. If you want to go to grad school, your grades matter a lot too.

So, in a sense, grades are real competition.

WalterBright · on July 29, 2014

I learned that 30 years ago. It seems to be one of those things that every generation needs to learn the hard way.

A reason why an effective team needs a few older guys who've been around the block a few times :-)

lobotryas · on July 29, 2014

True. Unfortunately you then get a new VP who still believes in KPIs (or needs to demonstrate KPIs to his boss because that's the only thing upper management "gets") and then you're back to square one :(

jib · on July 29, 2014

It is fine to have KPIs. When KPIs go out of control it is a sign of culture issues, or whatever you want to call it.

KPIs are just that. Indicators. Temperature is an indicator of good climate. If someone's answer to "it is cold in here" is to put a lighter under the thermometer then the problem isn't the thermometer, right? It is the people who are confusing indicators from goals. And that is a culture thing.

If that is your climate though you can still recover from it (I know we are doing way better than before at least:)).

First steps are to embrace ambiguity and uncertainty in evaluations. The culture twist (when I have seen it) comes from "Noone recognises my work" -> "Lets define KPIs for your work" -> "I am not evaluated fairly, look at how much work I do compared to guys that are evaluated better" -> "Lets define guidelines for evaluations based on our KPIs" -> "Everyone (except me) is cheating to score high on the KPIs to get better evaluations" -> "We need better KPIs that can't be cheated on" -> "We are doing a whole amount of stupid stuff because people worry about KPIs".

I think the first two steps are fine. There is direct business value in being able to compare how you are doing compared to the previous month, or some other department or whatever.

The problem comes in the third step - once the KPIs determine how someone is evaluated, then you are in a downwards spiral, because now you are saying that how you get there isn't as important as getting there. To get out of that you need to embrace the ambiguity that comes with letting evaluations happen, and maybe be "unfair".

Basically - uncertainty in evaluations is better than defining evaluations on KPIs. Solve the uncertainly by more discussions and reasoning, not by more rules.

superuser2 · on July 30, 2014

> If someone's answer to "it is cold in here" is to put a lighter under the thermometer then the problem isn't the thermometer, right? It is the people who are confusing indicators from goals. And that is a culture thing.

But you're not using indicators to do some kind of science, you're trying to use them to make decisions about who gets fired and who gets a raise.

If you're a self-interested human being, you're going to do whatever you can to make sure the indicator points to "give me a raise" not "throw my ass on the street." If the indicator is not a perfect measure of job performance, then you're going to let job performance slide to maximize the indicator.

jib · on July 30, 2014

I think we agree.

Thats the thing - the indicators when used right are used for science. Namely improvement of self or others. It gets harmful when they are solely used to determine if you should get a raise or get fired.

They can certainly be used to indicate problem areas i.e. "This guy is producing 50% less according to RANDOMKPI", does that make sense or do we need to fix something? But if they are used like "X is at 80% of RANDOMKPI, he will be fired if he is not at 90%" then they are harmful, obviously, because they dont take into account the much more important things like "X is at 80% of RANDOMKPI because he is unofficially the teacher of how to deal with anything in our payment systems, so he spends half his time helping out others, which adds way more value to us than if he was just implementing features" or whatever.

Once KPIs start to drive evaluations, you're down a slippery slope.

lobotryas · on July 29, 2014

Thank you for your response. Unfortunately for me, I have zero influence on this decision process (in our case, the rabid hunger for "Leading Indicators"). I get shut down or over-ruled whenever I bring up how some level of ambiguity is unavoidable and it makes little sense to treat software engineering work like a factory process.

Having to suck it up and deal right now. I appreciate you reminding me that KPIs are just a tool. :)

allochthon · on July 29, 2014

In my experience it's been a ready flow of money -- people with funding (e.g. VCs, or upper management at a large firm) are able to advance an agenda that includes KPIs, and then you're back to square one. Somehow they're shielded from the learning at the grassroots. By this means learning processes are disrupted.

616c · on July 29, 2014

I agree. I abhor political talk on HN, but this resonated true with me when hearing the comparison of US troop operations in Iraq to that of Vietnam in those instance. The idea of kill counts, popular in Vietnam, and targeted assasinations by militant leaders in Iraq (made famous by the playing card where people were ranked) and general metrics based on opposition death counts. I reflected on this and wondered how so many officers who survived Vietnam and saw who defining performance metrics the wrong way come back to let the same or even courage it later as officers in Iraq is this recurring horror story in large org management. It scares me that these mistakes are repeated, from war to situations with far less consequence like software development. I am sure there are good psych articles for more details on such issues, and I have always been interested in why such problems recur.

mathattack · on July 29, 2014

Here's the obligatory Dilbert -> http://dilbert.com/strips/comic/1995-11-13/

This sounds like a cheating problem, not a measuring problem. The same things happens when people try to measure schools.

My sense is you can't give up on measuring things, because you can't scale an organization without metrics. You just have to balance them with understanding what is really going on. Or rather you need to have the metrics tell us "What's going on with the business" rather than "What's going on with the individual".

Perdition · on July 29, 2014

The problem isn't measurement, it is tying rewards to achieving certain measurements. People will do what you reward them for doing, even if that isn't what you actually wanted them to do.

The real issue is that identifying the correct things to measure and reward is really, really hard.

I read once about a factory that was trying to improve output so they started rewarding shifts of workers for reaching production goals. But to achieve those goals workers started cutting corners and quality went down. So the company installed a device to measure the quality of the widgets, which worked for a while but then quality went down again. The workers had figured out they could just keep measuring the same in-spec widget over and over again to make the machine happy. Next the company altered the quality measurement machine so that it would issue an error if it did two identical measurements in a row. Again quality went up for a while until the workers figured out they could just rotate a couple of good parts through the machine to fool it.

hibikir · on July 29, 2014

I have a good book on the subject: Measuring and Managing Performance in Organizations.

The TL;DR version is that any job worth doing has some hard to measure components, so the best you will get using measures as incentives is to lag in the hard to measure parts. So the recommendation is not to stop measuring, but to make individual measurements invisible to management, and be just tools that people can use to improve their own performance. If your team is doing far worse than average in a measurement that you all find important, you will work on it anyway.

TeMPOraL · on July 29, 2014

> The same things happens when people try to measure schools.

I think you have this a bit backwards - the cheating is actually the symptom of people trying to measure stuff in schools and then make those measures directly affect social status and future success of students. Cheating exists because the system doesn't reward you for learning the material, but for passing a test.

mathattack · on July 29, 2014

I was thinking more of the cases where the teachers and principals change the students test scores to pass state exams.

Here is one example of many -> http://www.cnn.com/2013/04/02/justice/georgia-cheating-scand...

golemotron · on July 29, 2014

You can't give up on measuring things, but you don't have to let people know what you are measuring or make the measure materially to them. That's there the trouble starts.

mathattack · on July 29, 2014

The challenge is then you lose accountability, and the ability for people to work on what they are weak at.

golemotron · on July 29, 2014

There's a good argument that we shouldn't attempt to work on our weaknesses but that we should work on our strengths instead. I don't have a link but I've seen reference to this in research about performance reviews.

mathattack · on July 31, 2014

I've seen similar. I kind of view it as "A couple As and a couple Cs are better than all Bs" The one caveat is that you don't want any Fs. For example, to be a great programmer, you don't need to be a great English writer, but if you can't communicate at all, your work will suffer. There's a minimum level after which it's better to focus on programming. If you're a great accountant, you don't need to be a master of spreadsheets, but you need to be good enough to get the job done.

sp332 · on July 29, 2014

How do you know if the soldiers you have assigned to a project are any good without testing them?

golemotron · on July 29, 2014

Test the project.

Spooky23 · on July 29, 2014

Measurement isn't the problem, it was the interpretation of the measurement.

They would apply the scores to the metrics used to determine suitability for promotion, probably because there isn't much you can measure to measure the performance of a guy who sits in a hole all day.

sitkack · on July 29, 2014

What metrics does an ant hill use?

jessaustin · on July 29, 2014

I believe the current consensus is that KPIs are communicated via pheromones. Both the location and density of pheromones are meaningful to the decision-making process of an individual ant. The large-scale behavior of the colony is a result of the totality of those decisions.

You make an unfair comparison, however. Numerous factors that constrain ant behavior (brain size, genetics, sterility, energy requirements, etc.) simply do not obtain for an organization of unrelated human beings.

Ntrails · on July 29, 2014

If the KPIs were, say, something to be minimised - does this still hold? For example, I cannot see a way to game "Lower system downtime is better", or "lower load time is better". Or am I fundamentally misunderstanding what a KPI is?

TeMPOraL · on July 29, 2014

Sure you can.

- lower system downtime = don't do maintenance or do it in a half-assed way

- decrease system load time = make everything possible "optional post-load components", thus reducing load time by factor of, say, 10, and increasing time to combat readiness by a factor of 5.

People can pervert any metric they're measured by. Just look at what companies do to squeeze that last little dollar of profit out of their customers.

jacquesm · on July 29, 2014

One company I recently worked with had 0 scheduled system down time. They also had about 500 manyears worth of technical debt and had to jump through increasingly complex hoops in order to maintain the 0 down time policy.

Short term that looks pretty good though, longer term it is a guaranteed to fail strategy. When and if it catches up with you you're deep trouble.

markgraydk · on July 29, 2014

In that case you should have more than just one KPI to combat any perverse incentives. Further, any decision making based on the KPIs should take into account thorough analysis.

It's far from perfect but making things quantifiable does have a lot of benefits.

watwut · on July 29, 2014

The trouble is, the decisions based on KPI do not take these things into account. They are usually made up in the hierarchy by people entirely detached from what is done day to day. They usually do not know the harm done and wonder where the company is loosing money.

Your combination of multiple KPI amounts to somewhat more complicated KPI. There will be perverse incentives unless you get perfectly right. And that is possible only if the work, including the long term consequences of decisions, are easily quantifiable - which is not the case of most non-trivial jobs.

markgraydk · on July 29, 2014

I agree that KPIs are misused quite a lot but that does change the fact that they can provide valuable information and setup incentives to follow certain goals.

I think it is rather simplistic to say that having more than one KPI can be reduced to "a more complicate KPI". Sure, it won't solve all problems with badly setup KPIs but to allow for detailed incentive structures you need detailed KPIs.

My major point is still that decision making based on KPIs should not be made solely on what the figures (or any red/yellow/green flag next to it) show. It requires analysis - which is something that I've always seen done.

The good thing about KPIs is that it can be a great communication tool and open discusions about why a trend has changed or similar things.

TeMPOraL · on July 29, 2014

> I agree that KPIs are misused quite a lot but that does change the fact that they can provide valuable information and setup incentives to follow certain goals.

The problem is that KPIs are misused like, 99.9% of the times, starting from elementary school. It's rare to see someone using such scores as (one of many) evaluation input and not an output. It's good that you put effort into doing real analysis; sadly, not many people do that.

I think that KPIs, or grades, let you as a manager/evaluator be lazy. You don't have to think, to investigate why the scores are bad, to talk with poor performers. Probably you'll be even discouraged from doing so by your superiors / the whole system - you are not allowed to completely disregard the scores at your discretion.

markgraydk · on July 29, 2014

Sadly, you are probably right, at least that the majority of cases misuse KPIs. But dont throw the baby out with the bathwater. KPIs reveal patterns, force you attention towards certain issues and create a base for discussion. They can also pidgeon whole you into misunderstanding what is happening though.

In my experience, KPIs work better as a tool for communication and as input for analysis rather than for optimization of goals. In a previous job I worked with aggregating the Balanced Scorecard for a 10,000 employee corporation. On such an aggregate level where the KPIs are composites of many underlying decisions there is less of a risk of perverse incentives. Instead, they are used for comparing trends across time and similar purposes.

KPIs targeted at individuals or smaller teams are very different than that example though.

tomjen3 · on July 29, 2014

If I was punished for downtime I wouldn't have upgraded OpenSSL after heartbleed.

I wouldn't put new features online.

I general I would do little that might cause the system to be unstable, even if it was a actually a good thing.

calinet6 · on July 29, 2014

Absolutely. In attempting to solve the problem of the influence of testing, they've also taken a huge step toward improving quality.

This applies to all of us.

http://blog.deming.org/2013/02/the-idea-of-performance-ratin...

melling · on July 29, 2014

Counting lines of code went out of style 30 years ago.

http://www.folklore.org/StoryView.py?story=Negative_2000_Lin...

dan_bk · on July 29, 2014

John Oliver (Last Week Tonight) has a good round-up: http://www.youtube.com/watch?v=1Y1ya-yF35g

thisjepisje · on July 29, 2014

Classical example of Goodhart's law:

http://en.wikipedia.org/wiki/Goodhart%27s_law

teddyh · on July 29, 2014

I would think that Campbell’s law is a slightly better fit:

https://en.wikipedia.org/wiki/Campbell%27s_law

solarexplorer · on July 29, 2014

And a good example is probably Google ranking...

EGreg · on July 29, 2014

"As a team, they need to make the right decisions, but as individuals they're not required to be perfect."

So why not let them cheat like they do? That's teamwork.

Better yet why not have a computer program remember all these codes instead?

albemuth · on July 29, 2014

> Inside a full mock-up of a nuclear launch control center, Andrew Beckner and Patrick Romenafski practice the launch of nuclear weapons with the turn of a key

"Hey people, this isn't pointing to production, is it?"

PythonicAlpha · on July 29, 2014

Sometimes people can be like sub-atomic particles. You can't measure them without changing them.

Any form of measurement will fall back on them that want to make the measurement. You even can ruin company culture by having the wrong measurement. I saw it happen, when people started to cheat on colleagues to get better "performance".

mkesper · on July 29, 2014

The world could sleep better if those fatal missiles worldwide had a delay of 24 hours, at least.

jedberg · on July 29, 2014

So War Games was right then... We really do need to replace the officers with computers.

arethuza · on July 29, 2014

The Soviets did have an automated system that could launch their missiles automatically:

http://en.wikipedia.org/wiki/Dead_Hand_%28nuclear_war%29

It may still be operational, but not normally turned on - for obvious reasons....

At the other end of the spectrum the UK has no automated control preventing unauthorized launches (no Permissive Action Locks) and we check to see whether Radio 4 is still broadcast to detect whether civilisation has ended and hand written letters to instruct the captains of our Tridents subs what to do in that event.

soneil · on July 29, 2014

(off-topic, but I find it fascinating)

The UK system sounds back-asswards, but it strikes me as a surprisingly sane set of checks.

Man vs Machine has always been a royal navy doctrine. The US teaches nuclear engineers how to drive a boat. The UK teaches sailors how to run a nuke. Similarly, if the boat has to maintain independent firing capability (the whole point of the deterrent being in submarines is to avoid command&control decapitation), then any access control is a formality. If control ultimately falls to the crew, then don't pretend it doesn't - focus on the men.

Radio 4 sounds anachronistic, but it's a UK-based, govt-controlled transmitter, independent from military systems, which broadcasts every second of the day during routine operation - and because it's still cranked out on longwave (198?khz), can be heard across our waters & much of the north atlantic (the stomping grounds of our deterrent).

While civilian maritime broadcasts would sound like a more sensible failsafe, most of them are either broadcast or controlled from Northwood, so they're not independent of military systems, and are routinely silent between scheduled broadcasts.

What I do find interesting is the actual point of checking for civilian broadcasts - given the loss or failure of command&control communications, the result isn't a railroad to Armageddon - they're expected to implement some common sense. Compare this to the scenes in Crimson Tide, where Gene Hackman prefers blind obedience to the Protocol ..

And "hand-written" is a red-herring. The actual detail here is that only the PM ever knows what went in that letter once it's sealed. They don't have to be hand-written, and probably aren't anymore - but traditionally this was a side effect of not dictating to a secretary/typist.

dpierce9 · on July 29, 2014

The Air Force was also seriously against permissive action locks. When forced to add them, the generals made the launch codes the same on every warhead, all zeros.

Command and control is very hard: being able to always and immediately launch a nuclear attack but never subject to an unauthorized launch presents a variety of challenges. PALs help with the latter but not the former. Maybe because the Brits have fewer warheads (and therefore the ability to exercise more control) or needed to respond faster (being closer to the USSR and other nuclear powers), they made a strategic choice to skip PALs. Eric Schlosser's book Command and Control is worth a read if you are interested in the history of the US nuclear arsenal.

Perdition · on July 29, 2014

In reality that system was not mechanically/electronically automated, it just enabled a team of officers in a bunker to give a general launch order without contact with the leadership. And the launch orders would still need to be manually carried out by the crews of strategic missile assets.

It was a hedge against a decapitation strike and the chaos that would result in the Soviet military due to its officers habitual deference to central authority.

serf · on July 29, 2014

The American Project Pluto/SLAM system was meant to be autonomous and self-launching too. The Western MAD system.

Boy am I glad that project never got off the ground. A MAD-gap/race would have been terrifying.

A plane that can circle the Earth 4 and a half times is pretty cool, but at the cost of irradiating the whole planet? Maybe not so much.

superuser2 · on July 30, 2014

... how far into that movie did you stop? That policy didn't work out very well.

ck2 · on July 29, 2014

We're pretty much guaranteed to have a nuclear accident at this point.

If ditching the grades is the answer to cheating, I say ditch nuclear weapons to solve nuclear problems.

chiph · on July 29, 2014

You go first.

ck2 · on July 29, 2014

Half the world doesn't have nuclear weapons, they seem to be doing just fine http://wikipedia.org/wiki/Nuclear-weapon-free_zone

sp332 · on July 29, 2014

Ukraine gave up their nukes on condition that their borders be respected, and look what happened.

gwern · on July 29, 2014

You can't use nukes on some separatists and rebels, so I don't think that has much of anything to do with it. Pakistan isn't exactly in great shape despite having nukes.