Assessment · Education

No more marking: Tinder for grading essays

No More Marking.

Has there ever been a more enticing name for a piece of education software?

No More Marking comparative judgement software has been championed by Daisy Christodoulou. Her writing on Education Myths and Assessment are well known.

Her critiques of the adverb soup that we create when we develop rubrics and criteria sheets or use prose descriptors to define levels are entertaining. She suggests using comparative judgement may provide an alternative.

Daisy now works for No More Marking. Recently I’ve used No More Marking with students and with staff for the first time. The results were intriguing.

No More Marking software allows you to rank students work using a comparative judgement algorithm. We have been debating the relative merits of criteria sheets and rubrics for generating consistent teacher judgement. Consistent teacher judgement is a Holy Grail for many schools. Generating consistent and reliable marking of tasks like essays and assignments is not easy.

Our English staff, in particular, spend many hours developing complicated assessment rubrics, continua, marking schemes and criteria sheets. They use these to assess student work then spend more hours moderating the student work. This often ends up in a further modification of the criteria sheets, continuum and rubrics and more and more arguments.

Maybe No More Marking could make that job easier.

For our professional development day, I set up a task on no more marking. I scanned in 12 essays from the NAPLAN marking guide. I then asked a group of teachers from various faculties to make judgements on the scripts. The software displays two essays side by side on the same screen. The only instruction I gave was, “You need to pick the best piece of writing.”


The English teachers in the group immediately starting asking questions about “audience and purpose” and “what are assessment criteria” and “how much I should weight technical writing vs ideas.”

The maths and science teachers just got on with judging.

I wanted to see whether not using any criteria with “No More Marking” would generate the same rank order of scripts that using detailed NAPLAN writing criteria generated?

The NAPLAN writing criteria have been heavily criticized recently. The criticism has been that the criteria is too complicated and too heavily weighted to technical aspects of writing and will not produce quality writing.

The teachers were asked to make between 15 and 20 judgements each.

Try it yourself here. The task is still set up.

After little initial instruction teachers found the program very easy to use and found the judging quite simple.

My favourite feedback from a teacher; “So it is kind of like Tinder for essay marking. Swipe right if you like it, left if you don’t.”

Once the English teachers got over the fact that we were not using any criteria sheets or marking schemes they began to make judgements. Within minutes our reliability figure was over 0.9. After 100 judgements in total, I showed the staff their judging statistics on the data projector. They were captivated by this. They became quite competitive about who had the best reliability figures. Side note: the English staff did have slightly higher reliability than other teachers.

So how did we do ranking the scripts? Will we be able to generate the same rank order of scripts using no assessment criteria? We applied none of the complicated 10 criteria, 48 point marking scheme at all to do the judging. We simply responded to the prompt “Chose the better piece of writing.”

Here are the results after 340 judgements.

The Yellow Columns were the scores and ranking from the NAPLAN marking guide. The rest of the data is what No More Marking generates.

When I set up the task I set the scaled scores to go from 0 to 48 as I was interested to see how close NMM got to generating a number like the marking guide generates.

You can see the rank order was pretty close.

Marking guide scripts DS50-52 and DS54-55 were given the same score when marked against the 48 point criteria sheet. They both were given a score of 36/48 by the NAPLAN examiners. Our judges and NMM ranked them almost equal as well.

So a group of teachers, including some who are not expert in marking essays, were able to, in the space of 15 minutes, using no criteria sheet, rubric or marking scheme, come up with a very similar ranking of scripts to using very detailed criteria and scoring systems.

There is real potential in NMM in saving time and effort and increasing reliability in the assessment of writing.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s