By the time you read this the school year will be coming to an end. Pupils and teachers will be looking forward to the long summer holiday.
Youngsters in Year 11 and 13 will be breathing a sigh of relief at having finished their exams and looking forward to the next stage of their education and indeed their lives. They will have spent hours both taking and revising for exams over the past three months. A typical GCSE student can take over twenty different types of assessment, from oral exams in languages to maths tests. A Level students too can take well over ten different exams. Conscientious students will have spent hundreds of hours learning facts, drawing mind maps and practising past paper questions. A huge amount of effort has been put in by pupils, teachers and parents to make sure nothing can go wrong on the day and the boys and girls get the grades they deserve. The shocking and depressing fact is that in many cases their efforts won’t be worth it because in around 1 in 4 cases, the grade will they get in their exams will be wrong.
The reasons for this are not what you might expect. Yes, there are a shortage of high-quality examiners in some subjects, but exam boards are getting better all the time at quality control and weeding out poor markers. It’s not because of exam board errors or horror stories such as scripts being lost and so on. The reason why up to 50% of grades in subjects are wrong is much more fundamental than that and, under our current system of exam grades, is unavoidable.
It’s also quite technical so bear with me. Grade reliability means, in an exam context, that a particular script submitted by a candidate should get the same grade no matter which examiner marks it i.e. if it was marked twice by two separate examiners, their marks should agree.
In a subject like Maths, this happens most of the time; answers are either right or wrong and so there is little room for dispute. In a subject like History though, the quality of a candidate’s answer is much more subjective. Of course, the examination board produces guidelines, but it still requires the examiner to distinguish between, say, good analysis of a source and excellent analysis of a source. This isn’t easy. So, for example, one candidate may score 55% on a paper marked by one examiner whilst another would score it 58%. Not much variation, but if the grade boundary for a B grade is 57% then we have a problem. This problem is even worse at GCSE, where the marks which separate a grade 4 from a grade 5 may be only a few percent. This isn’t poor marking or bad examining; the fuzziness of the grade is inherent to the nature of long discursive subjects like History or English.
Does this really matter? Well, if you look at the stats, it could easily be that a pupil who needs AAA for History at Durham University could attain AAB and be rejected even though there is a 50:50 chance that if another examiner had marked his paper he would get the three As he needs. More seriously, thousands of youngsters each year score a 3 on English Language and need to resit at their and the country’s expense to get that grade 4. Another examiner, they would get that 4 first time round.
So, what’s the solution? The short answer is there isn’t one. Or at least, there isn’t until we scrap the “cliff-edge nature of grades and grade boundaries which suited education in gentler times. Now, with university grades hanging on such fine margins, we need to go back to the drawing board to find a fresh approach to pupil assessment.