There are several levels of significance in what has come to be known on twitter and elsewhere as the ‘GCSE fiasco’.
Of greatest importance is the significance for the individual students whose futures may be adversely affected by the GCSE grade decisions.
Then there is the significance for schools and colleges, not only in terms of league table scores but also in the entry arrangements for post-16 courses.
There is significance for teachers, whose students achieved lower grades than they predicted. For many years, the more experienced teachers have been predicting their students’ results with a good degree of accuracy – and suddenly they learn that their predictions are too high. And teachers, having worked to raise student aspirations, care about what their students have achieved too.
The current debate is significant for the government, for Ofqual, for awarding bodies and for the thousands of people who work as examiners and markers, and whose work is now being picked over.
At the root of all this, however, lies an issue that affects the way that all GCSEs and A-levels are graded and is therefore of huge significance to the future of all qualifications in England. This is the question of whether we have a criterion-referenced system – in which work of a certain standard gets the same grade every year – or a norm-referenced system – in which the number of candidates gaining each grade is kept the same each year, even if their performance is improving.
Until Keith Joseph courageously introduced criterion referencing of A-levels and O-levels in 1984, norm-referencing acted as an agent of social conservatism, keeping down the number of students at the higher grades and making it impossible to know whether schools and colleges were raising achievement year on year.
Since then we have been given to understand that the system had been operating on criterion referencing. That is to say, the number of candidates achieving each grade depended on the standard of their work, not on the proportion at each level.
Just as in 2002 when questions were raised about the grading of the then new A-levels, we now know that statistical wizardry is involved in setting grade borderlines, using the prior attainment of that cohort of candidates, i.e. any previous exam results they had obtained. At A-level, this involves the grades achieved at GCSE two years earlier by each cohort – a not unreasonable assumption of the level of intelligence of the cohort, one might think, unless of course the GCSE grades themselves were in turn based on some other less secure previous measure.
Now we know that that prior measure in question turns out to be the key stage 2 results, achieved five years earlier by each GCSE cohort. I have never been a fan of the accuracy of key stage 2 results as an indication of the ability of each 11 year old. As the foundation of the house of subsequent GCSE results, I would put that more into the category of sand than rock.
Key stage 2 tests – whatever view you might have on their robustness – do not cover speaking and listening, which is central to controlled assessment. As such, it is not sensible to use the KS2 data as predictors for performance in a GCSE that includes controlled assessment.
Regarding the letters between Ofqual and Edexcel, it is not clear how the enlarged Edexcel entry might have influenced the results. For example, the additional candidates could have been disproportionately from selective schools, which could have caused a rise in grades.
What we have at the moment is a strange mix of criterion- and norm-reference methodology. A key outcome of the Tomlinson Inquiry in 2002 was to stress the ‘professional judgement’ of senior examiners. As such, statistics are but one element in the pointers available to awarding committees.
As a nation we are surely unique in our pre-occupation with fractions of one percentage point changes year on year, resulting in an annual media scrum each August – examining is not the exact science that the public has been led to assume.
Given the investment in education, why are we surprised by improvement? We should expect it – even demand it; the Independent Panel chaired by Eva Baker in 2002 rightly questioned the extent of surprise every August.
As ASCL general secretary Brian Lightman said to the Select Committee on 11 September, “Let’s get back to what are the standards”, although this will not be easy when, as Ofqual regulator Glenys Stacey commented, there is huge turbulence in the system of GCSEs, with changes continually being made to the way that assessment takes place, making Ofqual’s job much harder, but also much more important.
As chair of the Chartered Institute of Educational Assessors, I am working to develop the professionalism of those involved in the exams industry, through CIEA membership and accreditation to higher levels of professional recognition. Few workforces can have experienced more change than these people in the last 20 years, and especially in more recent times. The answer should lie not in manipulating grades, but in setting exams of comparable rigour year on year. Examiners need to know the standards to which they are working, as do the teachers whose skill in predicting outcomes and potential helps to motivate young people to raise their aspirations.
Following Tomlinson’s inquiry in 2002, each awarding body was required to look again at the awards. The accountable officer and his staff sat with the chairs of examiners to re-examine the grades awarded. Examiners were asked if they were satisfied with the outcomes and any concerns about the awarding process. The same process, observed by outside experts, could be run again for GCSEs in 2012.
We owe it to the individual young people taking exams – much more than to the statistics behind each cohort – to make criterion referencing work properly, to be much clearer about what is a grade A and what is a grade C and to implement that consistently, through a highly professionalised assessment workforce, so that there is fairness across all subjects, all awarding bodies and over time.