Why Sir Michael Wilshaw is Wrong about More National Tests

In Debates and Polemics by Alex Quigley1 Comment


“For every complex problem there is an answer that is clear, simple, and wrong.” H.L. Mencken

In the most recent annual review conducted by OFSTED, Michael Wilshaw has once more come out shooting from the lip. There is of course an irony that as Wilshaw damns the schools of the nation for “mediocrity“, he absences the one quango with the most power to facilitate change – for good or ill – his very own organisation. This irony was brilliantly put by Matthew Taylor, from the RSA, here. One crucial observation made by Wilshaw, with little evidence beyond anecdote as I can detect, was that there should be a return of national tests at 7 and 14. This will, apparently, usher in the return of a rigour that can help us return to the top of international edu-league tables. Only, like much of Wilshaw’s headline-friendly utterances, once the dust has settled from his personal policy suggestions, his ideas appear to be all bombast and little ballast for our schools.

I have been researching a great deal into curriculum models and testing over the last few months. We are due to usher in a new curriculum and many teachers on the ground are having to look beyond the short-termism engendered by quangos like OFSTED to create a curriculum fit for our students. As Wilshaw himself said: “Our education system should be run for the benefit of children, and no one else.” Crucially, such national tests become flawed because their purpose moves away from enhancing the quality of teaching and learning. In our school system, national tests become a tool for accountability systems, like league tables and our old friend OFSTED (Michael has unveiled another irony here), and not for learning and developing knowledge. The teaching becomes skewed and quality learning becomes diluted and is replaced by question spotting and teaching to the test. ‘Goodhart’s Law‘, invented by Charles Goodhart, applies here. His application was for economics, but more generally, Goodhart’s Law reveals the truism that as soon as you define something as a measure – like aged 14 SATs becoming a national benchmark for schools – then it loses all value as a measure. In our accountability obsessed culture, assessment becomes distorted.

National testing is always going to be a rather dissatisfying blunt tool. Whilst necessary at 11, 16 and 18; selecting more national tests, rather than high quality internal school based testing and formative assessments, would appear a mistake. Getting rid of vague, generic levels was a very positive move. We should instead help share high quality models of assessment across schools and within and adapt them for our own context and curriculum. The latest DfE document on Primary assessment and accountability puts the matter clearly thus:

“1.6 As we have previously announced, the current system of national curriculum levels and level descriptions will be removed and not replaced. Our new national curriculum is designed to give schools genuine opportunities to take ownership of the curriculum. The new programmes of study set out what pupils should be taught by the end of each key stage. Teachers will be able to develop a school curriculum that delivers the core content in a way that is challenging and relevant for their pupils. Imposing a single system for ongoing assessment, in the way that national curriculum levels are built into the current curriculum and prescribe a detailed sequence for what pupils should be taught, is incompatible with this curriculum freedom. How schools teach their curriculum and track the progress pupils make against it will be for them to decide. Schools will be able to focus their teaching, assessment and reporting not on a set of opaque level descriptions, but on the essential knowledge that all pupils should learn. There will be a clear separation between ongoing, formative assessment (wholly owned by schools) and the statutory summative assessment which the government will prescribe to provide robust external accountability and national benchmarking. Ofsted will expect to see evidence of pupils’ progress, with inspections informed by the school’s chosen pupil tracking data.”


Of course, there is the essential point that each school is subject to the statutory requirements of external accountability. Only, with the reality of the enacted curriculum in the classroom, testing can distort the raising of standards – students get trained to beat tests and deeper learning is too often forsaken. We have moved too far away from assessment being for learning and instead it becomes a tool for school accountability. Tim Oates and Sylvia Green summarise the argument:

“Many have argued that the publication of national league tables in England and the pressure this places on teachers, schools and students has had a detrimental effect on teaching and learning because the accountability function impedes the ability to use assessment as an integral part of the learning process, thus placing the teacher in a context with strongly competing imperatives.”
Tim Oates and Sylvia Green – Cambridge Assessment http://www.cambridgeassessment.org.uk/Images/109749-how-to-promote-educational-quality-through-national-assessment-systems.pdf (p6)

They highlight further flaws in both the stability and impact of national testing:

“We investigated the stability of test standards at ages 7, 11 and 14 years of age in English, maths and science from 1996 to 2001 and reported varied findings across age groups and subjects with some tests appearing more lenient and some more severe over time. The extent of the variation immediately raised questions in respect of the claims being made for significant gains in national attainment.” (P4)

“There are a number of other negative impacts of national testing which are well-rehearsed in the literature. There is evidence to suggest that increases in performance are often found when high stakes tests are introduced because teachers and students become familiar with the test requirements rather than as a result of real improvements in learning. Negative effects occur when too much time is spent on memorisation, question spotting and test practice to the detriment of positive teaching and learning.” (P4-5)

In his role chairing the ‘Expert Panel for the National Curriculum review’ (also comprising Professor Mary James, University of Cambridge; Professor Andrew Pollard, University of Bristol and Institute of Education, University of London; Professor Dylan Wiliam, Institute of Education, University of London), Tim Oates also highlights further concerns about the blurring between assessment and accountability systems:


“4.21 In addition, we are concerned that an instrumental attitude, which values test and examination results and certificates as ends in themselves, has become increasingly evident in the English system. This diminishes the priority that should be given to ensuring that the underlying learning being accredited is deep and secure. In order to mitigate this narrow instrumentalism in learning, urgent attention will need to be given to relevant control factors, particularly assessment systems and accountability measures affecting all schools. If assessment and accountability systems are to be valid, they need to represent all valued learning outcomes not just a narrow subset of them. In this context, the role of Ofsted and school governors in ensuring that a school’s curriculum is broad, balanced and fit for purpose will be crucial.”


This research on the National curriculum proves it is not incompatible with testing at 7 and 14, but if the recommendation to remove levelling out of KS3 in particular is to bear fruit, then a nationally standardised assessment will likely pull in the opposite direction. The enacted curriculum will be one of following the new levelling system proposed for the test, regardless of the local curriculum devised with the apparent freedoms being offered to teachers. It stems from an accountability system driven of fear.

On the surface, this headline grabbing solution by Wilshaw seemingly appeals to the notion of standards, only the reality is that having more challenging tests will not improve learning or engender a system wide improvement in classroom teaching. No more than furnishing my garage with a fleet of heavy weights will turn me into a strongman. The NC expert panel report recognises that feedback in the form of grades and levels is “too general to unlock parental support for learning, for effective targeting of learning support, or for genuine recognition of the strengths and weaknesses of schools’ programmes“. As Russell Hobby asks, why are we spending £100 million a year on Wilshaw’s quango to do lesson observations if by his words we can rely on test data to raise standards? Indeed, what is the point of OFSTED if not to ensure a school’s curriculum is fit for purpose?

Now, I’m not against testing per se. I think the ‘retrieval practice‘ students undertake with a summative test can actually aid learning – see my post on the testing effect here, supported by the research here. Only, if the purpose of the test is confused with national accountability measures, then the high stakes assessment changes its nature and purpose. Teachers will inevitably teach to the test in ways that can inhibit the deep learning proposed by Tim Oates and his expert panel.

In our English department we can plan going about creating our own ‘fit for purpose’ annual tests (see our KS3 draft curriculum design here and our proposed assessment model here. These tests can be designed to reinforce the knowledge we have focused upon across the school year. The degree of accuracy and the general efficacy of the test can prove much more successful than any national test. Ironically, there would be more tests, but they would be in purposeful conditions and give us useful assessment information. We can report to parents with accuracy and formative language, unimpeded by a generic national framework of numbers that then trumps all other forms of feedback.

In summing up my argument I return to the words of the true experts on assessment and their review document, Oates and Wiliam et al.:

“If assessment and accountability systems are to be valid, they need to represent all valued learning outcomes not just a narrow subset of them. In this context, the role of Ofsted and school governors in ensuring that a school’s curriculum is broad, balanced and fit for purpose will be crucial.”

Perhaps Michael Wilshaw should leave the DfE to focus upon devising a national curriculum and purposeful statutory testing, whilst he ensures the role of OFSTED is proving effective in helping our so-called “mediocre” schools improve. Otherwise, it may prove that his organisation is not for for purpose.