Feedback has always been all the rage!
In all my years of teaching I think I know instinctively the things that make the difference to learning. Once such tipping points of learning for me has been those small, nearly unremembered, chats with students about their work. The subtle hint to change course; the exasperated plea to structure their writing; the nuanced tweak for an already well crafted piece of writing. It has been my ‘educated intuition‘ that those conversations made a difference for my students. It was an intuition that was broadly shared by my English colleagues. So much so, we enshrined such ‘one-to-one‘ feedback, to give it a rather more formal title, in our lessons and in our feedback policy.
We thought it worked. Our intuition and teacherly wisdom led us to create such oral feedback weeks in each term. We began in earnest last year. Our path as a department is well charted in these following posts:
Strategies for improving written feedback
Strategies for improving oral feedback
Moving from a ‘marking policy’ to a ‘feedback policy’
A particular focus for the department has been not only to improve the quality of feedback, but to also make a concerted effort to improve the quality of student responses to feedback. For us it has culminated in an effort to embed DIRT in our English lessons: Dedicated Improvement and Reflection Time. We have therefore been trialling DIRT strategies as a concerted departmental focus since last year. DIRT was therefore embedded into the core of our our feedback weeks.
It all sounds a flawless plan I think you’ll agree! Only, we found it wasn’t quite perfect!
We found our ‘One to one weeks’, where we endeavoured to have a feedback session with each individual student, were good opportunities for DIRT time, but that students needed lots of scaffolds, targeted support and modelling. More than we expected. Cue hasty replanning. Also, some groups were trickier to manage than others. It meant our feedback weeks weren’t all smooth sailing. Some of us found them untimely in terms of when the feedback week occurred and we needed to tweak them at the very least. Perhaps get rid altogether?
Our ‘educated intuition’ was tested. Frankly, we were unsure whether such ‘one to one’ conversations, one of the important cogs in our DIRT strategy, was making any impact. Could such a small unit of teaching make a worthwhile difference? Added to the complications of the new strategy, we needed to know whether the time invested was proving worth it.
Put simply, did it work?
Late last school year I was approached by my Head, John Tomsett, about the potential of doing some research (aided by the brilliantly helpful Jonathan Sharples). The opportunity was perfect timing. We could go some way to investigating whether our feedback weeks worked in our school- and in our context.
We had a question and we wanted some tangible evidence to challenge or support our intuition. We used the Education Endowment Fund guide to help us undertake a small, but precise matched trial that tested the impact of such feedback sessions. Let me take you through the steps:
We had a question we wanted to ask:
Here is the outline of the controlled trial:
Here is how we attempted to match the trial. We took similar groups and got two great teachers, Helen and Claire, to plan and deliver a writing unit, assessed with our own criteria. Crucially, our criteria is assessed for three writing strands: Writing 1 – for purpose, audience and style; Writing 2 – for structure of paragraphing and sentences; Writing 3 – for spelling, punctuation and grammar. We wanted to see if ‘one-to-one’ oral feedback impacted on any or all of these strands when students were asked to do equivalent review writing tasks.
One group, the ‘treatment group‘, was given the extra ten minute ‘one to one’ feedback to supplement and explain the written feedback. The ‘control group‘ were just given the written feedback. Both classes undertook the same DIRT strategies, and further supplementary tasks, during the feedback week:
Here is the evidence. On the right had side you can see the sub-level improvement for each writing strand. The conspicuously green column is Writing 1 – writing for purpose, audience and style:
And here is the summary of data (ably supported by our Subject leader of Maths, Mike Bruce):
So the ‘one-to-one’ oral feedback weeks did work. The particular focus for their success was not spelling, punctuation or grammar (as you would perhaps expect, given that these are long-term issues), but the significant success was focused in on how the teacher talk could illuminate for students how to write with a more fine-tuned style and crafting for the specific audience. It confirmed what was our educated intuition. One ten minute strategy, seemingly not too significant, added a huge amount of progress.
We were quietly pleased! We shared our findings and we endeavoured to continue with ‘one-to-one’ feedback, enshrining it in our practice, but with some tweaks. Gone are the timetabled weeks, fixed thought the year. Instead, we are retaining the ‘one-to-one’ feedback, with the recording of it in writing in their books etc., but we are more flexible about when we undertake the feedback. DIRT too is something we are honing and modifying as we teach. The matched trial gave us evidence, but coupled with the experience of actually delivering the feedback, we still needed to make adaptations. The classroom certainly isn’t a control room for experimentation!
The whole process of testing what works has been really illuminating – shining a critical light on our intuitions and testing our biases. The evidence and the research is far from flawless, but the learning is the thing. I would encourage anyone to trial the Education Endowment Fund Toolkit and test what works.
What we choose to teach and how we choose to teach matters. We should be mindful of what works in schools across the world and across our local region, but we should aim to reflect and test what works for our unique students in our unique school context.
Interesting post – thanks for sharing. I wholeheartedly agree that we should aim to examine “what works for our unique students in our unique school context”. I think this is a good example of an attempt to do just that, and this is where some of larger and eye-wateringly expensive trials the EEF is carrying out fall down, IMO. That said, we should aim to scrutinse any attempt at education research, so I have a few questions, if I may?
1) Am I correct in thinking that the 2 groups received identical written feedback and undertook identical DIRT activities, with the only difference being that the experimental group each received an additional 10 minutes of oral feedback? If so, can this really be said to be a fair test? If the comparison is “one type of feedback versus two”, what’s the difference between this and “no feedback versus some”?
2) Did the oral feedback take place within DIRT lessons? If so, how many lessons did this take up? ie, does this mean 29×10 minutes = 5 hours in which 28 students received no instruction? What activities did the control group do in the equivalent 5 hours? Should you have done testing to compare differences between the two groups in this domain also, in the interests of combatting / addressing bias?
3) What stats did you run? How did you arrive at the 4 months figure? Does this mean to suggest that the students made 4 months worth of progress in 10 minutes?
4) What format did the oral feedback take? DId it primarily focus on ‘Writing 1’ type activities? – ie is there a clear relationship between the input and the output? Why do you think Writing 2 and Writing 3 didn’t increase to the same extent?
5) How was the work assessed? Was it blinded – ie did the markers know whether they were marking the control or experimental group?
1) Yes. It is effectively conflating the written and oral feedback. In many ways it is about what I would see as ‘ideal’ feedback: written comments communicated orally to iron out the nuances and to explain the details of highe written comments with the works here to talk through.
2) The DIRT week took up three hours in actuality, so each 1-2-1 wasn’t exactly ten minutes, many proved shorter. We didn’t want to enforce an artificial exact time on the oral feedback. They did DIRT activities in the first lesson then they were given the same extension activities in the subsequent lessons to combat bias. I reckon this is the point where you simply cannot get exactly matched lesson provision.
3) We used our own sub-levels to get a sub-level differential. The effect size was calculated by John Tomsett, using the EEF guide I think. It doesn’t mean four months progress solely due to the ten minutes, but in the course of the month intervention – the ten minutes accelerating the progress.
4) The 1-2-1 feedback had the teacher, Helen, sit with the student and go through their work (pre-test) and talk through the details of targets etc. they talk through writing, 1,2 and 3 in the one conversation. I think it proved for us that you. An better explain style features (writing 1) by showing them their walk and unpicking he nuances of their writing. Writing 2 and 3 had lots of technical grammar points, such as issues with punctuation for example, which isn’t remedied in short a short space of time as a couple of weeks. Also, writing 1 was quite domain specific to the genre they were writing, whereas structure and SPaG is more general and I don’t think students are able to quickly grasp their application in a similar piece of work as well as grasping the features of the genre such as style.
5) The assessment wasn’t blinded. They marked five random from each, to moderate their marking, then marked their own and then moderated again, which was subsequently checked by me for even standards of marking.
I would say that there are many flaws in the process and the reporting. Isolating any one variable is of course very difficult. The calculation of an effect size is also problematic. That being said, we learnt a huge amount from undertaking the trial and I am looking forward to go at another one, with a larger scale, learning some lessons from what we have done here.
Thanks for clarifying. I agree, you learn a huge amount from doing research – often about things other than your intended research question, and in particular about the process of doing research itself. This is an interesting early attempt at a quasi-experimental study – however there are a number of significant methodological issues with the study as you describe it here. If you think giving oral feedback is a good idea based on ‘educated intuition’ then I think you should absolutely go for it. However if you want to subject it to the rigours of education research – to objectively evaluate whether and how it enhances student learning – then I think you need to address a number of issues. In particular, the Control vs treatment groups could and should be much more closely matched. For example, the very fact that the student had had a 10-minute one-to-one chat with their teacher – a pretty unusual occurrence in a lesson – may have made them feel more attended to, leading them to put more effort into their work that week, in an effort to pursue further positive individualised feedback for example. I know this may seem pedantic, but it’s important – I wouldn’t suggest that this would be solely responsible for the improvements you found, but it could have a sizeable impact in and of itself, and you need to rule out any extraneous variables like this if you are to make claims about the impact of an intervention. So for example the control group could have a 10-minute one-to-one chat about something else that would be useful to talk about, but does not centre on feedback about that particular piece of work. In addition the two groups should really have the same teacher – a structural issue which makes things incredibly difficult I know, but in terms of a supposedly matched trial it’s huge. The nature and content of the preceding lessons and feedback should also be highly controlled and comparable.
There are also a number of features of high quality education research missing here. For example, statistical analysis – 29 kids is easily enough to run stats on, and easy to do. I know it’s an EEF thing, but measuring progress in terms of months doesn’t make sense to me at all. I do however put some faith in statistics – are these gains statistically significant or could they be attributable, at least in part, to chance? Also triangulation – quantitative analyses should always be subject to qualitative interrogation, to tease out the story behind the figures. For example you could have interviewed a sample of experimental group students afterwards – those for whom the intervention had the desired effect, and those for whom it didn’t. What reasons do students give for whether their writing improved or not? And the blinded marking – this is very simple to do, and is absolutely necessary if you are serious about overcoming your biases.
I think it’s great that teachers are starting to engage in research. This seems to be a noble attempt to do just that, and I would strongly encourage you to continue and learn from your experiences. But this is a serious road to be going down, and we need to be hard-headed about doing it right – and that includes things like sharing a detailed methodology, running stats, and subjecting your study to peer review.
You say “the evidence and the research is far from flawless, but the learning is the thing”. I think this is where teachers and education researchers would part company (I am both, which makes me internally conflicted =). In a blog post where you are sharing data – the methodology is the thing.
Thanks for posting, and for doing so openly and transparently, and, frankly, I admire anyone who takes their practice so seriously as to engage in meaningful research with it. It’s hard and demanding work to conduct research, and the more educators commit to gathering, assessing and being driven be evidence the better our practices, communities
As a function of changing how I taught, I started to engage in more structured one 2 one feedback sessions with students, focusing on work they had done, and aims and goals, agreed on roadmaps based on those feedback sessions etc etc.
Certainly, students seemed to value the experience, my evaluations got better, and student focus, persistance and attention seemed to increase, but my experience is purely anecdotal, and hardly counts as evidence. So I’m extremely curious to get an evidence eyed view of it. Thanks for filling in part of that picture.
A few things strike me about your research. I’m curious as to whether there was a qualitative aspect, More specifically, I think the focus on outcomes is critical, and foundational in terms of evidence, but I’m also curious as to what students felt about the process. What type of feedback did they value most, how did their practices, ideas and engagements change, would they have felt they deployed more effort and time in pursuit of learning outcomes? Did you collect any qualitative data (I’m not saying you should have, I’m just curious).
Thanks, once again, for doing the work and posting.
Pingback: 365 days in my shoes Day 297 | high heels and high notes
Pingback: Can we ever praise too much? | Meols Cop High School
Pingback: A Cautionary Tale of Educational Evidence | HuntingEnglishHuntingEnglish
Pingback: A Cautionary Tale of Educational Evidence | The Echo Chamber
Pingback: Playing left-handed: How do I improve as a teacher? | Improving Teaching
Pingback: #NTENRED Workshop: Driving Research through the Department | meridianvale
Pingback: Assessment | PGCEPhysicalEducation