The knotty issue of assigning curriculum levels to assessment results

21 September 2016

This is the fifth in a series of posts about making progress in science. The series is based on a talk I gave at SCICON in July 2016 and responds to the pressure many science leaders are facing to report their students’ progress against curriculum levels. Today I want to get to the very heart of the issue. How do we equate different measures – for example test or assignment results – with levels in the curriculum? The four traditional subject strands of the science learning area are organised around concepts, which in turn are arranged under broad thematic streams (ecology, using physics etc.). How do we determine where students sit against curriculum levels which basically constitute a way of organising science ‘content’?

We do have one standardised assessment measure in New Zealand that reports students’ achievement against the levels of NZC. The careful process followed is food for thought, given the somewhat arbitrary processes many teachers are forced to use. I’m talking about NMSSA – the National Monitoring Study of Student Achievement. This programme provides the government with information about students’ progress across the whole curriculum, in Year 4 and year 8. Learning areas are assessed two at a time on a rotating cycle. The study uses a ‘light sampling’ approach and one of its innovative features – inherited from its predecessor NEMP – involves individual or small groups of students completing assessment tasks face-to-face with a trained teacher assessor. You can read about the first round results for science here. One component of that first round involved the completion of a pencil and paper survey. One focus of this assessment was communicating in science. Another was the students’ levels of science knowledge. The scale below shows how the assessment developers modelled dynamic relationships between communication capabilities and knowledge of science concepts.

Student responses to the 57 assessment items were used to construct the summary scale shown above. A full size version can be accessed on page 18 of the 2012 report.

Notice that students’ knowledge and their communication capabilities have been broadly grouped into three main bands, and that each band is aligned with a statistically determined range of scale scores. To illustrate the nature of the progression described in this scale, the image below pulls out a statement about the same specific aspect from each band.

This is the detail I pulled out from the scale for my talk at SCICON.* The heading reflects the focus of the whole test but I have selected just one thread to display here.

* The goal post clip art is from Shutterstock and the increasing size is meant to signal that the goal is getting bigger/ harder.

The scale descriptors here broadly reflect the achievement objectives from the ‘Communicating in Science’ NOS sub-strand of NZC. There is an evident sense of progress, but the quantitative measure is still expressed in terms of ‘scale scores’. These are not linked to the curriculum levels in any obvious way so how did the assessment team carry out the equating exercise that let them produce the image below? The full answer to this question is included as Appendix 3 to the 2012 report.

In brief, a team of eight science education experts took a full day to do this. Working with booklets of selected items ranked in order of difficulty (as completed by students) they individually decided, and then collectively debated, the placing of ‘cut points’ at which one curriculum level ended and the next began. The image below, which shows overall progress of the whole assessed cohort at year 4 and at year 8, could only be completed once this equating exercise had been done by the experts.

Notice that there are no hard and fast distinctions between level 1 and 2, or between level 3 and 4. The curriculum does not differentiate between these levels so the science education experts rightly felt they had no mandate to do so.

You might be wondering whether the exercise was worth the considerable effort and cost involved. The real value becomes apparent when we compare the relative progress of the two cohorts. At year 4 students are broadly on track in making progress against the NZC levels but by year 8 many are falling behind where they would ideally be. No doubt you can think of a range of reasons for this pattern. Because the assessment measure was so carefully designed the data also have some illuminating things to tell us.

My colleague Chris Joyce investigated overall patterns of answers to see if she could find any specific areas where students were not making expected progress. One aspect stood out – it’s captured in the ‘goalpost’ graphic above. Many year 8 students were still creating everyday representations to convey their science ideas, even if they were specifically asked to read or create representations that require the use of science conventions (e.g. drawing a food chain, where it is important that the arrows follow the direction in which energy flows).

This type of insight can be useful information for teachers who are seeking to lift achievement so that students make meaningful progress gains. NMSSA has a clear influencing agenda (the government needs to know whether and how year 4 and year 8 students cohorts are making progress indicated by the curriculum ). But, as we see here, a well-designed measure can meet an influencing agenda and still give teachers useful information for formative assessment purposes. One key to reconciling these competing purposes is that the curriculum linking exercise is robust and defensible – and as I’ve sketched here that takes time, money and deep expertise. I think it is unreasonable to expect individual teachers and schools to try and reinvent this wheel. What do you think?

Next week I’ll stay with the primary/ middle school years and look at how progress is modelled in the second of the international tests in which New Zealand invests. The spotlight will be on TIMSS (Trends in International Mathematics and Science Study).