Making progress in evidence-based reasoning: what part do contexts play?

16 September 2016

cover of science thinking with evidence test

This is the fourth in a series of posts about making progress in science. I’m shaping some thoughts and questions around a dilemma I know that many science leaders are facing when they are put under pressure to report their students’ progress against curriculum levels. What sort of measures should they use? How should these be aggregated into one overall judgement? What tells them where and how that measure equates with the curriculum levels of NZC? If you’ve been following the series you’ll already be aware that these are not questions with easy answers. But I hope I have begun making some inroads into possible ways to address the measurement challenges they put under the spotlight.

My strategy so far has been to glean indicators of possible progressions from an international assessment programme that claims to measure what NZC says we should value in science (i.e. informed and active citizenship). In the second and third posts I’ve looked at two different aspects of critical thinking that can be teased out from the detail of the PISA science scale. These aspects are causal reasoning and argumentation. Today I add contexts to the mix and introduce one of NZCER’s own assessment tools: Science Thinking with Evidence.

I would broadly group both causal reasoning and argumentation under the general cross-curriculum umbrella of evidence-based reasoning. This idea begs an important contextual question – reasoning about what? If we take the citizenship challenge laid down by NZC seriously, we’d have to answer that students should be gaining practice in developing their critical reasoning capabilities in the messy, uncertain contexts of real-world dilemmas. These implicate science but also extend beyond its boundaries.

I try to keep an eye on the work of Dana Zeidler, who has a long track record of critical research about student exploration of socio-scientific issues. He has recently argued that we deprive students of opportunities to develop really important aspects of capabilities for citizenship if we only draw on science concepts to support them in reasoning practice. Doing this leaves out opportunities to explore values and ethics, to develop a personal sense of responsibility, and to wrestle with issues of fairness, for example how decisions might impact different groups. In short, students need to develop moral reasoning in tandem with science reasoning, and to do this they need to explore rich and complex contextual dilemmas.

Where does this leave us with the measurement/progress question? Should we seek out assessment tools that set evidence-based science reasoning in social contexts? The argumentation progressions I discussed last week don’t do this – their context is particle theory. I don’t think PISA does this very well either. Even though questions are set in context, each context has to be relevant in so many different nations that they tend to be rather bland. Does this matter?

Science Thinking with Evidence (STwE) does ask students to engage with contexts as they think and reason. Admittedly the contexts are comparatively easy to understand, not rich and messy like real socio-scientific dilemmas. This is one of the limitations of a test format. But even within these constraints we could see that students need to slow down and be disposed to think carefully before they answer, or they can fall into reasoning traps. Some of these traps were quite unintentional. For example, taking our cue from an earlier PISA scale, we wanted to see if students could weigh up two or more data sources and then choose the one most appropriate to answer a question. Here’s the example I used in my SCICON talk, abbreviated for presentation purposes:

These are the two data sources. The context is the annual garden bird survey – a citizen science project managed by Landcare NZ.

Notice that each data set tells a different story about the same group of birds.

The question and pattern of responses from the trial students are shown below. (Green is the correct answer.)

Which bird is seen in the largest numbers in gardens?

A. Silvereye

B. Sparrow

C. Starling

D. Blackbird

You should be able to spot what’s happened here. Students who read the graph (in haste?) arrived at answer d). To arrive at answer a) they needed to read the table. Being disposed to slow down and think carefully is often important when citizenship issues are being considered. So although we didn’t intend this to be a trap, it actually assessed something important that we could so easily have missed, if the questions had been about graph or table reading skills out of context.

Every STwE question is accompanied by formative feedback to help teachers think about why their students might have answered as they did – and then support them to develop stronger reasoning skills as needed. This was our influencing agenda. But can the STwE tool measure overall progress? The four STwE tests have been statistically modelled onto a common scale. This provides an opportunity to track students’ progress in developing evidence-based reasoning over time, as the graphic below illustrates. (NZCER produces this tool so you might prefer an independent view on whether the tests do what they claim to do.) One limitation is that the tests broadly cover years 7-10. However a year 4-6 test series is in development and should be available in 2017.

The graphic below displays summary data from the large-scale trial when STwE was first developed. As a general rule of thumb, the higher the scale score (vertical axis) the stronger the student’s evidence-based reasoning skills.

We can see here that our students are making overall gains year on year. Of course this summary pattern tells us nothing about the progress trajectory of individual students but that can be worked out if progress is tracked over time. Now that STwE is on NZCER’s bespoke assessment platform, reports can be generated automatically.

This data was collected from a randomly selected group of trial schools. The overall pattern raises the question of what the apparent ‘progress’ should be attributed to. Is this just showing maturation of students’ thinking over time? How much more progress might they make if this type of evidence-based reasoning was an explicit focus of teaching and learning? Some teachers are using STwE to capture that shift as we intended when we designed a set of four progressively more challenging assessments modelled onto one common scale.

The STwE resource shows us that it is possible to develop assessment tools that track progress over time. It also shows that within the constraints of the assessment format, it is possible to focus on evidence-based reasoning for citizenship, in contexts with clear links to science. However it is neither quick nor cheap to develop a tool like this. And it is still not linked explicitly to the curriculum levels of NZC. We had no defensible basis to do that using the development process we followed, and with a main focus on the rather generic achievement objectives in the NOS sub-strands. So that’s my challenge for next week when I introduce some assessment processes used in National Monitoring Study of Student Achievement (NMSSA).