Demonstrating their science capabilities: (how) are students making progress?

7 October 2016

This is the seventh in a series of posts about making progress in science. If you haven’t been following the series it might help to know that I’m responding to a dilemma many science leaders in New Zealand are facing. At both primary and secondary levels teachers are being put under pressure to report their students’ progress against curriculum levels. Middle leaders are then required to report this progress to their senior leaders/ the school Board of Trustees.

I’ve been running a dual agenda as I respond to this challenge. On the one hand I’ve tried to demonstrate why I think this is an unreasonable requirement. We simply don’t have the assessment tools and reporting processes that would allow teachers to meaningfully assess and report specifically against the levels of NZC. On the other hand, it is possible to glean some really interesting insights from the way progress is modelled in the assessment tools that are around – and in some cases from careful, detailed research about the nature of progress (e.g. in developing argumentation capabilities).

In my previous post I briefly raised the challenge of more participatory assessments – i.e. those that allow students to show what they can do with their learning. I also pointed out that it is not possible to create a single list of prescribed content to be ‘covered’ at any one curriculum level because NZC requires teachers to weave the NOS and content strands together. The aim is to design meaningful learning for their students in contexts that have meaning and challenge for them in their own lives. The capabilities resources on TKI were designed to help teachers achieve this weaving task with ‘citizenship’ purposes for learning science in mind. But can they tell us anything meaningful about what making progress in this sort of participatory learning might look like?

The short and not very helpful answer to this question is ‘no’ – at least if we value evidence-centred assessment design. The box below shows how I summarised this way of creating assessments in my SCICON talk. (If you follow the hyperlink you’ll see that this summary glosses over considerable complexities in the full process):

An evidence centred assessment architecture

Select and develop tasks based on construct
Present task to learners
Learners generate evidence with respect to constructs
Evidence used to make inferences about construct of interest

Robert Mislevy was a key developer of evidence-centred assessment. I gave him the last word in the first post in the series. Writing for the Gordon Commission on Assessment in the USA he cautioned that this sort of work takes time and deep expertise, and can’t be done on the cheap.

All the candidates for progression introduced in the series so far have been based on students’ actual responses to assessment tasks – i.e. they were based on evidence. In every case a framework was developed to model relationships between the various pieces of the ‘construct’ in question before the assessment tasks were designed. We don’t yet have a substantive body of achievement evidence for the capabilities. And we are only just beginning to understand how the five capabilities interact with each other in a ‘whole’ task performance. The idea is too new and gaining these sorts of insights will take time (assuming that we even ask the right questions and gather the relevant evidence in a systematic, robust manner).

The capabilities resource developers responded to this challenge by pointing out that the design of the assessment tasks can make for more and less demanding demonstrations of learning. For example on this page they summarise features that make some learning contexts and tasks more demanding than others when students and gathering and interpreting data. The slide below summarises some detail I pulled out from this page. The developers compared tasks at the two ends of a L1/2-L5 continuum to create this task design guidance.

This is a snapshot of detail from information that contrasts features of tasks that allow for more straightforward demonstrations of capability and those that are more demanding.

The goal post clip art is from Shutterstock and the increasing size is meant to signal that the goal is getting bigger/ harder.

In 2015 my colleague Ally Bull carried out a small exploratory research project to investigate students’ demonstrations of their science capabilities in a small number of specific learning/assessment tasks. The full report is here.

Rather than trying to develop detailed and specific scales of the sort I’ve outlined for PISA, TIMSS and NMSSA, Ally created a small set of descriptors of the sorts of capabilities students might demonstrate at L1/2, L3/4 and L5 of NZC. In the slide below I have pulled out one interesting set of ideas that has a distinct participatory spirit.

Notice that this set of criteria focuses on how students contribute to classroom discussions, when the focus is on exploring and improving ideas. Rather than assessing individuals in isolation from their peers, this type of judgement requires interaction and collaboration.

Notice also the references to the quality of open-mindedness. This is arguably really important for critical and active citizenship and it is unlikely to be captured in traditional assessments.

Making judgements in relation to a set of criteria like this can’t be done using one discrete moment-in-time separate assessment task. Evidence would need to be gathered over time, perhaps in an e-portfolio. Students themselves could help choose, and could comment on, such evidence.

The process I’ve briefly sketched here fits much more closely with assessment for learning than with summative assessment. This was precisely the emphasis recommended by the UK Commission on Assessment Without Levels that I introduced in the first post in this series. I said then that you could consider using this report to push back against pressure to report against curriculum levels when we don’t (yet) have the tools to do so in science. I’m going to give them the last word:

"Levels were never designed to capture formative assessment, but they frequently came to be used this way, which often distorted the purpose of formative assessment and squeezed out certain valuable tasks which were not amenable to levelling."

Commission on Assessment without Levels, UK, 2015