Potential of assessment resource banks as sources of information on student performance and for curriculum evaluation
Paper presented at the 26th International Association for Educational Assessment (IAEA) annual conference, Jerusalem, 14-19 May 2000.
THE POTENTIAL OF ASSESSMENT RESOURCE BANKS AS SOURCES OF INFORMATION ON STUDENT PERFORMANCE AND FOR CURRICULUM EVALUATION
Paper to 26th IAEA Annual Conference, Jerusalem
May 14–19, 2000
Chief Research Officer
New Zealand Council for Educational Research
Assessment Resource Banks (ARBs) in mathematics, science and English have been developed at the New Zealand Council for Educational Research, mainly for school-based assessment purposes. The ARBS became available to New Zealand schools via the Internet in March 1997. Since that time the number of available assessment resources has increased and the style has broadened to include many performance assessment tasks. All resources in the ARBs are linked to curriculum statements and there is a search engine which enables teachers to retrieve resources that match their teaching.
In recent times the ARBs have been seen as a possible source of the type of information gained in some countries by national testing. This view of the ARBs was prompted by a proposal in 1998, to introduce national cohort testing in New Zealand primary schools. Using the ARBs as a source of national data becomes feasible as the banks expand and more data becomes available on students’ achievement. This paper considers the potential of ARBs to provide information on student performance in New Zealand and to contribute national pool of data for curriculum evaluation purposes.
1. The assessment resource banks in mathematics, science, and English
The Assessment Resource Banks (ARBs) are collections of assessment resources located on the Internet. They are organised to match the structure and terminology of New Zealand curriculum statements in mathematics, science, and English, for levels 2 to 6. The ARBs have been designed and developed at the New Zealand Council for Educational Research, (NZCER) under contract to the New Zealand Ministry of Education. Mathematics and science resources became available in relatively small numbers in March and May 1997 respectively, with English being added in September 1998. As at April 2000, there are 885 mathematics resources, 920 science resources, and 245 English resources available.
The ARBs contain a broad selection of assessment material, most of which emphasises the need for students to write answers and complete or construct graphs, tables, and diagrams. Practical resources in science and mathematics are also included. This is a departure from assessment materials published previously in item banks, where multiple-choice items (which were easy and quick to mark, and from which performance data could be readily calculated) were used almost exclusively.
The ARBs are not developed as alternatives to a school’s own assessment procedures but as a source of complementary material. They provide additional resources to help teachers and schools judge the relative performance of their students against the "typical" performance of national samples of students at given year levels. A strength is that the assessment materials prepared for publication in the ARBs are trialled co-operatively by assessment specialists and teachers, then chosen by individual teachers to represent their own teaching objectives. This development process assures acceptable levels of curriculum validity and test reliability. The process also provides classroom validity, given that teachers are free to select assessment resources that best match their curriculum objectives and teaching programmes.
To access ARB resources an Internet connection is necessary, along with a recent version of an Explorer or Netscape browser, and an attached printer. All resources are presented in a format that may be printed and photocopied. It is also possible to cut and paste resources electronically, and save ARB files to a word processing package so that the assessment tasks may be adapted to individual needs.
pp. 3–11 are adapted from Croft (1999) and Croft (December, 1999).
Access to the banks begins at the NZCER homepage at http://www.nzcer.org.nz, which leads to the ARB homepage. To use the banks it is necessary to hold a username and password, available from the on-line registration form on the ARB homepage. Staff from registered schools, registered teachers, and staff from teacher support services, the Ministry of Education, and tertiary institutions are eligible for a password. Access is available by arrangement to assessment staff internationally. The ARBs are not designed to be used by students or parents.
2. Development of the ARBs
The ARB project began in February 1993, when NZCER was contracted by the Ministry of Education to undertake a feasibility study. The second stage of the project was to implement the ARBs on a trial basis in 27 schools. The third stage was to improve the search engine developed during stage II, extend the number of mathematics and science resources in the banks, and make these available on the World Wide Web. The stage from July 1997 to June 1999, was to increase the number and range of mathematics and science resources, continue to develop the search strategy, and introduce English assessment resources. The present stage which runs until July 2001, is to expand and broaden the ARBs.
Throughout, the ARBs have been developed within the philosophy of the New Zealand Curriculum Framework (Ministry of Education, 1993) and the assessment strategies outlined in Assessment: Policy to Practice (Ministry of Education, 1994).
Stage I: Feasibility study 1993–94
The focus of the feasibility study was to prepare and trial examples of assessment resources suitable for large-scale and school-based assessment in mathematics and science for curriculum levels 3 to 6. During the initial stages the ARBs were seen to have potential national and school-based uses. An investigation of some allied measurement issues, plus identification of features of a model for computer storage and retrieval, were other aspects of the contract.
The organisation of this first stage fell into four distinct components:
- organisation of the banks,
- content of the banks,
- functions and uses of the banks, and
- reporting performance.
Although the general feasibility of ARBs in mathematics and science was established during stage I, no "banking" of resources or operation of the banks in schools was carried out.
Stage I has been overviewed and discussed fully in Reid, Armstrong, Atmore, Boyd, Croft, and Livingstone (1994).
Stage II: Implementation trial 1994–95
The implementation trial followed five months after the feasibility study. During this period decisions were taken to continue with this work. In common with the feasibility study, the implementation trial concentrated on mathematics and science at levels 3 to 6, with a focus on years 7 and 9—two major transition points within the New Zealand education system. From the outset, the ARBs had been a response to a government policy of the day known as "transition point assessment".
The central features of the implementation trial were that the ARBs were to be introduced into 27 schools for school-based assessment and to simulate national assessment on a regional basis. For school-based purposes, the resources were to be stored, retrieved, and made available electronically to school sites via the Internet and also computer disk. By way of comparison, hard copy material was to be utilised in some schools in combination with the electronic material. Professional development for teachers in assessment was to be incorporated while the ARBs were being implemented in trial schools.
The national assessment regional simulation was to compare three different styles of test. This trial is described and discussed later. This second stage was set within a research design that would enable subsequent policy to be based on sound evidence.
During late 1994 the focus was on preparing and reviewing assessment material for inclusion in the ARBs. For early 1995 the focus was to conceptualise and plan computer software to classify, store and deliver the assessment material to schools. A database was created, and the materials were loaded to it. At that early stage of the ARBs, this proved to be a slow and complex task, particularly in respect of the graphic content of the resources.
Stage II has been covered fully in Croft, Gilbert, Boyd, Burgon, Dunn, Burgess, and Reid (1996).
Stage III: Introducing ARBs to schools 1996–97
The focus of this work beginning in April 1996, was to continue developing mathematics and science assessment resources and make these available to individual schools via the Internet. This stage was guided by results and feedback from the stage II implementation trial. In particular, the search engine was modified to allow a more flexible approach to identifying and retrieving assessment resources. The style of assessment resources within the banks was to broaden, with a conscious decision to increase the number of resources needing a constructed-response form of answer.
As a short-term measure some ARB materials were made available in hard copy. This was shown by the implementation trial to be a poor substitute for the electronic versions, largely because of the loss of the searching and retrieval power of the search engine and the reduced flexibility for updating the resources or introducing new material.
How well New Zealand schools were placed to access the ARBs because of constraints on hardware, plus issues of training and support for teachers in information technology (IT), were important considerations too. They remain crucial to school-based access to the banks today.
The introduction of the banks to schools at stage III has been described and discussed more fully in an earlier IAEA paper (Croft, 1996) and Croft, Boyd, Dunn, and Neill (1996).
Stage IV: Developing additional resources 1997–99
During early 1997, work on exploring the feasibility of expanding the ARBs to incorporate English assessment resources began in earnest. A general conclusion from a survey about this, as summarised from Brown and Strafford (1997), was that with modifications to the search engine and allowing for the particular nature of assessment of English, the system could be adapted to incorporate this third learning area. Accordingly, writing and development of assessment resources in English proceeded during 1997 and 1998, with English coming on-line in September 1998.
Much of this work continued to be guided by results obtained during the 1995–96 implementation trial and from feedback from many participating schools during 1997–98. In particular, the search strategy was modified and the style of assessment resources broadened further to include practical tasks, which require responses beyond paper-and-pencil. Work on this type of resource was undertaken in science first (particularly the curriculum strands of material world and physical world) with practical resources in the strands of geometry and measurement following.
Stage IV is overviewed more fully in a subsequent IAEA paper, (Croft, 1996, May).
Stage V: Expanding and broadening resources in mathematics, science and English 1999–2001
For mathematics, a new focus during this period was to isolate diagnostic information from the trial material and develop a standard template for reporting this information.
More recently, writers of the science resources have begun to review students’ correct and incorrect responses to particular groups of resources, in order to uncover patterns of answers that may give insights about students’ achievement and mastery of certain curriculum areas.
3. The structure of the computerised ARBs
The structure and organisation of the ARBs mirrors the layout and terminology of New Zealand curriculum statements in mathematics, science and English.
The search strategy
At the "heart" of the ARBs is the search strategy that allows users to select assessment resources that are most appropriate to their current teaching. Previous experience with item banks (the forerunner of resource banks) has shown that the quality of the classification system determined the usefulness of the banks for teachers. Millman and Arter (1984) had noted earlier, "Classification is the key that unlocks the item bank." Unless a bank’s contents is able to be retrieved quickly and precisely, it will seldom be used. This is exactly where the ARBs’ search strategy comes into its own, as it allows a specific search with particular assessment purposes clearly in mind.
As the structure of the banks and the associated search strategy reflect New Zealand curriculum statements, resources may be found by searching under the following categories: strand, achievement objective or function, process or integrating skill, curriculum level. In addition, to assist users find the resources they require, two extra categories have been added: resource type and keywords. Details of these six categories follow.
In mathematics, the strands are number, measurement, geometry, algebra, and statistics. In science they are living world, material world, physical world, and planet earth and beyond. In English they are written language, visual language, and oral language.
Achievement objective or function
Each learning strand in mathematics and science has a number of associated achievement objectives sometimes referred to as sub-stands. In some instances the achievement objectives are combined. Each objective or combination of objectives from mathematics and science are represented in the ARBs. The functions in English curriculum are also represented in the ARBs.
Resources presently cover curriculum levels 2 to 6, with most coverage in levels 3 to 5. Typically the ARBs are used in classes covering Years 4–10.
Process or integrating skill
Most mathematics resources are classified by a process skill—problem solving, developing logic and reasoning, or communicating mathematical ideas. The science resources are classified by an integrating strand—making sense of the nature of science and its relationship to technology, focusing and planning, information gathering, processing and interpreting, or reporting. English resources are increasingly classified by process—exploring language, processing information, or thinking critically. In view of the relatively undefined nature of process skills in the curriculum statements, the classification of some individual assessment resources by these fields may be open to debate. Resources that do not represent a predominant process skill are not classified.
In order to help teachers locate appropriate assessment types the resources are classified by the style of response expected from students. The following four types are used in mathematics and science resources:
- Selected response (SR). The response is selected from a range of options incorporated in the resource. Two or three multiple-choice or matching items may be grouped to form one resource. Examples include:
- multiple-choice items
- matching items
- true/false and other alternate-choice items.
- Brief constructed response (BCR). The student constructs the response. Short answers, such as a word or two, a number or two, a phrase, or brief sentence are the essence of a BCR. Correct brief responses will encapsulate a single main idea. Completing entries in tables, graphs, or diagrams constitute a BCR. Examples include:
- short-answer questions
- enhanced multiple-choice items
- completion items for tables, graphs, diagrams, plans, illustrations, etc.
- problem-solving tasks requiring brief, generally structured responses.
- Longer constructed response (LCR). These have the same general characteristics as a BCR but require a more extended response. The LCR resource is generally more open-ended than the BCR, and inferences may be needed to determine relationships within the task. Producing tables, graphs, or diagrams constitute a LCR, as well as. Examples include:
- short essay-type question, structured or unstructured
- preparing a written plan for an experiment, investigation, or practical task
- conceptualising and/or producing tables, graphs, diagrams, plans, or geometric figures
- interpreting in a broader sense diagrams, illustrations, graphs, or tables.
- Practical (PRA). These are based around a performance component involving responses including, but additional to, paper-and-pencil. An investigation may be undertaken, data may be analysed, conclusion drawn, or a product may be completed. Examples include:
- carrying out simple investigations or experiments in science or mathematics
- classifying tangible materials in science
- undertaking measurement tasks in mathematics
- constructing shapes or figures in mathematics.
In English there are six resource types:
- Selected response (SR).
- Short written response (SWR).
- Longer written response (LWR).
The main features of students’ responses to these three types of English resources are similar to those outlined for SR, BCR, and LCR resources in mathematics and science.
- Oral response (OR). The predominant response is oral, although a minor written component may be included. There is no provision to describe the nature of the resource or the type of oral response required. English resources in this category will come mostly from the oral strand.
- Student rating or assessment (SRA). The essence of these resources is that a rating or assessment is undertaken by students. This category makes provision for student self-assessment or peer assessment by way of rating scale, observation scale, or checklist. Resources of this type will be found predominantly, but not exclusively, in the written strand, and will focus on expressive skills in English.
- Teacher rating or observation (TRO). Resources of this type are to be included to assist teachers’ assessments of expressive skills, mostly in the written and oral strands. Multi-level marking guides come within this category, although some multi-level material will be included for LWR resources as well.
Each resource has keywords or phrases designed to describe further the content and predominant skills tapped by the resource. Wherever possible the keywords are directly associated with New Zealand curriculum statements, but, because of variations in terminology for similar concepts, some alternative terms are required.
There is an on-line dictionary of keywords used to construct this type of search. As resources are added to the ARBs, dictionaries are expanded to include new keywords. This means that there is at least one resource in the bank for each entry in the dictionary. The keyword search is a very powerful aspect of the search engine. It is popular with users as it allows a search of the banks to be undertaken by topic.
Searching by Strand, Objective, Level, Process Skill, and Resource Type
A search may be undertaken by any single classification field or combination(s) of the six classification fields. Each classification field (except for keyword) is displayed on screen. Either all, some, or none of the classification fields may be specified. If a user does not wish to specify a particular field, the on-screen box is left as displayed (for example, All Process Types) and a broad general search will ensue. The results of the search may be viewed or printed at any point.
Other special features of the information available on screen include:
- background information on the project;
- suggestions for school-based uses of ARB resources;
- instructions on how to search for resources;
- guidance on bookmarking, downloading, and printing resources;
- directions for cutting and pasting, and downloading resources to word processors;
- a registration page to apply for passwords;
- an electronic feedback form for users to complete;
- frequently asked questions; and
- "navigation buttons" and links to various parts of the search screens.
Further description of the ARBs and examples of resources are contained in Croft (1999).
4. Expanding the ARBs
The overall number of resources available in the ARBs is projected to increase to beyond 2650 in the next 12 months, as the following processes continue:
- identification of curriculum objectives for assessment within the ARBs;
- refinements to the search strategy and retrieval processes;
- writing and development of resources;
- trialling and revision of resources;
- calculation of performance data and identification of diagnostic information;
- editing and publication of resources;
- incorporation of selected published resources from, for example, the Third International Mathematics and Science Study (TIMSS) and the National Educational Monitoring Project (NEMP) into the ARBs.
As the banks expand and offer more extensive coverage of curriculum statements, the more feasible alternative possible national uses of the banks become. Some alternative uses are discussed later.
In addition, a database of student results is being established on the basis of data from ARB trials. This will provide national information on achievement within most strands of mathematics, science and English. Some initial analysis of performance in some science objectives has been undertaken already by Marston and Croft (1999).
- National uses of the ARBs
From the outset, national and school-based uses of the ARBs have been dual elements of the project. Indeed, history shows that the possibilities for using the ARBs on a national basis at the ‘transition points Year 6/7 and Year 8/9, provided the main impetus for initial funding. Only latterly have the school-based uses of the ARBs been seen as the major function of the ARBs at a policy and funding level.
A trial national test
A simulated national testing trial using ARB resources was undertaken in October 1995. The exercise was a one-hour, whole cohort test of students in Years 7 and 9, in the 27 schools involved in the implementation trial. The test items were selected and compiled in one of three ways: Total NZCER Choice; (schools had no choice in the test content and NZCER, simulating the role of a central agency, made the choice of content); Total School Choice; (schools had total choice, selecting items and tasks from those in the ARBs, or nominating the strands and/or achievement objectives to make up a customised test); and 50:50 Choice; (half the content was selected by NZCER [this portion being identical for all schools in this option] and half chosen by the schools, from a range provided by NZCER). Total NZCER Choice and 50:50 Choice consisted exclusively of selected-response items.
Once the results of this national test were returned to schools, teachers were asked to complete a questionnaire. We were interested to find out whether changes in attitude had arisen since teachers had had some actual involvement with some national testing aspects of a working bank, as compared with the feasibility study in 1993, where teachers responded by questionnaire to a descriptive model of the resource banks.
The main findings presented in the report of the implementation trial, Croft et al. (1996), may be summarised as follows:
- In comparison with responses in 1993, there was weakening support for any results to be reported to, and/or used by the Ministry, or anyone outside of the teacher’s classroom. This may have reflected a general lack of support for national testing at that time.
- In comparison with responses in 1993, there was significantly increased support for the Ministry to use the ARB information to improve the allocation of resources nationally, and significantly less support for the Ministry to use ARB information to make public statements about national strengths and weaknesses of achievement.
- The results which were provided to teachers in the total NZCER choice condition received greater support than the others. As the student’s answers were computer marked and analysed in this condition, much more detailed and student specific data was obtained. This more detailed information appears to be of more interest and use to the teachers than just class lists with a handful of summary statistics, such as means and standard deviations.
- Overall, teachers in the total NZCER choice condition appeared to be less satisfied with the whole concept of national testing. This could be attributed to their lack of control over any aspect of the testing process. That is, teachers in the other two conditions had at least partial input into item/task/objective selection, and therefore had some control over the test content. As a consequence of this, teachers may have helped construct a test which was seen as more relevant to their current classroom programme.
There is a paradox in these last two points which does seem irreconcilable.
National uses of the banks was included in a survey about the feasibility of expanding the ARBs to incorporate English by Brown and Strafford (1997). They noted as follows:
. . . a national test created out of ARB resources is feasible, but would be generally perceived as undesirable by the educational community. . . .
School-based uses, on the other hand, are highly valued . . . Resources which can guide teacher planning, diagnose student learning needs, sum up achievement on units of work, moderate teacher judgement by exemplifying widely accepted standards of performance . . . are wanted across the whole gamut of English teachers.
Outcome of the trial
Of the approaches to national testing trialled, the Total NZCER Choice option was considered the only feasible approach. The other two options did not seem viable because of the lack of common information provided, the lead time required to prepare customised tests, the time needed to mark and report them, and costs associated.
However, there are advantages and disadvantages of this model to consider, as outlined in Croft et al. (1996).
- It is the cheapest option;
- All test data obtained are identical, allowing for various kinds of score reporting;
- By careful choice of common items between years and over time, growth and change could be measured;
- Quality, discriminating items would ensure reliability in excess of 0.90;
- Administration of a selected-response type test is familiar to teachers and students and is less influenced by students’ writing skills;
- Machine-scoring allows for prompt reporting/and makes sophisticated analysis viable;
- Tests made up exclusively of selected-response items may not be a valid measure of some achievement objectives and learning strands, or in keeping with the NZ curriculum framework;
- Lowered levels of validity associated with selected-response items, reduce some of the benefits of higher levels of reliability;
- Lowered levels of validity decrease the utility of the data and interpretation;
- If these tests become "high stakes" they it may lead to narrowing classroom emphasis on curriculum areas amenable to selected-response testing;
- Diagnostic information would be limited;
- A single 50-item test cannot sample adequately the outcomes from the five mathematics strands, or six science strands;
- Using only selected-response items as a source of national data is not making optimum use of the resource banks;
The single issue of reduced validity of a selected-response test to assess current curricula is paramount. On the balance, the disadvantages outweigh the advantages in the terms of this model of national testing. Accordingly, there was no recommendation to proceed with national testing
Future possibilities for national assessment
The three approaches to national testing investigated during 1995 were found wanting on a number of counts, particularly in terms of validity. Accordingly, three alternative approaches to gaining national information were outlined in Croft et al. (1996). These were:
- Systematic review of performance data from banks. As the ARBs incorporate data on student performance, they are sources of comprehensive national information, as well as providing useful benchmarks on student performance for school-based uses of the banks. Providing that performance data are constantly updated, and as long as an accurate record is maintained of fluctuations in these data, both a cross-sectional and longitudinal picture may be constructed on a national basis, of achievement within the learning strands covered by the assessment material within the ARBs.
One aspect of the operation of the ARBs is to update performance data on a regular basis, so that users have an immediate national benchmark against which to judge the performance of students. Given that the data are there, it is the systematic review of such data that could provide information on national performance.
With strategic planning to identify the areas or combinations of areas to be reviewed, a comprehensive bank has the potential to provide valid, reliable, and robust data on national achievement. However, as indicated, this is dependent on the maintenance of a comprehensive bank and a representative cross section of schools nationally making use of them. Performance data must incorporate a representative cross section of schools, as if non-academic and academic schools had different patterns of selecting material from the banks, and performance data were calculated from selected samples, later review of the data could be confounded. In part, too, this is an issue for the robustness of the types of numerical scales that may be used to report performance. Nevertheless, this approach, or a modification of it, has strong potential to provide comprehensive national data. It illustrates also, that some common data may serve school-based functions and national uses, equally well.
- Administration by schools of selected materials. Schools could be requested to administer selections of items/tasks to their students. Assuming that the material to be administered is specified by a central authority, there are three main ways this could be undertaken.
- The actual items/tasks could be specified and notified to schools. The material would then be prepared by schools direct from the bank and administered at a set time and date. Scoring could be carried out locally or centrally. Standardised procedures for administration and marking would be required. This would be seen as very close to a national test, but more diverse items and tasks could be included.
- A range of items and tasks could be specified and schools invited to select a given number from within the range and administer these. Given a range of material, scoring may best be carried out locally. This approach could be seen as close to a national test, but the school choice option would give some flexibility. Again, standardised procedures for administration, marking, and moderation/reliability checks would be required.
- Selected strands and/or achievement objectives could be specified and schools asked to report on the attainment of these by selecting items and tasks from within an ARB for their students to undertake. Items and tasks selected could be reported along with student performance on them. By strategic selection of strands and an adaptation of multiple matrix-sampling techniques, coverage of a complete curriculum statement may be inferred.
Each of these approaches are now possible within the coverage of the curriculum achieved at this time.
- Reporting by schools of the levels achieved by students. Banked material could be used by schools for reporting the achievement of students by level and learning strand at a given point in the school year, or over a period of time. These could then be collated nationally, to form an overall picture of performance across the curriculum framework. This approach would provide the most general data nationally, and it appeals as being the one that could be implemented most readily. It would support the importance of the New Zealand Curriculum Framework, and it would illustrate distribution of achievement by level, between and within selected years. If there is an urgent need to collate national data on achievement in mathematics and science, this approach appears to have a number of advantages for a speedy implementation.
These possible alternatives to national testing need further development, so that the positive and negative aspects of each may be clarified. There is a need, also, to move beyond theory to implementation, as it is not until this stage that problems and technical issues are fully appreciated and workable solutions found.
Another national proposal
The New Zealand Ministry of Education Green Paper, Assessment for Success in Primary Schools (Ministry of Education, 1998) proposed a package of assessment tools to help remedy perceived information gaps. The package included:
- "additional diagnostic tests that will provide teachers with detailed information on the learning needs of individual students in specified areas;
- more national exemplar material to provide examples of criteria for assessing student work and to help teachers to decide whether the judgments they are making about student achievement are consistent with national expectations;
- new externally referenced tests to enable teachers to identify how well their students are achieving compared with national and group levels of achievement; and
- more comprehensive national summary information that will enable Government to identify the achievement of groups of students in order to develop policy and monitor its effectiveness." (p. 20)
The proposal for ‘new externally referenced tests’ was for mandatory national testing at three year levels. Following a public consultation period and further analysis of the issues, decisions were taken by government to undertake a trial of proposed national testing during term two 2000. Following a change of government at the General Election on 27 November, 1999 this initiative was abandoned.
However, as at April 7 2000, a new initiative for voluntary testing of numeracy and literacy at Years 5 and 7 was announced. At the time of writing details were not available.
A more recent suggestion was put forward in 1998, in NZCER’s response to the Green Paper Assessment For Success in Primary Schools (New Zealand Council for Educational Research, 1998). This was to develop and present on the ARB Website a series of "intact tests" which would be complete achievement tests with normative data that schools could use, periodically, if there was a need to report progress or "benchmark" achievement against a wider group of students. By maintaining the choice of tests in schools’ hands, validity is likely to be enhanced and issues of mandatory testing may be settled.
Intact tests and the three possible approaches to national testing suggested following the simulated national testing trial in 1995, are all untried. Clearly, the three possible approaches mooted at that time are more feasible now given the expansion of the number of resources in the ARBs since that time. These possible approaches do have the potential to utilise the range of resources developed for ARBs, and address validity issues by a better sampling of the range of achievement within current New Zealand curriculum statements. It is to be regretted that more work was not undertaken on the possible approaches to national testing identified in 1996, as this may have facilitated resolution of some issues of national testing raised by the 1998 Green Paper.
Tests consisting of selected-response items to be computer-marked are the cheapest option for national testing and make the least demand on teachers’ time. All data obtained from this style of testing are identical in format, allowing for the greatest range of possible reports to be generated. By careful choice of common items between years and over time, growth and change may be measured on a full-cohort basis. Data of this nature could prove valuable for national planning purposes, as in traditional terms these objective tests achieve higher levels of reliability than newer forms of testing. Additionally, administration of objective tests is familiar to teachers and students, and responses to the items are less influenced by students’ writing skills.
A major disadvantage is that a single national test made up of mainly multiple-choice items is unlikely to be a valid measure of many achievement objectives, or be in keeping with major assessment principles of the New Zealand Curriculum Framework (Ministry of Education, 1993). A flow-on effect is that the lowered levels of validity associated with these items may reduce some of the benefits of higher levels of reliability of a traditional selected-response test. Indeed, the lowered levels of validity must reduce the utility of the data for planning and other purposes.
The single issue of reduced validity of a selected-response test to assess current curricula is paramount. Considering the range of constructed-response material being developed for the ARBs, using selected-response items only as a source of national data, would not make optimum use of the resource banks.
In considering national testing and the provision of national information, it is essential to be clear about the purposes of national testing, including the nature of the data required and the uses to which these data will be put in both the short term and long term. The potential for this form of assessment to rise above a surface approach to accountability, and make a proper return to the education system for the education dollars to be spent on it, remains a crucial issue.
Brown, G., & Strafford, E. (1997, November). The feasibility of expanding the assessment resource banks to incorporate English. Unpublished project report. Wellington: New Zealand Council for Educational Research.
Croft, C. (1996, September). Resource banks in mathematics and science for school-based assessment. Paper presented at 22nd annual International Association for Educational Assessment conference, Beijing.
Croft, C. (1998, May). Computerised assessment resource banks in mathematics and science: A New Zealand initiative. Paper presented at the International Association for Educational Assessment annual conference, Barbados.
Croft, C. (1999). School wide assessment: Using the assessment resource banks. Wellington: New Zelaand Council for Educational Research.
Croft, C., Boyd, S., Dunn, K., & Neill, A. (1996, December). Symposium: Resource banks in mathematics and science for school-based assessment. Symposium presented at 18th annual New Zealand Association for Research in Education conference, Nelson.
Croft, C., Gilbert, A., Boyd, S., Burgon, J., Dunn, K., Burgess, L., & Reid, N. (1996, March). Assessment resource banks in mathematics and science B implementation trial. Wellington: New Zealand Council for Educational Research.
Croft, C. (1999. December). Symposium: Resource banks for school-based and national assessment, Paper CRO 99553, Development and structure of the assessment resource banks. Paper presented at the joint Australian Association for Research in Education–New Zealand Association for Research in Education conference, Melbourne.
Marston, C., & Croft, C. (1999). What do students know in science? Analysis of data from the assessment resource banks. set: Research Information for Teachers, 2, item 12.
Millman, J., & Arter, J. (1984). Issues in item banking. Journal of Educational Measurement, 21(4), 315B330.
Ministry of Education. (1993). New Zealand curriculum framework: Te anga marautanga o Aotearoa. Wellington: Ministry of Education, Learning Media.
Ministry of Education. (1994). Assessment: Policy to practice. Wellington: Learning Media.
Ministry of Education. (1998). Assessment for success in primary schools (Green paper). Wellington: Ministry of Education, Learning and Evaluation Policy.
New Zealand Council for Educational Research. (1998). Assessment for success in primary schools: Green paper. A response from the New Zealand Council for Educational Research. Wellington: (author).
Reid, N., Armstrong, L., Atmore, D., Boyd, S., Croft, C., & Livingstone, I. (1994). Assessment resource banks feasibility study: Summary report. Wellington: New Zealand Council for Educational Research.