The black box of tertiary assessment
An impending revolution in the use of formative assessment lies ahead of the nation's tertiary institutions, writes Professor John Hattie.

John Hattie is Professor of Education at the University of Auckland.
This paper is drawn from his keynote chapter in Tertiary Assessment & Higher Education Student Outcomes: Policy, Practice & Research. This 300-page book is based on papers presented at the Symposium on Tertiary Assessment and Higher Education Student Outcomes in November 2008. There is also a summary document of the book, available for free download.
The black box of tertiary assessment: An impending revolution
The revolution of using assessment formatively as an integral part of teaching and learning is about to enter the halls of tertiary organisations. This assessment revolution has already swept through our primary and secondary schools; through the NCEA and its standards based approach; led to an the emphasis on reporting more than on scoring; led to greater peer collaborative assessment, learning intentions and success criteria; and has highlighted the realization of the power of feedback.
This revolution will place more emphasis on the important role feedback plays in the assessment issues in higher education; particularly the feedback from assessments to the instructor about what they taught well, gaps, strengths, and how their learning intentions were or were not realized by the teacher. This requires a change of mindset from thinking assessments are for and about the students - to seeing assessment as about how to enhance the teachers and teaching (and thence enhance the outcomes for students). The revolution will involve computerized essay scoring, computerized peer critique, the use of in-class feedback (e.g. clickers). Such computerized interactive tasks will allow lecturers to spend more time and energies to seek feedback not only for themselves but also then devote more time to the arts of teaching, learning and feedback to students. The black box of tertiary assessment is about to be re-opened, re-engineered, and put back together in a totally different manner.
Over the next decade we will witness the greatest revolution in the role of assessment in tertiary education – it will move from a device to sum up what we think students need to know, to providing feedback into the teaching and learning cycle; it will involve more than surface and invoke also deeper knowledge and understanding; it will involve peer assessment and computerized scoring; it will involve aspects of Second Life and interactivity; it will see more use of computerized adaptive testing; and the quality of these assessments will be set higher, the qualities will be more public, and students will be the major beneficiaries of this revolution. The revolution will encompass “feedback from assessment” and the development of visible learning and visible teaching.
To illustrate this revolution, we can draw upon the example of Alexander Graeme Bell who invented the telephone from a tool he made for teaching the deaf. If he saw the telephone in 1990 he would still recognize it. If he saw it now in 2008 he would not immediately see the connection between what he invented and Skype, internet, faxing, texting, virtual marts, i-tunes etc. There will be nothing familiar to Bell. A similar punctuated equilibrium is close in tertiary assessment. Nineteenth century professors would see many similarities between their assessments and today’s university assessment, and this is about to radically change.
Assessment for learning/Feedback from assessment
Of all the factors that make a difference to student outcomes, the power of feedback is paramount in any list. Interestingly, the power of feedback is less the information provided to students about their progress (although this does matter) but the feedback received by the lecturer about their impact on the students. When lecturers seek, or at least are open to, feedback from students as to what students know, what they understand, where they make errors, when they have misconceptions, when they are not engaged—then teaching and learning can be synchronized and powerful. Feedback to teachers helps make learning visible. The major feedback questions are “Where am I going?” (learning intentions/ goals/success criteria), “How am I going?” (self-assessment and self-evaluation), and “Where to next? (progression, new goals)” An ideal learning environment or experience is when both teachers and students seek answers to each of these questions, preferably together.
Technology induced revolutions in formative assessment
There is still an over-reliance on the written (essays, lab reports) in higher education. Most of these are constructed with little information for students as to what success looks like; and we know from decades of research that essays are excellent for measuring organization, style and language and not so successful for assessing content or higher order thinking.
A most exciting development is the newer automated essay scoring programs, which indeed do highlight content and understanding far more successfully than human scorers. Also the quality of the feedback from the essays is often superior and it is faster, more detailed, and much more reliably, consistently and validly scored. When we introduce these methods into higher education then we can restructure the academics’ job from summative evaluator to more a formative role. Further, the students may be encouraged to score their own essays and from the feedback may learn more (and faster) than the interminable wait time from writing to learning from the instructors feedback.
Measuring the multiple outcomes
Both society and students are demanding more from higher education: achieving competence, moving from autonomy to independence, establishing identity, purpose and integrity, and mature interpersonal relations. While we spend most time assessing the first, our funding and public success as well as our continual justification as an attractive place for people to come to higher education is as dependent on the others. There have been attempts to assess such success but perhaps the most important development hovering on our horizon is the OECD “Assessing Higher Education Learning Outcomes (AHELO)", to explore the possibilities for developing comparative quantitative measures of graduate learning outcomes across Universities and nations. AHELO aims to assess four strands: generic skills (e.g., critical thinking, problem-solving, generation of knowledge); discipline-specific skills (they are currently trialing engineering and economics); the ‘value-added’ or contribution of tertiary education institutions to students’ outcomes (with a view to assess the quality of the teaching services); and contextual indicators (equipment, career orientation, satisfaction). Given our desire in the University sector to want to be internationally recognizable, then the potential power of AHELO is high – especially when added to many of the current world rankings which depend mostly on research outcomes only. We need to develop our own robust assessments of teaching effects in our tertiary system on these outcomes; and the forthcoming International comparisons are most likely to drive the system to attending to these issues as core business, and thence investing in enhancing our teaching and assessments in higher education.
What works best?
We need to move away from the question: “What works?” and instead ask the question “What works best?” I have synthesized over 800 meta-analyses, about 240+ million students, 50,000+ studies, about 150,000 effect-sizes, from early childhood through adult education in the search for what works best (Hattie, 2009). As can be imagined, these effects cover most subject disciplines, all ages, and a myriad of comparisons. Almost everything works. Ninety-five percent+ of all effect sizes in education are positive. When teachers claim that they are having a positive effect on achievement this is a trivial claim. One only needs a pulse and we can improve achievement. Rather than setting the bar at zero we should set it at least at the average of all effects – and thence about half of what we currently do in teaching needs to be changed or discarded.
Among the most powerful effects come from lecturers seeking formative evaluation about their impacts on learning, on their clarity of instruction about what success looks like (e.g. , providing worked example), in listening to student questions and engaging students in the learning process, in lecturers seeking and listening to feedback about the effects of their teaching and providing feedback to students about their closeness to the success criteria, by providing multiple opportunities to learn, by constructing assessments that value both surface and deeper understandings, by teaching study skills in the discipline area (deeper study skills are only learnt in the content area, surface skills can be learnt without content knowledge), by promoting peer cooperative learning and teaching, by setting clear benchmarks through the course of what success looks like, and constructing assessments for students to evaluate their mastery during as well as the end of the course.
Originally published in the New Zealand Education Review, Vol 14 No.44, November 13th 2009
