Popular Posts

Sunday, 6 April 2014

Reliability. Validity, and Evidence of Learning the Canadian Way – Classroom Assessment Basics

In Canada, for the past thirty years, provincial documents have considered student evidence of learning collected in classrooms by teachers in a very different way than evidence of learning collected for external purposes such as school or system-based data collection. For example, in 1989, the Primary Program in British Columbia used the social science research perspective as the model for collection of evidence of student learning.That is, teachers and students together collect products, conversations (records of student thinking), and observations of process. This way of looking at evidence of learning has been embedded in my work (for example, Davies, 1992, Davies 2012), in provincial curriculum documents across Canada such as B.C.’s Primary Program (1989; 1990; 2000; 2010), Manitoba’s assessment policy (2010), Ontario’s Growing Success (2010) and Nova Scotia’s “Back iNSchool Again’ guide for teachers (2013). It is based on the work of Lincoln and Guba (1985).

Given this powerful and long-term perspective on evidence of learning at the classroom level across Canada, it was interesting to read the pre-readings connected to this area of classroom assessment, particularly where they address reliability, validity, and evidence of student learning.

Reliability and Validity

The research by ARG (2007), the findings of researchers in Scotland (2011) and Alberta (Burger et al., 2009) are worth examining as all researchers found that ‘teachers' professional judgment is more reliable and valid than external testing...’ Parkes (2013) (United States’ team’s collection of pre-readings) and Maxwell (2009) (an Australian perspective) take different perspectives from one another. And yet, neither appears to have considered this emerging body of research.

If one considers reliability from a social sciences perspective then one addresses issues related to reliability – repeatable, replicable – by looking at the evidence of student learning collected from multiple sources over time (Lincoln and Guba, 1985). Maxwell and Cumming (2011), delegates from Australia, come close when they state, “Concerning reliability, continuous assessment would lead to more stable judgments of student achievement (through collection of more extensive information over time and consultative judgements among teachers). (p. 206)

Evidence of Learning

From a social sciences perspective, evidence of learning is a qualitative task – and a messy one at that – because teachers are, potentially, collecting evidence of everything a student says, does, or creates. As teachers have deconstructed curriculum expectations/outcomes/standards, they have learned to be strategic about what they collect as evidence of student learning. Further, they also are strategic about what students collect as evidence of their own learning in relation to what needs to be learned. This process of triangulation of data (Lincoln and Guba, 1985) supports classroom teachers as they design next instructional steps and later, when they are called to make a summative judgement. Heritage (2013) discusses how teachers generate evidence of student learning, the range of sources of evidence, quality of evidence as well as evidence of learning in the context of learning progressions. This paper reviews the variety of purposes for collecting evidence of learning (informing student and teacher’s ‘next steps,’ being able to see learning along-the-way and over time, and to help teachers respond to student learning during the actual learning time itself.

Heritage (2013) notes that Patrick Griffin (2007) argues that humans can only provide evidence of learning “through four observable actions: what they say, write, make, or do.” (page 9). This is the definition of triangulation – everything a student says, does, or creates (B.C. Primary Program draft, 1989; Davies, 2000; 2012). Heritage (2013) goes on to discuss a variety of researchers who try in different ways to do exactly the same thing – that is, account for the vast possibilities of evidence of student learning. 

In the end I think everyone interested in classroom-based evidence of learning will find that it is more helpful to acknowledge that the ways students show evidence of their learning can not be contained in definitions but rather is simply ‘everything a student creates, says, or does’ is potentially evidence of learning.

Theoretical papers discussing reliability, validity, and what counts as evidence of student learning in classrooms need to be revisited given:

1.    Classroom assessment is not a ‘mini’ version of large-scale assessment. Reliability and validity begins to be attended to when teachers plan assessment and instruction with the learning expectations in mind and plan to collect evidence of learning in relation to those learning expectations while attending to triangulation of evidence of student learning (products, conversations, and observations of process).
2.    Moderation isn’t only for large-scale assessment. When professionals are involved in both formal and informal processes of moderation with the purpose of coming to agreement about quality of student evidence, their professional judgement is more reliable and valid than external tests and measures (ARG, 2007; Burger, 2009; Hutchison, 2011).
3.    Evidence of learning is messy. The collection of student evidence of learning from multiple sources including products, conversations, and observations – triangulation (Davies, 2012) not only prepares teachers to design instruction minute-by-minute but it also provides the evidence of student learning needed to support summative judgements about student learning in relation to curriculum expectations/outcomes/standards for reporting purposes.

As I reflect upon the definition of triangulation of evidence of student learning – collecting products, conversations over time, and observations of process – embedded in numerous curriculum and assessment documents across Canada, I think the way Parkes (2013) and Heritage (2013) consider reliability, validity, and evidence of student learning is not helpful in our Canadian context. 

When one considers the sheer number of Canadian classrooms and jurisdictions where teachers are expected to exert their professional judgement for both formative and summative purposes, it is obvious that Parkes' (2013) and Heritage's (2013) research summaries reflect Canadian education’s past, not our present nor our future.


 BC Ministry of Education. (1989; 1990; 2000; 2010).  The Primary Program: A Framework for Teaching. Victoria, BC; Queens Printers.
Davies, A. (2011). Making Classroom Assessment Work, 3nd Ed. Courtenay, BC: Connections Publishing and Bloomington, IN: Solution Tree Press.
Davies, A. (2000). Making Classroom Assessment Work. Courtenay, BC: Connections Publishing.
Heritage, M. (2013). Gathering evidence of student understanding. In J. H. McMillan (Ed.) SAGE Handbook of Research on Classroom Assessment, pp. 179-196. New York: SAGE.

Lincoln, Y. S. & Guba, E. G. (1985). Naturalistic Inquiry. Beverly Hills, CA: Sage Publications.
Manitoba Ministry of Education. (2010). Provincial Assessment Policy Kindergarten to Grade 12: Academic Responsibility, Honesty, and Promotion/Retention. Winnipeg, MB: Manitoba Education.
Maxwell, G. (2009). Dealing with inconsistency and uncertainty in assessment. Paper delivered at the 35th Annual Conference of the International Association of Educational Assessment, Brisbane (2009, September).
Maxwell, G. S. & Cumming, J. J. (2011). Managing without public examinations: Successful and sustained curriculum and assessment reform in Queensland. In L. Yates, C. Collins and K. O’Connor (Eds.) Australia’s Curriculum Dilemmas: State Cultures and the Big Issues. Chapter 11, 202-222.
Nova Scotia Department of Education. (2013). Back iNSchool Again! Halifax, NS: Department of Education. https://www.ednet.ns.ca/
Parkes, J. (2013). Reliability in Classroom Assessment. In J. H. McMillan (Ed.), SAGE Handbook of Research on Classroom Assessment, Chapter 7,107-124. New York. SAGE.

Remember, if you want to keep track of some of the conversations connect via these links:

Twitter: #AforLConversation
Twitter: @Anne_DaviesPhD

No comments:

Post a Comment