Reimagining Assessment in Undergraduate Education: Progress, Resilience, Mastery

Barry Fishman, Faculty Innovator-In-Residence, Arthur F. Thurnau Professor of Information and Education
@barryfishman

Cynthia Finelli, Director, Engineering Education Research Program; Associate Professor, EECS and Education
@cindyfinelli

Melissa Gross, Associate Professor of Movement Science, School of Kinesiology, and Art and Design, Penny W. Stamps School of Art and Design
@MMelissaGross

Larry Gruppen, Director, Master of Health Professions Education program, Professor, Dept. of Learning Health Sciences

Leslie Rupert Herrenkohl, Professor, Educational Studies, School of Education

Tracy de Peralta, Director of Curriculum and Assessment Integration, Clinical Associate Professor, Dental School
@tracydeperalta4

Margaret Wooldridge, Director, Dow Sustainability Fellows Program, Arthur F. Thurnau Professor, Departments of Mechanical Engineering and Aerospace Engineering

Given the resources of a major public research university, how would you design undergraduate education if you were starting with a blank page? That is the question we explore in The Big Idea project, and our answer is that undergraduate education should reflect the true strengths of the institution. In the case of the University of Michigan, that is research, discovery, and real-world impact. The Big Idea undergraduate program is problem-focused, with students directly engaged in research and scholarship as they work towards ambitious learning goals. Learners who meet these goals will be prepared to address the world’s pressing—and often ambiguous—problems. The program is designed to emphasize progress towards mastery — thus, it does not use grades, courses, or credit hours to mark student progress or readiness-to-graduate. In this post, we first discuss some critical features of the current higher education landscape and explain why we think they are ripe for re-examination, and then we outline how assessment for and of learning will work in the Big Idea program.

How would you design undergraduate education if you were starting with a blank page?

How Business-as-Usual in Higher Education Works… Just Not for Learning

The traditional mechanisms for measuring progress and readiness-to-graduate in higher education are grades, grade-point averages (GPAs), and credit hours. If you mention these terms, almost anyone involved with higher education will know what you are talking about. These are combined with course sequences that define both (a) the general education or “lower division” phase of college, where students pursue the broad distribution requirements usually associated with the liberal arts, and (b) the “upper division” years of college, where students pursue a focused major. These mechanisms have served higher education for more than a century, and they offer a relatively straightforward way to direct student learning pathways and to rank the relative performance of learners within those programs. However, in providing structure, these mechanisms also limit learners and curtail the potential of educational programs in key ways. In this section, we explore these limitations to explain why we are taking The Big Idea in a new direction.

Grades and GPAs: What Do They Measure?

Assigning grades as a way to record or report students’ performance feels like a naturally-occurring practice in formal education. But the use of grades, like many components of modern education practice, did not become widespread until the early-mid 1900s (Brookhart et al., 2016). Grades were part of a movement to make education more “scientific.” These efforts were led by Edward Thorndike, a prominent behavioral scientist who was instrumental to the spread of the scientific management and measurement in education. The work of Thorndike and others led directly to the standardized testing movement that defines much of K-12 education and college admissions in the United States today (Rose, 2016). The history of 20th century education reform has been described as a struggle between the constructivist and project-based ideas of John Dewey and the instructivist and standardized approaches of Thorndike. The consensus view is that Thorndike won and Dewey lost (Lagemann, 2000). As a result, the current structure of undergraduate education is more about the sorting of students for purposes that come later—such as college, graduate school admissions, or employment—than it is about supporting the learning of the students involved.

The use of grades actually removes information about student learning from the academic record-keeping process. Even when a letter grade is based upon a strict set of criteria about what a student has learned, the letter itself communicates little to the world beyond the classroom where it was issued. If a student has earned an “A,” we know that they did well in the course, learning (hopefully) the majority of the material taught. But what does a “B” mean? We generally assume that it means the student has learned roughly 85-90% of the material, but what material was not learned? If the course is part of a sequence, later instructors are given little information about the understanding or capability of their incoming students. “Grading on the curve,” meant to normalize outcomes across students, further masks information about student learning. A grading curve may communicate comparative performance, but without reference to the goals of the course.

The use of grades actually removes information from the academic record-keeping process.

The problem becomes even more pointed when trying to gauge a student’s overall learning in college. Currently, the GPA is the primary tool for reporting overall accomplishment. High GPAs are seen as signs of academic accomplishment, and they are rewarded with honors and other recognitions. Employers and graduate schools often use GPA as a sorting mechanism. This is problematic for several reasons. One problem is that GPAs can be “gamed,” with students seeking to take courses solely for the purpose of boosting their overall GPA. Another problem is that a great deal of the variance in a student’s final GPA can result from a student’s first-year grades, which are earned as students adjust to the new environment of college. This places first-generation students or others facing transitional challenges in college at a long-term disadvantage that does not reflect their actual capability upon graduation. Some institutions, such as Swarthmore College, have declared that all classes in a student’s first semester will be recorded only as “Credit/No Credit” to reduce anxiety and encourage risk-taking during this important transitional period.

Another reason that GPAs are problematic involves the growing concern that grades—or more accurately, a focus on grades as indicators of self-worth or future opportunity—are a contributor to the growing mental health crisis on college campuses. Many students, especially at selective universities, have never had an experience of academic “failure” in the form of a low grade or standardized test score. Yet learning scientists have long known that “failure” is a key component of successful learning. If a learner only ever gets the right answer or only ever performs at a high level, there is a good chance that they weren’t truly being challenged in the first place. Furthermore, knowing the “right” solution to a problem or challenge is often not as revelatory as working to understand why a solution is “wrong” and how to repair it. Deep learning is not about knowing answers, as much it is about understanding problem and solution spaces. The typical approach to grading in higher education imposes penalties on learners for taking on challenges and not succeeding on the first try, instead of encouraging continuous progress towards mastering the material. Think, for instance, of courses where the entire grade is based on a single or small set of exams.

Healthier and more productive grading systems (such as the gameful learning approach pioneered at U-M) emphasize progress and “safe” or “productive” failure, encouraging students to work beyond their comfort zones with confidence that they are not being adjudicated on each result. In the Big Idea program, we choose mastery-based assessment over traditional grading systems to support continuous effort and progress, and reflect true student accomplishment over averages or comparison.

Credit Hours: The Time Clock of Education

The credit hour, sometimes referred to as a “Carnegie Unit,” was created in the early twentieth century by the Carnegie Foundation for the Advancement of Teaching as part of their effort to create a pension system for college professors. To qualify for the Carnegie pension system, institutions were required to adopt a range of standards, including the newly-introduced Carnegie Unit (Silva, White, & Toch, 2015). From today’s vantage point, those who originally conceived of the credit hour might be surprised to see how its use has expanded to become the “time clock” for virtually all aspects of educational programs. The original metric was useful for determining the amount of time students spent in contact with instruction, but with expanding options for co-curricular, online, and other forms of learning, the question of what should “count” as learning or instruction has become muddied. Furthermore, the amount of time devoted to instruction does not necessarily translate to learning. As we’ve heard at least one critic observe, “If we’re measuring seat time, that’s the wrong end of the student.”

The question of what should “count” as learning or instruction has become muddied in higher education.

The Big Idea is based not on time, but on student accomplishment. We expect students to move at different paces and to approach their learning in a sequence that reflects personalized pathways. Therefore, we are designing the program in a way that we believe will take students roughly the same amount of time to complete as a “traditional” bachelor’s, but does not use time to regulate progress.

General Education, Majors, and Degrees

Lower-division undergraduates take a “distribution” of courses intended to expose them to different ways of thinking and expression, in order to develop intellectual breadth. Upper-division students engage in a series of courses defined by their major, designed to develop intellectual depth. Readiness for graduation from the university is measured by having accomplished at least a passing grade in all of the courses specified by the selected program of study and accruing the required number of credit hours. This system is designed to create both guidance to students in selecting courses and managerial efficiency in terms of university planning. Unfortunately, it also does little to promote learning and can suppress individual agency.

Within the system of distribution requirements and majors, it is unusual for a degree-granting program to employ program-level assessment of individual student learning as the qualification for graduation. Normally, assessment is conducted at the course level, with students being assigned grades based upon their performance within courses, and with those grades being compiled and reported as an average GPA across courses. If a student takes a predefined set of courses and maintains an acceptable GPA, that student may graduate with a degree in that area. In specialized cases, undergraduates may complete a capstone or summative project, such as a thesis. But for the majority of programs these are optional exercises, and the assessment criteria for summative projects are not usually aligned with any program-wide criteria for learning. In the Big Idea project, we have come to refer to business-as-usual as “faith-based” education, because one needs to have faith that students are learning something, even if the assessment design of the program is not designed to record or report that learning.

Charles Muscatine, in his book “Fixing College Education,” argues that the current system of majors exists in no small part to create “peace among the departments” (Muscatine, 2009, p. 39), ensuring that students continue to take courses in different areas of the university, even if the divisions that are created are not grounded in reality. As Muscatine pointed out, there has never been research conducted on whether the division of learning into sectors like “humanities,” “science,” and “social science” has real benefit to learners, or even reflects a valid distinction among different ways of thinking. He recommends distinguishing between disciplines in terms of the methods employed for sensemaking—“logical, empirical, statistical, historical, critical (analytic, evaluative), creative” (Muscatine, 2009, p. 40)—and making sure students have practice in each. This is what might once have been described as a truly liberal education, meant to liberate the mind and broaden our ability to think in flexible ways.

Assessment in the Big Idea: Business-as-Unusual

As we introduce our plans for assessment in the Big Idea, we note that, as with other elements of our design, the ideas presented here are not necessarily new; similar approaches have existed in different educational contexts in the past, and similar ideas continue to be employed in specialized applications today. However, the use of these practices at the core of a degree-granting undergraduate program within the modern research university is unusual.

In contrast to the checklist approach to graduating with a major described in the previous section, the Big Idea program starts with an ambitious set of program-wide learning goals, and has only one criteria for graduation: that a student has met all the goals at an acceptable level, as evidenced his/her “mastery transcript,” a document designed to: record accomplishment, provide evidence of accomplishment, and allow for tailoring to meet different student, program, employer, or other needs. (Our thinking about this kind of documentation is inspired by the work of the Mastery Transcript Consortium.) There are no required courses, and no minimum GPA. In fact, we do not intend to record or report grades for students in the Big Idea program. Nor is progress measured by the number of credit-hours completed, which is a metric of time rather than learning. What we are interested in is the progress a student makes towards mastery of the learning goals.

In the Big Idea program, there will be no required courses, and no grades… what counts is the progress a student makes towards mastery of the learning goals.

Our approach to assessment is designed to promote personalization of learning pathways, agency, and self-authorship. It is designed to promote resilience and personal growth. It is designed to support learners from diverse backgrounds. Where most current assessment paradigms are geared towards ranking and sorting students, the Big Idea assessment model is designed for transparency with respect to learning.

Assessment for Learning

There are two primary types of assessment: formative and summative. Summative assessment is a report of student knowledge, skill, or accomplishment at the end of some defined period or event, such as a course, high school, etc. Examples of summative assessment include final exams, which are typically used to measure end-of-course understanding and serve as the end of a student’s relationship with a course and instructor, and the SAT or ACT tests, which are meant to “sum up” a student’s academic potential at the end of secondary school and provide a measure of the student’s their readiness for post-secondary education. Formative assessment, on the other hand, is meant to inform learning, to give feedback on student progress, or to serve as a milestone towards a larger goal. Assessment in the Big Idea program is, by design, intended to be almost entirely formative.

Students in the Big Idea are expected to always be making progress towards the learning goals. Feedback on learning will come from many different places in the program — research supervisors, faculty across different learning experiences (including courses), and members of various communities where students conduct research and learning. The learning goals themselves are not meant to be a “final” report of a student’s potential or accomplishment; rather, they represent a certain level of attainment that can continue to be built upon throughout a learner’s life and career.

Students in the Big Idea are expected to always be making progress towards the learning goals.

Paths Towards Competence and Mastery

To emphasize the importance of progress and practice towards mastering the learning goals (no learning goal is a box to be checked), we describe paths towards mastery as encompassing several levels of proficiency: Awareness, Literacy, Competency, and Mastery. These terms are defined as follows in the Big Idea program:

Awareness

Students who achieve this goal know that this goal, practice, or skill exists, and they understand how it fits within the larger field or profession.

Literacy

Students who achieve this level are in the middle of learning this goal, practice, or skill, they understand its dimensions, they can apply or demonstrate it in a basic manner, and they are able to learn more about it through additional work. Such a student could play a supporting role in a project that employs the skills or practices inherent in this goal where someone else is leading and would understand what the other person was doing.

Competency

Students who achieve this level are ready to begin professional work requiring this practice or skill. A competent student can do a task requiring this skill on his/her own, he/she knows how to ask well-formed questions in the area or provide sound answers to others’ questions, and he/she could advance their understanding through self-study.

Mastery

Students who achieve this level are ready to employ the practices or skills embodied in this learning goal in the real world. Such students could supervise, guide, or teach others with respect to this goal. We note that “Mastery” for the purposes of graduation from the Big Idea program is not necessarily lifelong mastery; our rubrics will emphasize areas for ongoing growth even beyond our program goals.

Note that while we expect all learners in the program to achieve competency in all the learning goals, we expect each learner to achieve mastery in only a subset of the overall learning goals. Which goals a learner masters will depend on their particular focus and choices made during undergraduate study.

Getting Started (even before the “start”)

We expect that students in the Big Idea will arrive with some initial proficiency in many of the learning goals for the program, and we will invite students to present evidence of that learning. This would be a natural outgrowth of the kinds of experiences one might have had in secondary education (including extra-curricular experiences) that might lead a student to be interested in the Big Idea in the first place. We note, however, that traditional markers of student “accomplishment,” such as Advanced Placement scores, will not be considered as evidence of learning in and of themselves. We will invite students to present a case for their current level of learning that might include information about courses taken or tests passed, but that also requires demonstration of their actual knowledge or skills.

One of the first formalized activities of the Big Idea curriculum, to be conducted as part of a “Forum” experience (this is meant as a multi-year home base for students—similar to homeroom in secondary education—with access to more experienced peers and an advising faculty member), involves helping students become familiar with the learning goals, understanding why the goals were chosen and how they are to be assessed, and gaining exposure to a range of examples (perhaps provided by the more advanced students in the program). We will also engage learners in self-assessment activities to allow them to calibrate their current level of proficiency in each of the learning goals.

Demonstrating Achievement

We will develop detailed rubrics for each learning goal, providing descriptions and examples at each level of learning towards each goal. The rubric is meant as a guideline for learning and for assessment, not an “answer key” or template. We expect broad individual variation in the way each learner expresses their current levels of learning, to reflect individualized interests, choices, and pathways.

Students will use an electronic portfolio as a tool to record their work and accomplishments, to share/present those accomplishments for evaluation and feedback, and (following graduation) to offer evidence of learning in the future. We intend the portfolio to be a “mastery transcript” as it will serve as a replacement for traditional transcripts, among other uses. Portfolios have long been used as a means to encourage self-authorship by students, allowing them to shape the narrative of their accomplishments, and even customize that narrative for different audiences and purposes. Electronic portfolios also allow for the inclusion of many different forms of evidence in support of claims of learning, including links to assessment evidence and they can be assembled in different ways for different needs an audiences. Understanding how to communicate one’s own abilities using the portfolio (and underlying data) will be an important component of the Big Idea program, related to the “Communication” learning goal.

We expect broad individual variation in the way each learner expresses their current levels of learning.

Digital badges, also known as micro-credentials, are one mechanism for tracking and reporting accomplishment that can be employed in an electronic portfolio. Digital badges have many useful affordances. They can be used to establish pathways towards complex learning, to record progress, and to (when used as credentials), to signal accomplishment. Digital badges can also be used to expand assessment beyond our formal processes by encouraging students to make progress towards the learning goals in all areas of their life, whether part of the formal activities of the Big Idea program or elsewhere.

In addition to the portfolio-based mastery transcript, assessment in the Big Idea will include in-person interviews or performances. The sessions will be personalized to each student, allowing the individual cases demonstrate achievement in individual ways.

Formal (and Formative) Assessment in the Big Idea

Students in the Big Idea program are responsible for making an evidence-supported case for their progress towards or accomplishment of learning goals. There will be two levels of panel review for students to demonstrate achievement, both of which include review of the mastery transcript and in-person interviews or performances. As part of the overall learning process within the Big Idea, these two levels of review allow personalized assessment and feedback at a large scale through a manageable workload. The process is managed by the Forum/homeroom instructor, who (together with more advanced students in the Forum) can give advice to students about how to assemble their materials for review and when they are ready to start the process.

The first level involves review by a panel comprised of more advanced, “arm’s length” students, working under the supervision of a faculty member. This arrangement allows for frequent assessment of student progress at a large scale and is a key part of the learning process for student panelists, as they learn to give constructive feedback to more junior students with respect to each of the learning goals. The goals of this first level review are both to mark progress and, more importantly, to provide feedback to the student. This review level is expected to be employed regularly for students at the awareness and literacy stages of proficiency. While the actual number of reviews might vary for each student according to their needs and pace, we would expect students to engage in this first-level review several times a semester with respect to different learning goals. We also anticipate that not all students will succeed in each review step (though we hope that feedback from Forum instructors and peers helps mitigate this). The Big Idea is designed to support progress towards meeting the learning goals, so we would not necessarily consider this a “failure,” but rather progress towards eventual success.

As an example of how the first-level review might work, consider a student who believes she is making good progress on the statistical and computational learning goals (Ways of Knowing) and also the resilience goal (Personal Good), because of various difficulties she encountered towards learning these goals. The student discusses her readiness to be reviewed with her advisor and other students in Forum, collecting input on what evidence to include in her mastery transcript and how to assemble it. This evidence could include examples of the work done in statistics and computation, including a computer program that could be used to display statistical analyses related to a public health dataset about water quality (the dataset comes from a project the student is working on led by a faculty member in the School of Public Health) and a reflection statement about what was learned and how it represents progress towards achieving competency on those learning goals. To present evidence of resilience, the student writes a narrative discussing various challenges faced as she worked with this data and how she worked through those challenges. The panel reviews the material and provides feedback and questions to the student, and the student is invited to respond in writing. An in-person conversation could be scheduled for the student to meet with the panel for further discussion, if needed. Finally, the panel would issue feedback and a decision about what level of proficiency the student had reached for each learning goal. Students who believe they are ready to be reviewed for competence or mastery in particular learning goals would also use this student-run panel, and the panel would “approve” portfolios for review at the second level.

The second, more advanced level for assessment, involves a panel of faculty, advanced students, community members, alumni, etc. who will review and give feedback to students. This panel will primarily hear cases at the competency or mastery level of learning goals. Students will be encouraged to present cases that combine multiple learning goals (though we do not expect any single case to contain all learning goals), and they would be expected to engage in this second level of review for each learning goal, with the expectation that any review should include multiple goals to represent the full range of learning goals. This stage will also include a public presentation and discussion, similar to a doctoral thesis defense. As part of the Forum activities, we would work to prepare students to be proficient in a range of presentation modalities (again, the Communication learning goal), recognizing that this is another area where students will vary. We hope to make these community events. Once the program is operating at scale and the scheduling of such events is difficult, we envision an annual (or semi-annual) public celebration involving a poster fair, talks, and panels of students and others involved in the research.

The assessment infrastructure of higher education has evolved to emphasize efficiency and to simplify the management of learners, courses, and programs, but without a focus on learning or support for student individual differences or independence.

Summary

The use of grades, credit hours, and majors has served to help the modern university “manage” the processes of undergraduate education, but these tools were not designed to support learning or make it more transparent. Eventually, these structures became the tail that wags the dog. The assessment infrastructure of higher education has evolved to emphasize efficiency and to simplify the management of learners, courses, and programs, but without a focus on learning or support for student individual differences or independence. Periodic attempts at reform often focus on the re-introduction of learner-focused ideas such project- or problem-based learning, but these efforts strain against the boundaries of the existing infrastructure. We need “infrastructuring” work (Star & Ruhleder, 1996), a conscious reshaping of the practices, technological supports, and cultural norms that guide our thinking about assessment and support for learning in education. Our proposal for assessment in The Big Idea requires us to re-engineer the infrastructure for higher education assessment to emphasize progress, resilience, and eventually mastery of ambitious learning goals. Re-shaping these structures is a key component of our plan to design undergraduate education to take best advantage of the resources and opportunities of a major public research university.

References

Brookhart, S. M., Guskey, T. R., Bowers, A. J., McMillan, J. H., Smith, J. K., Smith, L. F., … Welsh, M. E. (2016). A Century of Grading Research: Meaning and Value in the Most Common Educational Measure. Review of Educational Research, 86(4), 803–848. https://doi.org/10.3102/0034654316672069

Lagemann, E. C. (2000). An elusive science: The troubling history of education research. Chicago: University of Chicago Press.

Muscatine, C. (2009). Fixing College Education: A New Curriculum for the Twenty-first Century. Charlottesville: University of Virginia Press.

Rose, T. (2016). The end of average: How we succeed in a world that values sameness. New York: HarperOne.

Silva, E., White, T., & Toch, T. (2015). The Carnegie Unit: A century-old standard in a changing education landscape. Retrieved from http://www.carnegiefoundation.org/resources/publications/carnegie-unit/

Star, S. L., & Ruhleder, K. (1996). Steps Toward an Ecology of Infrastructure: Design and Access for Large Information Spaces. Information Systems Research, 7(1), 111–134. https://doi.org/10.1287/isre.7.1.111