Applying the Rubric Outside Carleton
Study Overview
The feasibility studies mirror the assessment process at Carleton. A representative from the partner institution visits Carleton to experience the process here. Over the course of the following year, we work with that representative to prepare for a similar assessment on their own campus. In addition to discussing how the rubric content may be adapted to better reflect that institution's needs, we develop a strategy for collecting a sample of student work.
At Iowa State, this latter problem is relatively trivial as they already collect student portfolios to assess student communication. We work with the other institutions to create a sample in a manner that fits their needs. This may be done by targeting assignments from a sampled of courses; paying randomly selected students to submit papers showing analysis, interpretation, or observation; or it may take some other form.
Once a sample of student writing is collected, Nathan Grawe joins the institutional representative in running an assessment session at which teams of faculty readers evaluate student QR. As at Carleton, this reading spans three days. As at Carleton, the reading concludes with a discussion of what faculty readers saw in the papers and how the institution might respond to foster professional development and student growth.
Having applied the Carleton rubric to student work and discussed possible curricular implications, faculty readers then discuss the feasibility of regularly repeating the process as a means of assessing QR effectiveness on their campus. (Note: assessment need not take place annually.) This discussion includes:
- Are there items that should be added to the scoring rubric?
- Are there rubric items that should be deleted?
- What are the alternative ways to generate a sample of student work at this campus?
- Who should be involved in the reading/scoring of papers?
Reports from these studies will be made available on this site as they are prepared.
Study Results
Wellesley College Study Results (Microsoft Word 2007 (.docx) 20kB May4 10)
As our first feasibility study, this was a learning experience. Three lessons emerge in the report. First, QuIRK's rubric is designed for a study of general education work. The science-heavy sample of papers we read at Wellesley posed some challenges when assessing QR quality. Second, when reading at a different institution, two factors are changing: the identity of the readers and the character of the paper sample. To tease out the source of reliability challenges, we need to have the other institutions' readers read Carleton papers in addition to their own--something we implemented in subsequent studies. Finally, most participants thought the assessment exercise valuable and worth repeating.
Morehouse College Study Results (Microsoft Word 68kB May4 10)
Morehouse readers read a sample of papers from Carleton and a second sample of Morehouse work. In both samples, readers were able to achieve reasonable levels of inter-rater reliability. Interestingly, Morehouse readers achieved greater levels of agreement when reading Morehouse papers than when reading Carleton work. The fact that readers were better able to set common standards when looking at work from their own community supports claims that evaluators understand assessment as a localized construct. As at Wellesley, most participants found the exercise worth repeating.
Iowa State University Study Results (Microsoft Word 66kB Jul19 10)
Like Morehouse readers, Iowa State readers read samples both from Carleton and from Iowa State. Unlike the Morehouse experiment, all of the Iowa State papers responded to a common prompt. Scorers achieved a reasonable level of agreement in both relevance and extent, but showed somewhat less agreement in terms of quality.
Edmonds Community College Study Results (Microsoft Word 69kB Aug19 10)
Like Morehouse and Iowa State readers, Edmonds readers read samples both from Carleton and from Edmonds. Papers in the Edmonds sample were written to four different assignments. Reliability in relevance and extent proved strong. While a bit less reliable in the Edmonds papers than in the Carleton papers, scorers achieved useful reliability in QR quality as well.