High Stakes Testing

Instruction That Measures Up: Successful Teaching in the Age of Accountability
by W. James Popham
(ASCD, 2009)

Reviewed by Kenneth J. Bernstein, NBCT
High School Government & Social Studies (MD)



Teacher Leaders Network

This is a book by one of America’s acknowledged experts on assessment: now emeritus from UCLA, Popham has been a leading figure in research (having served as President of the American Educational Research Association), in publication (of his many books and articles, and as editor of a major journal on evaluation), and as a person whose opinions on matters educational are always worth considering. For those interested, you may read a professional bio here.


Popham has in recent years been critical of how our educational policies have approached the matter of tests, assessment and evaluation. This book therefore may catch some off-base, because Popham now moves beyond criticism to try to help those in the classroom deal with the reality of test-based accountability, something to which the current national administration has made clear its commitment.

The purpose of Instruction That Measures Up can be clearly understood from one paragraph in the preface, which appears on p. 2:
I believe the best way for teachers to deal with test-based accountability pressures — the way that benefits students — is to accept those pressures as a given, then plan and carry out instruction knowing that it will take place on an accountability-spotlighted stage. What teachers must do is focus on providing instruction that measures up: to the expectations of administrators, parents, and taxpayers; to their own professional standards; and, most essentially, to the needs of their students.
The book is divided into 7 chapters:
1. Teaching Through an Assessment Lens
2. A Quick Dip in the Assessment Pool
3. Curriculum Determination
4. Instructional Design
5. Monitoring Instruction and Learning
6. Evaluating Instruction
7. Playing the New Game
There is also a two page list of resources, an index, and some background on the author, with a total of 174 pages.

For those who are not all that knowledgeable about matters of assessment — which not only includes many of those in the classroom, but far too great a percentage of those involved with making educational policy — the second chapter by itself justifies the book. Popham divides the “Assessment Pool” into four broad categories: Testing as Score-Based Inference Making; The Core Concepts of Assessment; The Categories of Educational Tests; and The Summative and Formative Functions of Educational Assessments.

He provides clear explanations of the meaning of terms. Where necessary, he offers a great deal of detail with easy-to-comprehend explanations. A reader who pays attention will begin to grasp the importance of how psychometricians understand key terms.

The four core concepts Popham addresses are the key ideas of Reliability, Validity, Assessment Bias, and Instructional Sensitivity. Any assessment that fails to take into account these core concepts, whether it is designed by a classroom teacher for instructional purposes or by outside organizations and imposed from above for purposes of accountability, will by definition be at risk of being unable to provide sufficiently accurate information to allow one to draw appropriate inferences from the data.

Popham offers a number of blunt statements about problems with our current schemes of assessment, and a number of key warnings, of which one on p. 29 caught my attention: “It serves no one to ascribe unwarranted precision to educational tests.” Unfortunately, our national obsession with numbers and our desire to rank and compare means that it is precisely ascribing too much precision to the data we obtain from tests that has been distorting much of our educational policy for the past several decades.

Each of the chapters has some important material at the end. We have a “Chapter Check-Back” in which key concepts are repeated in summary form, as well as a list of several “Suggestions for Further Reading” on the topic of the chapter. The chapter on Monitoring Instruction and Learning has four key concepts, among which is this:
Feedback to students is most effective when it is task focused, directive, timely and simple; it is least effective when it comes in the form of grades. (p. 125)
Among the suggested readings are works by notable names such as Rick Stiggins, Robert Marzano, Grant Wiggins and Jay McTighe, and Popham himself, as well as some valuable works by lesser known lights. Each suggestion is accompanied by a brief explanation by Popham as to why the work is included: for example, about the address offered by D. A. Frisbee as outgoing president of the National Council of Educational Measurement, Popham tells us
Frisbee lays out a set of basics in educational assessment – concepts that he feels have been distorted in recent years. The article gives teachers a list of important misconceptions to avoid. (P. 51, italics in original).
Many of the chapters also contain political cartoons. Through these Popham pokes fun at a lot of rhetoric commonly encountered in current discussion on educational policy. The final cartoon on p. 160 portrays a pre-test pep rally, with a sign in the background reading:

Tomorrow’s Accountability Test
-- Cost teachers their jobs.
-- Close our school.
-- Destroy your future!


The speaker, apparently the principal, is urging the assembled to “Try harder, and harder, and harder!” while one teacher in the audience comments: I can see why Confucius said, “High stakes are for string-beans.”

Popham is for PROPER use of assessment. He quotes three sentences from Dylan Williams of Britain. The middle sentence reads:
 It is only through assessment that we can find out whether what has been taught has been learned.
That, Popham says, is “one I’d like to see in neon lights above the entrance to every school in the world.” (p. 101)

This is a book Popham intends to be of practical use to teachers. One may not agree with all his formulations -- this reader had some questions about the approach Popham offers as the structure of an effective lesson. Nevertheless there is a great deal of insight and practical advice. If nothing else, readers should come away with a deepened understanding of the terminology, and of the appropriate uses and inappropriate misuses of assessment of various kinds.

Before some final remarks, allow me to share a number of very brief selections which will give you a real sense of Popham and his approach:
...rarely can anyone look at a planned instructional activity and say for certain that it’s going to be effective. (p. 18)

Still, few educators, though seemingly awash in an ocean of test-based accountability, currently recognize how few accountability tests are even mildly sensitive to the quality of a teacher’s instruction. (p. 38)

It is far better for students to master a modest number of truly potent, large-grain curricular aims than it is for them to superficially touch on a galaxy of smaller-grain curricular aims. (p. 61)

Teachers must always concern themselves with what’s best for their students. (p. 70)


Remember, instructionally insensitive accountability tests are essentially insensitive to instruction, meaning what a teacher emphasizes in class is probably not going to make a substantial difference in students’ scores on an instructionally insensitive accountability test. (p. 71)

Self-reflection is a teachers’ ally. (p. 114)

...it is fundamentally wrongheaded to try to use a test to help students monitor their own learning while, at the same time, using the results of that test to grade or rank those students. (p. 119)

...formative assessment’s focus is on getting students to learn, not outperform other students. (p. 120)

...although many of those earlier researchers set out on a quest for a silver bullet that would permit the definitive appraisal of a teacher’s competence, such a bullet was never found. IT still hasn’t been.

The insuperable obstacle to the creation of a sure-fire, cookie-cutter approach to teacher evaluation is teaching’s profound particularism. (p. 145)

And one final quotation, that may help summarize Popham’s thinking:
  So, as someone who’s been dipping in and out of the teacher evaluation research literature for more than 50 years, I’ve come to a conclusion about the only truly defensible way to evaluate a teacher’s skill. Because of the inherent particularism enveloping a teacher’s endeavors, I believe the evaluation of teaching must fundamentally rest on the professional judgment of well-trained colleagues. (p. 146)
Ultimately this is a book about teaching. It is presented through the lens of a deep understanding of what assessment and evaluation can contribute to the improvement of teaching practice, as well as some serious cautions, offered throughout the work, about the dangers of pushing the instruments we have beyond the limits of the valid information they can provide us. Or rather, the reliable information from which we are able to draw valid -- even if often limited -- inferences.

I come away from the book agreeing with Popham that teachers should insist on getting a better grounding in assessment and evaluation — which is about far more than testing but which should thoroughly cover matters of testing -- as part of their professional development. It’s best if it is part of teacher preparation, but not too late as a part of continuation for those already in the classroom.

This is a valuable book. It is for and about teachers, but can be profitably read by anyone interested in improving teaching by the proper understanding and application of assessment. That should mean everyone, for we are all affected by educational policy, even if only through decisions made on how to spend the taxes we all pay.

I highly recommend this book.

Kenneth J. Bernstein is a National Board-certified teacher of social studies at Eleanor Roosevelt High School Eleanor Roosevelt High School in Greenbelt, Md., and a member of the Teacher Leaders Network. He is nationally known as a blogger on education and other issues under his online name of teacherken. Bernstein is also a 2010 recipient of The Washington Post’s Agnes Meyer Outstanding Teacher Award.

Fires in the Mind: What Kids Can Tell Us about Motivation and Mastery
by Kathleen Cushman and the students of What Kids Can Do
(Jossey-Bass, 2010)

Reviewed by Kathie Marshall
Middle Grades Literacy (CA)
Teacher Leaders Network

When I first picked up my copy of Fires in the Mind, the latest of several books written by Kathleen Cushman to bring more transparency to adolescent thinking, I looked first at the appendix where the author lists books by other authors as resources for the reader.

I was delighted to find the names of many authors I’ve learned from, including Howard Gardner, Alfie Kohn, Mel Levine, Robert J. Marzano, Carol Ann Tomlinson, and Rick Wormeli. I was especially excited that she mentioned Carol S. Dweck’s work on mindset and Brainology. I had the privilege of hearing Dweck speak three years ago, and ever since I have incorporated into my first-week activities Brainology and other exercises about a growth mindset that values effort.

As I dug into Cushman’s new work, I had in the back of my mind the fact that over the summer I wanted to reflect on some of my lessons and revise/improve them — especially my yearlong magazine publishing project. I was hoping that Cushman’s book would support that intention, and I was not disappointed. It’s all about discovering what motivates students to seek mastery of something.

Cushman worked closely with students through the “Practice Project,” a program of the What Kids Can Do organization funded by MetLife Foundation. It began with an investigation of whatever skills students felt they possessed, as the first step in finding out “how you get good at something.” Cushman writes:
A simple question, it reverberates at many levels. It matters equally to youth and adults, rich and poor, professional, artist, and tradesperson. Its answers have the potential to transform our schools and communities. And exciting research on the question of developing expertise has emerged in recent decades from the field of cognitive psychology. Powerful new evidence shows that opportunity and practice have far more impact on high performance than does innate talent.

Cushman’s student co-authors helped her identify several key pieces of the puzzle by examining:

• how they got started (it looked fun, others they liked were doing it, they were encouraged by someone to try it),

• why they kept going when the effort was challenging,

• and what setbacks and satisfactions they experienced.
Along the way, students also interviewed other experts to help them more fully understand the process of deliberate practice in order to get good at something. Eventually they were able to compile a list of experts’ habits that are worthwhile in any learning situation, including but not limited to, asking good questions, considering other perspectives, revising repeatedly, persisting, and knowing your own best work styles. These are important concepts for students to understand and apply to new learning situations both inside and outside of the classroom, and we teachers can certainly use them to make our classrooms better communities of practice.

One key statement I made note of was the following: The most compelling school experiences involved hands-on projects in which they could work in teams toward an outcome that mattered to them. (p. 9)  I couldn’t help but think about how, at least in my school system, the total focus on accountability mechanisms is driving teachers away from critical but time-intensive learning experiences for students and toward constant drill and preparation for high-stakes testing.

However, I soldiered on, looking for express ways in which I might improve the magazine project I reinstated last year with my sixth grade English students. In this inquiry learning activity, students choose an area of study and can work alone or with a partner and the contents of the magazine must be tied to sixth grade English standards. It was very helpful to think about our magazine project in the light of Fires in the Mind. I saw clearly what I had done right as well as ways I might revise the project to help students better reach mastery by drawing on the wisdom of Cushman’s kids.

Throughout the book, the author provides a number of questionnaires and checklists, which are also downloadable at this Resources page. These include questionnaires for students to think about as well as for teachers to use in their planning. Later chapters in the book use the lens of deliberate practice to explore homework, interdisciplinary projects, and performance as evidence of mastery.

No matter what stage we’re at as educators, I believe every teacher can mine this book for many helpful nuggets to support student mastery. As a student named Avelina told Kathleen Cushman: “If teachers knew what gave us that driving force to do better, they could apply that, so that everyone can do things to the best of their ability.”

We can help ignite “fires in the minds” of our kids, and this wonderful book makes excellent fire starter.

Kathie Marshall teaches middle grades language arts in the Los Angeles Unified School District. A former school-based literacy coach, she writes frequently about instructional practice and the teaching life. To engage more deeply in the work of the Practice Project, visit the Fires in the Mind website.

Rigorous Schools and Classrooms: Leading the Way
by Ronald Williamson and Barbara Blackburn
(Eye on Education, 2010)

Reviewed by Renee Moore, NBCT
English Teacher (MS)
Teacher Leaders Network


Rigorous Schools and Classrooms: Leading the Way
is a follow-up to Barbara Blackburn’s 2008 book, Rigor is Not a Four-Letter Word (see Karen Molter’s review here) and to fully appreciate the points, the books should be studied together. Both authors are former teachers (Williamson is also a former principal) whose educational careers run the gamut from K12 classroom to respected university researchers.

While Blackburn’s first book in this set was aimed at teachers and the classroom level, this book is designed primarily to show school leaders how they can navigate an entire school into a more rigorous culture and support teachers as they increase the level of rigor in their classrooms. The authors acknowledge there are many differing definitions of academic rigor in use today, and give a brief summary of those definitions and the many recent reports calling for more rigor in our schools. For Williamson and Blackburn, the preferred definition of rigor, which came from a practicing school principal, is:

creating an environment in which each student is expected to learn at high levels, each student is supported so he or she can learn at high levels, and each student demonstrates learning at high levels (p.28).

As you might guess from that definition, a great portion of the book is aimed at the role of expectations. Years ago, my teacher-researcher friend Joan Cone did a powerful study entitled “The Gap is in Our Expectations.” In it, she examined what happened when a high school chose to end its ability-based tracking program. The hardest part of their process was getting the  formerly lower-tracked students and the faculty to believe those students could do or would even attempt the same level of work as their peers. This same struggle with mindset, according to the authors, is at the heart of today’s push for academic rigor.

Among the many statistics cited in Rigorous Schools is one that reflects student expectations, or rather how those expectations are not being met. It comes from the 2006 report on high school dropouts, The Silent Epidemic, and notes that “66% of dropouts [said they] would have worked harder [in school] had more been demanded of them.”  The authors also reference a 2009 study of “low performing schools in Newark, NJ…where it was found that allowing students to struggle with challenging math problems led to improved achievement and results on standardized tests.” Williamson and Blackburn go on to openly challenge several myths about rigor, as well as teacher and student responses to it.

The book is full of charts and other tools for administrators to use as they both develop and evaluate rigor within their schools. Much of this material is taken from the first book, to which readers are frequently referred. In this second volume, the focus remains on the role of leadership. Quoting another principal the authors assert, “The school leader is most influential in creating and maintaining a rigorous culture. Without leadership, expectations will wane and outcomes will be mixed at best.”

One key chapter addresses “Ownership and Shared Vision” as a requirement for increasing rigor, as the authors rightfully acknowledge that to be effective and lasting, such a schoolwide shift cannot be a top-down decision nor a technique practiced by a few teachers scattered around the building. Another chapter addresses the role of the school leader as an advocate for the institution, building support from outside the school for a more rigorous culture within it.

The discussion around increasing rigor, particularly at the secondary level, takes on even greater significance as the Obama Administration pushes its goal of every U.S. student graduating from high school "college and career-ready." Many wonder how this can be accomplished given the glaring inconsistencies and inequities in American education today. The authors include a sample advocacy chart of facts from one high school that could be applied to schools and states around the nation; for instance:

“The fastest growing part of the high school curriculum at the moment are AP or college level courses. The fastest growing part of the college curriculum is remedial or high school classes.”

“High school tests [state standardized tests] address content that does not exceed the 9th or 10th grade.”

“15% of our students lose their scholarships at the end of their freshman year [of college], due to low GPA.”

These facts could easily have come from my own community in the Mississippi Delta or hundreds of others across America.

Most of the charts and tools in the book are downloadable from the website, including the PRESS Forward action plan and template to facilitate moving towards a more rigorous school. The book also includes very practical discussions of some of the most challenging details of such a school transformation including grading, scheduling (especially to provide time for teacher collaboration), student support, resistance from stakeholders, and suggestions for shared leadership.

There are several things I like about this book. The ideas that the authors are promoting contrast sharply with the test-prep driven malpractices that are being forced on teachers and students in lower performing schools — practices which we know will ultimately short-circuit true learning and sabotage long term academic accomplishment. Another potential benefit is that open, school-wide discussions about rigor do force us educators to examine our own deeply-rooted prejudices about expectations for different types of students.

I have seen educators hide behind a well-intentioned wall of paternalism towards some groups of students until the possibility of rigor for all is broached. I recommend this book, if not as a guide, certainly as an important discussion starter, for school improvement.

Renee Moore teaches in the rural Mississippi Delta. A former state teacher of the year and Milken Award winner, she serves on the boards of the Carnegie Foundation for the Advancement of Teaching and the National Board for Professional Teaching Standards. Her blog TeachMoore is featured at the Teacher Leaders Network.

Making the Grades: My Misadventures in the Standardized Testing Industry
By Todd Farley
(PoliPoint Press, 2009)


Reviewed by Kenneth Bernstein, NBCT

High School Government & Social Studies (MD)

Teacher Leaders Network

As the use of tests created external to schools and classrooms has exploded, one issue has always been the question of whether to rely merely upon selected response (a.k.a. multiple choice) items, or to also include constructed response items (paragraphs and essays).

Selected responses are cheap to administer; they can be scored solely by machine and the results obtained quickly. It is even possible, utilizing item response theory, to administer the test on a computer and use early responses to vary the items offered to the test taker, thereby determining the level of performance more quickly and accurately.

But many think we need more: after all, life does not ask us to choose one out of four or five pre-selected choices. Thus many colleges and universities, employers, and classroom teachers prefer that the tests include constructed items — “essays” if you will.

While it is possible to machine-score such items, that technology is still in its relative infancy, which is why companies that produce tests have need of human scorers. And it is because of this need that we get Todd Farley’s book Making the Grades.

Farley spent 15 years in a variety of positions involved with the scoring of such constructed responses. He worked for a number of America’s most important assessment companies, often doing the work on contract for various states, including Virginia, where I live. 

I am not a trained psychometrician, although during my now-abandoned doctoral studies I did seriously study issues of assessment. I am a school teacher today, in my 15th year of teaching. Each year but one, I have had to prepare students to sit for external tests – which may or may not have met the criteria to properly be labeled “standardized” — that included constructed responses. These tests have included the Maryland School Performance Assessment Program, the Maryland High School Assessments, and The College Board’s Advanced Placement examinations. During my one year of teaching in Virginia, known for its Standards of Learning (SOL) assessments, the middle school American History test was made up entirely of selected response items. I also bring to this review some experience that parallels Farley: in 2009, I served as a Reader for the Advanced Placement examination in U S Politics and Government and scored one of the four Free Response Questions on that year’s examination.

As I glance at my copy of the book, I have more than 40 sticky notes that I have affixed to pages containing passages I thought might possibly be worth quoting. Some I obviously will forego. Farley offers explanations of terms like reliability and validity, and explains how in the case of reliability, the term was often misused by those supervising the scoring process. Simply put, scoring companies are often satisfied if those scoring agree 80% of the time, even if that to which they agree is erroneous. It is like a scale that consistently reports your weight as 20 pounds less than reality. The information you obtain is reliable — but it is NOT valid.

What educational measurement should provide is the ability to draw valid inferences from the information analyzed. If nothing else, reading this book will raise questions in your mind about whether many of the tests being used to evaluate students, teachers and schools meet that standard.

Farley demonstrates that reliability is not necessarily something we can rely on. Allow me to quote an entire paragraph from pp. 55-56 to illustrate:
But you want to talk about a sliding scale? The scale we used to score writing flopped about like a puppy on a frozen pond, going every which way, keeling over and standing up and falling down. In scoring writing, for instance, an essay that had a good development of ideas could earn a 6, a 5, a 4, maybe even a 3. An essay that was troubled on the sentence level in terms of grammar, usage, and mechanics could earn a 1, a 2, a 3, perhaps even a 4, 5, or 6. (I don’t dispute the idea: Gertrude Stein said of F. Scott Fitzgerald that she’d never met anyone who was such a poor speller, yet he still managed to produce a decent text or two.) The point is that essays with identical levels of ability in certain areas could end up (due to other considerations on the rubric) with significantly different scores. In scoring writing, we were far from having hard and fast rules to live by. It all seemed a little untenable, rather mystifying, and the easiest thing to do was to hand your essay off to your neighbor or plead with your supervisor for help.
That passage references the idea of the rubric, the standard by which the grader is supposed to evaluate the essay. If a rubric is sufficiently clear to give guidance, it also may be too rigid for the occasional creatively written paper. A rigid application of the rubric might, as Farley illustrates on more than one occasion, result in a good piece of writing being undervalued and a poor piece of writing receiving a high score.

And the scoring companies often have little control over the rubric and how it is applied. They are scoring under contracts issued by states that may leave them little flexibility. Allow me to illustrate using an example of a scoring team examining an anchor paper.

Anchor papers are supplied to scorers to give examples of the work expected at each scoring level of a rubric. Farley provides the four-point rubric for an 8th grade writing assessment (descriptive mode). The rubric, provided by the state in question, expects the student to use a five-paragraph format. According to the rubric, for the scorer to assign all 4 points, the organization, focus and development, style and sentence fluency, and grammar-usage-mechanics should be considered “excellent.” For  3 points, they should be considered “good.”

Farley describes how table leaders were trained to lead the scoring of this 8th grade assessment. After reviewing the anchor paper for a score of 3 (which Farley reprints in the book), all of the table leaders were scratching their heads, describing the paper as “lame.” One seventh grade teacher in the group argued that she would not consider this essay good work by her students. Another pointed out that it consisted of only simple sentences, and Farley (also a table leader) noted it had no voice and no style. The response of their trainer Maria is telling:
Maria looked down at the essay. “I’m not saying I’d give this a 3 in my classroom, either, but that’s how we have to score it based on this ‘focused holistic” rubric. Most importantly, in this state’s Department of Education, the essay has a five-paragraph format, with introductory, body, and concluding paragraphs, and an introductory sentence in all five of them.”
As shocking as that is, the reader might not be prepared for what comes next. It’s an anchor paper that contains simply brilliant writing (and would be so for a high school student) but earns a score of 2. Here is Farley’s account of the conversation that ensued among the scorers:
   Greg scoffed. “This kid needs a publisher, not a score from us."
   Maria looked guilty. “I know,” she said. “I certainly wouldn’t give this a 2, either. The writing may be sentimental, but it’s first-draft work from an eighth grader. It’s a damn good response, I agree.”
    “So?” Harlan said.
    “Well, what’s the important person or thing in the essay? It’s her favorite spot, a fact we don’t know until the last sentence. That’s not five-paragraph format, is it? There’s no introductory paragraph, no introductory sentence --”
   “No,” Greg said, “it’s way more artful than that, building up the suspense nicely and using some beautiful descriptive language.”
   “Yup,” Maria agreed shrugging. “I know. But this is how they want us to score them”
   “Really?” I asked. “Rather a tedious five-paragraph essay than a beautifully done three or four paragraphs?”
   “It seems that way,” Maria answered. She looked at us, resigned. We looked back at her defeated.
   “All we care about is the formatting?” Pete asked.
   “That’s not the only thing, “ Maria answered, “but it is the first thing.”
   “Wow,” I said, “it almost seems a kid could get a 3 for turning in an outline.”
   Maria thought about it. “Not quite,” she said.
The book is a good read, such a good read that I hesitate to go into too much detail, so that I don’t spoil the enjoyment – and the shock – you will experience as you read it. But I’ll share a few other samples.

At the time of the passage cited above, the scorers were earning $10/hour or less. They were not required to be content matter knowledgeable, something that was a persistent issue in the experiences Farley cites. Scorers were trained, and had to meet a certain standard of scoring accuracy in order to be allowed to score. But the need for scorers was so great that the standards of accuracy were often bent, and scores were changed and manipulated to maintain acceptable levels of accuracy.

Please note that term — acceptable. Farley cites examples of where too great a level of accuracy could cause problems, and this was truly scary, because the examples involved the scoring of the National Assessment of Educational Progress (NAEP), which is supposed to be the ‘gold standard” by which all other educational assessment in this nation is measured. Farley was told he could not have a higher degree of accuracy than was recorded in previous scoring cycles lest the comparability of scores from cycle to cycle be lost. Ponder that for a while.

I do want to offer some cautions. Farley paints with a very broad brush. What he says is certainly widely applicable, but not universally so.

I teach in Maryland, which until May 2009 included two kinds of Constructed Responses (Brief and Extended) as part of the High School Assessments required in four subjects (Biology, Algebra, English, and Government) for graduation from high school. Each constructed response was scored by the same 4–point rubric, a copy of which students had during the exam. In the scoring process, each response was read by two scorers. Inter-rater reliability required only that the scorers gave adjacent scores, not identical scores. If the scores were not identical, the student received the higher score. I am not sure how accurate a measurement that was, but at least the students got the benefit of the doubt, unlike the scenario above that Farley described.

Cost control is another factor that influences the quality of the scoring process. Farley’s account focuses on scoring companies that paid relatively low wages to individuals who often lacked the necessary professional background to make accurate, independent judgments about the work they were scoring. As a result, a highly controlled system of scoring was imposed. But this method of assessing writing samples is not universal.

Here I speak from my experience as a reader of free response questions on the Advanced Placement exam for US Government and Politics. To score this exam, an individual must teach the subject in a post-secondary institution or have at least three years experience teaching the AP course in a high school. We were certainly qualified as to content. We were also paid substantially more than the $10 an hour Farley cites for the incident above, plus expenses for transportation, food and lodging.

We were thoroughly trained. We had our work closely examined at first, until we demonstrated our competence. We were spot-checked regularly by table leaders and by question leaders. The range of our scores was monitored by computer, and if we showed any scoring patterns that raised questions, our work would be reexamined. But once we got going, the scoring was limited to a single reader because we had over 100,000 exams (four questions each) to be scored in less than a week after training.

I know how seriously the Advanced Placement officials took this process because of my own experience. I read very quickly, and I was so much faster than others that, in the beginning, my work was checked very closely until the question leader determined that I was scoring accurately. When I had any doubt about a response, I would on my own initiative check with one of my fellows and/or with the table leader. That pattern was widespread among my fellow scorers.

I would argue that the AP people have demonstrated that reasonably accurate and consistent scoring of constructed response by properly trained people is possible, if one is willing to accept the concomitant costs.

Still, despite the caveat I offer based on my AP experience, I think Farley’s book is a valuable read with much to tell us about the often poorly understood processes and implications of large-scale high stakes testing. He ends the book with these blunt words:
If I had to take any standardized test today that was important to my future and would be assessed by the scoring processes I have long been a part of, I promise you I would protest; I would fight; I would sue; I would go on a hunger strike or march on Washington. I might even punch someone in the nose, but I would never allow that massive and ridiculous business to have any say in my future without battling it to the bitter, better end.

Do what you want, America, but at least you have been warned.

David B. Cohen, NBCT
Teacher Leaders Network 

With Thanksgiving arriving, we begin to hear and read commentaries about all that we are thankful for. As a teacher, I am mindful of the role teachers have played in my own life. Of course, the climate and values in education have changed somewhat since the 1970s, and what I once may have believed about schools and teachers when I was a naïve child must be tested against the rigorous standards of today.

 
I attended two different American military schools in Germany for kindergarten through second grade. Perhaps due to moving around a fair amount as a child, I find it hard to reconstruct memories of the teachers from those years. I’m told I was a skilled reader at an early age. Had anyone asked at the time, we would have likely attributed my facility for reading to my parents’ constant encouragement and the many high-interest books in our home. I now have to assume that although I can’t remember my earliest primary-grade teachers, they must have been brilliant educators. As we are frequently reminded in education these days, “the teacher is the most important factor in student success.” Thank you to those Armed Services teachers, whoever and wherever you may be.
 
In third and fourth grade, I was living in Colorado Springs. My dominant educational recollection is that my fourth-grade teacher punished me harshly whenever I held my pencil the wrong way. I was confused and humiliated by the insistence that I meet the highest standard of pencil-gripping and meet it immediately. But now, immersed in the field of education myself, I’ve learned that we have to be tough, we have to produce results and keep students progressing on schedule. And this teacher certainly produced results: I ended up in a program for gifted students when I reached fifth grade. The fact that I felt no connection to my education or my peers is far less important than the fact that my skills were far above average and growing. My tears and alienation were inconsequential, non-measurable and non-educational outcomes, so let me now give thanks for that fourth-grade teacher. I’m only sorry that my pencil grip never changed.
 
For fifth and sixth grade, I moved to Los Angeles and attended my fourth elementary school. I thoroughly enjoyed those years and imagined I was having a wonderful life. By today’s standards, however, I suffered from a conspicuous lack of measurable progress. I am sorry to say it, but my fifth and sixth grade teachers did not fully embrace accountability. The gifted program involved exciting enrichment activities but no summative assessments. Though I didn’t mind at the time, I can see now that those years represent a wasted opportunity. Where was value-added measurement when I needed it most?
 
Instead of helping me continue to make more than a year’s progress in each grade level, these teachers tried to promote “love of learning” and “personal responsibility.” Not really the school’s job, was it? In the absence of clear and rigorous state or district mandates, they had me reading Ray Bradbury in fifth grade and J.R.R. Tolkien in sixth. To make matters worse, my sixth grade teacher had us discuss the books without a single worksheet or objective assessment tool in sight. Open-ended conversation about The Hobbit made for pleasant class time, but did little to guarantee that I could have passed a rigorous, standards-based assessment. From a contemporary educational perspective, I have to ask, what was the point?
 
There were other signs of trouble, too. We memorized and recited poetry in class, and sang songs for no apparent reason. My fifth grade teacher even came out to the playground and taught us new games, organized a class Olympiad, and fostered friendly competition and sportsmanship. What exactly were the learning objectives and intended outcomes here? Where was the accountability? When I think of all the better ways that time could have been spent, I know now that some benchmarks must have gone unmarked. My inclination to give thanks is lessened somewhat at that realization.
 
I must concede that in fifth and sixth grade I thoroughly enjoyed school -- but not in any measurable sense. I made many friends, and even still keep in touch with a few of them, but what of that? As a professional educator in the 21st century, I am often reminded that I need to be data-driven. I don’t have time to waste on ephemeral pleasures or concerns about social and emotional matters. Performance counts.
 
Thanksgiving this year has made me realize that some of the blessings in our lives are obvious, but some warrant deeper examination. Looking back on my early education, I am thankful for those teachers who put academics (and pencil-gripping) first -- and thankful that those teachers who tried to reach the “whole child” didn’t harm me irrevocably. They may have shirked their accountability and neglected rigor now and then in their reckless pursuit of the joy of learning, but somehow it all worked out in the end.
 
Happy Thanksgiving, everyone!
 
David B. Cohen teaches English and serves as an academic advisor at Palo Alto High School in California. He is a National Board Certified Teacher and a co-founder of the Accomplished California Teachers (ACT) organization.

Making Learning Whole by David Perkins, Jossey-Bass (2008)

Reviewed by Mary Tedrow

Disclaimer: I read (incrementally) the entire text of Making Learning Whole: How Seven Principles of Teaching Can Transform Education while sitting in front of my 11th-grade students in Reading Workshop, second block of the day. When I read under those circumstances, I find my mind runs on three tracks: A focus on the reading; a jump to what my students will need from me in the next 25 minutes; and, for professional titles like this book, a jump to how what I am reading will affect this particular classroom.

In this particular reading experience, I did not focus very well. I offer this confession because it led me to believe that, as a classroom teacher, I was not the audience David Perkins was looking for. Otherwise, my brain would have been firing more on the third track: making direct application to my classroom work.

That rarely happened.

Making Learning Whole is Perkins’ argument that education is damaged by breaking knowledge down into discrete parts, which he calls elementitis, a phenomenon that leaves students spitting out fractured facts that have little cohesiveness or application to the subject under study.

He advocates designing course instruction around a “junior version” of the ultimate goals in a course so students can learn the subject as a whole concept -- and then returning to aspects of the “game” that prove difficult to master. Perkins calls this working on the hard parts.

His ongoing analogy refers to how he and so many others learned baseball – first in the junior version of Little League. The complexities of the game are revealed over time and the hard parts are the skill drills any athlete will remember from their days of training.

He continues the baseball analogy through seven chapters that present each of his seven principles (my take on each chapter is in parentheses): Play the Whole Game (reduce elementitis); Make the Game Worth Playing (motivation); Work on the Hard Parts (practice, practice, practice); Play Out of Town (show that knowledge transfers); Uncover the Hidden Game (explore the underpinning structure of a subject; in the case of baseball it is the strategies that lead to winning); Learn from the Team (learning as a social rather than individual activity); Learn the Game of Learning (the old chestnut of growing lifelong-learners).

I agree with Dr. Perkins. I think John Dewey would also agree. I think most vocational teachers, or Yearbook, Art or Music teachers – or any number of others who have been lucky enough to teach a course like the Journalism course I taught for years – would agree. We have all worked with students in a junior version of grown-up games and seen the power it has on student achievement. Students learn through the process of doing over and over, working on the hard parts, uncovering – as Perkins would say – the hidden game concealed within the grown-up games.

I think where I’d quibble with Perkins is over his subtitle: How Seven Principles of Teaching can Transform Education. This book offers little practical advice that might help instructors who resist leaving their safe academic curricula to make the uncomfortable leap to teaching the whole game. Little is provided to bridge that gap through visualization or application of his theories. No helping hand is offered to teachers who struggle against a tide of mandates that encourage ever more elementitis, grasping for any means to offer substantive learning experiences for their students.

After explaining his take on what I read as constructivism, Perkins offers a one-pager at the end of each of his seven principle chapters titled “Wonders of Learning.” All of the five or six paragraphs at the end of each chapter begin with “I wonder” as in “I wonder how to develop a good theory of difficulty for what I’m teaching?” Questions, I assume, designed to set readers on the course to transformation.

Frankly, I wondered when I’d have the time to re-think my entire theory of teaching while most of my waking hours are consumed by mandated state tests, data collection, grade reports, increasing student enrollment, committee meetings, grading and responding to student work. I need a little more help to go through my transformation. I suspect those who don’t spend leisurely hours in a think tank environment might as well. The theory is great, but without more step by step assistance, Perkins’ ideas are unlikely to revolutionize the practice of teachers too busy to look beyond next week’s plans.

As I thumb back through this book, I don’t see the usual evidence of my high engagement with a professional title – the highlighting, post-it notes and scribbled plans in the margin. On a couple occasions, however, what Perkins said did lead me into thoughts about my own students and their next 25 minutes.

I like the graphic on page 104. It shows three ways to respond to student confusion. It might be helpful for reaching those teachers who are stuck in “blame the student” mode. I also made note of questions he offered to get students thinking. The ones I wrote down: What’s going on here? What makes you say that? and What makes this hard? I’m always looking for good inquiry questions.

The sports analogy works, the theory is acceptable. But in all honesty, David Perkins did not offer enough to overcome my distraction, and I won’t be recommending this book to my teaching peers. I suspect it will be required reading in his Harvard Graduate School of Education courses.

Mary Tedrow teaches high school English and journalism and co-directs
the Northern Virginia Writing Project. She is a National Board
Certified Teacher and a former Fellow of the Teacher Leaders Network.

Syndicate content