Bride of VAMenstein: No Bad Idea Gets Left Behind

When I was much younger, my grandfather, a carpenter and engineer, had an expression he was fond of saying whenever we drove through a particularly poorly designed intersection or highway interchange.  He’d grunt in disgust and comment, “Whoever built this should do the world a favor.  Design ONE more and then drop dead.”

There are times when I’d like the economists who keep insisting they can design value added models of teacher effectiveness to consider following the same advice.

On November 25th, the U.S. Department of Education released newly proposed regulations for teacher preparation in the over 1200 programs that exist across the country.  The press release stated:

“It has long been clear that as a nation, we could do a far better job of preparing teachers for the classroom. It’s not just something that studies show – I hear it in my conversations with teachers, principals and parents,” U.S. Education Secretary Arne Duncan said. “New teachers want to do a great job for their kids, but often, they struggle at the beginning of their careers and have to figure out too much for themselves. Teachers deserve better, and our students do too. This proposal, along with our other key initiatives in supporting flexibility, equity and leadership, will help get us closer to President Obama’s goal of putting a great teacher in every classroom, and especially in our high-need schools.”

This is not a new subject for research and policy speculation.  In 1984, Judith Lanier of Michigan State University contributed a comprehensive chapter on teacher education for the 3rd Handbook of Research on Teaching.  Dr. Lanier concluded that while many spoke of the importance of teacher preparation, there were no entities willing to take robust authority for making sure its many parts worked, and that its quality remained highly spotty and often quite poor.  Since then, there have been numerous proposals to change and improve teacher preparation from the Holmes Group Reports, to the Carnegie report on teacher preparation, to John Goodlad’s proposals for preparing teachers, to the original report of the National Commission on Teaching and America’s Future.  In the 30 years since Dr. Lanier wrote her chapter, there have been numerous proposals, programs, and practices that have worked upon teacher preparation in the United States.

Now it is the turn of the Data Junkies.

The DOE announcement says states will be required to report on the performance of teacher preparation programs based upon the following:

  • Employment outcomes: New teacher placement and three-year retention rates in high-need schools and in all schools.
  • New teacher and employer feedback: Surveys on the effectiveness of preparation.
  • Student learning outcomes: Impact of new teachers as measured by student growth, teacher evaluation, or both.
  • Assurance of specialized accreditation or evidence that a program produces high-quality candidates.

Some of this is benign, some of it is deceptive, and some of it is rank foolishness.  The fact that Secretary Duncan’s statement specifically cited Relay “Graduate School of Education” as an example of an innovation in teacher preparation to be held up does not lead me to a great deal of confidence.  Relay, for those who do not know, is a teacher training “graduate school” that has no actual professors of education and is not attached to an institution of higher learning.  Rather, it is an alternative program housed in North Star Academy Charter School in Newark, NJ using its own teachers to train new hires in the methods of teaching used in North Star and allowing them both to be credentialed and to “earn” graduate degrees.  Relay and its supporters defend this because the charter school has externally impressive scores on standardized tests, but those scores come, as Dr. Bruce Baker of Rutgers University demonstrates, at the expense of more than half of the students who enroll at North Star – because they never make it to graduation.  North Star enrolls over 14% fewer students on free lunch than Newark Public Schools in general, less than half as many students with disabilities, and the students with disabilities at North Star are vastly more likely to be mild or low cost to the school, including no students with autism, no emotionally disturbed students, no intellectually disabled students, and no students with multiple disabilities.  Between 5th grade and 12th grade, half of students attending North Star leave the school, and 60% of African American boys leave.

Just to be clear: The Secretary of Education for the United States of America announced new teacher preparation regulations by praising the “innovation” of a “Graduate School of Education” that does no serious graduate study, has no qualified educational researchers, and that prepares its graduates to teach the methods espoused by a charter school where an African American male student only has a 40% chance of reaching his senior year of high school.

Components of these regulations are puzzling.  The DOE wants states to keep track of teacher retention rates, presumably because of the long known problem of early career teachers leaving both assignments or the profession in high numbers.  Such a requirement raises staggering logistical challenges, as states do not readily have ways to track the careers of teachers certified in their states who teach in other states, teachers who switch teaching in a public school for a position in a private or parochial school, and teachers who take up full time graduate studies — all of which are very different than leaving because of feeling overwhelmed and under-prepared.

More troubling, such data would be largely indicative of the professional cultures and environments of the schools in which teacher preparation graduates teach.  While teacher education has worked in the past three decades to provide prospective teachers with quality experiences to reduce the long recognized “reality shock” experienced by novice teachers, such work is frequently difficult, time and resource intensive, and requires significant rethinking of the relationship between universities and schools where prospective teachers are prepared.  However, significant research also exists that demonstrates that teacher turnover is deeply tied to school factors in initial job placements that are entirely outside university control.  In no place in these regulations on preparing teachers do I see anything related to how states and communities support the local schools to promote collaborative environments that support early career educators. What I do see is a potentially perverse incentive for teacher preparation programs to steer their graduates as far away from struggling schools as possible.

Worse than this provision by far, however, is the proposal to take the already invalid concept of Value Added Measures (VAM) of teacher performance and to use the VAMs of teachers to evaluate their teacher preparation programs.  A VAM is a statistical model based on student standardized test performance that takes a student’s previous year’s test scores, claims to predict how that student will perform given a year of effective teaching, and then generates the teacher’s “value added” based on how well students do based on those predictions.  The American Statistical Association issued a clearly worded statement this year detailing the problems with VAMs, citing both the lack of tests that are valid for the purpose and the very limited impact that teachers have on student variability on standardized test performance.  Research generally agrees that teachers are a very important if not the most important in school factor for students, but research also agrees whatever teachers’ impact is, standardized tests are an exceedingly poor measure of it, accounting for only 1-14% of student variability on the tests.

Despite these inherent flaws, VAMs remain highly popular with the federal DOE which has been influenced by the Gates Foundation funded “Measures of Effective Teaching” study which claims that VAMs can be used as a component of teacher evaluation.  Jesse Rothstein of University of California at Berkeley, however, notes that the data used to justify that claim is strikingly weak, and that teachers who are effective by some measures show up as ineffective by others and vice versa.  Dr. Baker of Rutgers illustrates here that teachers whose students score high in one year (called “Irreplaceables” by Michelle Rhee’s New Teacher Project “thought leaders”) are not all “irreplaceable” in subsequent years (and in fact most drift all over the map), making it absolutely necessary to consider that factors outside of the classroom play significant roles in student test performance. VAMs also potentially damage teachers whose students, far from being low performers, work at an accelerated curriculum that is several years past the material directly tested on the exams used to generate VAMs.  The New York Times reported in 2011 of the tribulations of Ms. Stacy Isaacson, who was universally regarded as an outstanding mathematics teacher whose students got excellent scores on state examinations and over two dozen of whom went on to New York City’s highly selective high schools, got ranked in the 7th percentile of teachers in the city by the VAM formula used that year:

NYC VAM

Ms. Isaacson’s low percentile could not be explained to her by anyone in her administration, and the fault lay at the opaque statistical formula used to rank her based on students’ tests.  Given the inherent flaws with VAMs, my explanation is as follows.  In the New York City Value Added Model, what is circled in this picture is a real number:

NYC VAMreal

Everything circled here is the result of misapplying statistical tools used to model entire national economies to a single teacher’s classrooms:

NYC VAMfake

Anyone who knows children and their development should be troubled by VAMs because in order to believe that they work with such small samples as a single teacher’s classroom, we have to believe that the VAM can adequately account for every factor outside of a teacher’s instruction that can impact how students do on a test.  Did Johnny get an Individualized Education Plan this year that finally provides support for his dyslexia?  Are Johnny’s parents reconciling after a period of separation and his home life is stabilizing?  Has Johnny’s cognitive development reached a point where he is ready for more complex learning and will outpace previous years of instruction because children do not actually develop in straight lines?  All of these are factors that can boost a teacher’s value added score without the teacher actually having done anything especially different for Johnny.  There are as many factors not directly related to a single teacher that can negatively impact a value added score.

So let’s review: Research supporting VAMs ignores its own contradictory research.  No current standardized test is sufficiently well designed for the purpose of generating VAMs. VAMs measure teacher input on student variability in standardized test scores which is as low as 1% and only as high as 14%.  Teachers whose students score in very high percentiles in one year can have students who score far differently in subsequent years. Teachers who are effective by every other measure possible can be placed in the very bottom tier of teachers using VAMs.  This is not the kind of stuff that inspires much confidence, but the federal DOE is going to push ahead anyway.

Have really terrible measures of teacher effectiveness on your hands?  Never mind!  If you are Secretary Duncan, you have Bill Gates backed research and advocacy, and seriously flawed “research” from Michelle Rhee’s pet group to tell you otherwise.  Full speed ahead.

Of course, if you are going to blatantly ignore what a growing body of genuine research tells you about your favored reforms, it stands to reason that you will double down on them and try to push them even further into the system by measuring teacher preparation programs by the VAMs their graduates generate.  There is a lesson here that Secretary Duncan, Bill Gates, Michelle Rhee, and an entire platoon of corporate reformers seem incapable of learning, and it has to do with learning humility when beloved projects turn out to be far more complicated and fraught with failure than anticipated.

In the 1935 sequel “Bride of Frankenstein,” the badly wounded but recovering Henry Frankenstein initially renounces his creation but is forced by his former mentor, Dr. Septimus Pretorius, to assist a project creating a “bride” for the monster.  The monster is excited by the chance to have a companion like himself, but is quickly devastated by her immediate, terrified, rejection of him and destroys himself, Henry’s laboratory, Dr. Pretorius, and the bride, proving again that the power of life and death is not a toy to be trifled with.

I could save Secretary Duncan quite a lot of trouble if he’d just ask.

Well, that didn't go as planned, did it?

Well, that didn’t go as planned, did it?

6 Comments

Filed under Gates Foundation, schools, teacher learning, Testing, VAMs

6 responses to “Bride of VAMenstein: No Bad Idea Gets Left Behind

  1. VAM = Value Absent Macho

    VAMbot Is The New SPAMbot

  2. “VAM: The Scarlet Letter”

  3. Pingback: Arne Duncan's Denial | likev.net

  4. Pingback: Arne Duncan's Denial

  5. Pingback: Value-added measures: Why all the love? | bloghaunter

  6. Pingback: Arne Duncan's Denial | Eblog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s