This is part 4 of a series on how we use value-added data in Tennessee and across the nation. The entire series can be found below:
Part 1: what the research says about value-added data
Part 2: how value added data impacts teachers in the classroom
Part 3: how to use value added data constructive
I’ve written a lot in the last couple weeks about value-added data, about how we use it now and how I think we should use it. That said, there were a few questions about TVAAS that didn’t fit in any of the pieces I’ve already written but which I want to elaborate on in more detail. This final piece offers some of my thoughts on these questions.
Question: What About Non-Tested Teachers? How Can We Use Value-Added Data When It Comes to Them? This is a huge concern, because right now almost two thirds of Tennessee teachers are being evaluated based on school-wide data. This is because not every subject is tested, even within the core subjects (geometry, my subject, is not tested). The result is that in some cases, math teachers are evaluated by English scores and music teachers by writing scores.
My simple recommendation is that we need to stop using value-added data to evaluate non-tested teachers, even if it’s just for professional development. I understand the idea that a school culture contributes to all test scores, but if the reliability of an individual teacher’s scores are questionable, then how can we use those same scores to evaluate a teacher who doesn’t even teach the subjects being tested? This is regardless of how we use these scores.
But how do we evaluate these teachers based on data? The good news is that alternatives to using school-wide value added data for non-tested teachers do exist. For example, here in Memphis art teachers can submit a portfolio to a committee of their peers and use that score as the value-added component of their evaluation score. This program could be expanded to include other subjects as well, offering an alternative measure for non-tested teachers.
Question: What If Teachers Receive Support and Still Don’t Improve? This is the fly in the ointment of my recommendation. I personally believe that if we reform the way we deliver teacher PD, we will wind up lifting many of our ineffective teachers into the effective category. However, some teachers will inevitably still not improve, either because of lack of effort or lack of ability. In that case, I’m ok with making employment decisions using evaluations that contain value-added data, because the decision wouldn’t just be based on an evaluation score utilizing value-added data. The decision would instead be based in part on evaluations and in part on the reality that we have done due diligence by that teacher and done everything we can to help them improve. The decision is result of a collective effort to help that teacher improve, not the result of one or two years of evaluation scores in and of themselves.
Question: Isn’t TVAAS A Better (more rigorous) System Than Other Value-Added Measures? This is the one fly in the ointment of my whole struggle with TVAAS data that I simply can’t get a straight answer to. The short answer is: I don’t know and i don’t know if we’ll ever get a definitive answer from a neutral party.
It is true that not all value-added measures are created equal. The State of Tennessee and SAS acknowledges this fact in their document “Misconceptions About TVAAS.” But how reliable is TVAAS itself? SAS claims that TVAAS yields “repeatability estimates of 0.7 to 0.8 for three year estimates.” Other research I’ve read suggests that a repeatability estimate of 0.8 is the goal, but that statisticians prefer it to be higher. In support of this measure, they cite an upcoming study, which was slated to be published in 2012. However, I spent an hour searching the web and I could not find this study. I found reference to it again in another document created for the state of South Carolina, but again no reference.
My concern is that until I see a peer-reviewed study comparing value-added systems side by side and evaluating their rigor, we only have the word of the folks at TVAAS that their system is indeed more rigorous. I personally feel that we can’t just take them at their word, especially when so many questions exist around how TVAAS is being used. The State Board of Education or the Legislature should take action to ensure that this equation is independently evaluated and compared to other systems to ensure its rigor before being used in any high stakes decisions.
As a side note, if anyone knows of such a study please post a link in the comments.
Question: Decades of Professional Development Hasn’t Yielded Improved Results. Isn’t This Just More of the Same? No, not if we do it correctly. Right now, most (but not all) ed schools and teacher prep programs aren’t truly preparing teachers for the rigor of the classroom. The result is that PD at the district level isn’t truly meeting the needs of individual teachers. In my experience, district-led PD is typically delivered in a lecture style format supplemented with some handouts once every couple of months. Sometimes the PD is online and is so easy that you don’t even have to do the required readings before taking the test to show you’ve completed the PD.
Instead, we need personalized and targeted PD for new and old teachers alike that recognizes that most of our teachers are entering the classroom woefully unequipped to be successful right away. There are a few examples of this occurring that we could adopt. One example is through teacher mentorship program through the PIT crew here in Shelby County. Made up of master teachers, this program pulls in highly effective coaches to work with master teachers and evaluators, who are then better equipped to support developing teachers.
Another example is the model employed by Teach for America and other alternative certification programs, in which new teachers are split into small groups and paired with a teacher coach, who then supports them throughout their first two years in the classroom through observations and targeted PD. I went through TFA, and both of my Teacher Coaches were very effective and very helpful. I’m very glad that I had them and I would not be the teacher I am today without them.
Question: Even if Just Used for Development, What Percentage of Teacher Evaluations Should Come From Value-Added Data? Some districts go as low as 10 percent (in Ohio) and in some cases value-added data would become the sole teacher evaluation measure, as almost happened here in Tennessee with the now defunct changes to teacher licensing law. I think that as a baseline, we shouldn’t go over 25 percent as this would cut down on teachers being falsely identified as ineffective.
Here’s why. Lets assume a teacher scores a “5” on all evaluation measures (the highest they can score) other than value added data. Lets then assume that they score a “1” on growth (the lowest), and lets assume that this score is due to error in measurement, as is often the concern regarding value-added measures. This teacher would then be assigned a score of a “3” under the current 65/35 split, or be considered as simply “meeting expectations.” By lowering the percentage of the evaluation from value-added data to 25 percent, this teacher could receive no lower than a “4,” that is still be considered above average effectiveness. Teachers with a “4” couldn’t be misidentified as “ineffective” even if their value-added data was misidentified as a “1” or a “2”
There’s still the concern that teachers on the border might be misidentified, but that number should get smaller as less of the score comes from value-added data.
In summary, there are many more questions that need answering when it comes to value-added data, some of which are unanswerable. But I think that we should err on the side of caution when it comes to this data given what is at stake.
Follow Bluff City Education on Twitter @bluffcityed and look for the hashtag #iteachiam and #TNedu to find more of our stories. Please also like our page on facebook. The views expressed in this piece are solely those of the author and do not represent those of any affiliated organizations.
I have read all four pieces of your paper, and I remain unsure as to whether or not you support the use of value-added for high stakes purposes. At more than one point, you indicate that you do not. However, in the new “paradigm” that you propose, if teachers don’t respond to more doses of PD that result from low VAM scores, then you are okay with using VAM to push them out.
Another issue I have with your presentation is that there is no reading list for finding out more about TVAAS and VAM. There are a number of recent books since 2009 devoted to the history, explication, and/or critique of VAM/TVAAS, but I see no links to them or any listed of suggested readings.
Entirely absent, too, is any historical links to help the reader contextualize the whole accountability argument that has developed over the past 50 years or so.
Thank you for the comment Jim. A couple thoughts in response. First, I realize I take a very nuanced position that is likely to satisfy very few, that value-added data in and evaluations shouldn’t be used only for high-stakes decisions. I think that it should only be a factor when all other attempts at improvement have failed. In that sense, I don’t think of it as “pushing teachers out” since we’ve gone the extra mile to help them increase their effectiveness.
This piece was also not intended to be a comprehensive overview of TVAAS and VAM from a literary standpoint. Rather, it represents my own thought progression around value-added data. Since these papers exist already and my purpose is to express what I’m thinking, I see no reason to rehash what is already there. However, if you feel you have a comprehensive list that represents a good overview of VAM and TVAAS, feel free to post it in the comments. If you want to email it to me, I can also include it as an addendum to the final piece.