## Seeking Educational Alpha

Nothing however, was neglected by the anxious father, and by the men of virtue and learning whom he summoned to his assistance, to expand the narrow mind of young Commodus, to correct his growing vices, and to render him worthy of the throne for which he was designed. But the power of instruction is seldom of much efficacy, except in those happy dispositions where it is almost superfluous.

-Edward Gibbon, The History of the Decline and Fall of the Roman Empire, 1776.

There are plenty of professions in which individual human performance matters a great deal but is difficult to measure directly in a simple or obvious way.  However, when that level of performance becomes important – which usually means there is a huge amount popular interest and/or money at stake – then that creates the incentive for statistics and quant types to plunge into the numbers and develop those metrics.

Teams sports provides a good example of a case in which an individual player’s contributions to the desired end state – victory – can be difficult to assess.  But there is interest and money, and so, for example, in the sport of baseball, we have the famous Bill James with his Sabermetrics and Win-Shares, and Michael Lewis’ sketch of Oakland A’s manager Billy Beane’s analytical magic in Moneyball.  And you can bet that all other sports now have their own metrics cults and prophets.

But when it comes to the recent fad of measuring individual teacher or school performance, or the efficacy of alternative pedagogical styles, it seems to me that governments are using a completely misguided approach in simply looking at student test scores.

This is because the fair and accurate way to assess teachers is not PC, but whatever way we use must be PC, so PC makes us dumb yet again, and unfortunate and innocent teachers are the victims of the collateral damage who bear the brunt of our collective insanity and unwillingness to come to grips with reality.

And we really need that fair and accurate way to measure teacher performance, because (1) teachers are particularly vulnerable to getting tarnished with an unjustified bad reputation courtesy of teenage angst, self-perpetuating warped perceptions, and the school rumor mill, and (2) the knee-jerk reaction is to go get smarter teachers based on test scores, certification exams, and the place they received their diploma under the assumption that these characteristic will also make them ‘better’ teachers, but (2)(A) there is little evidence to support that assumption (see Education Realist on the subject) and (2)(B) Bizarrely, and most un-PC of all, it will definitely mean the unjustifiable replacement of lots of perfectly adequate black and hispanic teachers by whites and asians but without any likely gains in student achievement.  The value and power of a sane and reasonable metric to prevent this inequitable nonsense cannot be overstated.

How should we assess teachers then if we’re going to go beyond end of year standardized test scores?  We should borrow a page from finance.  We don’t care about a fund manager’s gains alone.  We care whether he can consistently beat the market at the same level of risk.  That’s called ‘Alpha‘.  And we aren’t looking for a teacher’s test scores either, we are looking for his Educational Alpha.  How do we find it?  We move there in a sequence of steps.

1. From Scores to Yield

First, we should recognize that it’s not a teacher’s fault if a student arrives in his class with knowledge well below the standard expectation, and neither should it be to that teacher’s credit if his student arrives knowing 50% of the class material on day one.  That leads us to the concept of ‘value-added‘, which is nothing really new.  You generate two standardized exams which are distinct but test the same range of knowledge, and you give the kid one at the beginning of the term and one at the end and measure the difference in scores.  That’s analogous to ‘yield’ in finance.  If I tell you the price of a stock at the end of the year when you sold it, that means nothing to you unless you also know the price at the beginning of the year when you bought it.

2. From Yield to Expected Yield

But yield isn’t enough.  Some horses are thoroughbreds and will get a lot faster in their first racing season.  Others are draft-horses and won’t.  That’s not the fault of the jockey, that’s the fault of the quality of the material being worked and the hand the trainer is dealt.  Kids are the same; some are quick and smart, and some aren’t.  We are used to hearing about the awful teachers in America’s urban public schools, but it’s possible that some of them are doing the best that anyone can do with what the kids they’ve got and should be getting medals and applause instead of criticism and disdain.  On the basis of yield alone, if you give a teacher a bunch of Einsteins, that instructor is going to look great even if they do nothing, while another teacher given a class full of Beavis and Butthead clones is going to look awful.

So one needs to figure out to what degree one expects a particular child to improve during the course of a term, and then compare the student’s actual performance to that forecast.  The expected yield is like the market benchmark in the finance analogy.

3. Generating Expected Yield

But how do we calculate such a forecast for a child?  The best we can do is continuously collect a vast amount of data on a wide variety of variables from a large number of students and perform some kind of statistical regression analysis.  This regression analysis will show us which variables have the strongest correlation coefficients and are most explanatory and predictive of yields, and a subsequent factor analysis will help tell us which of those variables are strongly correlated with each other so we can reduce the forecasting model to the absolute minimum number of factors while retaining the accuracy of the prediction.

This model isn’t going to be a perfect, but it will probably get us in the right ballpark most of the time for most kids.

The problem is the question of which variables will one be throwing into the statistical sausage factory.  If I had to guess, the strongest predictor would be IQ, but good luck giving every kid an IQ test in this climate.  Fortunately, there are reasonably good proxies for IQ in the results of certain standardized tests that are given to young students, so that’s one way to get around the politics.

But what about other factors?  Some things – like peer groups and life circumstances – are just hard to capture.  Other things are easy to capture, but are politically sensitive and tend to give rise to controversy: race, gender, socioeconomic status, family situation, height and weight, etc.

For the latter group of factors, one faces two main problems.  The first is that this kind of regression analysis is going to produce some very unpopular and taboo results that contradict some of societies most important pretty lies in a way that will threaten the careers of anyone involved in the process of producing them, and the second is that using those results to generate different profiles and expectations for different students is going to drive the usual suspects completely crazy when they notice certain patterns.

But this is the minimum of what you have to do if you are genuinely interested in measuring teacher quality and performance.  The fact that no one is doing it is evidence that, despite all the signalling to the contrary, no one is really interested in measuring teachers if it means we have to look squarely in the face of the part of the problem which lies in the students themselves.  I don’t completely agree with Robin Hanson’s quip that “School Isn’t About Learning” but advocating for school quality isn’t about teacher performance if one isn’t willing to adjust completely accurately for the composition of the class that teacher has to manage, and based on the sometimes ugly truths of reality instead of utopian fantasies.

So, the profile and the model is the hard part, but let’s assume we get it done anyway, and for any student we can plug his vitals into the computer and out pops his expected yield.  That is like an individualized, custom ‘Beta‘ of 1.

4. From β to Δ

So, for any particular student in a teacher’s class, we have an expected change in subject test score and, at the end of the year, the actual change.  The difference is Delta – Δ, and we would expect a lot of statistical noise, and small positive and negative deltas amongst the various students.  But we aren’t measuring students, we’re measuring the teacher’s performance, so we need to add up all the student deltas and take an average, $\bar{\Delta}$.  And you would want to normalize the deltas to measure them in terms of the standard deviation of the normal student distribution of test scores for that subject.

5. From $\bar{\Delta}$ to α

One expects a teacher to have good classes and bad classes, and good years and bad years.  But if you take all the $\bar{\Delta}$‘s and average them as well, then the ups and downs should cancel out, and what you have left is the sustained ability to impact students above or below what would have been expected with a merely ‘average’ teacher.  That’s Educational Alpha, that’s fair and accurate, and that’s what we should be measuring.  But we’re not.

And there are definitely some political reasons why we’re not, and why we probably won’t be doing it anything in the future either.  However, since the No Child Left Behind era we have been collecting oodles of data on students and teachers alike (here’s an example from LA), and while they are still doing this wrong, I’m sure some enterprising statistician among you can extract the Alpha scores through a little clever manipulation of the existing dataset.  What would we see then?

Some Predictions.

1. The Null Hypothesis In Education == The Efficient Market Hypothesis

Bryan Caplan has his signalling model of and case against education and Arnold Kling has what he calls the null hypothesis in education (see here: 1, 2, and 3).  The basic idea from both concepts is that, on average, school quality, teacher performance, pedagogical style, teacher test scores, and dozens of other usual suspect considerations in fact make very little difference for test scores and life outcomes, and the primary driver of those outcomes is the cognitive talent and character of the student himself, on which the educational system – really any educational system – can only have the smallest of impacts, if any.  Mostly, the kids are born bright or dull, and unless you stunt them, they’re going to develop their minds and mental skill at their innate rates, no matter what you do.

In other words, it’s really hard for a teacher to beat the student market.

What that means is that we would predict most Alpha scores to be close to zero, with just a few slightly negative or slightly positive, and I’d guess a bias to the negative since one would reasonably expect it’s easier to skunk an entire class than to bring everyone up above their expected level of performance.

And as with repeatedly successful fund managers, there will be a few teachers with sustained and consistently high alpha scores, and it will be very difficult to explain why, what they are doing that is so special, or whether in fact their cases are mere statistical flukes.  In either case, whatever the secret sauce is to their magic, it will prove impossible to replicate and scale across the educator population.

If this is true, then the frame of all our entire education debate and all our over-politicized discourse is completely wrong.  And this is something we could, conceivably, discover right now.

Teachers are right to push back against unfair evaluations and obsession with test scores, but they should be agitating for this kind of evaluation program so they can prove their case instead of constantly appearing like they have something embarrassing to hide and are just trying to avoid scrutiny.

2. Losing The Alibis

One of the terrific shames of our age is that PC makes it impossible for most people to speak forthrightly about their core interests lest in the course of conversation they accidentally step on one of a multiplying numbers of taboo land-mines. That gives rise to an insatiable demand for alibi-frames, or cover stories that allow us to ‘justify’ our actions and desires in the modes our society currently tolerates, whether or not they make any sense or correspond to reality.

But if people invent these alibis out of whole cloth, they’ll just be accused of using racist code-words and dog-whistles and such, and so they have little choice but to ride the wave generated by the influential people who control the bounds of respectable discourse and the direction of political policy, and use rhetorical judo to leverage those ‘acceptable concerns’ into a rationale that will also allow them to get a little of what they want too.

Here’s what happened.  Education reform advocates, social scientists, and progressive policy makers have been facing down the full standard deviation racial gap in test scores for generations under the assumption of the neurological uniformity of all population groups and the corollary belief that they could close the gap through ‘resources’ (i.e. money) and ‘the best teachers’ and pedagogical methods.

It hasn’t happened.  Nothing seems to work.  But that hasn’t stopped the reformers who can’t be convinced to pull the plug and thus keep trying increasingly desperate interventions to save their patient.  But all of those efforts rely on keeping a certain seductive myth alive: that the explanation for the gap is not genes but because of a certain kind of ‘privilege’ which is that all the smartest teachers with all the positive alpha are locked up in the nice white and Asian suburban schools.  And, if only we could get Harvard’s finest to do a single tour in the ghetto before predictably burning out and bailing for jobs in administration or academia, we could solve this problem once and for all.

It’s a fairy tale.  But if you keep the myth of untapped alpha alive, don’t be surprised when other people start using it in ways you don’t appreciate.  That’s practically the only thing to get a non-progressive initiative accomplished in this political environment.

There is a lot of dissatisfaction with the current public school system and a lot of people want out and the ability to pursue alternatives, but without having to pay for private school on their own, which they can’t afford, or to buy a house in an elite school district, which they also can’t afford.  What do these parents really want for their children out of the educational system?  Who knows – lots of different things.  Some want out from under the government’s thumb so they can choose their own curriculum and disciplinary rules.  Others want their kids to have the highest quality peer group. There’s a thousand different desires.  But the one thing these parents are allowed to say they want is better quality teachers and better quality schools, relying on the assumption that these things are meaningful concepts and, you know, exist.

That is, they are allowed to say they want to go to a place where the teachers have more alpha.  How can you tell them no when you’ve been running a massive ‘get more alpha’ campaign for generations?  Hence charters and vouchers and so on.  And a brain-dead never-ending education policy debate.

However, when we actually start measuring teacher alphas, and if we fail to reject the null hypothesis in education, then the legitimacy of the frame of all these arguments and alibis and cover stories will suddenly evaporate.

One the one hand, that’s an unfortunate result for someone like me who supports the maximum amount of educational variety, freedom, and entrepreneurship.  A genuine free market in education won’t produce a company that can magically make Johnny smarter, but it would satisfy what his parents want, instead of some school board bureaucrat.  But progressives will use the result to shut down charters and vouchers as ‘unjustifiable’ based on performance, and thus force everyone into identical public schools for the sake of their collectivist and egalitarian principles and for propagating narratives most compatible with their own ideological perspective.  They’ll also stop anything the unions don’t like, such as the evaluations themselves, and experiments like performance pay.

On the other hand, they might just stop obsessing about ‘the gap’ and let schools go back to tracking students by ability so that teachers can have more cognitively homogenous classes, which are easier and more efficient to teach.  If we could even catch up with 50 years ago, we’ll move far ahead of where we’re at today.

1. Thanks for the links–I changed the date on my Mr Singh post which changed the URL. I’m insanely weird, and I wanted the post to be dated in May. Here’s the correct link: http://educationrealist.wordpress.com/2014/05/31/learning-from-mr-singh/

• Handle says:

Thanks, I’ll fix it tomorrow.

2. I don’t think we can evaluate teachers based on student performance at all. It’s really something we’ve only considered in the last couple decades. Before then, we understood that student performance was a combination of cognitive ability and incentive. The only reason we can’t do that anymore is that it requires we acknowledge the achievement gap.

• Handle says:

In Bad Students, Not Bad Schools Robert Weissberg has two alternative “academic accomplishment formulas” with a combined weight of 15 distributed amongst 5 factors.

There is his “traditional understanding” version:

Achievement = 8 Intelligence * 4 Motivation * 1 Resources * 1 Pedagogy * 1 Instruction

And what he calls, “The Liberal Alternative” version:

Achievement = 1 Intelligence * 1 Motivation * 5 Resources * 4 Pedagogy * 4 Instruction

I think he’s spot on in two ways. First, his achievement formula is correct and implies that most teachers won’t make much difference, but it leaves room for exceptional (positive or negative) teachers and methods. When you write about you efforts to get through to your math students, I detect the signal of a teacher where that 2 out of 15 impact can come through and make a difference. I think that signal would show up in something like my ‘seeking alpha’ analysis too. On the other hand, it’s hard to capture ‘motivation’ in my model, but I think it would reveal itself in the regression analysis because when you see smart kids with crap grades you have a good proxy for poor motivation.

Second, he’s right about what the progressives think (or what they pretend to think). They claim that differences in innate intelligence and motivation are small and not determinative. What really matters is money, the latest thinking on technique, and good teachers with lots of that magical alpha. They are bitter clingers to this lousy theory.

So, I agree with you mostly. I think we can evaluate teachers, but that all we would discover is that only a few teachers make a difference, positive or negative, and that even they still have a pretty small impact.

My question for you is whether you think it is in the interests of most teachers to have this result verified. Do teachers benefit more on net from the current erroneous narrative and the ‘hollywood transformatic mystique’ that people imagine about their profession, or would they be better off if everyone just knew the plain truth, like everyone used to know a few generations ago? What are their interests, and what are their perceptions of their own interests?

• I would say that liberals think motivation is extremely important. Motivation and curriculum (pedagogy). But yes, I now see what you’re saying.

I think a lot of my success (if it is success) comes from a combination of curriculum and instruction. I am, to put it in ed school terms, very good at Vygotsky. I know how to find the zone of proximal development. Progressives tend to put it in the wrong place. Traditionalists and reformers set it too high.

So if you evaluated teachers for that 2 out of 15 impact, the value would come not to the teacher, but in possibly finding information we could replicate. That’s what everyone’s trying now, of course, but they’re looking for love in all the wrong places.

• Handle says:

That’s a good point, and I like that looking for love line.

Lately, I’ve been thinking about the problem of nudging progressives away from their unrealistic vision of humanity and closer to reality. How do you nudge them from the root problem of their crazy achievement formula to a more accurate one, as a new basis for policy and debate?

• I think progressives are coming to that viewpoint as a result of their argument with reform. They’re still going to ask for money, but less for schools and more for services. Which isn’t necessarily a bad thing, so long as we quit pretending that services will improve scores.

Another interesting viewpoint is that of the critical pedagogists–which is as far as us as is possible to get to. Paul Thomas is a good guy to read on this. They are way out there radical, but one of their positions is look, stop judging black, HIspanic, and poor people by “white” standards. Give them empowerment, give them agency, give them the ability to define their own world. You can see in there the glimmerings of what is needed, coming at it from exactly the opposite direction.

• Handle says:

That is encouraging. I feel like I should do a full interview with you.

You often give charters and voucher schools a hard time because, I assume, they make claims that they achieve results because of factors aside from what is really population management.

But if we’re going to break out of the one-size-fits-all mold, allow people to go their own way, and get a genuinely diverse and competitive free market in schooling, then what other initiatives would be better as a thin edge of the wedge?

• I give charters a hard time for a lot of reasons, but most of all because we should not be allowing choice with public dollars. Full stop. Here are some articles spelling out why:

If we break out of “one size fits all”, we can do it within the purview of public education. If we are going to succeed, we have to change some of the disparate impact requirements hurting public schools. Charters aren’t bound by them.

• Handle says:

I respect your position, and agree with you completely on disparate impact, but I disagree about public money. What makes public school such an exception? We give people vouchers for all kinds of public benefits and allow them to use the money to choose whatever arrangement they think works best for them. For an educational example, the GI Bill, which has worked out great for lots of my friends on the ‘the public pays, the student chooses’ model.

Parents who are frustrated with their local public school feel exasperated by a sense of futility and like they are Sisyphus pushing a boulder when confronting massive, sclerotic, and change-averse governmental bureaucracies. Consider that example Charles Murray gave about the hard time Maryland was giving to that Classics school. It’s hard to imagine openness to parent desires, inputs and change in general from such institutions. Many of my friends with kids tell different versions of the same horror story.

Finally, and this is just a hunch about the politics of the matter, building a constituency to take on the education establishment is probably better accomplished by letting as many people flee the public system as possible. My own children are in a public school, and I am satisfied with it, but I already know that if something happened and I wasn’t, then I’d be stuck.

Speaking of Murray, have you read / do you have any thoughts on his Real Education?

• “We give people vouchers for all kinds of public benefits and allow them to use the money to choose whatever arrangement they think works best for them.”

No, we don’t. We attach strings. I doubt very much a GI could go to a school that openly preached anti-Semitism or promoted a racist agenda. Welfare dollars have all sorts of strings.

More importantly, school isn’t an individual benefit provided at public expense. The rationale for public education is its benefit to society.

And if you disagree with that, fine, but I don’t further debate charters with anyone who seriously thinks public school is a benefit for parents provided at taxpayer expense. It’s historically incorrect and there’s no point in the discussion.

• Handle says:

Yes, you’re right, there are strings attached. I tend to prefer fewer strings.

I think public education is a mixed bag of private and public benefits, and that these are very hard to define and tease out. There are things the parents get out of it, the students get out of it, and the public at large gets out of it. I agree with you completely about the history and the public benefit rationale, so I hope I didn’t leave the wrong impression.

I am unsure whether the same public benefit can be achieved in non-public schools. I would prefer a more decentralized system with only a few strings attached, but I concede that I don’t know, and can’t prove, whether it would achieve the same public benefit.

Is there any good research on that subject? Or maybe just essays, since it is something that seems that it would be difficult to assess empirically.

One line of argument I’ve encountered is that the charter system will just create schooling apartheid, and a very segregated and stratified structure where students are isolated in homogenous groups of class and race, and that this isolation will be harmful to social cohesion and mutual understanding. I think that’s probably valid.

• “I think public education is a mixed bag of private and public benefits, and that these are very hard to define and tease out. There are things the parents get out of it, the students get out of it, and the public at large gets out of it. I agree with you completely about the history and the public benefit rationale, so I hope I didn’t leave the wrong impression.”

Fair enough. I wouldn’t disagree with this. I’d also say the courts have been annoyingly imprecise on this. Courts are really bad on public ed generally.

• asdf says:

Going to a charter got me out of my shitty public school. Instead of probably being miserable, doing drugs, and dropping out I went to one of the best schools in the country. I definitely ended up on a way better life path because of charters.

“One line of argument I’ve encountered is that the charter system will just create schooling apartheid, and a very segregated and stratified structure where students are isolated in homogenous groups of class and race, and that this isolation will be harmful to social cohesion and mutual understanding. I think that’s probably valid.”

This is a good thing. People across too big of a divide can’t interact, it just breeds conflict and dysfunction. Also, I was way more tolerant of minorities when I didn’t actually interact with any of them. Once you actually start doing that you become a racist fast.

NAMs are basically a parasitic class we want to isolate and keep from doing any more damage to the system then is unavoidable. Keeping them away from our children is a good start.

All this talk about stratification would have some more substance to it if we were talking about white Christian upper class people interacting with white Christian working class people who shared many of the same core values but just happened to have a bit different economic and cognitive context. That’s the good kind of diversity. You can actually grow from that because the fundamental structure is sound and you have a common vocabulary, genetics, etc to draw on. Unfortunately even this is becoming out of reach as the white working class NAMifies itself.

• anon says:

> I don’t further debate charters with anyone who seriously thinks public school is a benefit for parents provided at taxpayer expense. It’s historically incorrect

In linguistics that’s called the etymological fallacy. Why do we care about why public school originated?

In a debate about affirmative action in college admissions, would you ignore anyone who thought it was a benefit for URMs provided at the expense of Asians and whites? That’s legally incorrect today — affirmative action in education is illegal if “intended” to benefit its beneficiaries, and legal if “intended” to benefit the whites who do get in by exposing them to the URMs who otherwise wouldn’t be around. As I understand it, that’s the whole reason we call URMs “diverse” instead of, say, black.

Saying it’s not worthwhile to discuss an issue with someone because their view of it is “historically incorrect” also amounts to saying that in any baptists-and-bootleggers scenario, the bootleggers don’t exist.

• giganigga says:

Multiplication is commutative.

Achievement = 8 Intelligence * 4 Motivation * 1 Resources * 1 Pedagogy * 1 Instruction

= 32 * Intelligence * Motivation * Resources * Pedagogy * Instruction = 32 IMPRI

I thought I might be misunderstanding him and that he didn’t actually publish such an egregious mistake, but a quick search turns up the page where he confirms he is talking about multiplication
“Elements are multiplied so if any term is “0,” the final result is “0.”

I assume he is trying to say intelligence is 8 times more important than other factors, but it appears he is lacking the combination of these factors that allows understanding of multiplication.

• Handle says:

You are correct, he should have used an additive formula (where the weights are correlation coefficients), all multiplied by the normalized product of IMRPI, if he wants it to zero out for zero on any factor.

• giganigga says:

I agree that ability is very important, but I think it acts as a constraint, not as a factor. Therefore I believe that alpha is possible where expectations are less than constraints. This normally occurs at the very low end and the very high end.

Half of Detroit is functionally illiterate. It would be fairly easy to correctly predict who is going to remain illiterate- the expectation is illiteracy for Joe. While Joe’s low IQ will prevent him from ever learning calculus, he has the mental capability to learn to read. If a teacher can get him off the block and into a book, he can rise to the level of his ability as someone who can read.

On the other hand, smart people can learn calculus from wikipedia and essentially max out their test scores.** A good or bad teacher will not change that, so their expectation is a 5 on the AP test. The max expectation is a pretty low bar for a bright kid. However, good teachers can take students much further than this, hopefully up to the constraints of his ability.

**While I am not in general anti-SAT, I think most standardized tests do an awful job at sorting the high end of the curve, with the focus being on not making mistakes on easy problems. It is also very difficult to measure how much value a teacher added when smart peoples scores remain essentially perfect. This is especially frustrating given that there are tests that are widely used that DO distinguish well (AMC, AIME, other academic Olympiad qualifying tests)

• Handle says:

I tend to think of anybody two standard deviations or more, above or below the mean, as a special case who warrants a different kind of approach.

3. VXXC says:

The Fallacy of Bona Fides; assuming good will of the other party.

If this fallacy isn’t categorized yet, it should be.

Perhaps we can call it the American Fallacy.

4. Toddy Cat says:

“If we could even catch up with 50 years ago, we’ll move far ahead of where we’re at today.”

This applies to a lot of areas other than just education. I’ve often had this discussion with Reactionaries – they want to go back to the High Middle Ages or the Congress of Vienna, and I’m thinking, “Hell, I’d settle for the freakin’ Kennedy Administration!”.

• Handle says:

I’ve seen the usual retort as, “Well, the 1950’s just gave birth to the present day, so it would all just happen again,” with a typical rejoinder of, “But we’d have the knowledge and experience of how it all went wrong, and we’d try to prevent that and keep things stable and call people agitating for change dunces who can’t learn from history and suppress them mercilessly.”

In my own case – and in the case of almost every stable, healthy, and happy family I know, whatever their politics and ideology and thoughts about life outside their house – we tend to create a little 1950’s bubble within my home, and accommodate modernity as best we can, and exclude it when we can’t.

Joshua 24:15, “And if it seem evil unto you to serve the Lord, choose you this day whom ye will serve; whether the gods which your fathers served that were on the other side of the flood, or the gods of the Amorites, in whose land ye dwell: but as for me and my house, we will serve the Lord.”

New Version: “And if it seem evil unto you to live like they did in the 1950’s, then you should choose how you want to run your family life, whether the way your ancient pagan ancestors lived, or as polyamorists do today, in whose land ye dwell: but as for me and my house, we will choose traditional nuclear monogamy.”

5. VXXC says:

The common practice and concept of restore or roll back to the last known good point or configuration is one which the moderns cannot conceive of in terms of society. Of course 50 years ago was merely the last time we were normal. Before every vice and disorder, fringe and freak became the follow on exploitation phase to Civil Rights, the Camp Followers commandeering the Army. To us social and moral chaos with all it’s attendant evils is what seems normal but of course it’s not. Normality will resume, as it did in Russia for instance, and posterity will wonder why we stood the Queer Terror so long.

Of course we could call it many “Terrors” I’m just using a catchy and current one.

6. Z says:

I fully admit to not having read through the entire post. I can’t get past the idea of measuring teachers. If it is possible, it is unacceptable. If it is impossible, then there is no point. This is probably why magical thinking is so dominant in pedagogy. Anyone the least bit empirically minded throws up their hands and walks away, leaving the field open to hucksters and grifters.

• Handle says:

“If it is possible, it is unacceptable.”

Unacceptable to you, or to whom?

• Z says:

I’m indifferent. I’m also not the one deciding what is and what is not acceptable. That would be the teacher unions and their allies in state government, as well as a large chunk of the parents.

• Handle says:

That doesn’t make sense. Sure, teachers unions are usually against it, but that’s because they don’t want teachers to get unjustly blamed for poor students by a bad assessment method, and they still like the seniority and credentials approach to pay, the justification for the disparities of which would be undermined by accurate performance evaluation.

Unions have a need to push back without having to rely on the ugly truth as the basis for their argument.

But parents and ed reform policy wonks and local governments are usually gung-ho on teacher evaluation. Look at the LAUSD.

• Anthony says:

Teachers’ unions are also ideological, and they hold an ideology that is *explicitly* anti-evaluation, not just because of the possible unfairness, but because they don’t believe that teachers are substantially different. Their ideology also recognizes the “racist” effect of actually factoring in relevant factors, and therefore is against it.

…but they should be agitating for this kind of evaluation program so they can prove their case

Besides the reasons above, the other problem is that if all teachers are pretty much the same in effectiveness, that removes a major reason for paying them more. Or, even worse (from the union perspective) they might find that effectiveness correlates negatively with seniority. There’s already a lot of pushback on class size because the research showed it didn’t make any real difference.

• Handle says:

I mentioned the seniority thing in another comment, but class size impact is a good point.

7. Anthony says:

Rating individual teachers, even after you’ve made all the necessary adjustments, is going to be a crapshoot. You’ve got maybe 30 kids (or kid FTEs) where you have an influence on their scores – random statistical variation is going to be brutal. And getting enough data for something statistically valid will take a significant portion of a teacher’s career.

• Handle says:

The same thing applies to fund managers.

And the result of a lot of random variation itself has a meaningful interpretation, that most teachers are less important to achievement than other factors.

• Anthony says:

If what you want to do is find out whether there are any teacher characteristics which actually make a difference, that’s great. And that data is largely available for mining. But if you want to actually evaluate individual teacher performance, the small sample size makes the evaluation inherently unfair because it’s too random.

• Yes, the required caveat to this marvellous idea is more or less the entire contents of Taleb’s ‘Fooled by Randomness’ on the subject of identifying high-performing investors, in direct proportion to how much importance is attached to the final data on the individual teachers themselves, as opposed to the knowledge gained in the intermediate qualification and regression steps.

This caveat itself has the caveat that the input data are by their nature, dramatically more moderate and regular than market variables, being restricted to finite ranges of outcomes, and no scope for alteration of the ‘portfolio’.

• Handle says:

I think that people are missing the forest for the trees.

The point is that there is a kind of politically acceptable and culturally dominant implicit model of the factors which influence student achievement and account for various disparities and this model is erroneous and pernicious. It is not just about beliefs; teachers are being unjustly evaluated and criticized on the basis of this bad model right now, and it dominates our discourse and politics and ruins any chance of genuinely productive reform.

The point is to try and nudge the conventional wisdom on this topic in the direction of reality. One needs to put the bad model to the test, demonstrate it is irremediably faulty, and shift both frames of the Overton window to the right.

A single year of doing a single massive study which uses this technique would be remarkably beneficial in providing the necessary evidence, and even a result that showed that evaluative measurement was hopeless would be a positive result.

The reason it is not done is because we already have a good idea of what we would find, and it’s too terrible to look at, and the illusion that is only plausible in a state of ignorance is too useful to certain political actors.

So yes, by all means criticize the possibility of evaluation even under an accurate model, but understand that right now the policy wonks believe in both the meaningfulness of evaluation, and their false paradigm. Good data might just help to break them off their dual addiction to both species of junk.

• asdf says:

“Good data might just…”

What data would you submit to prove the idea that good data would cause some kind of change. I think people are being very empirical in realizes the uselessness of empirical study.

8. Steve Sailer says:

It would be interesting to know what percentage of the two sigma alpha teachers cheated on testing their students. Similarly, it would be interesting to know what percentage of two sigma alpha hedge fund managers cheat in some fashion.

• Handle says:

Yes, extraordinary performance in an efficient market is inherently suspect and more likely to be the result of cheating than talent or ‘economies of scale’.

A good way to test for that in teachers is to see how the kids do at the beginning of the next term and look for inconsistency. But then the next chess move is to make the whole school a 2 sigma cheating school. But, as we’ve seen, that requires a lot of people to keep a secret, which is hard to sustain for long.

9. Hadn’t realized this conversation went on.

From above:, Anthony on unions.

“because they don’t believe that teachers are substantially different.”

Well, yes. That is, largely, the point of unions, that all workers are roughly the same level.

“, the other problem is that if all teachers are pretty much the same in effectiveness, that removes a major reason for paying them more. Or, even worse (from the union perspective) they might find that effectiveness correlates negatively with seniority. There’s already a lot of pushback on class size because the research showed it didn’t make any real difference.”

On the first, sez who? As I wrote in a followup post, all sorts of professions get paid more without regard to effectiveness. As to the second, unions objected to class size reductions originally.

On cheating– teacher could be telling kids the answers to the test while it’s going on. Or the teacher is keeping the tests after it’s over and fixing them.

Pretty much every teacher I know (a small group, I grant you) can not see how cheating goes on without the active cooperation of administration or kids. We have to get those tests back fast.

• Handle says:

“Pretty much every teacher I know (a small group, I grant you) can not see how cheating goes on without the active cooperation of administration or kids.”

This sounds right. And it’s not like the press is champing at the bit to investigate whether certain urban feel-good ‘success stories’ have a worm in the shiny apple. The secret conspiracies make sense though, because everyone has aligned incentives and gains in their own way from higher scores. Who gains by stopping or exposing it? Is there million-dollar whistle-blower payout I don’t know about?

11. P. Gauge says:

Z Blog has this one summed up nicely as: “A well reasoned technocratic reform plan that has no chance of adoption.”

• Handle says: