[FYI: If you’re interested in data-blogging like this, I’m offering a little free course on it.]
We’ve all been there: You’ve had a couple drinks, you’re having fun talking with someone, then you blurt out a controversial opinion and everything goes belly up. Maybe your interlocutor scolds you, maybe they just walk away, or maybe nothing happens but there’s gossip a week later…
If you have controversial opinions, what you need is a method for knowing — in advance — whether your conversation partner can handle them. It needs to be simple and quick enough to be practical, but it needs to be scientific enough to offer real predictive validity.
It recently occurred to me that there exists a statistical technique that solves exactly this problem. It’s called recursive partitioning, and the practical tool it produces is called a decision tree. If you have data on public opinion and other demographic variables, you can use statistics to determine which chain of questions will give you the best guess about someone’s position on any given issue. If we create a decision tree to predict their position toward suppressing naughty opinions, then we have a simple, practical, and scientifically valid “life hack” for avoiding IRL flame wars.
I did this last week and the results are very interesting. If you’re interested in the statistical details, or you’d like to run the code yourself (perhaps on a different outcome variable), you can find all of that here. In this post, I’ll focus on the social and practical implications.
Here’s all you need to know about the stats. In this analysis, “being a wokescold” is proxied by whether or not someone thinks racist speakers should be allowed or disallowed. For possible predictor variables, I included a handful of variables that are reasonable to ask someone about or easy to observe yourself.
- sex/gender = variable named sex
- race = variable named race
- left/right identification = variable named pol
- family income = variable named realinc
- college attendance = variable named college
- word knowledge or verbal skill (proxy for IQ) = variable named wordsum
I then conducted recursive partitioning, which breaks the data down into the sequence of branches giving the most predictive traction over the outcome variable.
Figure 1 plots the resulting decision tree.
The graph is fairly intuitive, and if you’d like to understand the numbers better, see my more technical post over at jmrphy.net. Here I will give you a more concise and practical translation, resulting in a simple heuristic you can memorize.
If you meet a random person, there’s a 38% chance they’re a wokescold (defined as wanting to suppress racist speakers; one can debate this, but whatever, it’s a decent proxy).
The very first and most important question you can ask someone, to avoid a flame war, is: “Did you ever go to college?" If they say yes, the probability of them being a wokescold drops to 29% and that’s your best guess: They are probably not a wokescold. Nothing else will improve your guess from this point (at least from the variables we selected).
Now, many of you will say: But it’s the college-educated wokescolds one should be most afraid of! True. The limited utility of this analysis is also it’s primary social-scientific value: It reminds us that college-educated wokescolds remain a relatively minor anomaly, quantitatively speaking. Being educated still means you’re much more likely to support unsavory expression. It’s true that educated wokescolds are often the most dangerous landmines we’d like to tiptoe around, and unfortunately my particular analysis this week will not help you on this front. Fortunately, I have an alternative algorithm custom made for this use-case: If they went to college and they’re also a female with dyed hair, hold fire on your nuclear takes: They are probably a wokescold. Unless they’re Amber Frost.
If they never went to college, the next question you have to ask yourself is whether they're smart. You probably don't want to give them a vocabulary test, but conversation is pretty revealing. If they are smart, you infer they are not a wokescold (40% chance). If they are dumb, it's now a coin flip (50%).
Next, what is their race? This you can probably guess yourself. If white, this bumps them very slightly toward not being wokescolds (48%). If non-white, this bumps them toward being wokescolds (57%). From here:
If they are white and male, there's a 45% chance they’re a wokescold so you infer they are not — and that’s your final guess. If they are white and female, you should see if their family is rich or not. If rich, they are slightly less likely than a coin flip to be a wokescold (46%); if poor, they're slightly more likely than a coin flip to be a wokescold (54%).
If they are dumb and non-white, there is a 57% they’re a wokescold and that’s your best guess.
A heuristic you can memorize
(This only applies in America, mind you, the land of the free.)
- If they’re a female who signals creativity or virtue (e.g., dyed hair, bumper stickers), don’t share any edgy takes (this is post-hoc to the model, just a precaution in light of data limitations and researcher experience).
- If they went to college, they’re probably not a wokescold. You may gradually begin to share your edgy takes.
- If they did not go to college, but speak more intelligently than average, they are probably not a wokescold. You may gradually begin to share your edgy takes.
For all others, the safest decision rule is to not share edgy takes. Bonus rule only if you can master the above 3-step algorithm and you have an appetite for risk:
- If they are rich white people, you may gradually begin to share your edgy takes.
What about ideological identification?
The most intriguing result here, to my mind, is that ideological identification totally drops out — it appears to have no predictive power! As I wrote in my technical post:
[That ideological identification has no predictive power] is fascinating, given that many people today tend to think of speech suppression as a fashion on the educated Left! And it is, but that's only a highly visible minority. Political scientists would not be surprised by this result: We've long known that leftists and educated people are always more supportive of free expression (you just don't hear about those people in the media right now).
Please note that the model here does not provide especially satisfying statistical discrimination. It’s better than nothing, but one must still proceed carefully. Always begin by sharing mildly provocative takes, and watching your interlocutor’s reactions. Do not advance to nuclear takes until several acts of mild edgelording produce only smiles, laughter, or excited edgy reciprocity. With additional data and more sophisticated modeling, we may hope to derive more confident predictions for more ambitious social maneuvering. Until then, be careful.