By Facebook's Logic, Who Is Protected From Hate Speech?

The social company's rules for determining what constitutes a protected class doesn't always allow for nuance.
Image may contain Text Label Alphabet Vehicle Transportation Aircraft and Airplane
Getty Images

For months now, social media companies have been grappling with how to minimize or eradicate hate speech on their platforms. YouTube has been working to make sure advertisers' content doesn't show up on hateful videos. Instagram is using AI to delete unsavory comments. And earlier this week, ProPublica reported on the internal training materials Facebook gives to the content managers who moderate comments and postings on the platform on how to calculate what is and isn’t hate speech.

According to the report, the rules use a deliberate, if strange, logic in determining how to protect certain classes of people from hate speech while not protecting others. ProPublica points to an example specific from the training materials: Facebook's rules dictate that “white men” is a protected class whereas “black children” are not.

How the Rules Work

According to Facebook’s rules, there are protected categories---like sex, gender identity, race and religious affiliation---and non-protected categories---like social class, occupation, appearance, and age. If speech refers to the former, it’s hate speech; if it’s refers to the latter, it’s not. So, “we should murder all the Muslims” is hate speech. “We should murder all the poor people” is not.

This binary designation might make some uncomfortable, but it’s when protected and unprotected classes get linked together in a sentence---a compound category---that Facebook’s policies become extra strange. Facebook’s logic dictates the following:

Protected category + Protected category = Protected category

Protected category + Unprotected category = Unprotected

To illustrate this, Facebook’s training materials provide three examples—“white men”, “female drivers”, and “black children”—and states that only the first of these is protected from hate speech. The answer is “white men.” Why? Because “white” + “male” = protected class + protected class, and thus, the resulting class of people protected. Counterintuitively, because “black” (a protected class) modifies “children” (not protected), the group is unprotected.

Math + Language = Murky

In math, this kind of logical rule-setting is called symbolic logic, and it has understandable rules. The word-based logic discipline was first created in the nineteenth century by mathematician George Boole, and has since become essential to the development of everything from computer processors to linguistics. But you don’t need to have a PhD in logic or the philosophy of language to recognize when basic rules are being violated. “Where did @facebook’s engineers take their math classes? Members of subset C of set A are still members of A,” tweets Chanda Prescod-Weinstein, an astrophysicist at the University of Washington.

Philosophers of language think a lot about how modifying a category alters the logic of a sentence. Sometimes when you have a word for a category—like white people—and you replace it with a subset of that same category---like white murderers---the inference doesn’t follow. Sometimes it does. For instance, take the phrase “All birds have feathers” and replace it with “All white birds have feathers.” The second sentence still makes logical sense and is a good inference. But take “Some bird likes nectar” and replace it with “Some white bird likes nectar,” that may not be true anymore---maybe only green birds like nectar. It’s a bad inference.

Facebook’s rules appear to assume that whenever a protected category is modified with an unprotected category, the inference is bad. So just because “black people” is a protected class, it explicitly doesn’t follow that “black children” is a protected class, even though the average person looking at that example would say that black children is a subset of black people.

The fact is, there isn’t a way to know systematically whether replacing a category with a subcategory will lead to a good or bad inference. “You have to plug in the different examples,” says Matt Teichman, a philosopher of language at University of Chicago. “You have to just look at the complexity of what’s happening to see for sure.”

Teichman muses over one example that might support Facebook’s algorithm: White murderers should all die. “Whenever I come across wacky policies like this I try to think, is there any conceivable way to justify it?” he says. There, the subset “murderers” is, in most cases, bad. So maybe it makes sense to be able to direct hate speech at them. But a murder’s race is---or at least it should be---completely irrelevant to the badness, and including race in that sentiment seems problematic at best.

Knowing the Rules Changes the Game

Now that people know what Facebook’s rules are, there are a lot of ways to break them. For instance, if someone uses the term “radicalized Muslims” on Facebook, that isn’t hate speech (the modifier “radicalized” makes that group unprotected). By simply applying a modifier to a protected class, a person can perpetuate a stereotype and disparage a subcategory while dog-whistling about the whole group, all while not breaking Facebook’s rules.

“There’s an interesting legal difference between literal meaning and implied meaning. You're on the hook for what you literally say, but you can often kind of weasel out of really being committed to the stuff you just implied,” says Teichman.

Following the rules can very quickly become an exercise in absurdity. One could modify a protected group in such a way to include larger and larger swaths of that group. Take, for example, saying “Black children shouldn’t be allowed in our town” versus “Black adults shouldn’t be allowed in our town.” By writing the latter, one can perpetrate hateful speech that includes the entire black community--without breaking Facebook’s rules. And the gaming of the rules doesn’t end there. Just by modifying a protected group’s name with a description of appearance---“ugly” “fat”---one can add insult to demeaning injury.

Looking at the rules as a whole, ProPublica reports that Facebook developed the rules in reaction to specific actors’, such as governments and users, complaints. At one point, the rules were open-ended, including a general rule that said, “‘Take down anything else that makes you feel uncomfortable,’” says Dave Willner, a former Facebook employee, in the ProPublica report. Wilner revised the current rules to make them more rigorous. The result, Teichman says, appears to be patchwork constructed not out of some top-down ethical determination, but rather a list slapped together over time. “Categories get this hodgepodge when they're just the result of being stitched together out of complaints people made,” says Teichman.

When asked about their stance on these issues, Facebook pointed WIRED to a statement on hate speech released the day before the ProPublica report. Facebook’s intentions are to be “an open platform for all ideas,” according to the statement.

Some people see these rules in a more sinister light---that the policy intends to protect people the minimum amount possible. “Maybe what they're trying to do with this rule is allow a little bit of problematic speech through so the users who want to engage in that will use their site,” says Teichman.

Ultimately Facebook’s rules don’t account for the subtlety of language or the nuanced historical and social issues that made the categories protected in the first place. “On one hand you have to love the elegance of a policy that explicitly protects white men and not black children because that’s exactly what race and inequality scholars argue colorblind policy and diversity policies do,” tweets Tressie McMillan Cottom, a sociologist at Virginia Commonwealth University. “But on the other hand it is depressing, because it means dozens if not hundreds of people saw that framing and thought it was perfectly fine. They thought it was legal AND smart.”

Some think that hate speech, like pornography, can’t be systematically defined. Instead, as Supreme Court Justice Potter Stewart wrote of pornography in Jacobellis v. Ohio , “I know it when I see it.” And that’s where bias rears its ugly head, especially now that artificial intelligence is trained by the very humans content moderators it is meant to replace. It’s not that the moderators were perfect, but an AI, like DeepText that Facebook-owned Instagram recently rolled out to eliminate mean comments, likely contains the same logical assumptions and power structures embedded, unseeing, in decisions it makes.