Skip to main content

Instagram's Bold Plan to Block Hateful Comments Using AI

WIRED Editor-in-chief Nick Thompson sits down with Instagram CEO, Kevin Systrom, to talk about the platform's bold plan to use AI to block hateful comments posted by trolls.

Released on 08/17/2017

Transcript

So what I want to do in this story is I want

to get into the specifics of the new product launch

and the new things you are doing and the stuff

that's coming out right now in the machine learning.

But I also want to tie it into a broader story

about Instagram and how you decided to prioritize niceness

and how it became such a big thing for you

and how you reoriented the whole company.

So I'm gonna ask you some questions about the specific

product and then some bigger questions.

I'm down.

All right, so let's start at the beginning.

I know that from the very beginning you cared

a lot about comments.

You cared a lot about niceness, and in fact,

you and your co-founder, Mike Krieger would go in

early on and delete comments yourself.

Yep.

[Nicholas] Tell me about that.

Yeah, not only would we delete comments,

but we did the unthinkable.

We actually removed accounts that were being

not so nice to people.

So, for example, whom?

Well, I don't remember exactly whom,

but the back story is my wife is one of the nicest

people you'll ever meet.

And that bleeds over to me.

And I try to model it.

So when we were starting the app, we watched this video

about like, basically, like how to start a company.

And it was by this guy who started the Lolcats company

or meme, and he basically said to form community

you need to do something.

And he called it prune the trolls.

Nicole would always joke with me.

She's like, hey, listen, when your community's getting rough

you gotta prune the trolls.

And that's something she still says to me today

to remind me off the importance of community

but also how important it is to be nice.

So back in the day, we would go in,

and if people were mistreating people,

we'd just remove their accounts.

And I think that set an early tone

for the community to be nice and be welcoming.

But what's interesting is that this is 2010, right.

Yeah.

And 2010 is the moment where a lot of people

are talking about free speech and the internet,

and Twitter's roll in the Iranian Revolution.

So it was a moment where free speech was actually

valued on the internet probably even more than it is now.

How did you end up being more in the prune the trolls camp?

Well, there's a age old debate between free speech

and like, what is the limit of free speech?

And is it free speech to just be mean to someone?

And I think if you look at the history of the law

around free speech, et cetera, you'll find that generally

there's a line where like you don't want to cross

because you're starting to be aggressive or be mean

or racist, and you get to a point where you want

to make sure that in a closed community

that's trying to grow and thrive, you make sure that you

actually optimize for overall free speech.

So if I don't feel like I can be myself,

if I don't feel like I can express myself,

because if I do that, I will get attacked,

it's not a community we want to create.

We just decided to be on the side of making sure

that we optimized for speech that was expressive

and felt like you had the freedom to be yourself.

So one of the foundational decisions at Instagram

that helped make it nicer than some of your peers

was the decision to not allow resharing, right,

and to not allow something that I put out there

to be kind of appropriated by someone else

and sent out into the world by someone else.

How was that decision made, and were there other

foundational design and product decisions that were made

because of niceness?

Well, we debate the reshare thing a lot

because obviously people love the idea of resharing

content that they find.

Instagram is full of awesome stuff.

In fact, one of the main ways people communicate

over Instagram Direct now is actually they share content

that they find on Instagram.

So that's been a debate over and over again.

But really that decision is about keeping your feed

focused on the people you know rather than the people

you know finding other stuff for you to see.

And I think that is more of a testament of our focus

on authenticity and on the connections you actually have

than about anything else.

So after you went to VidCon, you posted an image

on your Instagram feed of you and a bunch of the

celebrities. Totally.

I'm gonna read some of the comments.

[Kevin] In fact it was a Boomerang.

It was a Boomerang, right, which got you in a little

bit of trouble. Let's get technical here.

I'm gonna read some of the comments on @Kevin's post.

Sure.

These are the comments.

Suck.

Suck.

Suck me.

Suck.

Can you make Instagram have auto scroll feature?

That would be awesome and expand Instagram

as an app that could grow even more.

Meme lives matter.

You suck.

You can delete memes but not cancer patients.

I love meme lives matter.

All memes matter.

Suck.

MLM.

Meme revolution.

Cuck.

Meme.

Stop the meme genocide.

Make Instagram great again.

Meme lives matter.

Meme lives matter.

Meme lives matter.

Mmm, gang.

MLM gang.

I'm not quite sure what all this means.

Is this typical?

It was typical, but I'd encourage you to go

to my last post where I listed

for Father's Day. Your last post is all nice.

It's all nice.

It's all about how handsome your father is.

Right.

There are a lot, listen.

He is taken.

My mom is wonderful.

(laughs)

But there are a lot of really wonderful comments there.

So why is this post from a year ago full of cuck

and meme lives matter, and the most recent post

is full of how Kevin Systrom's dad is?

Yeah, well, that's a good question.

I would love to be able to explain it.

But the first thing I think is back then,

there were a bunch of people who were, I think, were unhappy

about the way Instagram was managing accounts.

And there are groups of people that like to get together

and band up and bully people.

But it's a good example of how someone

can get bullied, right?

The good news is I run the company.

I have a thick skin, and I can deal with it.

But imagine you're someone who's trying to express yourself

about depression or anxiety or body image issues,

and you get that.

Does that make you want to come back

and post on the platform?

[Nicholas] Certainly not.

And if you're seeing that, does that make you want

to be open about those issues as well?

No.

So a year ago, I think we had much more of a problem,

but the focus over that year on both comment filtering,

so now, you can go in and enter your own words

that basically filter out comments that include that word.

We have spam filtering that actually works really well.

So probably a bunch of those would have been caught up

in the spam filter that we have

because they were repeated comments.

And also, just a general awareness of kind comments.

We have this awesome campaign that we started

called #KindComments.

I don't know if you, you know, The Late Night Show,

they read off mean comments on another social platform.

We started Kind Comments to basically set a standard

in the community that it was better and cooler

to actually leave kind comments, and now, there's

this amazing meme that has spread throughout Instagram

about leaving kind comments.

But you can see the marked difference between

the post about Father's Day and that post a year ago

on what technology can do to create a kinder community.

And I think we're making progress,

which is the important part.

Tell me about sort of steps one, two, three, four, five.

Like, how do you, you don't automatically decide

to launch the 17 things you've launched since then.

No.

[Nicholas] Tell me about the early conversations.

Well, the early conversations were really

about what problem are we solving?

We looked to the community for stories.

We talked to community members.

We have a giant community team here at Instagram,

which I think is pretty unique for technology companies

that literally their job is to interface with the community

and get feedback and highlight members who are doing

amazing things on the platform.

So getting that type of feedback from the community

about what types of problems they were experiencing

in their comments.

Then, led us to brainstorm about all the different

things we could build, and what we realized

was that there was a giant wave of machine learning

and artificial intelligence, and Facebook had developed

this thing that basically, it's called Deep Text,

which basically--

Which launches in June of 2016, right?

So it's right there.

Yep, so they have this technology,

and we put two and two together, and we said,

you know what?

I think if we get a bunch of people to look at comments

and rate them good or bad, like you go on Pandora.

And you listen to a song.

Is it good or is it bad.

[Nicholas] Yeah.

Get a bunch of people to do that.

That's your training set.

And then what you do is you basically feed it to the machine

learning system, and you let it go through 80% of it.

And then you hold out 20% of the other comments.

And then you say, okay, machine.

Go and rate these comments for us based on the training set.

And then we see how well it does.

And we tweak it over time.

And now, we're at a point where basically,

this machine learning can detect a bad comment

or a mean comment with almost amazing accuracy.

Basically a 1% false positive rate.

So throughout that process of both brainstorming,

looking at the technology available

and training this filter over time with real humans

who are deciding this stuff, gathering feedback

from our community, and gathering feedback from our team

about how it works, we are able to create something

we're really proud of.

So when you launch it, you make a very important decision.

Do you want it to be aggressive in which case it'll

probably kick out some stuff it shouldn't,

or do you want it to be less aggressive in which case

a lot of bad stuff'll get through?

Yeah, this is the classic problem.

If you go for accuracy, you will misclassify a bunch

of stuff that actually was pretty good,

so you know, if I'm just, you're my friend.

And I go onto your photo, and I'm just joking around

with your and giving you a hard time.

Instagram should let that through 'cause you guys,

we're friends, and Right.

you know, I'm just giving you a hard time,

and that's a funny banter back and forth.

Whereas if you don't know me, and I come on,

and I make fun of your photo, that feels very different.

Understanding the nuance between those two is super

important, and the thing we don't want to do is have any

instances where we block something

that shouldn't be blocked.

Now, the reality is, it's going to happen.

[Nicholas] Definitely.

The reality is it's going to happen.

So the question is is that margin of error worth it

for all the really bad stuff that gets blocked?

And that's a fine balance to figure out.

That's something we're working on.

We trained the filter basically to have

a 1% false positive rate.

That means 1% of things that get marked as bad

are actually good.

And that was a top priority for us because we're not here

to curb free speech.

We're not here to curb fun conversations between friends,

but we want to make sure we are largely attacking

the problem of bad comments on Instagram.

And so you go, and every comment that goes in

gets sort of run through an algorithm, and the algorithm

gives it a score from zero to one on whether it's likely

a comment that should be filtered or a comment

that should not be filtered.

Right.

And then that score combined with the relationship

of the two people?

No, the score actually is influenced based

on the relationship.

So the original score is influenced by it,

and Instagram, I believe I have this correct,

has something like a karma score for every user

where the number of times they've been flagged

or the number of critiques made of them is added in

to something on the back end, and that goes into this, too?

So without getting into the magic sauce,

you're asking like, Coca-Cola to give up its recipe.

I'm gonna tell you that there's a lot of complicated stuff

that goes into it, but, basically, it looks at the words.

It looks at our relationship.

And it looks at a bunch of other signals including

account age and account history, that kind of stuff.

And it combines all the signals, and then it spits

out a score of zero to one about how bad

this comment is likely.

And then, basically, you set a threshold that optimizes

for 1% false positive rate.

When do you decide it's ready to go?

(sighs) I think at a point where the accuracy

gets to a point that internally we're happy with it.

So one of things we do here at Instagram is we do

this thing called dog-fooding, and not a lot of people

know this term, but in the tech industry,

it means, you know, eat your own dog food.

So what we do is we take the products,

and we always apply them to ourselves

before we go out to the community.

And there are these amazing groups at Instagram,

and I'd love to take you through them unless they were,

but they're actually all confidential.

But it's employees giving feedback about how they feel

about specific features, and--

So this is live on the phones of a bunch of Instagram

employees right now?

There are always features that are not launched

that are live on Instagram employees phones

including things like this.

So there's a critique of a lot of the advances in machine

learning that the corpus on which it's based

has biases built into it.

So Deep Text analyzed all Facebook comments, right.

It analyzed some massive corpus of words

that people have typed into the internet,

but when you analyze those, you get certain biases

built into them.

So for example, I was reading a paper,

and somebody had taken a massive corpus of text

and created a meat machine learning algorithm

to rank restaurants and to look at the comments

that people had given under restaurants

and then try to guess the quality of the restaurants.

He went through, and he ran it.

And he was like, interesting because all of the Mexican

restaurants were ranked badly.

So why is that?

Well, it turns out as he dug deeper into the algorithm,

it's because in the massive corpus of texts,

the word, Mexican, is associated with illegal,

illegal Mexican immigrant.

Because that is used so frequently.

So there are lots of slurs attached to the word, Mexican,

so the word, Mexican, has negative connotations

in the machine learning base corpus,

which then affects the restaurant rankings

of Mexican restaurants.

It sounds awful.

[Nicholas] So how do you deal with that?

Yeah, uh, well, good news is we're not in the business

of ranking restaurants.

But you are ranking sentences based on this huge corpus

of texts that Facebook as analyzed as part of Deep Text.

Well, it's a little bit more complicated than that.

So all of our training actually

comes from Instagram comments.

So we have hundreds of raters, and it's actually

pretty interesting what we've done with this set of raters.

Basically, human beings that sit there, and by the way,

human beings are not unbiased.

That's not what I'm claiming.

But you have human beings.

Each of those raters is bilingual.

So they speak two languages.

They have a diverse perspective.

They're from all over the world.

And they rank those comments basically

thumbs up or thumbs down.

Basically, the Instagram corpus, right?

So you feed it the thumbs up, thumb down

based on an individual, and you might say, but wait.

Isn't a single individual biased in some way?

Which is why we make sure every comment is actually

seen twice and given a rating twice by at least two people

to make sure that there isn't, there's as minimal amount

of bias in the system as possible.

And then on top of that, we also gain feedback

from not only our team but also the community.

And then we're able to tweak things on the margin

to make sure that things like that don't happen.

I'm not claiming that it won't happen.

That's, of course, a risk, but the biggest risk of all

is doing nothing because we're afraid of these things

happening, and I think it's more important that we are

A, aware of them, and B, monitoring them actively,

and C, making sure that we have a diverse group

of raters that not only speak two languages

but are from all over the world and represent

different perspectives to make sure we have

an unbiased classifier.

So let's take a sentence like, these hoes ain't loyal,

which is a phrase that I believe a previous study

on Twitter had a lot of trouble with.

Your theory is that some people will say,

oh, that's a lyric.

Therefore, it's okay.

Some people won't know it will get through,

but enough raters looking at enough comments over time

will allow lyrics to get through,

and these hoes ain't loyal, I can't post that

on your Instagram feed if you post a picture

which deserves that comment.

Well, I think what I would counter is if you post

that sentence to any person watching this,

not a single one of them would say that's a mean-spirited

comment to any of us, right.

So I think that's pretty easy to get to.

I think if there are more nuanced examples,

and I think that's in the spirit of your question.

Yeah.

Which is that there are gray areas.

The whole idea of machine learning is that it's far better

about understanding those nuances than any algorithm

has in the past or any single human being could.

And I think what we have to do over time is figure out

how to get into that gray area, and like judge

the performance of this algorithm over time

to see if it actually improves things.

Because by the way, if it causes trouble, and it doesn't

work, we'll scrap it and start over with something new.

But the whole idea here is that we're trying something,

and I think a lot of the fears that you're bringing up

are warranted, but it's exactly why it's keeps most

companies from even trying in the first place.

Right, and so first you're gonna

launch this filtering bad comments.

The second thing you're gonna do is the elevation

of positive comments.

Tell me about how that is gonna work

and why that is a priority.

Um, the elevation of positive comments is more about

modeling in the system.

We've seen a bunch of times in the system

where we have this thing called the mimicry effect.

So if you actually, if you raise kind comments,

you actually see more kind comments.

You see more people giving kind comments.

It's not that we ever ran this test,

but I'm sure if you raised a bunch of mean comments,

you'd see more mean comments.

Part of this is the piling on effect, and I think

what we can do is by modeling what great conversations

are, more people will see Instagram as a place for that

and less for the bad stuff.

And it's got this interesting psychological effect

that people want to fit in, and people want to do

what they're seeing, and that means that people

are more positive over time.

And are you at all worried that you're gonna turn

Instagram into the equivalent of

an East Coast Liberal Arts College where people are--

(laughs) I think those of us who grew up

on the East coast might take offense to that.

(laughs)

I'm not sure what you mean exactly.

I mean, a place where there are trigger warnings

everywhere where people feel like they can't have

certain opinions, or people feel like they can't say things

where everything, where you put this sheen over

all your conversations as though everything

in the world is rosy, and that bad stuff

we're just gonna sweep it under the rug.

Yeah, that would be bad.

That's not something we want.

So I think in the range of bad, we're talking about

like the lower 5%, like the really, really bad stuff.

I don't think we're trying to play anywhere

in the area of gray.

Although I realize there's no black or white,

and we're gonna have to play at some level.

But the idea here is to take out, I don't know,

the bottom 5% of nasty stuff.

I don't think anyone would argue that that makes

Instagram a rosy place.

It just doesn't make it a hateful place.

And you wouldn't want all the comments

on your, you know, on your VidCom post.

It's a mix of sort of jokes and nastiness and vapidity

and useful product feedback.

And you're getting rid of the nasty stuff.

But would it be better if you raised like

the best product feedback up and then the funny jokes

to the top?

Maybe.

And maybe that's a problem we'll decide to solve

at some point, but right now, we're just focused

on making sure that people don't feel hate, you know.

And I think that's a valid thing to go after,

and I'm excited to do it.

So the thing that interests me the most

is that it's like Instagram is a world

with 700 million people, and you're writing

the constitution for the world.

When you get up in the morning, and you think about

that power, that responsibility, how does it affect you?

Doing nothing felt like the worst option in the world,

so starting to tackle it means

that we can improve the world.

We can improve the lives of many young people

around the world that live on social media.

I don't have kids yet.

I will someday.

And I hope that kid, boy or girl, grows up

in a world where they feel safe online,

where I, as a parent, feel like they are safe online.

And, you know, the cheesy saying, with great power

comes great responsibility?

Like we take on that responsibility,

and we're gonna go after it, but that doesn't mean

that not acting is the correct option.

There are all sorts of issues that come with acting.

You've highlighted a number of them today,

but that doesn't mean we shouldn't act.

It just means we should be aware of them,

and we should be monitoring them over time.

One of the critiques is that Instagram, particularly

for young people, is very addictive.

And in fact, there's a critique being made

by Tristan Harris who is

a classmate of yours Classmate of mine.

and a classmate of Mike's and a student in the same

class as Mike's and he says that the design of Instagram

deliberately addicts you.

For example, when you open it up--

Sorry, I'm laughing just because I think the idea

that anyone inside here tries to design something

that is maliciously addictive is just like so far-fetched.

We try to solve problems for people,

and if by solving those problems for people, they like

to use the product, I think we've done our job well.

This is not a casino.

We are not trying to eek money out of people

in a malicious way.

The idea of Instagram is that we create something

that allows them to connect with their friends

and their family and their interests

through positive experiences, and I think any criticism

of building that system is unfounded.

And so all of this is aimed at making Instagram better,

and it sounds like changes so far

have made Instagram better.

Is any of it aimed at making people better,

or is there any chance that the changes

that happen on Instagram will seep into the real world

and maybe just a little bit conversations in this country

will be more positive than they've been?

I sure hope we can stem any negativity in the world.

I'm not sure we would sign up for that day one.

But I actually want to challenge the initial premise,

which is this is about making Instagram better.

I actually think it's about making the internet better.

I hope some day the technology that we develop

and the trainings that we develop and the things

we learn, we can pass on to start-ups.

We can pass on to our peers in technology.

And that we actually together build a kinder,

safer, more inclusive community online.

[Nicholas] Will you open source the software

you've built for this?

I'm not sure.

I'm not sure.

I think a lot of comes back to how good it performs,

and the willingness of our partners to adopt it.

But what if this fails?

What if actually people kind of get turned off

by Instagram?

They say, Instagram's becoming like Disneyland.

I don't want to be there, and they share less.

(laughs) The thing I love about Silicon Valley

is that we bear hug failure.

Like, failure is what we all start with.

We go through.

Hopefully, we don't end on on the way to success.

I mean, Instagram wasn't Instagram initially.

It was a failed startup before.

I turned down a bunch of job offers that would have been

really awesome along the way.

That was failure.

I've had numerous product ideas at Instagram

that were total flaws, they were total failures.

And that's okay.

We bear hug it because when you fail

at least you're trying, and I think that's actually

what makes Silicon Valley different

from traditional business is that our tolerance

for failure here is so much higher.

And that's why you see bigger risks

and also bigger payoffs.