Search Engine Breakdown
04/14/21 | 20m 37s | Rating: NR
Why does a widely used internet search engine deliver results that can be blatantly racist and sexist? Two leading information researchers investigate their discoveries of hidden biases in the search technology we rely on every day.
Copy and Paste the Following Code to Embed this Video:
Search Engine Breakdown
REPORTER
As misinformation and so-called fake news continues to be rapidly distributed on the internet, our reality has become increasingly shaped by false information. Many people don't know the difference between something real and something created to deceive them.
SAFIYA NOBLE
I spent about 15 years in advertising and marketing, and while I was there, Google arrived on the scene. I understood the transformative effect that this search engine was having in helping us curate through all kinds of information. But I was surprised, having just left advertising, that everybody was thinking about Google as this new public trusted resource, because I thought of it as an advertising platform. Most people who use search engines believe that search engine results are fair and unbiased. The public, and especially kids and young people, use search engines to tell them the facts about the world. One weekend, my nieces were coming over to hang out, and I was thinking, "Oh, let me pull my laptop out "and see if I can find some cool things for us to do this weekend." I just thought to type in "Black girls," and the whole first page of search results was almost exclusively pornography or hyper-sexualized content. In 2012, I started to see some of the results changing. Google had started to suppress the pornography around Black girls. Unfortunately, still today, we see pornography and a kind of hyper-sexualized content as the primary way in which Latina and Asian girls are represented. "What makes Asian girls so attractive," "Asian fetish," "hot ladies from Asians," "see who we rank number one in 2020," "tender Asian girls," "meet world beauties." This is the study that was done by the Markup that replicated my study from ten years ago. They found that Black girls, Latina girls, and Asian girls, those phrases were-- look-- so profoundly linked with kind of adult content. Zero for white girls, zero for white boys. There are so many racial stereotypes and gender stereotypes that show up in search results. What about actual girls and children who go and look for themselves in these spaces? It's very disheartening. When women become sex objects in a space like this, it's really profound, because the public generally relates to search engines as kind of fact checkers. Before we were so heavily reliant upon a database, we used something like a card catalogue. We didn't rank content, it was alphabetical. It also might be by subject. It's a summary of the organization system we call the Dewey Decimal System.
NOBLE
Now when we're in a subject, we know there is a lot in relationship to that one item that we might be looking for. We might go look for a book in the stacks, for example, and find that there's hundreds of books around that one that tell us something about that book, and we might serendipitously find all kinds of other bits of information that are amazing. But we can see a little a bit more about the logics of that. We don't understand the logics of how certain things make it to the first page in a search. Google has a very complicated and nuanced algorithm for search. Over 200 different factors go into how they decide what we see. Of course, they're indexing about half of all of the information that is on the web, and even that is trillions of pages.
AD ANNOUNCER
Billions of times a day, Google software locates all the potentially relevant results on the web, removes all the spam, and ranks them based on hundreds of factors, like keywords, links, location, and freshness-- all in, oh, 0.81 seconds.
NOBLE
The whole premise of a search engine is to categorize and classify information. A lot of the content that comes back to us on the internet, it's in a cultural context of ranking. We know very early what it means to be number one, so ranking logic signals to us that the classification is accurate, from one being the best to whatever is on page 48 of search, which nobody ever looks at. (keyboard clacking) Part of what it's doing is picking up signals from things that we've clicked on in the past, that a lot of other people have clicked on, things that are popular. So an algorithm is, in essence, a decision tree. If these conditions are present, then this decision should be made. And the decision tree gets automated so that it becomes like a sorting mechanism. Google's very reliable for certain types of information. If you're using it in this kind of phone book fashion, it's fairly reliable. But when you start asking a search engine more complex questions, or you start looking for knowledge, the evidence isn't there that it's capable of doing that. It's this combination of hyperlinking, it's a combination of advertising and capital, and also what people click on that really drives what we find on the web. This is where we start falling into trickier situations, because those who have the most money are really able to optimize their content better than anyone else. There have been great studies about the disparate impact of what a profile online says about who you are.
LATANYA SWEENEY
I was the first African American women to get a PhD in computer science at M.I.T. So, I visit Harvard. I'm being interviewed there by a reporter, and he wants to see a particular paper that I had done before. So, I go over to my computer, I type in my name into Google's search bar, and upward pops this ad implying I had an arrest record. He says, "Ah, forget that article. Tell me about the time you were arrested." I said, "Well, I have never been arrested." And he says, "Then why does your computer say you've been arrested?" So I click on the ad, I go to the company to show him not only did I not have an arrest record, but nobody with a "Latanya Sweeney" name had an arrest record. And he says, "Yeah, but why did it say that?" If you type in the name "Latanya" in the Google image search, you can see a lot Black faces staring back. Whereas if I type "Tanya," I see a lot of white faces staring back. So we get the idea that there are some first names given more often to Black babies than white babies. So, I then took a month and I researched almost 150,000 ad deliveries around the country, and I found that if your name was given more often to white babies than Black babies, the ad would be neutral. And if your first name was given more often to Black babies than white babies, you were 80% likely to get an ad implying you had an arrest record, even if no one with your name had any arrest record in their database.
NOBLE
One specific way that algorithms discriminate is that they just are too crude. The idea of if x, then y, if you have this type of name, it means you're automatically associated with criminality. That blunt, crude kind of association, that is the staple logic of how algorithms work. The types of bias that we find on the internet are often blunt. We are being profiled into similar groups of people who do the kinds of things that we might be doing, and we're clustered and sold as a cluster to advertisers. And so there's certainly a commercial bias. But we also have the bias of the people who design the technologies. To think that technologies will be neutral or never have bias is really an improper framing. Of course there will always be a point of view in our technologies. The question is, is the point of view in service of oppression? Is it sexist? Is it racist?
SWEENEY
Here I was, a passionate believer in the future of equitable technology, and if the people, when they were hiring me at Harvard, had typed my name into the Google search bar and paid attention to this ad, it put me at a disadvantage. And not just me, but a whole group of Black people would be placed at a disadvantage. How could these biases of society be invading the technology that I really had grown to love? And now civil rights was up for grabs by what technology design allowed or didn't allow. Google's ad delivery system is really quite amazing. You click on a web page, and that web page has a slot that an ad is going to be delivered. And in that fraction of a second, while the page is being delivered, Google runs a fast digital auction. And in that digital auction, they decide which of competing ads are going to be the ad they're going to place right there. At first, the Google algorithm will choose one of them randomly, but if somebody clicks on one, then that one becomes weighted more often to be delivered. So, one way the discrimination in online ads could happen would've been that society would have been biased on which ads they clicked most often, and that this would've represent the bias of society itself. Our technology and our data sharing are so powerful that they are kind of like the new policy maker. We don't have oversight over these designs, but yet, how the technology is designed dictates the rules we live by. And this meant that we were moving from a democracy to a new kind of technocracy. I became the chief technology officer at the Federal Trade Commission. They're sort of the de facto police department of the internet. One of the experiments that I had done while I was at the FTC showed that everyone's online experience is not the same. What we lose with our hyper-reliance upon search technologies and social media is, the criteria for surfacing what's most important can be deeply, highly manipulated. One of the hardest case studies to write in my book was about Dylann Roof. He went online and he was trying to make sense of the trial of George Zimmerman. And the first thing that I guess I can say, I would say woke me up, you know, would be the Trayvon Martin case.
REPORTER
Trayvon Martin, an unarmed Black teenager, was shot down by a white neighborhood watchman who claimed self-defense. Eventually I decided to, you know, look his name up, just type him into Google, you know what I'm saying? For some reason, it made me type in the words "Black on white crime."
NOBLE
We know from Dylann Roof's own words that the first site that he comes to is the Council of Conservative Citizens. The CCC is an organization that the Southern Poverty Law Center calls vehemently racist. And that's, that was it, ever since then. Let's say he had been my student. I could've just immediately said, "Did you know that that phrase is kind of a racist red herring?" The FBI statistics show us that the majority of white people are actually killed by other white people. But instead, he goes to the internet and he finds the CCC, and he goes down a rabbit hole of white supremacist websites. Did you read a lot? Did you read books, or watch videos, or watch movies or YouTube, or anything like that specifically about that subject matter? No, it was pretty much just reading articles. Reading articles? Yeah. And we know that shortly thereafter, he goes into a church, murders nine African Americans, and says his intent is to start a race war. This is not an atypical possibility. When you don't get a counterpoint to the query, you don't get Black studies scholarship, or FBI statistics, or anything that would reframe the very question that you're asking. This is an extreme case of acting upon white power radicalization, but this is not unlike things that are happening right now every day in search engines, on Facebook, on Twitter, in Gab. People are being targeted and radicalized in very dangerous ways. This is what is at stake when people are so susceptible to disinformation, hate speech, hate propaganda in our society.
SWEENEY
Racism itself can't be solved by technology. The question is, to what extent can we make sure technology doesn't perpetuate it, doesn't allow harms to be made because of it? We need a diverse and inclusive community in the design stage, in the marketing and business stage, in the regulatory and journal stages, as well.
NOBLE
I am really interested in solutions. It's easy to talk about the problems, and it's painful, also, to talk about the problems. But that pain and that struggle should lead us to thinking about alternatives. Those are the kind of things that I like to talk to other information professionals and researchers and librarians about. As a person who has a name that doesn't sound like Jennifer, right? Or Sarah, or something. That paper made the difference for me, because I was just this grad student, and you were this esteemed Harvard professor, and you were having these experiences, too. When I think about the, like, the, the foundations of something like ethical A.I., I go back to you in that early paper. I think what I feel most hopeful about is that there's this new cottage industry called ethical A.I., and I know that our work is profoundly tied to that. But on another level, I feel like these predictive technologies are so much more ubiquitous than they were ten years ago. You know, what I find really painful is that as we move forward, it's harder to track. One thing that becomes clear is, we could use a heck of a lot more transparency. As a computer scientist, my vision is, I want society to enjoy the benefits of all these new technologies without these problems. Technology doesn't have to be made this way. That's right, that's right. I see so many more women and girls of color interested in these conversations, and one of the things that I also see is how we see things because we ask different questions based on our lived experiences. Just the fact that the questions are being raised means that the space is less hostile, means there's an opportunity for your voice. And, and the other thing that's really important about this work, it means that it's a new kind of way of thinking about computer science. It's in this conversation with you that I see a future. I'm hopeful because it's not one isolated paper, but in fact, it's a, it's a movement toward asking the right questions, exposing the right unforeseen consequences, and pushing this forward towards a solution. Some questions cannot be answered instantly. Some issues we're dealing with in society, we need time and we need discussion. How can we look for new logics and new metaphors and new ways to get a bigger picture? Maybe we can see when we do that query that that's just nothing but propaganda, and we can even see the sources of the disinformation farms, maybe we can see the financial backers. There's a lot of ways that we can reimagine our information landscape. So, I do feel like there is some hope.
Search Episodes
Related Stories from PBS Wisconsin's Blog
Donate to sign up. Activate and sign in to Passport. It's that easy to help PBS Wisconsin serve your community through media that educates, inspires, and entertains.
Make your membership gift today
Only for new users: Activate Passport using your code or email address
Already a member?
Look up my account
Need some help? Go to FAQ or visit PBS Passport Help
Need help accessing PBS Wisconsin anywhere?
Online Access | Platform & Device Access | Cable or Satellite Access | Over-The-Air Access
Visit Access Guide
Need help accessing PBS Wisconsin anywhere?
Visit Our
Live TV Access Guide
Online AccessPlatform & Device Access
Cable or Satellite Access
Over-The-Air Access
Visit Access Guide
Passport







Follow Us