ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Why AI Moderators Will Be Biased

Updated on January 8, 2018
tamarawilhite profile image

Tamara Wilhite is a technical writer, an industrial engineer, a mother of two, and a published sci-fi and horror author.

Introduction

Microsoft revealed how AI can learn to be biased as a result of human effort. Within 24 hours, the AI Tay, modeled off a teenaged girl, was spouting racist and other bigoted statements. While Tay mostly learned from the worst of the trolls on the internet and was shaped by the coordinated input of such “bad” data, the issue triggered a wave of papers on how AIs learn to share human biases. Most of those papers focused on how AIs learn sexism and racism – but this article is concerned about the other biases AIs learn from humans with far greater likely impact on discourse.

The censorship seems logical when it is automatically done by the AI. And when done automatically, it is also hidden to most.
The censorship seems logical when it is automatically done by the AI. And when done automatically, it is also hidden to most. | Source

How AIs Learn Bias by Design and How It Impacts the Internet

As an engineer with more than a decade of experience in IT, I realized that AIs and computer models in general may end up with built-in biases as a result of human decisions. Furthermore, their biases are reinforced based on the biases of humans who select which AI programs to go with and the human moderators who provide further “tuning” of the models. Let’s look at how the demonstrated biases at large tech companies are already affecting dialogue and how relying on artificial intelligences they think are unbiased will actually increase the bias and ability to communicate for many.

Google’s program “Perspective” is an AI trained to flag and moderate comments and tries to tame the trolls. Instagram set up an AI to do the same, though unlike Google’s project, it relies entirely on their own data set. Who wouldn’t want to shut down trolls? The problem comes up with the supposedly neutral algorithms by design. When an AI is trained based on the data from left wing sites like Huffington Post and New York Times, moderate opinions are now classified as on the right and barely tolerated, while conservative opinions are readily flagged as unacceptable.

Publishers can define what toxicity threshold they’ll tolerate on their site with Google Perspective, but the toxicity rating is based on the liberally biased data set and further reinforced by comments flagged as unacceptable by these sites. Observe sites from Reddit to Facebook purging conservative, libertarian, ex-Muslim and other groups based on the biases of the site administration. Given that the left wing bias is obviously dominant across many social media platforms, it is logical to assume they’ll only implement moderating AIs that moderate based on their biases. Training of the AI will reinforce it with similar or increasingly biased examples as they continue to censor content they don’t like.

The new data could end up not only locking in the bias but shifting further as the definition of “unacceptable” expands to include ever more opinions, topics and sources.

Fact checking doesn't prevent this bias but is an example of it. The fact checking of data against liberally biased sites against liberally biased fact checkers like Snopes and Politifact is one example of this. The unfairness of fact checking against liberal sites is that what these sites don’t cover is flagged as untrue or unverified when covered by the remaining independent outlets or conservative sources. The other problem is the shared bias of sites like Snopes and Politifact is that it rates reporting on conservative sites as partially true, instead of conservatively biased. A third issue is the fact that liberals are selective in what they fact check, rarely fact checking verifiably untrue statements by liberals. This results in untrue statements by liberal figures given weight by virtue of publication, while conservatives’ statements are minced, challenged, reported as biased and otherwise delegitimized. Sites like Facebook and Twitter limiting original sources like Wikileaks limits the ability of humans to vet this information as legitimate, and the biggest social media cites censoring these sources teach AI not to let others share them or downgrade comments that refer to them.

You end up with the AI reinforcing the ever-narrowing acceptability standards, as it weeds out balancing counter-opinions and the user community itself shifts to the left. After all, if one never hears a contrary opinion in any degree, it is already known to social sciences that the community becomes more extreme. Algorithms reinforced by like-minded humans will do the same with greater speed and efficiency.

How Can This Issue Be Solved?

More data isn’t the solution. The issue isn’t resolved by using large data sets unless they come from sources with diverse opinions. For this reason, using multiple data sets from liberal sites isn’t a solution, but mixing in sources from conservative and international sites that have different biases is a solution. However, that’s not what has happened during development, and it isn’t likely to occur as a correction.

Conservatives, libertarians, people questioning the political correctness baked into the data sets used to train AIs on social media sites that will censor them across the internet as these tools are rolled out may find freedom on other platforms, but that leads to greater bias by the AIs as every contrary opinion disappears off major platforms.

For the time being, using more facts and detail in an online comment or posting reduces the odds that you will be censored by these AIs, but as they learn more context, you end up with anything other than the PC checklist being censored by an AI that the moderators think is unbiased because they don’t check their own biases. It is rather self-unaware for liberals to say that others have biases while having their own, but Dennis Prager pointed that problem out a decade ago.

It is possible that deliberate campaigns to promote conservative, libertarian and classic liberal content and discussing it will keep public discussion and AI moderation data sets moderately left instead of evolving into a narrow, far left corner.

One solution would be tech companies deliberately hiring more conservatives, libertarians and others who do not share their liberal biases and putting them in positions of influence over technology. Think about the Facebook trending news feed where they censored conservative stories and injected liberal ones, the majority of the team thinking this was moral and good. However, I don’t expect that to happen.

Why Does This Matter?

The book “Wisdom of the Crowds” shares how it is diversity of opinion and free exchange of ideas that leads to correct democratic solution. When we silence opinions, facts based on their sources, and even facts from neutral sources because they are contrary to an ever extreme set of political biases, the public debate’s outcome by its very nature leads to wrong outcomes. We also risk cultivating extremism because people’s dialogue shifts farther and farther to the extremes because their views are never countered by the other sides. This is a logical result of everyone retreating to a safe space, both because the AI won’t let liberals see anything else and conservatives have to retreat to corners the AIs don’t reach.

And our social divide becomes an impossible chasm, made worse by the supposedly moderate artificial intelligences whose creators don’t understand the biases and reinforcement thereof their creations cause.

References

Snopes, Which Will Be Fact-Checking For Facebook, Employs Leftists Almost Exclusively
http://dailycaller.com/2016/12/16/snopes-facebooks-new-fact-checker-employs-leftists-almost-exclusively/

Snopes, Politifact, & Other Fact Checkers are Liberal Mouthpieces
http://canadafreepress.com/article/snopes-politifact-other-fact-checkers-are-liberal-mouthpieces

Who’s Checking the Fact Checkers?
A new study sheds some light on what facts the press most likes to check.
https://www.usnews.com/opinion/blogs/peter-roff/2013/05/28/study-finds-fact-checkers-biased-against-republicans

working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)