Google to downgrade pirate sites in search results
Search Engine as Social Arbiter
Everyone with a browser uses a search engine. It's your TV Guide and Yellow Pages and Rand McNally Road Atlas in a convenient digital form. Try to spend one 24 hour period without searching for something online: all you have is your bookmarks and your friends. It can't be done.
A publicly held company, Google is traded on the Nasdaq under the symbol GOOG. Anyone can buy it, but in increments of $642: individual shares are relatively expensive. High-dollar shares are not indicative of company value, but they do present a daunting entry point for individual investors.
On paper the company is worth over 200 billion dollars. They don't have a storefront, like Apple, or a product line, like Microsoft. There's no inventory: were they to file bankruptcy tomorrow they would begin to sell off massive numbers of computers and related technology used to manage their search engine service. They also own real estate dedicated to office buildings and data centers throughout the world. Their intellectual property has a very limited shelf-life and would bring little value were it offered for sale.
Is Google the best search engine? Yahoo and Microsoft don't think so, but those companies cannot compete with the number of search results returned by Google systems. No one outside of these companies knows how search results are computed, therefore it is impossible to compare one algorithm against another. It will always be this way. Google's competitive advantage comes from its secrecy. Their servers aren't faster or cooler or Made in America. If we knew how they ranked sites and returned search results, we'd have the knowledge to circumvent and evaluate the algorithms. All we can do now is judge the results, like eating a hot dog.
The secrecy is ironic because Google is a champion of Open Source software, outside of their search engine. The company sponsors programming competitions, tutorials, training, and academic programs designed to nurture the next generation of coders. Throughout these programs runs the common thread of transparency: if you code something, everyone should be able to access it. Perversely, don't bother to ask Google for their algorithms: they'll tell you to pick better keywords and to write better content.
The company employs people to hint at the algorithms and mete out helpful(?) bits of information when revisions take place. Matt Cutts is widely cited as a 'source' for understanding how Google searches and sorts and ranks. Don't mistake his generosity for altruism: he draws a paycheck from the company that he partially reveals. I immediately like the guy because he and I share a very similar academic pedigree. We've probably attended some of the same conferences and perhaps sat in the same breakout sessions.
No search engine can avoid becoming a social arbiter. Search engine users implicitly trust the results they are given. Search for "Toms shoes" and you get the top 10 results as decided by Google algorithms. It's probably good enough, but you'll never know if result #11 would have met your needs more effectively. Google will never know, either. They can measure what result you select and how long you might tarry on some sites, but they can't measure your satisfaction.
Recently Google admitted to adjusting their sorting algorithm to penalize web pages deemed as 'pirate sites.' The implication is that these sites are offering downloads that violate copyright laws in one or more countries.
Actually, the implication is that Google computers have decided that these sites are such offenders.
Keep in mind that humans are rarely involved in this process. Far too many web sites exist to be evaluated by eyeballs. This is not the exercise of a focus group. When a site is tagged as a pirate site, it's almost certainly because a computer algorithm added up some numbers and came up with a number in a range. The number, the range, and the algorithm are all under the control of Google.
You get to vote
Google hints that their pirate-flagging algorithms will consider the volume of "valid copyright removal notices" (their words) reported against individual sites. We cannot say if Google plans to interface with government agencies to validate complaints. Certainly they plan to use their own copyright violation reporting system, but that has obvious limitations.
Worst case, Google will evaluate third-party sites based on complaints received through Google. As a frequent user of Google's reporting system, I can confirm that it does work, except when it doesn't. Google holds no sway over sites that don't publish Google advertising (Ad Words.) A site in Nigeria proudly violating copyrights probably won't care if Google indexes them or not. I recently published a page that was shortly cloned on a European server: for several keywords, that pirated page ranks higher than my original content. I have reported it to Google, but the clone persists.
What if you want to search for a pirate site?
Google has decided that you probably won't have a pirate site in your search results. Keep in mind that the company is under fire on several fronts for, allegedly, facilitating copyright violations. They are being sued for copying and publishing millions of books that were supposedly out of print and not subject to copyright regulations. They are constantly harangued by movie, music, and book publishers who are hysterical over loss of revenue due to pirate sites. Adding new filtering logic to their search engine may appease some lawyers and judges.
Some folks would say that suing Google is akin to suing General Motors for making the car used to haul pirated DVDs. Those folks don't make their living producing digital media. Whether right or wrong, Google is permanently inserted into the debate.
More about Google
- Google Style Guide: Does Google have a Style Guide?
Mighty Yahoo publishes a Style Guide, but Google does not. We present a compendium of Google Style Guide possibilities.
- What does Google Know about You?
Rumors swirl around the topic of online privacy. Some Internet users wish to be completely anonymous, some don't know who's watching and couldn't care less. Regardless of your desire for privacy...
How has G00gle become the monolithic de-facto leader in Internet Search Engine technology? Let us look at what might have happened.
More by this Author
Ever been to a NASCAR race? I thought not. Here are my top 10 reasons why NASCAR racing doesn't rock.
We could nag, but this is better. It is less work for us to use the words of others to emphasize the importance of doing laundry. Besides, if you saw the way we dressed, you'd laugh your mouse off.
Data Hiding is an aspect of Object Oriented Programming (OOP) that allows developers to protect private data and hide implementation details. In this tutorial we examine basic data hiding techniques in Java.