Searching for a Better Search Engine

Crede Vaccam

What do you think this picture is about? Do you think it's about vacuum cleaners?
What do you think this picture is about? Do you think it's about vacuum cleaners? | Source

Why do we need search engines at all? Why don't we just search the entire internet ourselves for what we are trying to find? I always thought that the reason we didn't was because it would take us so much more time than it would for a computer to do it for us, and that's why we assigned this routine work to a search engine.

I was already searching legal databases using primitive search engines back in the 1980s when I was in law school. At the time, the search algorithm was not a big, proprietary secret. It was access to the database that was paid for by subscription. The search algorithm was spelled right out for us.

Here is how I understood the explanations:

  • If one search term was submitted with no special symbols, all documents containing that term would be listed in the search results, either chronologically ordered or alphabetically ordered, depending on the default of that particular search engine or a default that the user selected.
  • If two or more search terms were submitted in a sequence, (without quotes or other symbols) such as Vacuum County or crede vaccam or Nabal Cabeza de Vaca, then the search engine would first return all those documents that had the exact words in that exact sequence with nothing between them, followed by all other documents where the search terms appeared in that order, but possibly with other intervening terms. (They would be ordered so that the ones with the fewest intervening terms would appear first on the list.) After that, all documents that had all terms in whatever order would be returned. After this would be listed all the documents that had at least one of the search terms.
  • If two or more search terms were submitted between quotes, like this "Vacuum County" or "crede vaccam" or "Nabal Cabeza de Vaca", then only those documents that contained the exact search terms in exactly that sequence, without intervening items, would be returned.
  • If one of the search terms was submitted with a plus sign in front of it, such as +vaccam, then no document would be returned that did not contain it.
  • If one of the search terms was submitted with a minus sign in front of it, such as -cleaner, then no document would be returned that did contain it.
  • There was no magic formula and no secret algorithm, and the whole purpose of cluing us into these rules was the better to help us find what we were looking for.

Nobody supposed at the time that the documents we were searching for would have a partisan interest in being found first, or that the people who had written the documents would bribe the search engine into tricking us to overlook competing documents. The purpose was really to find the best possible match for the exact search term entered. If you typed in Vacuum you got documents that had the term Vacuum. If you typed in vaccam you got documents that contained the word vaccam. It was inconceivable that somebody could type in Nabal and the search engine would decide all by itself and without consulting us that a document with the word naval was a better match.

Those were the days!

What Google finds when we search for Vacuum County

Vacuum County is my second novel. I finished writing it in 1993, and then I spent a couple of years trying to interest an agent or a publisher in the manuscript. And then I figured, why not just let people read it for free? After all, there's this new fangled thing called the internet, and anybody can find anything just by looking it up on Altavista. All they have to do is put in a couple of search terms, and voila! -- the closest match will be returned. And since I'm the only person who has ever written a novel called Vacuum County about a man named Nabal Cabeza de Vaca and since the phrase crede vaccam is exceedingly rare, anybody who even just by accident juxtaposes those words will get a glimpse of my novel. Sure I'd love to have been able to sell it, but more importantly I want it to be read. So there it sat, and pretty much nobody read it for the past fifteen years.

For all you novelists out there, here's some information that might come in handy. The worst thing that can happen to your novel is not that someone will steal it. The worst possible thing that can happen is that nobody will ever read it.

But that's okay, right? People didn't read it, because they weren't interested. That's fair enough. Every time they looked up "Vacuum County" and found my book, they quickly left the site. Is that what happened? Or somewhere in the past ten years, when Google became the "best search engine in the world", might it not be possible that my search results didn't even come up, because the new algorithm made sure that they wouldn't?

Google lets us make little videos of search results. See what happened when I searched for Vacuum County.

Vacuum County is hard to find if you don't know what it is

Without the quotes, the sequence Vacuum County yielded a top result that didn't even have the words in that order: "North County Vacuum". This was followed with lots of other information about vacuum cleaners in which the word county appeared somewhere in the text. Wouldn't my pages that had the words in the correct order get priority in an old fashioned search?

When the words were placed within quotes, the top search item was "Wholesale Vacuum County buy Vaccum County lots." Now what is that? It doesn't even make sense. So if you click on it at aliex.press.com > Wholesale Product you will find there is no "Vacuum County" there. It's some kind of scam that whatever you happen to be looking for, they will insert the words you used into the search result.

This is followed by my own hub "The Problem with Genre" and a CreateSpace blog of mine that briefly mentioned Vacuum County and then more listings involving vacuum cleaners. Should I be happy that my hub and my blog surfaced at all? Fine, I'm happy, but don't you think that the novel is a better match?

For the search term Nabal Cabeza de Vaca, the top returns were about Alvar Nunez Cabeza de Vaca, and the word Nabal doesn't even appear in the entire document anywhere. The word naval is in bold in the Google listing, leading me to believe that Google overlooked pages that had the word Nabal in them in favor of pages with the word naval. Is that what the best search engine in the world does? Even a decent librarian wouldn't do that.

But now see what happens when we try to look for the phrase crede vaccam. The top two results substitute creed for crede. Meanwhile Google tries to tempt the searcher to go look at more vacuum cleaners by asking "Did you mean crede vacuum?"

If you agree with Google and say "yeah, that's it", I believe in the vacuum, not the cow, here's what you'd see: "Joe Crede is a f****ng vacuum", followed by more information about vacuum cleaners.

What's in a typo?

Why the Google algorithm Allows Popularity to Affect page rank

What was the idea behind the Google algorithm? The idea was to save time on searches by prioritizing based on popularity. This makes a certain amount of sense if you are looking for a sequence that comes up very, very often. If somebody is looking for the phrase vacuum cleaner, then because it is such a common phrase, maybe it would make sense to allow the strength of the linking to the site to play a part in deciding what search item appears first. But even here, you wouldn't reverse the order of the words. You wouldn't put a site that had the sequence cleaner vacuum above one that had vacuum cleaner, no matter how popular the cleaner vacuum site was. You'd go with the sequence the searcher gave you first.

When a search sequence is rare, then there is no competition, and no reason to look at popularity. If on the entire web there are only three documents with the sequence crede vaccam, those documents should rank first in a search for crede vaccam. If they don't, then something is wrong with the search engine. It's as simple as that.

Competing Search Engines: Yahoo

If I look up Vacuum County on http:www.yahoo.com, my top two results today are:

  1. VACUUM COUNTY, Chapter Twenty

    VACUUM COUNTY. PART THREE, Chapter Twenty. Copyright 1991 Aya Katz . Chapter 20 PROMISES KEPT. FROM VACUUM COUNTY FILES. PROGRESS REPORT. VACUUM COUNTY ADULT PROBATIONwww.well.com/user/amnfn/vac20.html - Cached
  2. VACUUM COUNTY, Chapter Twenty-Seven

    VACUUM COUNTY. PART THREE, Chapter Twenty-Seven. Copyright 1991 Aya Katz . Chapter 27 THE SHEEP AND THE SHEPHERD. THE NEW YORK TIMES. Tax-Evading Rancher Wonders Why He Doesn't ...www.well.com/user/amnfn/vac27.html - Cached


Now I don't know why they chose those particular chapters, or why they went into vacuum cleaner wholesalers immediately after those two entries. This is not an unqualified endorsement of yahoo. I think they've been bribed, too. But they're a lot more decent about it, don't you think?

If I look up crede vaccam on yahoo today, the first thing that comes up is something about vacuum cleaners, but then the second and third entry are two chapters from Vacuum County that contain the sequence crede vaccam.

If I look up Nabal Cabeza de Vaca on Yahoo today, I get two chapters from Vacuum County, followed by these two sites:

  1. Películas gratis de Nabal | Filmografia Nabal | Cartelera ...

    - Translate con la filmografía de Nabal Presentamos tráilers de cine gratis online para ... Cabeza de Vaca | 1990; Fugitivos Rebeldes | 1954; La mujer milagro | 1931; Calígula | 1979pejino.com/cine/nabal - Cached
  2. Personaje bíblico | cristianismo | Nabal | Laredo Cantabria

    - Translate 25:14 Pero uno de los criados dio aviso a Abigail mujer de Nabal, diciendo: He ... Cabeza de Vaca | 1990; Fugitivos Rebeldes | 1954; La mujer milagro | 1931; Calígula | 1979pejino.com/pelicula/cristianismo/nabal - Cached

The sites in Spanish actually contain all the words in the sequence Nabal Cabeza de Vaca, though not in that order. Yahoo can help the searcher identify the Biblical character Nabal, on whom my novel is based. The fact that in Spanish Cabeza de Vaca isn't just a name, it's also three independent words, allows searchers to identify the semantic relationship between the name of the famous explorer and the occupation of the biblical character Nabal. So even though I think yahoo violated the ordinary rules of priority in search, I can't feel very upset about it, because they contribute a better understanding of the background of my novel to anyone who might care to know what it is really about.

While we're thinking about those Spanish listings, don't you think it's interesting that all the top Google listings about Alvar Nunez Cabeza de Vaca weren't even in Spanish? How are those the best results, even if I were looking for the famous explorer? Wouldn't his own book Naufragios y Comentarios be a better, more primary source?

Competing Search Engines: DUCKDUCKGO results

At duckduckgo.com, the first result for Vacuum County is chapter twenty of my novel. The rest of the results are vacuum cleaner sites. For crede vaccam, at duckduckgo.com, the top result is chapter eighteen of my novel, followed by documents in latin that contain both words, though not in that order or without intervening words. Nabal Cabeza de Vaca at duckduckgo yields chapter nine of my novel, followed by the site in Spanish about the biblical character, followed by sites that contain long lists of names.

So? DuckDuckGo is less corrupt than Google, but not as generous as Yahoo.

Competing Search Engines: Bing

At Bing, Vacuum County yields chapters twenty-seven and eleven of my novel as the two top results, followed by vacuum cleaner listings. Crede vaccam on Bing gets us a vacuum cleaner listing in the top spot, followed by two of my chapters, followed by more vacuum cleaners. Nabal Cabeza de Vaca at Bing gets us chapters twenty-seven and seven of my novel, followed by the Spanish language biblical listing on Nabal, followed by the list of names, followed by Spanish texts.

I'd say Bing is not as good as Yahoo, but possibly equal to DuckDuckGo in the value of the results, though by no means identical.

Is it paranoid to conclude that Google is corrupt?

In discussing the recent algorithm change, there are many opinions. Some are angry with Google and others think this is just a settling down period. Some even say that the bad results are getting top billing in order to find the "bad guys" and punish them.

Me? I don't think there are any bad guys among the listings.The listings are inanimate. They are just information. Information is neither black hat or white hat. It is what it is. The readers get to decide what they want to read. It should be up to the search engines to arrange the pages according to comprehensible rules. The algorithm should not be a proprietary secret. It should be known to all -- especially the people who are searching, so that they can know what terms to input in order to get the best results for them.

Google claims its algorithm is in order to help the searcher find the best results. But the best results are different depending on who you are. In fact, weighting of different search results based on prioritizing them by popularity should be something that a searcher can select by himself, and each searcher should be able to use his own private algorithm the better to help him find what he is looking for. If Google really cared about us, that is what they would let us do.

When someone asserts that Google wouldn't dare give slanted search results, for fear of losing its market share, I have to laugh. They've been doing it for years. All the major search engines are doing it, to a greater or lesser extent. They do it, because they don't get paid by the searchers. They get paid by advertisers. The algorithm is all about exactly how many vacuum cleaner sales sites will get higher priority in a search for Vacuum County that would never have gotten into the list in the first place under a simple boolean search.

How to get around this? Write your own search engine. You won't get rich doing it, because nobody will pay you. But if the results are slanted, they'll be slanted to your bias and nobody else's!

© 2011 Aya Katz

Comments 40 comments

nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

Aya,

I don't know much about the recent change in Google's algorithm.

I think it should be emphasized that even prior to the change, Google's ranking involved popularity of the site.

Whether this is better or not depends of course on who

the user is. If it works better for most users then clearly it serves Google's purposes.

Most users do not write their own algorithms because they are unable to construct a consistent algorthm. Others who can, construct an algorithm that is so simplistic that it does not meet their needs.

I always understood that the difference between putting a search term in quotes and not doing so has precisely to do with the importance of the order of the words.

When you search for Vacuum County, you are searching for Vacuum AND County

When you search for "Vacuum County", you are searching for Vacuum County in that order.

Nets


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, it's true that the Google algorithm was based on popularity for a long time now. I did not mean to imply otherwise. In fact, the change in the values alotted to certain links may have actually limited the effect of popularity in current searches.

My understanding about the difference between a string in quotes and one without quotes was that the order in the quotes was obligatory, but the order without quotes was prioritizing. By this I mean that "Vacuum County" requires the exact terms in the exact order. But when they are entered in sequence without quotes, then the ones in order appear first, then those not in order.

If it were as you describe, wouldn't +Vacuum +County give you exactly the same results as Vacuum County?


Paradise7 profile image

Paradise7 5 years ago from Upstate New York

There never will be much traffic for unusual titles or words, so I think Google treats most of them as misspellings, and adapts accordingly. It does make sense for most users.

I wonder if you'd have better luck with marketing your book or getting it read by piggybacking onto an Amazon or EBay site.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Paradise7, thanks for your comment. Of course, you are right, and I do plan to publish my book and sell it on Amazon this year. But here's the amazing thing about it: the reason it will get more traffic on Amazon is because Amazon will stand to make money by selling it. The keywords will be the same. The traffic will come from the sales potential. So even if we are just interested in getting people to read something, it pays to sell it rather than give it away.


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

Peeps understand money better than anything. If you just put something out there to share the SE's don't get it cuz they're all geared to sites trying to make $$.

I wrote a SE for my own site but I was only searching data in my own DB.


Peter Dickinson profile image

Peter Dickinson 5 years ago from South East Asia

Interesting article. You seem to have done the same experiments as me and then some. What I have never been able to understand is how some things I search for can drop from Google page one to Google page four overnight. Thank you for sharing. Good luck with the book.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Hot Dorkage, agreed. Money is easy to understand for all, whereas other things, if not translated into currency, are harder to grasp.

I think that it's great you wrote your own database search engine. How hard would it be to convert that to an internet wide search? People have told me you need a lot of servers, so that it would actually require an investment of considerable resources to compete with Google now. Is that true? Any way around it?


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Peter Dickinson, thanks! The results on Google are constantly shifting, even without a major algorithm change. Some of that may be legitimate, as a result of the publication of new data on new pages, which would naturally cause a re-ordering. However, a lot of it is not based on the data at all, but has more to do with what site is in favor with the Google gods today.


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

To answer your question, Aya, it would be a job for Google, not me. My database was just my own content, organized and optimized the way I wanted. Google indexes and categorizes the entire public internet (or tries to). That is their database. They need a massive worldwide distributed system of massive server farms, and another massive army of crawlers that work 24/7 just to keep it up to date, then clever algorithms to optimize the search so it runs in a sane amount of time. No, not something I could do in my garage.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Hot Dorkage, that's what I've been told. But when Google started out, surely they didn't have all that?


nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

Aya,

I tried it and indeed +vacuum +county gives me the same

results as vacuum county.

Nets


nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

About Google's hardware, I read that when they were starting out they had access to a strong university computer lab and then they got kicked out when it was found they were monopolizing its resources.

You don't need the servers of course, but you do need the crawlers. Perhaps you could restrict what part of the web

you are willing to crawl in some way ...


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

nhkatz if you don't have the servers how would you host your gargantuan database. And also if you don't have the servers where would the crawlers originate from.

The rumour about the basement Google team co-opting a University computer is true, I know a guy who was in on Google at the beginning. He's probably sipping a maitai on a beach somewhere now.

Google didn't have all those resources when they started, of course not. The web was probably 5% then of what it is now, and when they launched they didn't cover everything. It was their clever algo that made them eat the other existing SE's of the time (Yahoo, alta vista, Lycos) for breakfast.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, try hot dog and compare with dog hot, without quotes. They yield different results for me. Someone more knowledgeable than I am explained the difference to me this way: they are weighted as to importance based on order of entry.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Hot Dorkage, do we need exclusive use of a server in order to crawl effectively?

How much of a database does Google have to keep of search results? It doesn't seem to recognize content it encountered before as being word for word the same, which makes me think it does not keep a mirror of the web in its database.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

BTW, I get different results for +crede +vaccam as opposed to crede vaccam.


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

You can crawl all you like. What do you do with the data you find?


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Presumably, order it, report it, and dump it. So you're saying that to order it, even just momentarily in response to a query, you would need as much memory as the entire web?


Barbara Kay profile image

Barbara Kay 5 years ago from USA

The last time I used Google Search, I ended up getting disgusted and going over to Yahoo. I did come up with better results. I am biased against Google right now, so maybe that was part of the problem. I'm kind of angry.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Barbara Kay, I understand how you feel. I do think they are not giving the best results and that a simple search, without unusual operators, yields some fairly strange results.

Since I wrote this hub, it has been pointed out to me that using "advanced search" will bring better results. Very knowledgeable people don't get stuck with the bad search results from Google, because they use advanced search. However, for those of us who want to talk to all the readers out there, and who know that the majority of searchers won't bother with advanced search, this is not much consolation.

I think that yahoo does give more pertinent results for the average searcher who doesn't have any special expertise.


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

No you don't need as much storage as the entire web that's absurd. But crawling is time consuming, Your search would take days if you had to go crawl each document on each search. It would be like reading every book in the library to see if it had what you want. Google builds a "card catalog" of the web if you will then it's there for people to use to search. It takes time to build it, and in order for it to be useful you have to keep the results of your crawl otherwise you'd be building and building it on every search query.


crystolite profile image

crystolite 5 years ago from Houston TX

Good write up,thanks.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Hot Dorkage, okay. That seems reasonable. But the card for each page in the card catalog can't contain all the information in the page, right? So if an obscure term didn't make it into the card catalog, how will the crawler know where to look for it? In the old library system, things were catalogued by subject, title and author. Today, we can find individual words and phrases.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Thanks, Crystolite.


nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

Aya,

I didn't say that +vacuum +county gives the same result

as +county +vacuum. The difference may be that more emphasis is placed on the first word. So you get more vacuum cleaner ads.

Nets


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, I also am not able to find proof on the first page of Google using the keywords vacuum county to show there's a difference between putting a plus in front of each keyword as opposed to simply typing them in in the normal sequence.

However, using crede vaccam I am able to find a difference between the first page results of +crede + vaccam as opposed to crede vaccam.

http://www.google.com/#sclient=psy&hl=en&q=crede+v...

and

http://www.google.com/#sclient=psy&hl=en&q=%2Bcred...

Since the search results are not identical in this case, I submit that there is a semantic difference between using the pluses or just the sequence without pluses. (This semantic difference might not always yield different results on the first page, but it is no less real because of that.)


hot dorkage profile image

hot dorkage 5 years ago from Oregon, USA

That's why they have key words and stuff. But the crawler flags unusual or distinctive terms in the text of your page, so the info google keeps on a web page is more detailed than a card catalog. They have heuristics about keyword density and all sorts of stuff. Read an article on SEO if you would like to optimize your pages more for Google, and search engines in general. Cuz even if you like some other searcher, most people use Google. Personally any time i tried to really trick out a page for Google, the quality of the prose suffered.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Hot Dorkage, thanks for the tips. I'm feeling ambivalent about adapting to the Google dominated environment. I don't want my prose to suffer, and I would rather change the market than adapt to it in a completely passive way. How to do this, of course, is a difficult question. If it can't be done by building a competing search engine, then it must be necessary to support an existing search engine that sanctions better prose.


OpinionDuck profile image

OpinionDuck 5 years ago

Aya

I like the name of duckduckgo

I will have to try it.

The problem with the Search Engines is that they are paid for by advertisements, either directly or indirectly.

Thanks


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

OpinionDuck, it makes sense to me that you like the name DuckDuckgo! ;-> I agree: the problem with all the major search engines is that they get paid by advertisers. The question is: is there any other way to do it that would be fiscally sound? Would you pay for a subscription to a really honest search engine? Would most people?

Cable TV was supposed to be an alternative to broadcast TV where subscriptions, instead of advertisements, paid for the programming. But now, if you want to watch TV at all, you pretty much need a subscription, and there are still ads on TV when you switch it on.


OpinionDuck profile image

OpinionDuck 5 years ago

Aya

Thanks

You make some pretty good points.

Although, Westlaw for legal research was a subscription service. Can you imagine what the search results would be if you relied on Google. You could pretty much ensure that precedents could be unreliable, if someone wanted to pay Google to distort the results.

Pay TV for Over the Air Network stations is really just for transmission, and not for the shows themselves. So we pay to have TV shows with commercials and annoying flash pop ups.

With the espansion of the Internet Cable and Satellite TV could be changed dramatically.

Technologically, I don't think anyone can out do Google without spending as much money on it as they did, but much of their technology was to make it cash driven. So maybe there is an opportunity for a pay subscription service that is unbiased.

What I just said sounds confusing, but there are two aspects of a search engine. One aspect is actually finding the content that the user is looking for, while the other aspect is the business model of the company that built the search engine.

The business model funds the company but it does so at the expensive of making the search results biased towards the paid in clients.

So in expense the advertisers are really paying into the search engine to get themeselves a preference.

I never thought that people would pay for network television, so there might be a way for a search engine company to get paid solely by the user.

Offhand, I can't think of any scenarios but I am sure it is possible.

It would have to start with a niche of some kind, where objectivity was worth paying for. The algorithms for the search would have to have more human involvement than pure techological routines. Possibly a central clearing house type of setup rather than searching the web, the websites would go to the clearing house and submit their site.

Jsut a thought


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

OpinionDuck, thanks for your thoughtful response. I agree that the search engine problem is made of two separate but interlocking issues: 1) the technical aspects of how to build an effective search engine that can give its results in a matter of seconds and 2) the business problem of how to pay for this service.

I noticed that among the people who comment on this hub, those who have a lot of expertise on the technical side tend to attribute Google's success to technical know-how and these are also the people who downplay any bias in the results. They say that if used properly, Google comes up with useful results. On the other hand, people who understand the business end of the problem tend to see the bias, but have no idea how to make a search engine that would compete.

What we need is to find someone equally knowledgeable about both issues, who is interested in beating the bias. But that's kind of like hiring a pirate to help you battle the Federal government. How can you trust the pirate?


OpinionDuck profile image

OpinionDuck 5 years ago

Aya

Great comment, I can't add anything to it.

Thanks


nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

http://www.google.com/corporate/tenthings.html

This is what Google thinks. Notice how it deals with both

the business side and the technical side. It differs from what your "technically knowledgeable" readers think because it recognizes that search is a non-trivial problem. (The first paragraphs of this hub assert that search is a trivial

problem. Your "technically knowledgeable" readers don't disagree. They just think you need really big computers.)

On the business side, Google claims that the aim of their business is to provide the best results to their users and

that everything else will follow. Of course, you have lots of complaints. But it isn't as if there were no complaints the

other way. See e.g.:

http://techcrunch.com/2011/03/16/danny-sullivan-on...


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, thanks for the links. I don't think I said that search is a trivial problem. I said that the algorithm should be known to all, so that they could manipulate their queries to get the results that were best for them as individuals.

Google says in the link you posted that it wants to help the user -- you--- it calls it, to find what the user means to find, as opposed to finding something else. But every user is different. Like Humpty Dumpty, we pay our words to mean whatever we want them to mean. So Google proposes to solve this democratically -- their word for it, not mine. By using links, it proposes to find what is best for more people, by leaving out the people whose preferences and language use are atypical.

Of course, we all know what happens in a true, unfettered democracy. And this is exactly what is happening with Google.

I haven't read your second link yet. I may comment on that later.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, I read your techcrunch link and saw the video. I can't agree that Hubpages results on Google were spammy, because HP were gaming the system. Without this gaming, no new information could ever come to light. We'd only get the established experts and people who had many friends showing up in search results.

Hubpages allows friendless, unconnected people to share with the world astounding discoveries about reality, discoveries that would otherwise be suppressed.

My hubs, which were written for the purpose of getting fresh, new information to searchers were discovered on Google thanks to HP. Who even heard of Project Bow before I went public with it on HP? Then I got 60,000 hits in the course of two days. Now, people may not agree with my claims, but at least they know about them and can judge for themselves.

I'm afraid that under the new system, the establishment wins, and rogue researchers lose.


nhkatz profile image

nhkatz 5 years ago from Bloomington, Indiana

Aya,

You point out very clearly the reason your hubs probably ought not be on hubpages. But certainly the point of hubpages, in general, is to be spammy.

Now, the real issue is, if people were paying for a search

engine, what kind of search engine would they pay for?

Are you sure it wouldn't be one that is difficult to spam

and hence would filter you out?

If a search engine comes with an instruction manual on how to spam it, it will be spammed. Even if it doesn't, I think it

can be spammed because the spamming problem is a problem in

search.

If the algorithm is made public, is it a problem for you that the algorithm is too hard for you to understand? (What if it

involves eigenvalues and eigenvectors?) (Hint: it does!)


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

Nets, this is very intriguing! Do you know the Google algorithm? Has it been published somewhere where only mathematicians can read it? How do you know what it involves?

Sure, it might be a problem for me that I am not sophisticated enough to understand it, but that's a personal problem. As long as we all have access to the information, at least those who are smart enough might have a chance.

As for what you said about spam, I'm not sure how to take it. Are you saying that I was spamming those who had a genuine interest in ape language studies? Or are you saying that I alone of all the users on Hubpages am not a spammer? Because I would take issue with either interpretation.


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

For those of you who are following the comments in this hub, Nets has shared with me a technical article about the Google algorithm.

Here it is:

http://www.rose-hulman.edu/~bryan/googleFinalVersi...

After you read the article, come back and tell me what you think!


Aya Katz profile image

Aya Katz 5 years ago from The Ozarks Author

The math is over my head, but here is my layman's understanding of what the problem is supposed to be: it's not enough to calculate how many other pages are linking to a particular page in order to determine its page rank. Google wants to assign a greater value to a link coming from a popular page than one coming from an unpopular page. But... this is hard to calculate when it's all changing all the time, and if we assume that everyone is equal at the beginning of the calculation.

So, assuming my interpretation of the problem is correct, then the claim that it's all a "democracy" is based on the idea that in a democracy, people don't just decide by majority rule. They first have to figure out who is more popular, and the popular people get more votes.

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    More by this Author


    Click to Rate This Article
    working