LSI: The New High In CyberSearch?

Latent Semantic Indexing – the new buzz word in ‘Search’ has been quietly and insidiously spreading its tentacles through the great big Google www world. Webmasters still can’t decide whether it counts for Page Rank or not. Here’s what could be good about it from the other side – for the millions who see that Google Search Bar as their entry into a world of information.

What exactly is LSI?

It is, pure and simple, yet another information retrieval system. However, it goes beyond SEO or Search Engine Optimisation in that it gives you more than just the millions of pages that contain the word you just typed into the Search Bar. Have you ever been totally frustrated when you’ve wanted information on a particular subject and you had to dig through hundreds of pages to find what you wanted? What LSI is moving towards is to try and make search more relevant. When you type in a keyword, the Google spiders look at all the web pages that contain that particular word and also analyse them for words that are semantically similar. So what you get as a result are pages that will be more relevant to the subject you are interested in rather than just the ones that contain your ‘search’ word. While you may not really notice too much of a difference just yet when you are searching for simple terms, you most definitely will when you looking for something more complex.

For webmasters, however, how this works as far as Page Rank goes is not yet clear. SEO and all the other complex factors that go into getting your web site up there on Page 1 are too important to discard and it might be quite a while before LSI kicks in and becomes the yardstick. Most of the ones I know and work with might look at a middle ground combining both – but they are not about to throw out all their SEO maneuverings just yet. They can’t afford to – not if they are playing the Page Rank game in real earnest.

A magic wand it isn’t

If you think LSI will get you into some kind of realm of artificial intelligence, banish the thought. It is and was designed to be a mathematical formula. However, the way it functions, one can be forgiven for thinking otherwise. As one expert puts it, it takes the whole search operation from a common accountant mentality to a new level of matrix algebra. It’s a powerful algorithm that seeks out similarity values, arranges the results and what you get is page indexing that goes beyond mere searching for a term – you have a stage of analyzing that comes before the search begins.


Adding another dimension

It’s like moving from 2-D to 3-D. It retrieves documents based on similar content – and those similarities are determined by the content on all the relevant pages. So what took you hours sometimes to plough through, with numerous permutations and combinations of search words or phrases will be done behind the scenes of that search bar and presented to you. What it does is to co-relate semantically similar words over thousands or maybe millions of related documents and then come up with a set of content words that are likely to be relevant.

When Google bought over Applied Semantics, it was a foregone conclusion that it would only be a matter of time before their software CIRCA would be put to use in the retrieval of information. This application extracts and organises information and almost mimics human thought. What it has done for cyber search is to go beyond keywords to keyword themes.

Tilde Tactics

How do you have access to this new dimension of searching? It’s easy. Look at the little-used key on your keyboard to the left of your ‘1’. That squiggly symbol on the top is called a ‘tilde’. That’s the magic key to get you there. All you need to do is to put that little symbol in front of your search word, like so: ~song. Do it both ways and see the difference. The first time, without – and what you get are pages that contain the word ‘song’. Then add the tilde before it. See the difference? Now, you might just have pages listed that don’t contain the word ‘song’. It could include documents that have the words ‘music’, ‘lyrics’, ‘MP3’, etc. (Look at the words in Bold and you’ll see the keywords that are being picked up.)

Bringing back the joy of writing

What does this mean for someone who is an online writer? HOPE. At the present moment, probably nothing more. However, the fact that more and more people will search in a more focussed way means that no matter even if your sites are way down as far as Page Rank goes, chances are that if they are doing an LSI search, you will get read. And what is most welcome is the fact that you don’t need to stuff all those keywords into the copy. As long as the relevant words and phrases occur naturally, those invisible spiders will find them and present them to the person who is looking for them. So Content might emerge out of the SEO clutches to remain king, making it easier to search and easier and more satisfying to write. Will the webmasters welcome it? That’s something we’ll have to wait to find out.


More by this Author

Comments 24 comments

sumosalesman profile image

sumosalesman 8 years ago from Somersworth New Hampshire

Fascinating piece... it's great to be brought up to speed on what may be the cutting edge of search in the months to come.

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

Thank you sumosalesman - it would be interesting to see how it goes - what is fascinating is how it seems to work!

vitaeb profile image

vitaeb 8 years ago from Shenandoah Valley, Virginia

Wow! This is good news to me! Thanks for bringing this information to our attention. I've tried some searches using the ~ and saw some of my writing come up on the 1st page. Awesome.

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

That's because your writing's great vitaeb - I should know :)

ColdWarBaby 8 years ago

Very interesting and possibly very potent! Thank you very much!

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

Thanks for stopping by CWB - yes, it could well be!

Seabastian profile image

Seabastian 8 years ago from Raleigh

I am surprised that google has not made more progress in coming up with ways to describe similarity searches versus absolute (keyword ) keyword only searches. There is some capability to do that with boolean seaches but that is beyond most users ability and even those like myself who can figure it out doesn't want to go through the effort.

It would seem that google could make it a lot easier by letting you pick from the initial results and tell google that out of the first 20 results they served up you , whatyou are looking for is similar to three of the results and totally exclude similarities to 2 or three of the other results.

With google's computing capabilities they should then be able to construct a refined boolean search to provide what you are looking for even though you have not been able to provide a concise explicit seach.

Steve Little 8 years ago

Shalini - this is a great hub. Theme based search has been a long time in coming and its impact for legit businesses is welcome. The implications are for site transparency and visibility based on valuable, original, on topic content. Keep us posted on any new information. Thanks. Steve

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

Seabastian: Thanks for reading. Yes, much as I love Google, I do agree that there is room to make their search better - right now, it seems to be an implied boolean seach. Similarity searches with the option of exclusion as you suggest would be a great way to go in order to make it much more refined and focussed - and a lot less frustrating! Let's hope LSI shines the light in the right direction.

Steve Little: Thanks for stopping by - yes, it's been a long time coming. One wonders though how soon it will be before keywords move towards keyword themes.

Christoph Reilly profile image

Christoph Reilly 8 years ago from St. Louis

Another remarkably informative hub. Whenever i read your work, I learn something that is relevant to me. You also have a knack of writing about complicated subjects in a very understandable and accessible way. Very good work! Now I'm going to go play with my tilde~.

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

Now that, Christoph, sounds awfully naughty LOL!

Thanks for reading and for the kind comments :)

countrywomen profile image

countrywomen 8 years ago from Washington, USA

First SEO and now LSI. I have always learnt something new from your hubs. You have so much to offer and I am glad through hub pages I am receiving so much from you. Great hub.

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

CW - thanks as always for the kind words - you always have something nice to say, bless you! I'm just a newbie with a curious mind :)

cgull8m profile image

cgull8m 8 years ago from North Carolina

Me also like countrywomen, thanks for sharing this. We are missing out a lot on Google search by just using the simple stuff but not the advanced ones. Thanks Shalini.

quicksand profile image

quicksand 8 years ago

Hi Shalini ...

Albert Einstein, Thomas Edison, Isaac Newton, Michael Faraday, Steven Hawkins ... all these men had something in common with you. A "curious mind." :)

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

cgull - thanks for reading!

quicksand - thanks - you're awfully good for my ego :)

newcapo 8 years ago

Wow, Shalini, had no idea this existed. You've sure opened my eyes, my friend. Great hub!

Shalini Kagal profile image

Shalini Kagal 8 years ago from India Author

Thanks for stopping by and reading newcapo!

RobertHassey profile image

RobertHassey 7 years ago from United Kingdom

Your last paragraph, on hope, for the online writer is intriguing. One does not have to stuff all those keywords? Maybe. But one has to stick to one's guns in terms of what works. And so far LSI to means several categories of keywords, from main to peripheral, and to be used in a set-length piece with an eye for overall keyword density. But just because there are more keywords one could use does not always translate into natural phrasing. Good hub though.

Shalini Kagal profile image

Shalini Kagal 7 years ago from India Author

As a writer, Robert, one lives in hope that there can be a happy medium between keyword stuffing and a more moderate use of keywords. I do agree that overall keyword density is essential today to get your work read - and you're right - they don't often translate into natural phrasing! Thanks for reading.

sixtyorso profile image

sixtyorso 7 years ago from South Africa

Shalini Great interesting infomation. well put so that the layman can easily understand it. Well I guess I must try sear~ch~ing!

Shalini Kagal profile image

Shalini Kagal 7 years ago from India Author

sear~ch~ing!!! good one! Thanks for reading sixtyorso!

temp_inizer profile image

temp_inizer 6 years ago from Noida (India)

Thanks Shalini...

I just want to know how to Implement LSI or how to use LSI for my site, so my site more searchable...


Shalini Kagal profile image

Shalini Kagal 6 years ago from India Author

Hi temp_inizer - thanks for reading - I've just added a capsule up there on how it works. Hope it helps!

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.

    Click to Rate This Article