The wonders of GooHackle

The wonders of GooHackle

After stumbling onto an absolutely fascinating expose of most used Internet words courtesy of GooHackle.com, we certainly had to know more. GooHackle possibly represents the epitome of Google parsing and scraping services available to modern man.

What is Google Scraping?

Scraping Google involves the extraction and consolidation of useful information from Google.com web pages for supplemental applications and decision support. At least we envision that it does. Also encompassed in the process of scraping might be the intentional avoidance of special 'roadblocks' such as captcha technology and IP address screening. Google, along with many other internet-based web sites, implements mechanisms that some might describe as extreme. The captcha form, for example, attempts to limit access to some Google information by obligating users to visually process a specially engineered text string and correctly type that string back to the Google server before access is granted. Should a user prove incapable of accurately echoing the information, access is denied. Extreme data acquisition requires that captcha processing be automated.

GooHackle claims to possess technology to evade Google's captcha security. The home page of GooHackle.com provides a link to another GooHackle page explaining their breakthrough. Evidently the programmers at GooHackle realized that they could get any human to enter the captcha code, not just the human who actually wanted the information. Everytime they need to avoid a captcha, they capture that captcha from the Google page and present it to an internet user who obligingly translates it.

What can Goo do for you?

GooHackle also offers am extremely cool perl script that processes a Google search results page into a simple list of URLs (Universal Resource Locators.) This URL list can be funneled into other information processing tools for expedited SEO optimization and other useful applications. Google only offers the first 1000 URL results for any search, expanded into 100 pages of 10 links each. Instead of enduring the tedium of manually processing 100 pages of information, consider this script. Encounter a demo of the script output on GooHackle site: simply enter a search string, click a button, and sit back in amazement as a raw list of URLs is almost immediately returned.

Expect some resistance from Google. The type of scraping exploited by this tool is certainly frowned upon by the Google engineers and lawyers. Google possess an extreme affinity for the information they provide, despite claiming as their company motto the self-incriminating phrase Be Nice.

Should you get involved with Goo?

Enraptured with GooHackle yet? Some folks find themselves enthralled with the concept of 'defeating' Google in order to obtain useful Internet metrics. Everyone needs a hobby. We enlivened our otherwise dull day by testing the GooHackle Keyword Popularity Tool. In order to exercise the tool, we entered the single kyword phrase 'cars' and clicked a button on the form. In a few moments we observed that this keyword extends over 509,000,000 different web pages.

That's a lotta web pages.

GooHackle obviously didn't scan 509,000,000 web pages in order to extract this information. Evidently an efficient little perl script extracted the number from a Google results page. Regardless of the etymology of the number, it's an interesting parlor trick.

More by this Author


Comments 6 comments

Wayne Brown profile image

Wayne Brown 5 years ago from Texas

I'm one of those guys who believes there is no reason to go to the back of the caves to see the bats if there are a few hanging around the entry. So far that has served me well so I imagine Goohackle will do little for me but thanks for serving it up in an informative and understandable way! WB


Tom Whitworth profile image

Tom Whitworth 5 years ago from Moundsville, WV

nicomp,

The only possible drawback I can see to a gob of Goo from GooHackel is that it turns out to be like Forrest Gump's box of chocolates. You never know what you're getting until you bight into it.


Stan Fletcher profile image

Stan Fletcher 5 years ago from Nashville, TN

This was fascinating, enthralling, enlightening and well-written. Definitely the best info on GooHackle I've heard. And all this time I thought it was a new kind of fishing bait. "Jimmy Don, pass me some more of that GooHackle. That last fish that got away managed to get all the GooHackle off my hook before I lost him." As you can see, I'm on the cutting edge of all things technical.


nicomp profile image

nicomp 5 years ago from Ohio, USA Author

@Stan Fletcher: Nice glasses.


drbj profile image

drbj 5 years ago from south Florida

GooHackle, huh? I am enamored of that name.


nicomp profile image

nicomp 5 years ago from Ohio, USA Author

@drbj: You got it! yay! It was fun for me, let's see if anyone else notices.

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working