Search Helium

Home > Computers & Technology > Internet > Internet Troubleshooting

Guide to the different search engines

by Pooka

Created on: May 12, 2008

The Search Engine

Search engines, so adequately named are the programs that transport you across the World Wide Web. Thanks to this interesting technology and millions of painful hours at programming and reprogramming, finding information on anything in the world can be as easy as hitting a few keystrokes.

There comes a point in every man's life when it occurs to him: I want to know about ninjas! I know it happened to me, and thankfully search engines were there to help! So as any enterprising young man would do, I logged onto the famous http://www.google.com and performed a search for ninjas! Only a few seconds later was I provided with the official ninja website and all my needs fulfilled. http://www.realultimatepower.net But how did Google know how to find such a marvelous website? It's all thanks to the World Wide Web and the wonderful creatures that live there.



Long, long ago, search engines such as Google built an extensive archive of websites using robotic spiders! Not the eight-foot tall giant metal death machines you may think of, but interesting creatures nonetheless. Special coded software robots called Spiders are sent crawling over the web in all directions. Their purpose is to build lists of every website they can find, and nab as many words as they can straight from the text. They start on the most popular sites, and crawl through every link on the page until theoretically every page on the World Wide Web has been cataloged and indexed. Of course new pages are being created everyday, so it is unlikely the work of a Spider will ever be done.

Sergey Brin and Lawrence Page of Google gave an example of how quickly their spiders work. They built their initial system to use multiple spiders, usually three at once. Each spider could keep about three hundred connections to Web pages open at a time. At its peak performance, their system could crawl over a hundred pages per second, generating around six hundred kilobytes of data every second.

A simple database of Websites would often yield many irrelevant results in a search. When searching Google for ninjas, http://www.realultimatepower.net comes up first. This is because of a ranking system that brings up the most relevant websites first. Most search engines use different metrics to determine how relevant a Website is to the query. A search engine might store the number of times that word appears on the page, it might give it more credit for having the word closer to the top of the page, or even having it contained in HTML meta tags. Each search engine has a different method of determining the best results, which is one reason that doing the same search on multiple search engines can yield entirely different results.

Building a search generally uses formal Boolean logic. You can search for anything while using variables AND, OR, NOT, as well as quotation marks to indicate the entire phrase must be present, and often other variables. Fuzzy logic is really neat, but as of yet, unpopular for search engines.

So my simple search for "ninjas" gave nearly three hundred thousand results. But thanks to the hard work of robot Spiders and the data basing tools of search engines I found my way swift and easy to http://www.realultimatepower.net. The official ninja WebPages, where I found all the information I ever wanted to know about real ninjas.

Learn more about this author, Pooka.
Click here to send this author comments or questions.

Helium Debate

Cast your vote!

Does digital rights management (DRM) promote or prevent piracy?

Click for your side.

215160

Featured Partner

The Responsibility Project

The Responsibility Project is the brainchild of Liberty Mutual Insurance. As an insurance company, we like responsible people. Because people who believe in doing the right thing don't just make better people, they make better custome...more


CONNECT WITH US

Read
our blog
Helum for writers

Write and get published
Share with other writers
Polish your freelancing skills

Join our active writing community
Helium Content Source for Publishers

Quality articles from proven freelancers
Exclusive rights, fast turnaround
Brand engagement, business blogging -- our writers do it all

Get custom content today!

INFORMATION


Helium, Inc.
200 Brickstone Square Andover, MA 01810 USA
#