Home > Computers & Technology > Internet > Internet Troubleshooting
Created on: May 12, 2008
The Search Engine
Search engines, so adequately named are the programs that transport you across the World Wide Web. Thanks to this interesting technology and millions of painful hours at programming and reprogramming, finding information on anything in the world can be as easy as hitting a few keystrokes.
There comes a point in every man's life when it occurs to him: I want to know about ninjas! I know it happened to me, and thankfully search engines were there to help! So as any enterprising young man would do, I logged onto the famous http://www.google.com and performed a search for ninjas! Only a few seconds later was I provided with the official ninja website and all my needs fulfilled. http://www.realultimatepower.net But how did Google know how to find such a marvelous website? It's all thanks to the World Wide Web and the wonderful creatures that live there.
Long, long ago, search engines such as Google built an extensive archive of websites using robotic spiders! Not the eight-foot tall giant metal death machines you may think of, but interesting creatures nonetheless. Special coded software robots called Spiders are sent crawling over the web in all directions. Their purpose is to build lists of every website they can find, and nab as many words as they can straight from the text. They start on the most popular sites, and crawl through every link on the page until theoretically every page on the World Wide Web has been cataloged and indexed. Of course new pages are being created everyday, so it is unlikely the work of a Spider will ever be done.
Sergey Brin and Lawrence Page of Google gave an example of how quickly their spiders work. They built their initial system to use multiple spiders, usually three at once. Each spider could keep about three hundred connections to Web pages open at a time. At its peak performance, their system could crawl over a hundred pages per second, generating around six hundred kilobytes of data every second.
A simple database of Websites would often yield many irrelevant results in a search. When searching Google for ninjas, http://www.realultimatepower.net comes up first. This is because of a ranking system that brings up the most relevant websites first. Most search engines use different metrics to determine how relevant a Website is to the query. A search engine might store the number of times that word appears on the page, it might give it more credit for having the word closer to the top of the page, or even having it contained in HTML meta tags. Each search engine has a different method of determining the best results, which is one reason that doing the same search on multiple search engines can yield entirely different results.
Building a search generally uses formal Boolean logic. You can search for anything while using variables AND, OR, NOT, as well as quotation marks to indicate the entire phrase must be present, and often other variables. Fuzzy logic is really neat, but as of yet, unpopular for search engines.
So my simple search for "ninjas" gave nearly three hundred thousand results. But thanks to the hard work of robot Spiders and the data basing tools of search engines I found my way swift and easy to http://www.realultimatepower.net. The official ninja WebPages, where I found all the information I ever wanted to know about real ninjas.
Learn more about this author, Pooka.
Click here to send this author comments or questions.
Below are the top articles rated and ranked by Helium members on:
Guide to the different search engines
by Lisa Mazurek
When it comes to search engines, Google accounts for 40 to 60 percent of the search engine market. It isn't surprising given
by Rainier Wong
When it comes to searching for something on the Internet, the name Google comes first to the minds of most people. However,
Search engines are perhaps the most important of websites. These are websites that search and find other websites and web
by Virginia
In October 2007, Americans conducted 10.5 billion searches, which was a twelve percent increase versus September 2007. Fifty
by Chuck Baker
Search engines are the corner stone of the Internet. Without them the Internet couldn't exist. With all of the content on
View All Articles on: Guide to the different search engines
Helium Debate
Cast your vote!
Does digital rights management (DRM) promote or prevent piracy?
Click for your side.
Featured Partner
The Responsibility Project is the brainchild of Liberty Mutual Insurance. As an insurance company, we like responsible people. Because people who believe in doing the right thing don't just make better people, they make better custome...more