How Do Search Engine Spiders Work

It is the search engines that finally bring your website to the notice of the prospective customers. Hence it is better to know how these search engines actually work and how they present information to the customer initiating a search. 

There are basically two types of search engines. The first is by robots called crawlers or spiders.

Search Engines use spiders to index websites. When you submit your website pages to a search engine by completing their required submission page, the search engine spider will index your entire site. A ‘spider’ is an automated program that is run by the search engine system. Spider visits a web site, read the content on the actual site, the site's Meta tags and also follow the links that the site connects. The spider then returns all that information back to a central depository, where the data is indexed. It will visit each link you have on your website and index those sites as well. Some spiders will only index a certain number of pages on your site, so don’t create a site with a billion unfocused pages, create topical relevant web pages instead.

A search engine spider will periodically return to web sites to check for any information that has changed. The frequency with which this happens is determined by the moderators of the search engine.

A spider is almost like a book where it contains the table of contents, the actual content and the links and references for all the websites it finds during its search, and it may index more than a million web pages a day.

Example Search Engines: All The Web, Excite, Lycos, AltaVista, Yahoo and Google.

When you perform a keyword search instructing a search engine to locate and return relevant information, the Search Engine searches through the index it's spider has created, it's not actually searching the World Wide Web in real time to return to you the displayed search results. Different search engines produce different rankings because not all the search engines use the same algorithms to search the indices.

Algorithms in Practice:

Search Engine Example 1

Search Engine engineers develop and implement algorithm which instructs the search engine spider to scan for keyword frequency and keyword relevancy as well as keyword situation in the content of website pages. Spider Algorithm include rules that detect artificial marketing methodology in web page content used by rogue SEO practicians in hope of fooling the algorithm into spidering the web page as one of high relevancy and returning it as a top ranking in the search result index. Such artificial SEO techniques or "Black Hat" search marketing methods include keyword stuffing and spamdexing.

The algorithm then analyses the way in which website pages link internally and externally to each other. By checking how web pages link to each other a search engine can both determine what a page is about as well as relevant to and determine the web page contents keyword similarity between the linked pages comparing them against the original page.