November 3, 1997
Site offers all you need to know about Web search engines
If you deal with Web sites in your company -- and what company isn't designing Web sites of some kind these days? -- you may want to know how people are finding your site. Even if you don't maintain a Web site, you may want to know some tricks for finding sites you're interested in.
In both cases, you would benefit by reading Danny Sullivan's Search Engine Watch site. This server (http://www.searchenginewatch.com) is packed with information on making your Web site as visible as possible to the world, and locating obscure Web sites.
Search Engine Watch currently follows seven major Internet indexes. Roughly in order from those that index the greatest number of pages to those that handle the fewest, the seven are: AltaVista, Excite, HotBot, Northern Light, Infoseek, Lycos, and WebCrawler. I don't have room to list the URLs of all these sites, but they're available on Sullivan's site.
Sullivan reports that AltaVista indexes 100 million Web pages, but he advises that numbers like these should be taken with a grain of salt. Depending on how you count, he says, Excite and HotBot are probably larger. Cyberspace is hard to measure.
A more important criterion than size is how fresh the information is. If a search engine has a lot of information but it's months old, it's possible your site will be missing, or the site you're looking for won't show up.
Because no search engine can store or search the entire contents of the Web, an index on a particular topic may contain information that has not been re-checked for three months. For Web authors, this underlines the importance of submitting fresh Web pages to search engines on your own initiative, a process Sullivan describes in detail. And although search engines claim to index secondary pages that link to a home page, Sullivan notes that this may take from one week to three months, depending on the index.
Casual users of search engines are more interested in how often indexes update the pages they've searched. Sullivan says HotBot, Lycos, and Northern Light, in that order, update their indexes every two weeks or so. The other engines, due to their sheer size, or other factors, can be three months out-of-date on many Web pages.
To make their Web searches more efficient, several of the indexes have learned to track how often sites change. Sites with a low turnover can then be skipped more often during indexing. AltaVista, HotBot, Infoseek, and Northern Light do this.
Another trick is to rate sites more highly if they have a high link frequency. Indexes count how many times a Web page is linked to from other sites. Search Engine Watch explains how to determine this count for your site using each of the major search engines.
One of the most fascinating services Sullivan's site offers is a screen-full of links to search-engine pages that show the Boolean searches that Internet users are typing in right at that very minute. They say there is no way to trace a particular search to an individual, so there's no privacy concern. The parade of oxymorons that people search for -- "vital statistics," "free trade," and "advanced civilization" -- are hilarious to watch.
Search Engine Watch is free, but Sullivan offers an annual subscription for a price of $25. This entitles the buyer to a deeper level of research. Send a check payable to Danny Sullivan, 2836 Judah St., San Francisco, CA 94122.
Brian Livingston is the co-author of several best-selling Windows books, including the most recent Windows 95 Secrets (IDG Books). Send comments to brian_livingston@infoworld.com. Unfortunately, he cannot answer individual questions.
Missed a column? Go back for more.
Copyright © 1997 InfoWorld Publishing Company