Earthweb
Images Research Events Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts
 SUBJECTS:
 FEATURES:
Search EarthWeb Network

internet.commerce
Be a Commerce Partner
Promotional Products
Cell Phone Plans
Laptop Computers
Corporate Gifts
Merchant Solutions
Promotional Gifts
Promotional Items
Car Donations
Car Insurance
Merchant Accounts
Mortgage Refinance
Truck Advertising
Computer Deals
Online Booking Hotels

PDAs
PC Notebooks
Printers
Monitors
IT Management : Columns : Executive Tech: Prevent Googlewashing From Harming You

Symantec Data Management Solutions
Whitepaper: The Benefit of Continuous Data Protection
Data volume continues to grow at nearly 40% to 50% each year, making back up of mission critical data very difficult. For any organization looking to manage data growth, improve reliability, and speed data recovery, continuous data protection provides the avenue to address the challenges in a method that will improve overall data protection without weighing down IT with costly solutions.
Register Now to Download.
Whitepaper: Breaking Through the Dissimilar Hardware Restore Challenge
This paper discusses recovery to virtual computer environments, hardware migration strategies, hardware repurposing for optimal resource utilization, meeting recovery time objectives, and increasing disaster tolerance.
Register Now to Download.
Whitepaper: Converging System and Data Protection
From resilience against threats to efficient restoration of normal operations, Symantec can help keep your business up, running, and growing—no matter what happens.
Register Now to Download.
Webcast: Symantec Brings Disk-based Data Protection and Advanced System Recovery Together
Symantec Backup Exec™ and Symantec LiveState Recovery™ allow rapid and easy backup and recovery of virtually any Windows data and Windows system.
Join us for an informative Webcast to learn how to:
  • Create backups and restore to specific system recovery points
  • Maintain data availability and minimize server downtime
  • Eliminate backup windows, improving increased system reliability
  • Dramatically minimize downtime by rapidly recovering entire systems to dissimilar hardware platforms or even to virtual environments
Register Now to Watch.

Related Articles
Insider Threats Giving IT Execs Nightmares
'Googlewashing' Makes Your Site Invisible
Commission Junction Hires Web Detectives
Commission Junction Bans Major 'Loyalty' Affiliate
Inkjet Wars 4: Value For Your Money
- ITSMWatch Newsletter -
email:
IT Focus
Wireless in the Enterprise

Wireless technology continues to make great inroads into networks. But IT pros still must contend with a number of issues such as security, access and integration.

Ready? Set. Go!

Mobile Workers Never Looked So Thin

The Incredible Hidden Wireless Connection

Product Watch
RemotelyAnywhere - Remote Access to Windows Machines via Web Browser
Brocade iSCSI Gateway - Bridges IP Servers to FC SANs
FibeAir - Wireless Connectivity For Campuses, Providers
InfiniPath - Linux Cluster Interconnect for MPI Applications
Elemental Security Platform - Agent Based Host Security Policy Compliance and Management

more products >>

Datamation Definitions
data mining
ERP
extranet
grid computing
intranet
network appliance
outsourcing
storage
VPN
virus
FREE Tech Newsletters

Whitepaper: Breaking Through the Dissimilar Hardware Restore Challenge. Learn about recovery to virtual computer environments, hardware migration strategies, hardware repurposing, and more.

Prevent Googlewashing From Harming You
November 8, 2005
By Brian Livingston

Brian Livingston It's bad enough when people copy articles you've written for the Web and post them as though the material was their own. But it's downright awful when the thieves' Web pages get higher rankings in search engines than your own pages do for the same search terms.

The fact that people who steal your work often outrank you is a problem called "Googlewashing," as I wrote last week. The problem is getting so bad that your Web page may slip out of Google's Top 10 for particular search terms even if your friends innocently quote an excerpt of your work and legitimately link to you as the source.

Today, I'll explain how you can keep people from copying your writing wholesale -- and how to keep search engines from penalizing you when your work is legally excerpted elsewhere.

Do As I Say, Not As I Link

Many companies unknowingly help other sites rank higher on specific searches than their own pages do.

I reported last week that some specialists in the field of "search-engine optimization" have found ways to steal your content and then artificially raise their rankings to Page Rank 10, Google's highest score. I'll describe in my next column how to combat that trick. In today's piece, I focus solely on ordinary search-engine results, where no fakery is involved in the scoring.

Here's an overview of the situation your Web site may be suffering from:

You publish articles or blogs. To attract traffic, you may invest considerable time and money into developing fresh and interesting content. Unfortunately, almost any Web page is easy for ethically challenged people to copy, posting it as if it were their own. If your content is distributed via RSS (Really Simple Syndication), the process is trivial to automate.

People comment on your work. Apart from outright theft, it's very common for other sites to excerpt a paragraph or two from your work. This is considered "fair use" under most countries' copyright laws. It's perfectly acceptable, assuming the commenters include a link to your original article.

You link back or "trackback" to the commenters. Some Web sites are programmed to automatically include a fragment of any comments on the original article that may be found at other Web sites. These cross references are called "trackbacks" and usually include links to those sites. Unfortunately, by linking to other sites' excerpts of your work, you may be contributing extra "points" to those sites in search engines, pushing their rankings above your original article's.

Trackbacks are admirable as courtesies. But you don't necessarily need to help those sites rank above yours in searches on your particular topic.

To Follow Or To Nofollow, That Is The Question

In an interview, Andy Edmonds, a relevance data analyst for MSN Search, suggested that companies posting trackbacks and other related-comment links use a feature called nofollow.

To use this trick, you add the attribute rel="nofollow" when linking to sites that comment on your content. Search engines largely consider such links to be "unapproved" by your site. These links, therefore, don't lend additional weight to the pages you're linking to. Google, Yahoo, MSN Search, and other indexes started recognizing the nofollow attribute in January 2005.

Adding nofollow to every link in a site's comments section has become fairly easy to do without the need for you to hand-code it. Methods to add the attribute to comments automatically (or omit it) are already available for most major blogging tools, such as Movable Type, Blogger, and WordPress.

Using this feature on your site's trackbacks (as well as your comment area) is a logical extension. To be sure, there are heartfelt opinions against using nofollow, as expressed at sites such as IO Error. The strongest criticism of the attribute is that it hasn't stopped comment spam, the problem that nofollow supposedly was invented to combat.

But nofollow does seem to me to be a useful way to help search engines recognize content that wasn't originated by your site. It's also a feature you can turn on or off for individual links, with some extra work, if you wish.

I Never Meta Tag I Didn't Like

Ideally, search engines would automatically recognize a new article posted by you as the "original article." Your work would then rank higher than Web pages that were merely references to your original article.

MSN's Edmonds suggests a way Web sites can help search engines determine which page was the first to carry some particular content. This is to use a "meta tag" containing the date an article was published. If several Web pages contained links to each other, the page with the earliest date would be considered by search engines to be the "original content." It should, therefore, rank higher on a given search that also matches the follow-up pages.

One effort to popularize tags of this kind is the Date element of the Dublin Core, a group that proposes standards for meta data. Unfortunately, only a very few meta tags are noticed by most search engines, and date isn't one of them, according to Search Engine Watch.

Despite the fact that date meta tags aren't widely supported, that doesn't mean search engines might not recognize them someday, so it wouldn't hurt to use them. "That's a community practice that would really help with this problem," Edmonds says. 

Stopping Outright Rip-Offs Of Your Content

Perhaps your biggest problem isn't legitimate excerpts outranking your stuff but dishonest people copying your material verbatim and posting it as their own. There's a big incentive for this these days. Now that text ads from Google and other services pay Web sites for each individual click, many promoters try to build as many pages as possible -- using whatever articles they find -- merely to attract visitors who might click advertising links.

One way to find sites that are blatantly copying from you is to subscribe to a service such as Copyscape's Copysentry. For $9.95 to $19.95 per month (or free for small, manual searches), this service reports to you every week or every day on Web pages that contain sentences that match 10 or 20 pages you determine. Specifying additional pages costs $0.25 to $1.00 per month.

If you find copycat sites, it may be useless to complain to the offender directly -- if you can even find a way to contact him or her. But you might get results by complaining to the copyist's ad network or Web hosting provider.

Copysentry isn't a panacea. Although it was created by some of the developers of the highly regarded Google Alert service, it doesn't seem to find every instance of duplicate content on the Web. In fact, some reviewers, such as David Mattison of The Ten Thousand Year Blog, report that using plain old Google reveals more duplication of your content than using Copysentry does.

If you decide to use Google or any other search engine to look for unauthorized duplication of your content, follow a few simple rules:

Wait 30 days, since Google and many other search engines update their master index files only once a month. (Some sites are indexed far more often than this, but your site and the copycat sites you're looking for may not be.)

Look for copies of your last paragraph because many sites legitimately reprint the first one or two paragraphs of your content in the course of linking to your site.

Make up little-used phrases at the end of your articles to help you zoom in on copycats. Common words and phrases will appear on so many Web pages that copies of your particular work will be hard to find.

Remember, search services such as Google and Copysentry are no substitute for serious digital rights management, if DRM technology (which is beyond the scope of this column) is really the level of protection your content needs.

Conclusion

Unauthorized copying will never completely go away, but you should at least be able to glean some satisfaction by catching the worst offenders. In my opinion, people who copy entire articles without permission are being impermissibly impertinent. (Copy that, suckas.)

Brian Livingston is the editor of WindowsSecrets.com and the coauthor of "Windows Me Secrets" and nine other books. Send story ideas to him via his contact page. To subscribe free and receive Executive Tech via e-mail, visit our signup page.


Executive Tech Archives


JupiterWeb networks:

Graphics.com

Search JupiterWeb:

Jupitermedia Corporation has three divisions:
JupiterResearch


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Jupitermedia Corporate Info | Newsletters | Tech Jobs | E-mail Offers