How To Audit and Remove Bad Links – Part 1 Identifying Bad LinksJuly 16th, 2013 by Serps Invaders
This started off planned as a single blog post but quickly spiralled into a 6,000+ word manual for identifying, removing and disavowing bad links. We’ve split it up into 3 parts to make it more digestible and we’ll publish them over the next few weeks. Today we’ll look at the ‘why’ of link removal and how to find and identify bad links.
Link removal. Link detoxing. Some people hoped that this would be a temporary madness and normal link-building business would resume momentarily. Unfortunately that doesn’t look like it’s going to be the case. Google are continuing to refine and update the Penguin algorithm, that particular part of the algorithm which specifically deals with manipulative behaviour including bad links. Link warning messages from Google Webmaster Tools have only increased since they started. Even big brands have suffered humiliating penalties as a result of manipulative link practices.
Clearly the Ostrich Gambit isn’t going to work and more and more webmasters are coming round to the realisation that they’re going to have to pull their head out of the sand and actually start dealing with the problem rather than ignoring it.
But where to start and how to proceed? There’s clearly a lot of confusion about what sort of links should be removed or how removal requests should be made and there’s some startling examples of people demanding the removal of perfectly good links or loading their requests with aggressive legal threats which are only doing more harm to their brand. Crashing into link-removals without using a rational, well thought-out strategy is likely to do just as much harm as leaving the bad links where they are. Following is an outline for a rational approach to link removal.
Assessing Possible Damage
It’s a good idea to review your backlink profile on a regular basis as a matter of course to head off any possible damage before it occurs. If you haven’t been doing this, however, and if your site has been around for a while then it’s worth doing an initial disaster check to see if it has already incurred any possible damage.
Webmaster Tools Warnings
The first thing to check here is for any warning messages from Google Webmaster Tools. Specifically you want to look for an ‘unnatural link’ warning which would indicate that they’ve discovered suspicious links pointing to your domain and that action may be taken against it if it hasn’t already. If you have received an unnatural link warning then you definitely want to take action.
Secondly you should look at your rankings if you are monitoring them and your organic search traffic. There’s two things to look for here. Firstly, a large drop in rankings across all the keywords you are tracking or a sudden and large drop in organic search referrals. These two signs would indicate that your site may have received a site-wide penalty which has drastically affected your organic search traffic. Less severe, but of equal concern is if you see isolated drops in rankings or traffic for specific keywords. This could indicate a keyword-specific penalty indicating that you have too many links to your site using the same anchor text (usually a high-value keyword).
Diagnosing Root Causes
Bear in mind that both of the ranking/traffic volume issues could be a result of on-page changes. For example, if your site-wide rankings and traffic disappear overnight, it could be a result of technical changes to the website preventing search engines from being able to index the site or the analytics code being removed. Alternatively, it could be a result of a non-link related algorithm update such as Panda. Any issues with organic traffic volumes (or noticeable drops in traffic from any sources for that matter) should be checked to see if they coincide with changes made to the website and known algorithm updates to help diagnose possible causes.
Gathering Your Links
The next step is to collect and analyse your backlinks. Whether or not you have identified likely signs of a link-related penalty, this is worth doing on a regular basis or if you’re starting out with a new site. Just because the site hasn’t received a penalty yet, doesn’t mean there isn’t something lying dormant in the link profile which could cause you problems at a later date. Bear in mind at this stage that, if your site has a lot of backlinks, it’s unlikely you’ll find them all. Links could be to deep for some tools to discover or exist on networks which that tool hasn’t yet crawled. The important thing at this stage is to be thorough. Using only one source to find your backlinks could result in missing batches of backlinks, so it’s best to use a variety of link discovery tools to find as many of your site’s backlinks as possible.
Link Discovery Tools
There are several tools on the market which will provide you with backlink data on your own site or other peoples’ websites. They are provided by companies who have essentially built their own web crawler that behaves in a similar way to search engine web crawlers. Starting from seed sites, they will crawl all the links they find on the web to create an index of domains and URLs and where they link to. Because they all have far more limited resources than the big search engines, they are limited by how much data they can collect. That said, these tools can be very effective in discovering most of your site’s backlinks. For a thorough job, we would recommend using more than one, in order to get as complete a picture as possible. Some of the best backlink discovery tools include:
Other tools, such as Link Research Tools, don’t actually crawl the web themselves but aggregate link data from other sources. In the case of LRT, it will also remove duplicates from the different sources and check if the links are still active as part of the link report service.
Both Google Webmaster Tools and Bing Webmaster Tools can also give you valuable information on your website’s referring backlinks. Don’t forget to include this data as part of your process, it’s easy to overlook but is pretty invaluable backlink data (even although it won’t be the complete picture) and should certainly be considered a priority if you’re concerned about cleaning up links to keep the search engines happy. This data should be downloaded from both services and added to your backlink data from the backlink discovery tools.
Historic Link Development Reports
Don’t forget to include any existing link development reports you have on hand. Whether you’ve been building links yourself or another agency has been doing this work for you, if there are records of the links built then retrieve them and include them in the analysis as well.
Analysing the Links
Using Automated Link Detox Tools
Now that you have all of the backlinks you can discover for your site, you need to start analysing them to determine how healthy they are and if there’s any that are potentially problematic. A good place to start with this is the Link Detox tool from Link Research Tools.
It uses over 93 different ‘rules’ to automatically classify links as Healthy, Suspicious or Toxic. Essentially it looks at a wide variety of potential ‘tells’ which a machine can automatically identify which might indicate spammy, low quality or manipulative links including site wides, links from websites which have been deindex (indicating that the site has possibly been penalised), links from websites or pages which have low volumes of inbound links etc.
The Link Detox tool has three different modes including one which will automatically check for backlinks to your site and provide data on them and another which allows you to upload links for it to check manually. The former mode can be useful for running quick health checks on domains. However, as we’ve gathered backlinks already from multiple sources, we would want to use the upload option to import our backlinks into Link Detox and run an analysis on them to get a more complete picture.
Manually Review Your Links
Whilst LRT’s method of classifying links is a useful starting off point, it is prone to creating false-positives and we would not recommend blindly acting on it’s recommendations. Manual review of the links is important and will usually uncover links which have been classified incorrectly – both links being classified as Toxic or Suspicious which are actually ok and links which have been classified as Healthy which might actually be potentially problematic.
If you have a particularly large amount of backlinks to sift through, it may be tempting to simply go with the automated recommendations of LRT or similar link detox tools. However, we would always strongly recommend against that and point out that it is often automation and corner cutting which leads to bad links in the first place. Link detox should be part of a process of turning over a new leaf and that should include putting away automated processes and bad practices in favour of doing the job properly.
That said, manually reviewing huge amounts of links is a daunting task and approaching it systematically is entirely sensible. Experienced SEOs will be able to spot a lot of the bad links simply by looking at things like the domain name the link comes from and the anchor text used to quickly categorise links for removal. Here are just some of the things to look for and a few shortcuts for quickly identifying batches of links to mark for removal.
Manipulative Anchor Text
No matter what you or your SEO think, it just isn’t that normal for people to link to your site with the anchor text ‘Free Poker Credit’ or any other high-value search term. You should filter your backlinks for anything that looks like deliberately manipulative anchor text and check through those links, because odds are they weren’t ‘organically acquired’ and are the most likely culprits for discovering spammy link patterns.
If you have noticed indications of penalties for specific keywords, then pay special attention to any links using these keywords, variations or linking to the page which used to rank for the term as these are probably going to be links you will want to prioritise for removal and disavow.
A link from a directory isn’t automatically a bad thing. There are some good directories on the web, even although their use as a way for people to find sites has greatly diminished since search engines became reliable. However, some quick tells to identify directories which you probably don’t want links from include:
- Any directory with SEO in the domain name (unless it’s a directory of SEO websites) and most directories with words like ‘link’ in the domain name.
- Any directory with no topical focus – ie directories which have categories for every possible type of website.
- Any directory which includes a graphical representation of a site’s Toolbar Pagerank next to it’s listing.
- Any directory which uses little or no editorial discretion for sites it lists.
- Any directory which has multiple identical clones on different domains.
- Any directory which isn’t indexed by Google.
- Any directory with a spammy backlink profile.
Ok, that list pretty much excludes almost every single directory on the web including some of the ‘good’ ones. But to be fair, there are only a few good directories worth being included on. Exceptions to these rules would be some of the top general directories (like DMOZ), local directories which you can only get listed on with a genuine business address and niche directories which only focus on one particular topic (eg blog directories or a specific industry directory). Even in these cases you should still manually review them for quality. For example, there are plenty of junk Injury Lawyer directories which we would still classify as spam even although they are technically focused on a niche industry topic.
Again, forum profiles are not automatically a bad thing. If someone who works for the site has included a link back to the site in a profile on a forum which they regularly use, that’s perfectly acceptable. What to look for here are patterns of links from forums where accounts have been created simply to add a link to the account profile with no intention of participating in the forum. These are very often created using automated software, exploiting known vulnerabilities in open source forum software that makes it possibly to automatically create profiles on large numbers of forums. As it’s automated, there will also be thousands of other spam profiles on the same site making these links an easy target for search engines.
Social bookmarks can be a great indicator that people like your website and are sharing it with their peers. It can also be worthless spam, created by automated tools. If your site has earned links from sites like Reddit great, but like with forums, there are a lot of zombie ‘social bookmarking’ sites around which are built on open source software which is prone to automated spamming. Once you’ve found a few of these you will usually see a pattern emerging, such as the same keywords or target URL being used consistently (due to the automated nature of the links) or the same URL structure for all the sites (due to them all being run on the same software), which makes it easy to filter your list of links to quickly identify all of the spam bookmarks.
User Generated Content
UGC is a pretty broad term and can of course include social bookmarks and forum profiles. In addition to these you should also keep an eye open for links in forum comments and links on free blogging platforms such as blogspot or squidoo. Again, determining if these are spam or not really requires manual assessment.
A spam comment on a forum will usually take the form of an off-topic comment or a new thread which starts off with a post with a link to your site in it. Often the comment will contain multiple links, possibly with unnatural use of anchor text.
Free blogging platforms can be used in multiple ways. Sometimes it’s just someone running a legitimate blog on a free platform (in which case you may want to look into the paid posts section below). In other cases it can be a blog set up just to link to your site. Often, these are used as part of a ‘link wheel’, where multiple blog posts are set up on different platforms and link to each other as well as your own site.
Reciprocal Link Schemes
Reciprocal link schemes have been around for a long time and you’ll often see links from these to smaller business sites. Once upon a time, reciprocal links were a fairly legitimate way for businesses to recommend or provide support to other businesses they considered useful. Of course, that purpose quickly became corrupt and now they are almost exclusively low-value ways of building lots of links. Usually these take the form of websites having a ‘links’ page of some sort and an easy tell that they are linking purely to gain more reciprocal links is if they are linking to other sites completely unrelated to their own. Or linking to hundreds of sites.
There are, of course, automated services which run massive reciprocal link schemes. When you join, you add a page to your site which the owners of the link scheme control and add links to automatically. Your site automatically gets linked to from the links page of everybody else involved in the scheme and likewise your links page automatically links out to everybody else who has signed up for the scheme, with new links being added as new members join. Because it’s automated, it is again very easy to spot. The links page will usually always have a consistent template and it will usually always link to the same websites on each link page. Identify the pattern and add these to your link removal list.
Paid links can take a wide variety of formats. Some of the most common include:
- Links in the sidebar of blogs or websites
- Links in paid blog posts, sponsored content etc. on legitimate blogs, news sites etc.
- Links inserted into existing content
- Adverts which have links which aren’t passed through redirects and aren’t nofollowed.
- Articles on paid blog networks.
These can be more difficult to spot and assess as they are the ones most likely to have had some effort made to make them look natural.
- Sidebar links are probably the easiest to spot because you will have hundreds or thousands of links from a single domain (depending on just how many pages that site has).
- Some paid content will actually be marked as sponsored/paid content etc. depending on how legitimate the site is and how much they are trying to stick to the letter of the law.
- In most cases they will use some form of high-value anchor text as the link.
- There may be no logical reason for the site to have linked to the content on your site other than monetary compensation.
- Paid blog networks typically have very poor quality content (eg bad English, signs of spun content), will not have any consistent topical relevance and often reuses the same content on multiple sites. You can also spot patterns in how they link out – a company will typically buy articles on these sites in bulk so they will all link to the same companies at the same time, making patterns easy to spot if you look at multiple posts on each site. Also, their whois data is almost always anonymous.
If you can’t find a link which has been identified in a backlink report it may be because that link no longer exists. It could also be an indication that it’s a hidden link, so if you have any problems spotting a link, it is worth checking the source code of the page and doing a search for your domain name. This can often turn up links which might have a negative position set in CSS or other ways of hiding the link. Be aware that these links may have been placed without the site owner being aware of it, either through external hacking or by someone with admin access and as such could have been acquired through illegal means. If you come across these, I would advise treading especially carefully as it may come as a shock to the site owner that they exist!
Don’t Throw Out The Baby
This might seem obvious, but we’ve seen a lot of cases since link removal became a thing of agencies seemingly sending out blanket link removal requests to every single site that links to them or using ineffective methods of filtering out the sites to target. Receiving a link penalty does not mean you need to remove all your links. It means you need to remove any links that may be deemed manipulative or not editorially given, especially from low quality or untrustworthy sites. If you remove all your good links as well then you’re going to look silly, potentially insult some great contacts who used to think your site was worth linking to and do even more harm to your site’s ability to rank. Don’t throw out the baby with the bathwater and be sure that the links you’ve earmarked for removal really are bad links.