Post Penguin 2.0 (4) Link Building – What Links Will Work?

Post Penguin 2.0 Link Building

The following post discusses what links might still pass value when the Penguin 2.0 update hits the SERPS (or Penguin 4 as some have dubbed this update).

Google is a system that is still unofficially in Beta albeit, perhaps coming properly of age soon when it finally tunes its algorithm to show the fairest results. In 10 years time we will look back and laugh (or cry) at how a private business could get away with controlling the Worlds internet with an incomplete algorithm.

The PageRank aspect of the algorithm is admittedly well intentioned and is based on the offline citations model used by academics and writers.

There is talk in high places about this model being used strictly by Google in the near future. In this post I’m going to explore why dialling up the algorithm to ensure all links adhere to this model could be a disaster for Google and what links I think Penguin 2.0 will reward but first I want to look at how things stand now.

Google left the door open…but it’s closing fast

Many people in the SEO world talk about links being classed as votes from one page to another but what does that really mean and do Google really see links like that too?

Let’s look at how easy it was to game Google just a year or so ago (and I’m sure some Black Hatters will still say that is the case now). By taking a piece of content and spinning it into a thousand unique but often less legible versions and then submitting to multiple article directories, blog networks and other sources you could get a page ranked in many verticals. Other factors required included relevant anchor text and a spattering of related keywords in titles and body copy and for many verticals as long as the site matched up in other areas (too many factors to list) it was job done.

What does that say about the effectiveness of the pre 2012 PageRank algorithm as a voting system? It says that although all links weren’t equal any link was counted as a vote regardless of where it came from; hence the reason it was so easy to game – link volume and keyword rich anchor text all the way!

At this point are you starting to see how irresponsible Google could be regarded when it comes to looking after the interests of web sites that play fair?

Fortunately things have been changing and fast. Since Penguin 1.0 (April 2012) in order for a link to pass value to another page on another site the quality parameters have changed . Certainly there have been unnatural link warnings flying about so Google now demonstrates that it recognises crap links and blog networks. This has prompted mass tidying up of link profiles and blog network exoduses (understandably).

However, here’s an argument that says that Google might not have got link quality quite nailed quite yet; many SEO’s talk about the percentage of anchor text that is causing a penalty to be triggered  (for example at 60%). If this is true then it shows that even the much feared quality focused Penguin was or is flawed, and why? Because how about if we had 54% dodgy anchor text, does that mean my pile of directory links with keyword anchor text are all ok? In some cases you would still get the message of link death as the 60% isn’t a fixed cut off point but still it begs the question.

The next wave of link murder?

Clearly Google is getting its act together now so let’s assume Penguin 2.0 (or Penguin 4) is going to be more ruthless.

Eric Enge in his recent post here talks about the possibility of Google getting much closer to the citations model it originally introduced in its PageRank thesis and that any deviation is a problem. I’d like to explore that idea a bit more.

Let’s look at what a citation is. The definitions are:

A quotation from or reference to a book, paper, or author, esp. in a scholarly work.

A mention of a praiseworthy act or achievement in an official report, esp. that of a member of the armed forces in wartime.

For the purpose of acknowledging the relevance of the works of others to the topic of discussion

To uphold intellectual honesty

To attribute prior or unoriginal work and ideas to the correct sources

To allow the reader to determine independently whether the referenced material supports the author’s argument in the claimed way

To help the reader gauge the strength and validity of the material the author has used

So if citation style links are pure gold what about the rest?

How does the above definition define the many millions of hyperlinks all over the internet? How many of them are actually strictly speaking ‘citations’?

If a webpage about Alsatian dogs links to a local kennels it happens to like in the area how does that fit into the above definitions? Is that link still a vote? Will it still pass any power to the kennels web site? Will it be considered a Penguin friendly link?

Based on this thinking it seems not and this is why I think that Google will deal with links like this – which make up a vast proportion of the links online – in a different way (let’s call them ‘non-citation’ links) and still assign some value. Here’s why I think that they can’t just dump millions of links into the spam bucket.

If Google pulls the plug on anything that is not a citation then how will the search results look?

The answer to this question really depends on how much Google allows the PageRank algorithm to affect their search engine results pages. There are of course many other factors Google must take into account when ranking web pages, however, take the following scenario  example on board I’m going to use to illustrate my point.

A guy called Tim works for an insurance firm in a tiny town in Yorkshire, England, let’s call them Tim is insane about all things insurance and decides to write the ultimate guide to insurance. Jill is a Dr lecturing in finance at Stanford University and decides to link to Tim’s article from the Faculty’s blog. This blog post get shared socially in the intellectual community, linked to from a few more University web pages that discuss insurance and then finally due to this gets a link from the insurance page on Wikipedia.

Tim’s web page has amassed genuine citations. In comparison over the first ten years of Google’s life the insurance sites ranking for the keyword insurance have amassed thousands of crap links (that have flown under the radar and helped them rank). Google then introduces its new strict ‘citation’ based Penguin 2.0 PageRank algorithm update in 2013 and boom Tim’s site ranks number on for insurance (happy boss).

In reality there are too many other factors that affect searchengine results but if hyperlinks are the foundation then changing things drastically could rock the boat a bit too much for the end user. Ultimately my Nan doesn’t care about link manipulation, she just wants to get the best result when she searches for bingo offers, not a ‘well linked to’ scientific paper about bingo playing statistics!

So how could Google evaluate our more commonly found ‘non-citation’ style hyperlinks?

The question to ask is why would a link from one site to another, that is not a ‘citation’, be of any use to Google in determining the value of a linked to page?

Let’s consider are these valid reasons to assign a value to a link?

1. The owner of the sites recommends the service/product

Google’s answer: So what. Unless your site is trusted we don’t care what product/services you recommend

2. The owner of the site thinks that the article is useful further reading

Google’s answer: So what. Unless we trust your opinion we don’t care what you think is further reading

3. The site/page/link is relevant to the one it is linking to

Google answer: So what. Unless the site is trusted we don’t care if it is relevant

Basically the theme running through the above reasons for adding value to a link is trust. So how can trust be generated to create ‘non-citation’ based link value?

If you have considered the above Google responses you might already be thinking how content validation and site authority are going to play an important role in assigning value to links. This is why the Google+ authorship strategy is so important to Google and was even openly discussed by former Google CEO Eric Schmidt .

This means that even if a link is in content on an unknown blog it can be verified and valued via the G+ profile and assigned a fair value. Maybe even some of the authors ‘juice’ (sounds iffy) could be allocated to the domain authority/PR? This would make authors a valuable commercial resource in more ways than they might have imagined but that’s a blog post for another time.

Below is my shot at how links might be rated


Relevance relevance relevance…

As for relevance, everyone is bashing on about it and of course link relevance is important but it is understanding ‘relevance’ that is important. Relevance is defined as “the condition of being relevant, or connected with the matter at hand”.

Take this example scenario. How would a machine based algorithm assign relevance based on the above definition?

Site name:

Article title: 20 things to do before you hit 20

Links to:

Is the site relevant to the linked site? No.

Is the article relevant to the linked site? No (or at least on the surface appears not to be).

Is the link relevant and useful to the reader? Yes! The article lists learning to drive as a thing to do before hitting 20. A teenager could find the linked page useful so it has relevance.

My point here is that Google is smart and is able to spot what humans might not on the surface see as relevant, Bill Slawski’s excellent article does a great job of outlining how this might work.

One of Google’s search quality team was quoted as saying…

“…getting a link from a high PR page used to always be valuable, today it’s more the relevance of the site’s theme in regards to yours, relevance is the new PR.”

I have to agree a link from a linking high authority, on-topic site within a tightly related article is going to be powerful but it doesn’t mean you should write off the rest. For example, how about a link from a PageRank 9 page that is totally off topic but the link is relevant within the context of the article? I’ll take that over a PR0 relevant link from a relevant article all day long.

So to round up

Citations are likely to be officially crowned as the most powerful links on the web when the penguin pecks again (no surprises there then as they probably already are).

Our regular ‘non-citation’ based hyperlinks will continue to pass power but with contextual relevance, authority of site and author and social visibility used as their primary weighting factors.

Links with no authority and no author authority will pass nothing regardless of being relevant or not.

I think the only saviour for small sites with lesser known authors and no authority will be social visibility and engagement metrics but ultimately well engaged sites will build the other metrics anyway.

Further reading:

Jason Brooks

Jason Brooks

Managing Director at UK Linkology
Life is a gift and today is the present...totally agree with that ...however, as a digital marketer I spend most of my life second guessing the future and quite enjoy it! Feel free to get in touch and share your vision, gripes, loves, hates and general marketing thoughts. I'm all ears with grey hair and a bit of a pot belly thrown in.
  • Hi Jason,
    Great insight on post penguin link building strategy. Relevancy mattered in the past and will continue to matter in future as well. More focus would be given on author rank and author associations. Sites having good reputation, high trustrank, high domain authority and high page rank would be the best bet for getting a backlink. Citations plus backlinks with proper anchor text ratio would be the key to link building. Thanks for sharing this awesome stuff.


  • First of all well done for being the first page out if the penguin results to actually tell me something new. Other links I have been reading have just been telling me to be careful with anchor text as if we didn’t know that. Anyway I think the author thing is for real and is something everyone should be getting sorted. It may even save your site from being hit if you have a few author verified links. Real links from real people is what Google want and this is their best idea so far of getting this.

  • Jez

    Hi Jason,

    Very interesting post, can see the logic in it but Im not sure things will go this way.

    A couple of years back relevance was more important, at that time people suggested G should lean more into PR because it was harder to create PR than relevance. For that matter it is a lot harder to build PR than it is to get a google account and set your site up with “authorship”.

    For example, there are lots of Indian tech blogs. Almost all have Google authorship, facebook and twitter profiles. They are all “on topic” and relevant, but the quality is poor. They are mostly re-hashed posts they have read elsewhere, and typically hover around PR2.

    Also consider who is going to go to the trouble of creating a “citation”. The “I dont care about SEO I just write about what I love” brigade who Matt Cutts says we should all become arent going to care about citation, they are just going to link out to sites they like. The only people who are going to go to the trouble of creating a citation will be those wanting to pass value through their links, exactly the people Google do not want to trust.

    This is similar to the no-follow debacle. WooThemes, one of the largest premium theme providers no-follow all outbound links by default in their themes. A “non SEO” site owner runnig a WooTheme will most likely leave it that way, a WooTheme used in a blognet on the other hand will make sure to turn that option off in the theme settings.

    Do you think the BBC will go to the trouble of creating citations for the benefit of Giggle, or do you think they will carry on writing for their readership?

    Do you think all those low grade Indian bloggers with authorship will start using citations if they become popular?

    In my opinion the majority of citations would end up being paid. Again, Matt Cutts is always saying things like “forget about google”, “pretent google doesnt exist”… people doing that wont cite, people in the SEO game will.

    Now I know you said High PR + Citation, not Low PR as in the case of the Indian Bloggers I mentioned, but how many sites does that leave?

    If google shrink the pool of sites they trust and allow to pass value, or weight them so highly that they become the dominant factor in their rankings then what Google does becomes a lot easier to replicate, and the variety of sites in the SERP will continue to suffer.

    I increasingly use Bing to find answers to problems because, although Google may list better sites, they list the same sites for every permutation of the query. If you dont find the answer you need and re-phrase it, you get the same results again and again. I started using Bing after struglling to find the answer to a problem in Google on three separate occasions, around 15 minutes each time, total 45 minites (at least) when I eventually tried Bing the answer was #2, because Bing leans more into relevance and less into trust, domain authority etc. So whilst Googles results are “cleaner” and less easy to manipulate, they are (in my opinion) increasingly bland and lack the diversity I want from a search engine.

    Moving to this kind of model would make that worse (in my opinion) as they would concentrate so much influence into a small core of trusted value passing sites.


  • Thanks for the indepth comment Jez, I agree, citations cannot be the only model Google uses or algorithm transparency will ensue and thus open them up to massive abuse. My summary points are that I expect citations will become more valuable and that other types of links will be judged on author visibility through G+ verification and other data they’ve collected. Google has to use multiple techinques to assign link weight or as you suggest risk alienting the non ‘SEO’ focused web. The problem they have is sorting the honest crap from the SEO crap. Good luck!

  • Good article Jason and I thank you for sharing. There is a lot of changes that will have to be made when it comes to link building from now on, and I believe that Google is just getting started with changing the game. There are just too many who want to cut corners and not do things organically. I’d personally rather for myself and my clients have 10 quality back links that are original and relevant, than 50 that are just bogus hyperlinks in a worthless blog that was paid to be in.


    • Thanks Bradley,
      That’s our thinking too. The time for building links enmasse is long over although some would have you believe otherwise! This article is more about the perfect link than what I think is working ‘right now’ and it’s what we are aiming for in the next year or so. The whole G+ content/author verification thing could be very powerful for Google but getting enough people on board is their immediate problem. For now quality links in quality content whehter G+ verified or not are the only way forward and at least for now those that follow this rule know that many of the web spammers are being kept out of the SERPs.