Trust is a popular metric for determining the authority level of a website or page. The concept of trust is based on a research paper by Yahoo and Stanford University researchers. TrustRank is based on the idea that good sites tend to link to other good sites. TrustRank begins with a seed set of high trust sites from which they map the outbound links. Sites closest to the seed set are rated higher than sites linked further away. TrustRank makes sense. However there are several problems with it. Subsequent research has determined that TrustRank is unreliable because it is biased toward certain kinds of sites and inaccurate because other methods are able to identify spam by a margin as high as 40% better. A further shadow on TrustRank is the fact that neither Google or Bing are on record as using any form of TrustRank. So if the concept itself of TrustRank is unreliable for determining the authority of a site, why is Trust one of the leading metrics in use by the search marketing industry?
Does Google use TrustRank?
Over the years, employees from Google and Bing have used words and phrases like trust, trustworthy, and trusted sites but those words and phrases have never been used within the context of starting with a seed set of trusted sites and mapping the outbound links (the Seed Set Methodology of calculating TrustRank). More importantly, neither Bing nor Google are on record as confirming the use of any variation of TrustRank. In fact, Google employees are on record multiple times affirming that they do not use TrustRank. Yes, Google has a patent on something called Trust Rank and Bill Slawski wrote an article that explains why that patent is unrleated to Seed Set TrustRank as described in the Yahoo/Stanford research paper. Google engineer Matt Cutts is on record as stating that Google does not use TrustRank and explained that their anti-phishing patent that was coincidentally named Trust Rank has nothing to do with the Seed Set method of calculating TrustRank. It is fair to say that it is highly likely that Google and Bing do not use TrustRank. Some may argue that even if Google is not using TrustRank, a TrustRank metric can still be useful making business decisions. Is TrustRank still useful?
TrustRank teardown.
Not long after the TrustRank paper was published, researchers put it to the test and found flaws in the method. Here are the flaws:
1. TrustRank is biased in favor of sites in popular topics, topics with more seeds in the set. This is a bias against niche topics, topics with less sites in the seed set simply because there aren’t that many sites about those niche topics. In a research paper on Topical Trust Rank, it was noted that “TrustRank has a bias towards larger communities… TrustRank has a bias towards communities with more seeds in the seed set.” This means that TrustRank is biased against sites in niche topics for which the seed set contains few if any seeds. Conversely, TrustRank is biased in favor of popular topics like sports, entertainment, etcetera, which can accumulate a higher level of “trust” because there are more sites in those seed sets. This creates the situation where an unstrustworthy site can accumulate links from popular niche topics and gain a higher level of TrustRank than it deserves. This is in fact what happens with TrustRank. Subsequent research determined that alternate methods were 19% to 43% better than TrustRank at discovering spam.
2. TrustRank contains a bias against sites with inbound links from pages containing many other links
In another research paper it was noted that TrustRank is partially dampened by the number of outgoing links, creating yet another negative bias. The dampened links means that less trust is spread from a trusted page with 20 links than if that page had ten outgoing links. Of course, if an authority page links out to twenty pages you would think it trusts all twenty pages equally. Which makes sense, but that’s not how Trust Rank works. TrustRank contains a bias against sites that accumulate links from pages containing multiple links. Thus, the trust metric calculation becomes less accurate. This may or may not apply to third party metrics. But if you are using Trust Metrics to base business decisions on, then you deserve to know whether trust is dampened by the amount of outgoing links. Otherwise the Trust Metrics you are using are flawed.
3. TrustRank is only as good as the seed set.
A seed set can consist of sites listed in DMOZ, or even a list of outbound links from Wikipedia or any combination of handpicked sites. The providers of third party trust metrics do not divulge the source of the seed set, and with good reason. Once the seed set is known the metrics can be gamed. But in a way, it doesn’t matter what the third party seed set consists of because the set represents the the opinion and preconceived notions of the third party. For example, some metric providers believe in a quality called “inherent trust.” They believe that a government or university links are inherently trustworthy, a quality referred to as inherent trust. This creates a bias that may not in fact exist at Google or Bing, especially because search engineers are already on record as stating there is no trust inherent in a dot edu link that makes it better than any other link. So if a TrustRank metric contains a bias based on something like the beliefe that .gov and .edu links contain inherent trust, then their seed sets may not reflect the reality of any algorithm at use at Google or Bing and that bias may cause that third party TrustRank metric to become inaccurate and unreliable.
In relevance we trust
While third party metrics will always reflect the professional opinions of those creating them and can never represent what is actually in use at Google or Bing, I was intrigued by Majestic‘s Topical Trust Flow metric because previous research has shown that segmenting the seed sets by topic helped overcome the flaws in the original TrustRank algorithm. So I asked Majestic for permission to trial it, a courtesy Majestic extended to me. I am acquainted with Dixon Jones of Majestic but I do not have any business relationship with the company. What intrigued me about their Topical Trust Flow is the existence of a research paper about Topical Trust Flow that discovered that by ordering the seed sets by topics they were able to improve the accuracy rate from 19% to as high as 43%. I asked Dixon about that paper and he replied that their Topical Trust Flow calculations are proprietary and not based on any other work. Regardless the methodology, the results are impressive. For example, a site with low inbound links could be considered low trust. However Majestic’s Topical Trust Flow shows what topic categories that site’s inbound links are coming from. This is important because relevance is one of the most important ranking factors. With Topical Trust Flow I can quickly tell the difference between a site with few but relevant links and a site with few but irrelevant links. Clearly the link with few but relevant links is very important and should not be overlooked. In my opinion it would be more important than an irrelevant link. The decision of whether a link is relevant or not is easier with Majestic’s tool. From my experience, the relevance factor is important to ranking. Thus, while the trust score can be said to be a measure of quantity of inbound links, topical categorization of the seed set identifies the relevance qualities of the inbound links. This makes Majestic’s Topical Trust Flow a useful search marketing tool.
Who do you trust?
Ultimately, there could still be a form of trust that’s being used by the search engines. For example, the Panda algorithm examines on-page factors to weed out low quality pages. What’s left could be said to be a seed set of trusted sites. A massive seed set of trusted sites to be sure, culled from the entire Internet and not just from a small hand picked seed set. A seed set created by Panda could thus represent a cleaner link signal from which to calculate what is shown in the SERPs. But that’s just speculation. Nevertheless, while the original seed set methodology is unreliable, by using multiple metrics such a Majestic’s Topical Trust Rank combined with various metrics provided by LinkResearchTools, one could come up with a fairly accurate data for link acquisition purposes.
The key points are that the original TrustRank paper by Yahoo has been shown to be unreliable and that TrustRank should not be confused with Google’s anti-phishing patent called Trust Rank. It must be understood that search engines are on record as stating that there is no ranking factor called Trust that is in use. There are ranking factors that are referred to as Trust Factors that demonstrate that a site is trustworthy, but that Trust Factors are generally aspects of a site that describe how spammy or non-spammy a site is. Typical trust factors may involve characteristics such as keyword use, outbound links and inbound links, among many other points of examination. Lastly, Topical Trust Flow as offered by Majestic, in combination with other third party metrics (and your own good judgement) are useful for determining whether a link can be trusted for a link building project.
Roger Montti
Latest posts by Roger Montti (see all)
- Make Diversification a Major Part of All SEO Strategy - September 15, 2015
- Brand is Not a Ranking Factor - March 12, 2015
- Rethinking Outbound Linking - January 29, 2015
- TrustRank Teardown – Is Trust a Useful Metric? - January 7, 2015
Dixon Jones says
Hi Roger,
Brilliant post and I am glad that in 2015 the industry is starting to come back to this question of topics/categorisation. One “typo” above, is that your link to the ryansossi research paper about “Topical Trust Flow” is actually a research paper about “Topical Trust Rank” and whilst there are similarities in the concept – as you mention, Majestic’s methodology is our own as is the trade mark “Topical Trust Flow”. (Sorry for being a pedant.)
Whilst there has not been a clear signal from the engines that they are using these ideas, MC did talk about “Topical Page Rank” in a video in March last year and at the time suggested that using these as part of the algo was on the horizon – but of course he then went on extended leave. A quite well known French SEO, @512Banque has done some interesting research which suggests there is a strong correlation already between SERPs and Categories – so whether or not Topical {Word of your choice Here} is directly causing this or is simply a by product of other algorithmic elements, the correlation does seem to exist.