What is PageRank?
PageRank is one of Google's algorithms responsible for determining the ranking of a web page in search engine results pages.
The PageRank score ranges between 0 and 10, with a value of 0 representing low-quality pages and a value of 10 representing high-quality pages.
The algorithm evaluates the ranking of a web page by analyzing the links pointing to it. It considers not only the quantity of these links but also their quality.
These links, known as backlinks or inbound links, are clickable text found on a web page that directs to a specific page on your website.
While backlinks can support the information available on your website's page, this support may not be the only factor Google uses to determine your page’s ranking.
This is because Google's PageRank algorithm conducts both qualitative and quantitative analyses.
For instance, if we consider links as "votes" and a web page receives five backlinks, PageRank will examine the authority of each of the five pages from which these backlinks originate.
Authority is the value or weight a search engine assigns to a web page - higher authority generally correlates with a better position in search results.
History of PageRank
In 1997, Larry Page and Sergey Brin developed the PageRank algorithm at Stanford University as part of a research project.
Prior to PageRank, search engines ranked pages based on the presence of keywords in various parts of the content, such as the title, H1, subheadings, and URLs.
This approach led to keyword stuffing, a tactic used by webmasters to inflate their content's keyword density to achieve higher rankings. As a result, many low-quality web pages ranked high in search engines.
In 1996, Robin Li created RankDex, the first search engine to determine page rankings based on hyperlinks.
PageRank, which many believe was inspired by RankDex, was developed after RankDex.
With the introduction of the PageRank algorithm, Google started considering not only keywords and other signals but also link analysis to evaluate web pages.
Timeline of PageRank
This table provides a detailed timeline of the key events related to PageRank and its evolution in combating link spam.
January 9, 1998
First PageRank patent filed
March 15, 2000
Google Directory launched
December 11, 2000
PageRank launched in the Google toolbar
January 18, 2005
Introduction of rel="nofollow" attribute
November 17, 2005
PageRank integrated into Google Sitemaps
October 15, 2009
PageRank was removed from Google Sitemaps
April 24, 2012
Launch of the original Penguin algorithm
October 16, 2012
Introduction of the disavow tool
December 6, 2013
Last update of PageRank in the toolbar
March 7, 2016
Removal of PageRank from the Google toolbar
September 23, 2016
Launch of Penguin 4.0
July 25, 2011
Google Directory shut down
January 9, 2018
PageRank patent expired
September 10, 2019
Addition of rel=ugc and rel=sponsored attributes
July 26, 2021
Launch of the first Link Spam Update
December 14, 2022
Launch of the Link Spam Update with an AI-based detection system called SpamBrain
The origins of PageRank can be traced to January 9, 1998, when its first patent was filed under the name "Method for node ranking in a linked database."
This patent expired on January 9, 2018, and was not renewed. On March 15, 2000, Google publicly introduced PageRank through the launch of the Google Directory, a version of the Open Directory Project sorted by PageRank. However, the directory was eventually closed on July 25, 2011.
Following the launch of PageRank in the Google Directory, Google introduced PageRank in the Google toolbar on December 11, 2000, which became the version that most SEOs focused on.
However, the toolbar version of PageRank was last updated on December 6, 2013, and was eventually removed on March 7, 2016.
PageRank was also incorporated into Google Sitemaps - which is now known as Google Search Console, on November 17, 2005. However, it was later removed on October 15, 2009.
In the course of time, SEOs began exploiting the system to gain more PageRank and achieve better rankings, resulting in the development of various link schemes.
As a response, Google, in collaboration with other major search engines, introduced the rel="nofollow" attribute on January 18, 2005, which was intended to counter spam.
However, SEOs began misusing nofollow for PageRank sculpting. Consequently, Google modified the way nofollow worked in 2009 to address this issue.
On September 10, 2019, Google introduced additional link attributes, such as rel=ugc and rel=sponsored, to provide more specific options.
To fight link spam, Google released the Penguin algorithm on April 24, 2012, causing penalties for many sites.
Google then introduced the disavow tool on October 16, 2012, to aid site owners in recovery.
With Penguin 4.0, launched on September 23, 2016, Google began devaluing spam links instead of punishing sites.
More recently, Google introduced the Link Spam Update on July 26, 2021, and followed up with another update on December 14, 2022.
These updates introduced SpamBrain, an AI-based detection system designed to devalue unnatural links.
In short, Google has consistently improved PageRank over time to combat various link spam techniques and maintain the integrity of search results.
How does it work?
PageRank algorithm assigns a numerical score to each web page, which helps determine its position in search engine results.
Here's a simplified explanation of how PageRank works:
Inbound links: PageRank views links from one page to another as votes. When a page links to another, it is seen as endorsing or recommending the target page. The more inbound (Backlinks) links your page has, the more important it is deemed by the algorithm.
Link quality: Not all links are equal. PageRank considers the quality of the linking page when determining the value of a link.
A link from an authoritative, relevant, and high-quality page carries more weight than a link from a low-quality or irrelevant page.
Damping factor: The damping factor, usually set to 0.85, represents the likelihood that a user will continue clicking on links while browsing.
This factor helps prevent pages from gaining an unfairly high PageRank simply by having many low-quality links.
Probability distribution: PageRank uses a probability distribution based on the random surfer model to determine the likelihood of a user visiting a particular page.
This probability is distributed among all the pages connected through external links.
The PageRank equation is as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
PR(A): The PageRank of page A.
PR(T1), PR(T2), ..., PR(Tn): The PageRank scores of pages linking to page A.
C(T1), C(T2), ..., C(Tn): The number of outbound links on pages T1, T2, ..., Tn.
d: The damping factor, which is usually set to 0.85.
By calculating the PageRank scores for all web pages, Google can rank pages in search results based on their importance and relevance, ultimately providing more accurate and useful search results to users.
FYI: The above example is from the original paper that is not the same now. There have been a lot of changes brought about in the way PageRank is measured.
How it has changed over time?
Several changes have been made to the PageRank system over the years:
Link Value Variation
Rather than equally distributing PageRank among all links on a page, some links are now valued more than others.
Google may have switched from a random surfer model (where a user might visit any link) to a reasonable surfer model (where some links are more likely to be clicked and thus carry more weight).
Ignoring Certain Links
Google now ignores the value of certain links through various mechanisms, such as:
a. Nofollow, UGC, and sponsored attributes.
b. Google's Penguin algorithm.
c. The disavow tool.
d. Link Spam updates.
Google also does not count links on pages blocked by robots.txt, as it cannot crawl these pages to see the links.
Google has a canonicalization system that helps it determine which version of a page should be indexed and consolidates signals from duplicate pages to the main version.
Canonical link elements were introduced in 2009, allowing users to specify their preferred version of a page.
Redirects were initially said to pass the same amount of PageRank as a link, but this has changed, and no PageRank is currently lost.
The treatment of links on noindex pages remains uncertain. According to John Mueller, pages marked as noindex will eventually be treated as noindex, nofollow, meaning that the links will stop passing any value.
However, Gary Illyes suggests that Googlebot will discover and follow the links as long as a page still has links to it.
These statements aren't necessarily contradictory, but if Gary's statement holds, it could be a long time before Google stops crawling and counting links, perhaps never.
Which factors do not affect it?
PageRank sculpting is the practice of using the nofollow attribute to stop links from sharing the link power or, in other words, stop voting for certain pages to provide more link power to other web pages.
The nofollow attribute is the tag that you add to a web page’s HTML code to tell the search engine to ignore that page. Ignoring means the search engine will not pass link power to this page.
Until 2009, SEOs utilised sculpting to benefit some of the web pages. For instance, you have five pages getting an internal link from a page of high authority, and you add a nofollow attribute to one of them.
This way, you divide 100 per cent of the link power among the remaining four pages.
This practice does not work today because now you cannot control PageRank flow by using nofollow tags/attributes.
These are outgoing links from your web page to another. These outbound links will not reduce the PageRank of the page from which links are going out.
Is it still important?
Although Google has set a lot of other parameters for page ranking, the PageRank algorithm is still important because Google’s Search Advocate, John Mueller, confirms it.
Yes, we do use PageRank internally, among many, many other signals. It's not quite the same as the original paper, there are lots of quirks (eg, disavowed links, ignored links, etc.), and, again, we use a lot of other signals that can be much stronger.
So, it is still important because the ranking of the page, relevant to other pages on the web, speaks of the authority of the web pages and also of the website.
However, Google removed it from the toolbar in 2016 just because of the bad practices SEOs started doing. They manipulated this important factor by getting paid links from high-value PR pages.
The removal from the toolbar happened because Google didn’t want the SEOs to check the PageRank every time they manipulated it. Therefore, it is now hidden but still working.
However, Google does not want you to stop considering it; instead, keep working on it by strengthening link building.
Why is it crucial to SEO?
It is crucial to SEO because, with a high PageRank, the chances of a web page’s good position in search results are higher. It is how the users can acknowledge your page’s authority when they click on the web page outside of your website.
In addition, the removal from Google’s toolbar makes it more important because one cannot manipulate it.
Should you really bother about it?
Consider it one of the factors that Google considers but do not take it as something that is the only criterion for ranking.
In fact, there are a lot of websites with a 10 PageRank, but they still do not come at the top position. For instance, Adobe has a PageRank of 10, and it still does not always come at the top position.
It is because there are many other factors that contribute to ranking along with it, such as quality content, keywords, user experience, etc.
PageRank is an algorithm developed by Google to evaluate a web page's quality and determines its ranking in search engine result pages.
PageRank considers the quantity and quality of backlinks or inbound links.
Although Google removed PageRank from the toolbar in 2016, it still plays a crucial role in SEO because it contributes to a web page's authority and its position in search results.
However, it is only one of many factors that Google considers, and quality content, user experience, and keywords also contribute to a web page's ranking.