The effects of the Panda update have been quite profound. It was far from perfect; and a lot of perfectly good content sites got knocked out with the bad. However, Panda ultimately did make a lot of SERPs better, returning more trustworthy information from highly reputable sources that are more relevant to the query. It also forced a lot of content networks to introduce more stringent checks on the quality of articles that is allowed to be posted on their sites.
When Panda was initially released, the first sites that were heavily affected were articles submission sites such as ezinearticles.com, squidoo.com and hubpages.com, press release sites, and content aggregator sites. Over the past few years, there have already been more than 30 updates which has impacted 12% of all searches and many low quality content farming sites and sites that are linked to these sites.
Conversely, trusted brands with lots of high-quality, unique content appeared to score high and were largely unaffected by the update. In fact, many of those types of sites saw their rankings rise.
Soon after Panda was released, Google published the list of “23 bullet points “ that human raters were asked when distinguishing higher quality sites from lower quality sites. This was supposed to help webmasters “step into Google’s mindset”.
The Panda algorithm works by grading sites through advanced algorithms that search for patterns which indicate whether a site is of high or low quality site relative to a given search query. These patterns are based on data from human search quality raters employed by Google.
It is therefore important to understand that Panda starts off from a human point of view, not a machine’s. These raters were asked to rate the sites based on three areas: on quality, design, and trustworthiness (trustworthiness being the biggest factor). The raters looked at hundreds of websites and grouped sites into good and bad and tried to figure out mathematically what the differences were between the sites.
The raters were asked questions such as:
- Would you be comfortable giving this site your credit card?
- Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
- Is this the sort of page you’d want to bookmark, share with a friend, or recommend?
- Does the article provide original content or information, original reporting, original research, or original analysis?
- Is this article written by an expert or enthusiast who knows the topic well, or is it shallower in nature than top ranking articles that address the same topic?
- Do you consider this site to be authoritative?
- Would you be okay if this was in a magazine?
- Does this article have an excessive amount of ads that distract from or interfere with the main content?
- Does the page provide substantial value when compared to other pages in search results?
For example, a website secured with the HTTPS protocol would attract an affirmative answer to the question “Would you be comfortable giving this site your credit card?”. Thus, through all of the data gathered by these human raters, Google now has a ‘perfect profile’ of a high quality site. Computers, using machine learning, are then brought in to mimic the human raters.
When the algorithm becomes accurate enough at predicting what the humans scored, it’s then unleashed on millions of sites across the Internet, filtering out sites that are likely to perform poorly on the questionnaire. Every site that shows up in the Google SERP is evaluated and given a score based on the above factors.
Here’s some of the list of 23 questions released by Google to help alleviate the worries of webmasters, but most of these are open to interpretation:
Over time, the SEO community has come together to analyze websites that were hit by Panda and arrived to the following conclusions about pages that get penalized:
- The content is poorly written (such as content that has been “spun” using software)
- The content is very short (“shallow” content that is too brief to be valuable)
- The content is mostly duplicate content (copied from another page)
- The content adds no real value
Google acknowledged that the launch of “Google caffeine”, (Google’s previous algorithm update which was designed to build an even faster and comprehensive search engine), had actually played a role in increasing the visibility of ‘shallow content’. Thus Panda was a post-caffeine solution to ensure that the quality of search results was not jeopardized. According to Google, the initial Panda change affected about 12% of queries to a significant degree.
Essentially, the following types of sites were hit hardest by the Panda update:
- E-commerce sites that use identical product descriptions across a number of sites. To meet the “high quality” and “unique” content guideline, every product needs to be given unique descriptions and listings.
- High volume content farms with low quality content. (A content farm produces large amounts of content specifically to attract traffic from search engines and use those page views to generate easy advertising revenues.)
- Thin affiliate sites that use stock product descriptions that are used across many other sites.
- Businesses with multiple sites that contain nearly identical content on each site.
- The user experience had always been an important factor to Google before Panda, but after the update, they became a significant ranking factor! Thus, your website’s high bounce rate for a particular keyword might suggest that your site did not satisfy the searcher and that your site does not offer a good answer for that particular search query.
- Sites with lots of ads that are specifically designed to host ad-sense ads.
- Large content networks with lots of low quality or duplicated content such as Squidoo and Hubpages.
- Price comparison sites with thin content.
- Travel sites with poor or duplicated reviews.
- Websites with poor usability and branding.
- Sites that don’t seem professional or trust