No one probably wants the Panda extinct as much as webmaster and small publisher Patrick Jordan.
Jordan runs a blog called justanotheripadblog.com. While we're about to highlight an individual case it's worth noting there are plenty more small publishers affected. A simple scan through the support forums will show you as much.
We saw Jordan's sorry tale outlined on SEOBook.com, an SEO blog which is aghast at Google's odd algorithm rejigging.
Like many webmasters, Jordan noticed a severe dip in organic traffic. A Google search reviewer told him a reason for his website dropping could be explained by a quick, simple test. The reviewer typed a sentence from Jordan's website into Google and found it was populated all over the web.
Which is a strange reason to cite - since that is precisely the problem Jordan was trying to point out anyway. He tested the same sentence himself, where he lists higher ranking results all as 'scraper' sites.
Scraper sites automatically re-post original content without consent. Currently they are ranking higher for established publications in the UK, including Pocket-Lint, Reg Hardware and Computer Weekly. This fellow's blog fell into the same category and was automatically penalised while the content farms the update was supposed to kill continued to rank higher. It's a quality thing, is it?
A serious problem with Panda is that Google, as a web search outfit and self-confessed organiser of information, can not write any code that can detect good or bad "content". The idea is admirable but in practice, it reeks. Microsoft before it attempted to curb what it didn't like, and fortunately, the company's efforts fell flat on its keister. Bing as a possible saviour to Google's arrogant monopoly feels like Dada in the 21st century.
Likewise by culling its credibility with start-ups there is a chance Google is giving the upper-hand to its disdainful rival in Rupert Murdoch. Google hasn't figured out publishing and when engineers are at the helm telling people what they should and shouldn't read, it never will.
A comms spokesperson for Google could not answer us when we asked what the heck was going on. The spokesperson said he was not able to talk to us about the technical side and repeatedly directed us to that vague Google Panda blog post which we reported on earlier. What it does not answer, and what the internal Google spokesperson could not answer, is how copyright-violating scraper sites rank above their sources.
The UK's Gareth Evans, Global Comms & Public Affairs in Google Search, Commerce & Social reluctantly replied to TechEye, taking three days to fob us off: "Apologies for the delayed response. As you probably guessed, I'm not going to be able to give you much more info than you've already received.
"But on your question about what constitutes a 'low quality' site, we published a new blogpost on Friday to address that very issue in as much detail as we can.
"I hope this is of some use."
It wasn't.
To add insult to injury, SEO Book reports that Jordan, sick of being ignored, resorted to AdWords to bring some further revenue to his site - which Google denied. Meanwhile, scraper sites that rip off Jordan's content, much like with other small publishing houses, continue to to profit from stolen articles full of Google's own adverts.
We have asked Mountain View several more questions and wait for a response.
You fell victim to one of the classic blunders -- The most famous of which is "never get involved in a land war in Asia" - but only slightly less well-known is this: "Don't substitute your idea of 'quality' for Google's idea of 'quality'".
You don't have to agree with their idea but you do have to accept it if you want to participate in Google's search experience.
The increasingly silly analyses coming out of SEO Book are not helping anyone understand the algorithm or the process or the system. Disagreeing with the questions, trying to ridicule them away -- that's no more effective than any other form of denial from the Kübler-Ross model of the five stages of grief.
Google's definition of quality doesn't have to be as good as or any better than anyone else's definition. It just has to be something they are happy with.
And everyone else can either decide to live with the limitations of that definition or they can choose NOT to rely on Google for traffic to their Websites. It's not like cutting Google out of the equation has to be a death-knell for any Website.
In fact, truly valuable sites get a majority of their traffic from non-Google sources, so maybe Google's definition of "quality" isn't quite so bad after all.
I'd have to disagree with that last statement - "truly valuable sites get a majority of their traffic from non-Google sources". TechEye receives it's traffic from a relatively diverse set of sources, but the other site I run, developer Fusion - targeted at software developers - receives 68% of it's traffic from Google (out of the 70% that comes from search engines). This is like many other successful developer sites like Stack Overflow. We're in the Google game whether we like it or not.
And sure, whilst you can debate the merits of Google's quality guidelines, it's been shown time and time again that the reality bears no correlation. Sites that fit those guidelines perfectly have been badly hit, and others that are completely outside of them, are benefiting. That's the issue I see here, personally.
James
It's a narrow-focus niche site and you're not mentioning what you did to grow traffic for it outside of Google.
I've been involved with search engine optimization for over 12 years. I have always seen a strong correlation between high value sites and search referral traffic representing less than 50% of their traffic -- except in technical niches.
"Sites that fit those guidelines perfectly have been badly hit"
I challenge you to name ONE because I haven't seen any. I've been rereading those guidelines on an almost daily basis for two months and so far every site I have reviewed that has complained about losing traffic from Panda has issues that are referred to in the Google guidelines.
The guidelines are vague on many points but there have been plenty of incidents and question-and-answer sessions through the years where Googlers have expanded upon and clarified those guidelines.
People are in a state of denial about being compliant with the guidelines. Their sites were borderline and the previous algorithm couldn't detect the borderline cases -- Google even said so in the WIRED interview.
What Google tolerated before it's apparently not willing to tolerate now. The fact that some scraper sites are now ranking above downgraded sites doesn't mean that the downgraded sites are good or that Google wants to see scraper sites ranking highly -- it just means the scraper sites aren't as far out of the current implementation of Google's guidelines as the downgraded sites.
I expect Google to continue working on the scraper problem. Meanwhile, all the arguing and justifying and defensiveness isn't going to change reality: Google laid down a new set of rules and people either agree to play by them or they need to find other sources of traffic.
Oh dear Michael,either you don't understand Google's guidelines, or you are blindly faithful to their way of thinking.
"Google laid down a new set of rules and people either agree to play by them or they need to find other sources of traffic."
The problem with this is that there are a vast array of websites that have staked their business on a particular way of marketing, only to have been stung. Google is more than just an anonymous method of getting traffic, it is a whole revenue stream for people's business.
We have listed a range of search queries, unrelated to our business, that have decreased in quality because of the latest change. The issue here is that Google might start to see its admittedly huge market share slide as people look elsewhere.
www.computerweekly.co.uk
www.techeye.net
www.pocket-lint.com
www.ehealthmd.com
I've spent the last month trying to figure out what Google has decided they don't like about our content - and have failed so far.
On the subject of guidelines - and yes, this is another post from SEO book - http://www.seobook.com/ignore-seo. Matt Cutts himself said no to hiding duplicate content pages (from the same domain) using robots.txt, telling webmasters to instead allow the Googlebot to figure it out and promote the right one. And now suddenly unannounced their penalizing the whole site because of it. Not particularly consistent.
Keep in mind that I am looking for borderline behavior that most people would consider -- based solely on past experience -- to be "acceptable" because Google never had a problem with it in the past. I cannot in any way guarantee that whatever I find is causing these sites to be downgraded by Panda or other algorithmic changes.
SEO Book's quote of Matt Cutts is taken out of context. Matt wrote that comment ages ago in a world that did not include nor envision the need for Panda technology.
COMPUTER WEEKLY
* Some of the content is replicated on other sites.
* The site's internal navigation does not present a clear hierarchical structure.
* Many of the pages (root and category roots) lack substantive content (mostly links)
* Looking at even articles in a Lynx simulator the user has to work to get to the actual content.
POCKET LINT
* Some of the content is replicated on other sites
* Negative placements in stylesheets (Googler Maile Ohye announced this practice was no longer acceptable in 2010)
* A random selection of articles revealed relatively little indexable text compared to embedded videos and ads (could just be a poor set of choices on my part)
* The site's internal navigation does not present a clear hierarchical structure
EHEALTHMD
* Some of the content is replicated on other sites.
* The site's internal navigation does not present a clear hierarchical structure.
* Negative placements in stylesheets
TECH EYE
* There appears to be a canonical URL issue creating what seem to be false positives for indexable duplicate content within the site.
* The site's internal navigation does not present a clear hierarchical structure.
* Negative placements in stylesheets
I should note that SEOmoz's latest Linkscape update announcement includes the random statistic that they found an average of 60+ links per page. That statistic is too raw to be really usable but as a rule of thumb I advise people to aim to include no more than 50 links on a page.
It's okay to have 100 links on a page but their purpose should be clear to the random visitor (such as archive links for blogs). If the links are "just there for SEO" then they are probably providing a negative user experience.
Mr. Steers, with all due respect, the Google index remains Google's property and no one is justified in reserving special privilege for themselves within Google's index.