Baidu released a ranking list of 30 websites with “original content” on November 5th, in order to encourage website owners to create original content instead of copying articles from other sites, and to use its new star product – Bear Paw Account, which is supposed to help original content get proper rankings on Baidu.
However, according to our research, websites on the ranking list don’t seem to be producing original content; instead, they are just copying or rewriting articles from other sources, like most other Chinese websites.
For example, on this No.1 website called “Apple Green Health Site” (Pingguolv), we randomly selected an article, searched a part of one sentence in Baidu and found multiple articles with almost the same text body – only with different titles, which were mostly published prior to “Apple Green”.
We also tried the No. 4 website on the list, which is called “Twelfth Grade Site” (Gaosanwang), and the results are just the same – we randomly selected a recent post on the site and easily found several sources with the same content but a prior index date in Baidu’s search results.
So you can see how ironical it is. Baidu hasn’t been doing good in identifying original content by its algorithm updates, and now it’s not doing a good job in verifying such content even by artificial approval mechanism for Bear Paw accounts either.
Maybe Baidu thinks it’s endurable that these websites contain some duplicate articles, but I really doubt if they do have any original content. Most of the sites on the list were never heard by any of us.
Take the No.4 “Twelfth Grade Site” site for example, although its domain was registered in as early as May 2004 and it owns as many as 231,142 search results on Baidu, most of its posts were indexed by Baidu in the past two years and this domain was on sale until April 2015 according to archive.org, when relevant content began to show on the site. It’s basically impossible for an ordinary Chinese website to create more than one hundred thousand original posts in only three years.
As great content aggregators, these websites might be very popular among certain populations in China, but at the mention of originality, Baidu is just fooling itself as well as others. If Baidu cannot substantially protect original content from being duplicated, it wouldn’t be able to win back content creators, many of who have been migrating to other platforms like WeChat and Toutiao in recent years. And it could be easily defeated by potential competitors like Google who is planning its return to China.
So what can SEOs and website owners learn from such contradictions presented by Baidu? On one hand, it’s widely known that Baidu isn’t doing good at identifying original content, but what we are sure is that original content would be better than duplicate content, especially for new websites. So if you have a lot of articles in English, you should certainly translate them into Chinese and optimize them from SEO and cultural perspectives, with which we can help.
On the other hand, these websites recognized by Baidu might have copied a lot of articles from other websites, but this should have been done in a tactical way, and when its number of posts reached a certain scale, and the website becomes more useful and reliable than other similar sites, it could be valued by Baidu.
However, we cannot neglect the possibility that Baidu just made some mistakes in creating the list – its content or branding team might haven’t got the right data from its technical team. And some sites could be penalized by Baidu’s algorithm updates in the future. This is possible because Baidu isn’t a consistent company in history, no matter in search engine technology or in its product adoption and maintenance.