Tuesday, 10 March 2015

Google Ranks Websites according to Facts, not just Links. Here's Why?


Recently, Google is developing a way to rank websites according to the accuracy of information in addition to things like links. They compare factual accuracy against their Knowledge Vault. I think this is a great idea.

The first time when I heard about the ranking of websites for the Google search results based largely on back-links, and has no regards for facts, I thought, "Wha? That's incredible!". You can't expect the net users themselves to judge what facts are good, and what aren't. Not knowing the facts is what led them to the sites to search for facts in the first place.

We could be asked to use our own judgement. What if we're too young, too ignorant, too gullible to judge? What if the facts are too complicated? Before the internet (and now), the publishing companies are doing the fact checking. But there's nobody checking the facts on the internet, and everyone could be a publisher.

It seems Google is finally mending this shortcoming, in a way. Well, they fact check us, but not correcting us because it's an impossible task, nor it's Google's place to do so.

Google search for Knowledge Vault


Not All Information Are Created Equal

As far as information goes (on the net or in real life), it falls into 3 categories: facts, opinions, and lies (disinformation or misinformation or both).

There're sites that are simply spreading lies knowingly (or unknowingly through sloppy research), and are quite popular (judging by its links). There MUST be tons of sites like that out of the hundreds of million of sites (664 million according to Netcraft). I'm sure you came across some. This is what Google would like to weed out. Of course, there're sites that are totally honest, but still contains errors in their selected pages (I'm guilty of that. We're only human. Even Wikipedia contains flaws). Google looks at other pages in that site overall when ranking the specific page. They won't punish you just for making a few mistakes now and then. But if you consistently and extensively doing it is another matter.

What is fact? Zeus is an ancient Greek god, and Zeozobebo isn't. Even thought we know Zeus doesn't exist in real life, but it's a fact! This is because it's in Knowledge Vault (I presume. I'm 95% certain). But Zeozebebo isn't fact because I made it up just for this article. I'm shock if it's in the Knowledge Vault (I'm 99.99999% certain it isn't). Simply put, what is considered facts have nothing to do with reality, it has everything to do with the Knowledge Vault. It's all to do with matching your content with those in the Knowledge Vault. It's the Bible for this fact checking algorithm.

One would think that some sites can't be judged by factual accuracy. For example, my own blog which gives opinions on food, movies, etc. In other words, sites that do reviews of, say food, movies, travel, etc. These sites contain mostly subjective statements, not facts. It doesn't mean these sites aren't useful. In short, I don't think all sites are or should be subject to fact checking.


Photo: Wikipedia
Well, remember that all sites are ranked by comparing similar sites. All food blogs, for example, will be subjected to the same factual accuracy treatment. So they're on equal footing.

Having said that, I suspect Google won't apply this fact checking algorithm to all sites. It's pointless to apply this algorithm to, say a dating site or forums where they're supposedly self-correcting. It would probably only apply to those sites where factual accuracy is of utmost important. For example, engineering, science, religion, geography, history or law related sites where they pose as authority that disseminate information to the public. In order words, the serious subjects. Yes, I came across science youtube sites that disseminate total garbage. And many sites give total "false facts" about religions. In fact, this is one area where disinformation or misinformation is thriving because they can't be checked by reality. You could call them heresy. This is bad for us and Google. It's wasting our time, at best.

Another perfect example of sites where fact checking can't be applied is news network sites because they generate facts. They're in fact call facts into existence. For example, some well known people die today, and the media reports about it. In other words, they just created facts by the mere act of reporting to public. They're always ahead of the Knowledge Vault, and so such checking isn't possible. You could say "news" can be defined as those facts that aren't in the Knowledge Vault.

A good example of site where fact checking is good idea would be site like Wikipedia. At the moment, Wikipedia almost always appears at the top of search results, I'm very interested to know how well they pass this litmus test. If they pass the test with flying colours, this implies Wikipedia is accurate, according to Google Knowledge Vault anyway. But then Knowledge Vault also relies on Wikipedia. So I think it will still stay on top. Whether Wikipedia is a good source for Google's Knowledge Vault is a different issue. But Google doesn't just rely on Wikipedia alone. It also uses sources like CIA Fact Book, and so forth. The more sources the better, but it also makes Google's job harder and therefore may introduce additional errors.



Better Late Than Never

Remember that Google isn't saying that they're getting rid all search ranking based on links. This is ridiculous because Google had already spent an enormous amount of resources on them, and then just throw it all away would be crazy.

I saw headings like, "Google Ranks Websites according to Facts, not Links", when it should be, "Google Ranks Websites according to Facts, not JUST Links".

The page with the title, "Google Ranks Websites according to Facts, not Links" should be ranked lower than the one with the title, "Google Ranks Websites according to Facts, not JUST Links" because it's just less accurate (if anything else being equal), and misleading.

Google will use this fact checking in addition to their existing ranking algorithm to sites that are applicable.  In other words, Google isn't removing all these sites from their search engine. It's just if everything else (such as backlinks) are being equal, the site with more factual accuracy ranks higher.

Bank Vault
Photo: Wikipedia By Jonathunder (Own work)
[CC BY-SA 3.0 or GFDL], via Wikimedia Commons

If it's such a good idea, why Google didn't put it in the beginning? Like me, I suspect Google always would like to rank sites with more factual accuracy higher, but have no ability to do so until now.

In order to apply this fact checking algorithm, Google needs to build up the Knowledge Vault. This takes time, and now it has accumulated enough data ammo to carry this goal. Also, the tool for the very complex algorithm (one would imagine) and its ramifications also needs to be worked out. And Google also needs lightning-speed hardware for such mammoth task, which isn't available until now. I'm sure it's easier if Google don't do it. This is one more big task to do, and cost them more resources. They do it to stay competitive as the best search engine.

This is all part of the evolution of Google search engine when their capability is upgraded. This is a next logical step. And the netizen is better for it. We're all becoming more informed and less misinformed or disinformed.



No comments:

Post a Comment