Three Thoughts of Anti - cheating in Search Engine

In the previous article, Viker mentions content farms with linked farms , which are relatively traditional cheating methods. As for the other, there is time behind us again and again. Since there is cheating, in order to maintain a good show results, the search engine will have anti-cheating.

So, what will the search engine do with anti-cheating?

In general, there are three ideas:

1, trust propagation model

Some time ago colleagues told me that one thing, his sister B study abroad, and then one day in the qq with A said that the clothes to the phone, the bank card all the washed, so that A to B's mother Call to send money, A immediately executed, B's mother also convinced that in the upcoming moment, B a phone, to save the imminent loss.

Scrutiny.

In fact, this is also a trust communication model, directly on the QQ with the mother of B said, B's mother must not fully believe, but through the mouth of a pass, it is not normal. Because, in the B's mother's trust list.

OK, the resulting, is a kind of anti-cheating ideas.

In the massive web page data, by means of technical or artificial means, get worthy of the trustworthy web page set to white list. The trust value of the page in the whitelist is decremented or decayed by the link along the outward spread. Then, setting a value above this value is OK on the page, below this value, sorry, you are cheating.

2, do not trust the dissemination of the model

This, in fact, with the first idea is similar to that, find a group of cheating pages, and then through the link relationship analysis does not trust the score.

The only thing to note is that the trust score is passed forward through the link, and the non-confidence score is passed backwards through the link. for example:

A is a spam page, then the link to point B is the probability that the spam page has a greater probability than the link C pointed to by A.

3, abnormal discovery model

The so-called exception, can be cheating the characteristics of the page, it can be a normal page characteristics.

Set up these features, and then to determine whether a page cheating or whether it is normal, it is pretty OK one thing.

A simple example of a simple link to a farm:

Links to the farm in the link relationship, is carefully arranged by the designer, and thus, there will be some contrary to the characteristics of nature, such as:

1), out of the chain and the chain of statistical distribution: the normal page out of the chain and the chain to meet the Power-law distribution, cheating links violate the distribution;

2), cheating the link Url URL is often too long, including more points to draw lines and numbers;

3), such as the chain and the chain of growth rate, etc., normal web pages and cheating pages in these patterns of change is different.

Comments

Popular Posts