Introduction to the principle of router (soft routing) blocking advertisements

Many users find that the advertisements of many websites are simply crazy, half of the pages are ads, and more are splash screen ads, following ads, pop-up ads, forced viewing, privacy theft and other vicious ads; This article will briefly introduce the ad blocking (blocking) directly in the router (soft route), so that the computer, mobile phone, tablet and other devices at home can effectively go to advertising, and also provide users with a refreshing reading and browsing effect.

PS: Friendly, beautiful and suitable advertising will not cause inconvenience to users; Instead, advertising is the majority of revenue for a website and app.

Go to common technical implementations of advertising

DNS filtering

(Typical: AdGuardHome) prohibits advertising-related DNS requests, only allows non-advertising request communication interception to occur before the start of network requests suitable for routers and other devices, once set, the whole family intranet takes effect without special settings for devices connected to the home intranet can only be identified by the domain name, and the advertising domain name and the content domain name are the same powerless, and the page content cannot be filtered

Browser plug-ins

(Typical: AdBlock) according to the browser sent by the request address, return page content filtering ad blocking occurs before the start of the network request and after the end of the request only for web browsers, other non-browser applications are invalid for each device, each browser needs to install additional plug-ins can identify the domain name, request, web content for advertising

Traffic filtering

(Typical: KoolProxyR) is similar to a global filter through which all network traffic passes. For encrypted requests (such as https), it is necessary to use a man-in-the-middle attack (MIMT) method to intercept it. Interception occurs before the start of the request, after it is suitable for routers and other devices, once set, the whole family intranet needs to install a fake certificate for the client (to achieve MIMT decryption traffic) to identify the domain name, request, and page content for advertising

With all the above technical methods, the more interception timing and identification points, the better the theoretical effect. But there is one most important thing that is not mentioned –

Rule base

Similar to antivirus software, the antivirus effect depends on the quality of the core virus database, and the de-advertising effect also depends on the quality of its rule base.

The difficulty of maintaining the de-advertising rule base is much higher than that of the virus database, and the virus generation speed is relatively stable, but advertising is not. Tens of millions of sites around the world, tens of billions of ad impressions per day, frequently changing ad placement methods, rule base maintenance is very difficult.

In addition, the core idea of the Internet is that the wool comes out of the pig. Site operation requires costs, do not collect money from users, and sites can only rely on other directions to obtain revenue, advertising is the main source of this. As a result, almost all sites are resistant to de-adtech (typically, the KoolShare forum prohibits discussions of similar technologies).

In order to ensure revenue, they will constantly change the way of advertising insertion according to the development of de-advertising technology, thereby circumventing the operation of de-adware. So

Advertising and de-advertising have always evolved in a constant struggle

。 Because ad implementation changes, an effective blocking rule is likely to fail overnight, or some sites allow you to wait a fixed amount of time to get content even if you block it (typical video sites).

In the long run, advertising is conducive to the survival of current Internet sites and is beneficial to users (if all sites do not make money, they can only close their doors in the end, and the interests are damaged by end users). For myself, I think that what needs to be blocked is actually malicious, coercive, inducing ads, trojan phishing links, privacy collection spam. For those sites that have great help for daily life, entertainment, and learning, it is a win-win approach to fund support, or take the initiative to click on advertisements to express support and help the site develop better.

Go to the principle of advertising technology

The process of surfing the Internet, technically, is the process by which a browser transmits content back from a remote server over the HTTP protocol and displays it. Browsers need to establish a TCP connection to connect to the server, and there are usually many connections. Advertising requests are no different from our ordinary requests in terms of transmission methods, mixed in normal requests.

How do I block ads? In two steps

identify

intercept

Identification is the identification of those from HTTP requests that are advertising requests. At present, we cannot easily tell whether this HTTP request is an ad or not by returning the content (it should be possible when the AI matures). The current technology can only make a fuss from the request address. The URL (address) of the Internet request contains the domain name and the path of the specific access resource, and it is now easier to judge the nature of the request through the domain name and the path of the access resource, and the similar ways that can be seen are to follow this identification idea.

Once identified, the next step is to intercept. There are two types of interception:

(1) Domain name method. The Internet access request is actually divided into two steps, the first step is to resolve the domain name through DNS to return the IP of the target server, and the second step is to establish a TCP connection to the remote server through the IP and use HTTP to transmit the content. The domain name method is to make an article in DNS, find that the request is an advertisement request, and directly return the domain name does not exist or an invalid IP, preventing the establishment of a TCP connection. The advantage of this method is that it is simple, fast, and does not interfere with normal TCP connections. The disadvantage is that DNS can only get domain names, but not full URL addresses, so it is impossible to filter some URL characteristics. Fortunately, now advertisements are generally product placement, that is, ordinary sites embed a link to the ad, usually the advertisement has a relatively clear domain name (usually the advertising domain name is not connected to the domain name of the visiting site), so it can achieve good results. This is also the principle of most current schemes.

(2) Traffic filtering method. Typical example is the famous koolProxyR. Quite set up a TCP proxy, all traffic goes to the proxy, the proxy can get the URL of all requests, according to the rules to decide whether to let these requests or not. The advantage is that it is possible to obtain a complete request in this way, which can be filtered beyond the domain name, and even identify the content when the AI matures. But there are flaws, and the loss of performance is one, but definitely not a big problem. The biggest problem is that websites are basically HTTPS requests, and the browser will verify the target website certificate to determine whether it is connected to a real address. This also makes traditional HTTP hijacking almost impossible to survive, even if the browser is forcibly connected to the fake website address, but the fake website can not create the certificate of the real website, the browser will also alarm. The growing popularity of HTTPS has severely affected koolProxyR’s similar filtering scheme, because it proxies requests initiated by the browser, it is only a transit to the real server, because the connection content is completely encrypted, it cannot be read, and even the URL cannot be obtained, so it cannot be blocked.

Of course, koolProxyR schemes can filter URLs and content by creating a fake certificate for each target website, pretending to be the real website that the browser wants to visit, but this creates additional problems.

Certificates need to be forged for all websites, and the root certificate that forges the certificate is placed in the root of trust of each browser. Usually there will be a lot of devices at home to surf the Internet, PC is fine, tablet, mobile phone, home camera, PS4, XBOX, etc. are troublesome. Of course, you can also set a whitelist for these devices, that is, it is not filtered, but it still increases the difficulty of maintenance.

Some apps don’t accept forged certificates. Typical financial applications, various payments, online banking, banking applications, and some e-commerce, they will strictly verify the server’s certificate chain, and forged certificates are not accepted at all. Such applications do not work in environments such as koolProxyR.

Therefore, traffic filtering is no longer a preferred way to go to advertising, and domain-based solutions have become mainstream.

Of course, there are also some solutions based on browser plug-ins, which are also effective, and browser plugins can achieve fine-grained filtering of domain names, URLs, and even page content. But the problem is that it can only be applied to browsers, usually PCs, and not to the entire home network like DNS solutions.

The HomeLeader firmware uses AdGuardHome as a de-advertising solution and is implemented based on DNS technology. The technical feature of DNS is to filter the domain name of the request, and the core of the filtering is the rule file. Similar to the virus database of antivirus software, AdGuardHome can mount various advertising rule libraries. Its accuracy depends on the rule base.

Now to recall the opening question.

Why is it launched to go to the ad or can still see the ad

The ad you see isn’t blocked in the rule library, or the ad is the same as the site body domain, and AdGuardHome can’t get rid of it at all.

The ad has been upgraded, and it is not currently included in the rule library and cannot be blocked.

Why don’t some sites show up?

The rule base mistakenly hurts some “normal sites”, or the rule base thinks that you are visiting an ad, but you don’t think it (in fact, what you have always thought to be normal content is actually an ad link)

Now it’s all HTTPS, and it’s useless to go to ad tech

Based on DNS de-advertising scheme, it has nothing to do with the use protocol. Even other protocols work as long as they involve DNS resolution.

What to do if you encounter an abnormal access

If you encounter an abnormal site visit and want to confirm whether it is related to de-advertising, what should I do?

http://192.168.1.1:3000 Open the AdGuardHome management interface (please modify the routing IP according to the actual situation)

User passwords are root

Home page, near the logo in the upper left corner, there is the word “disable protection”, click to disable ad filtering, click start again.

In addition, DNS caching may also affect site access, and anyone can clear it: disconnect the network interface and reconnect, restart the computer, restart the route.