Url Blocking Rule

This article introduces how to write URL blocking rules in the XBrowser, if you are already familiar with the ABP blocking rules syntax, you can use the ABP rules syntax to write rules, X browser is compatible with the ABP rules syntax rules, here we only introduce the X browser to simplify the more accessible rules syntax.

This article describes how to write URL blocking rules in the XBrowser, if you are already familiar with the ABP blocking rules syntax can be completely used to write rules ABP rules syntax, XBrowser is compatible with the ABP rules syntax rules, here we only introduce the XBrowser to simplify the more accessible rules syntax.

Rules for matching domain

Single domain rule, as long as the domain name of the resource URL can be matched it will be hit.

Example 1

In the simplest case, use the complete domain name as the blocking rule, such as the following rule

www.example.com

The following resource URL will be hit by the rule.

https://www.exammple.com/paht/of/banner.js

Example 2

It is also possible to use subdomains or to form blocking rules together with wildcards, such as the rule below.

example.com
*.example.com
.example.com

The above rules have the same effect, just choose your own customary writing method, the following resource URLs will be hit

https://a.example.com/paht/of/banner.js
https://b.example.com/paht/of/banner.js
https://en.ad.example.com/paht/of/banner.js

Example 3

Wildcards are used for fuzzy matching and can simplify the writing of rules.

ad.*.example.com

The following resource URLs will be hit

https://ad.img.example.com/paht/of/banner.js
https://ad.js.example.com/paht/of/banner.js

Example 4

s*.example.com

The following resource URLs will be hit

https://s1.example.com/paht/of/banner.js
https://s2.example.com/paht/of/banner.js
https://s3.example.com/paht/of/banner.js

Rules for matching paths

Use only the path as the match condition, as long as the path can match it will be hit, as in the following example.

Example 1

/path/of/banner.js
*/path/of/banner.js

These two rules are equivalent and can hit the following resource URLs

https://www.example.com/path/of/banner.js
https://mydomain.com/path/of/banner.js
https://www.example.com/en/path/of/banner.js

Example 2

/path/*/banner.js

Wildcards can also be used, and resource URLs like the ones below will be hit.

https://www.example.com/path/of/banner.js
https://www.example.com/path/of/first/banner.js

Example 3

/path/of/banner.*

Resource URLs like the following will be hit

https://www.example.com/path/of/banner.js
https://www.example.com/path/of/banner.png
https://www.example.com/path/of/banner.jpg

Rules for matching query parameters

Query parameters are used as matching conditions, and as long as the query parameters can match they will be hit, as in the following example.

Example 1

&ct=bj&dit=

Resource URLs like the following will be hit

https://www.example.com/path/of/banner.js?lang=en&ct=bj&dit=100060

Example 2

Using wildcards

?frm=*&ct=*&dit=

Resource URLs like the following will be hit

https://www.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://www.example.com/path/of/banner.js?frm=jp&ct=tokyo&dit=100083

Combination usage

As we know from the above example, the blocking rules can match the domain name, path, and query parameters separately, and not only that, we can combine them to get more accurate matches.

Example 1

example.com/path/of/banner.js?frm=

Resource URLs like the following will be hit

https://www.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://s1.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100081

Example 1

Using wildcards

example.com/*?frm=cn&ct=*&dist=

Resource URLs like the following will be hit

https://www.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://s1.example.com/service/ad/banner?frm=cn&ct=sz&dit=100024

Advanced Usage

Use the control parameter “$3p”

3p is the abbreviation of “”third-party””, sometimes we only want our blocking rules to take effect only for off-site resources, in other words, the rules only apply to the URL of resources different from the current website domain.

/path/of/banner.js$3p

Assuming that the website we are currently visiting is www.example.com , the rule will hit the following resource URL

https://mydomain.com/path/of/banner.js?frm=cn&ct=bj&dit=100080

but allow the following one

https://www.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080

Use regular expressions

If you are familiar with regular expressions, you can match resource URL directly with regular expressions.
We specify that a regular expression rule starts with “–” followed by a regular expression, as shown in the following example.

--ad(\d{1,2})?\.example\.com

Resource URLs like the following will be hit

https://ad.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://ad01.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://ad02.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080

The following resource URLs, which are similar But can’t match with regex, will be allowed

https://ads.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080
https://ad123.example.com/path/of/banner.js?frm=cn&ct=bj&dit=100080

Adding domain scopes to rules

In order to make the rules more precise and avoid accidentally hurting the resources of other sites. We can add a domain scope to the rule to restrict the rule to the domain specified by the scope, in the format of “rule@domain list”, as in the following example.

/path/of/banner.js@my.example.com

This rule is only available for sites with the domain name my.example.com

/path/of/banner.js@example.com

This rule is valid for sites with the first-level domain name example.com

/path/of/banner.js@my.example.com,mysite.com,myspace.com

Rules can take effect under multiple specified domains

Combination usage

Common rules and control parameters can be used in combination, here are some examples of legal rules.

/path/of/banner.js$3p@example.com
--ad(\d{1,2})?\.example\.com$3p
/path/of/banner.js$~3p@example.com

Performance recommendations

Please give preference to rules that do not contain wildcards. Simply matching domain names, paths, query parameters, or a combination of them is very fast and does not require traversal, and hundreds of thousands of rules will not affect its performance.

Rules with wildcards are internally converted to the greedy mode of regular expressions. It is known that the greedy mode of regular expressions has a lower performance. Therefore, it is preferred to write rules using combinations of domain names, paths, and query parameters that do not contain wildcards whenever possible.

For example, the following rules

*/path/of/banner.js
example.com/ads/*
?frm=ch&ct=bj&dit=*

It is recommended to use the new writing style to replace

/path/of/banner.js
example.com/ads/
?frm=ch&ct=bj&dit=