Module: matching

This module provides utilities for matching URLs against criteria.

Matching Criteria

The module supports two types of criteria:

Matching Output

The module supports three types of output for matching URLs:

  • Match Pattern Sets (preferred) - optimized objects that compare a URL against the criteria.
  • Regular Expressions - RegExp objects that compare a URL against the criteria.
  • Regular Expression Strings - strings expressing regular expressions for comparing a URL against the criteria.

Implementation Notes

We use Rollup pure annotations (@__PURE__ comments) because Rollup assumes that iterators might have side effects (including subtle cases of iteration like Array.map and Array.join). Without the annotations, Rollup would mark arguments for many of this module's functions (which might be large string arrays) as tainted by side effects and always include those arguments in bundled output. The pure annotations are associated with either iteration functions or class instantiation to provide clarity about why they're needed.

Source:
See:

Classes

MatchPatternSet

Methods

(static) createMatchPatternSet(matchPatterns) → {MatchPatternSet}

Create a new MatchPatternSet for matching a set of match patterns.

Parameters:
Name Type Description
matchPatterns Array.<string>

An array of match pattern strings.

Source:
Returns:
  • The new MatchPatternSet.
Type
MatchPatternSet

(static) domainsToMatchPatterns(domains, matchSubdomainsopt) → {Array.<string>}

Generates a set of match patterns for a set of domains. The match patterns will use the special "*" wildcard scheme (matching "http", "https", "ws", and "wss") and the special "/*" wildcard path (matching any path).

Parameters:
Name Type Attributes Default Description
domains Array.<string>

The set of domains to match against.

matchSubdomains boolean <optional>
true

Whether to match subdomains of domains in the set.

Source:
Returns:

Match patterns for the domains in the set.

Type
Array.<string>

(static) domainsToRegExp(domains, matchSubdomainsopt) → {RegExp}

Generates a RegExp object for matching a URL against a set of domains. The regular expression is based on match patterns generated by domainsToMatchPatterns and has the same matching properties.

Parameters:
Name Type Attributes Default Description
domains Array.<string>

The set of domains to match against.

matchSubdomains boolean <optional>
true

Whether to match subdomains of domains in the set.

Source:
Returns:

A RegExp object for matching a URL against the set of domains.

Type
RegExp

(static) domainsToRegExpString(domains, matchSubdomainsopt) → {string}

Generates a regular expression string for a set of domains. The regular expression is based on match patterns generated by domainsToMatchPatterns and has the same matching properties.

Parameters:
Name Type Attributes Default Description
domains Array.<string>

The set of domains to match against.

matchSubdomains boolean <optional>
true

Whether to match subdomains of domains in the set.

Source:
Returns:

A regular expression string for matching a URL against the set of domains.

Type
string

(static) escapeRegExpString(string) → {string}

Escapes regular expression special characters in a string.

Parameters:
Name Type Description
string string

The input string.

Source:
See:
Returns:

The input string with regular expression special characters escaped.

Type
string

(static) importMatchPatternSet(exportedMatchPatternSet) → {MatchPatternSet}

Restore a MatchPatternSet that was serialized to an object with the export function.

Parameters:
Name Type Description
exportedMatchPatternSet Object

A serialized MatchPatternSet.

Source:
Returns:
  • The new MatchPatternSet.
Type
MatchPatternSet
Example

Example usage of import.

const matchPatternSet1 = webScience.matching.createMatchPatternSet([ "*://example.com/*" ]);
const exportedMatchPatternSet = matchPatternSet.export();
const matchPatternSet2 = webScience.matching.importMatchPatternSet(exportedMatchPatternSet);

(static) matchPatternsToRegExp(matchPatterns) → {RegExp}

Converts an array of match patterns into a RegExp object.

Parameters:
Name Type Description
matchPatterns Array.<string>

The match patterns.

Source:
Throws:

Throws an error if the match pattern is not valid.

Returns:

The regular expression RegExp object.

Type
RegExp

(static) matchPatternsToRegExpString(matchPatterns) → {string}

Converts an array of match patterns into a regular expression string.

Parameters:
Name Type Description
matchPatterns Array.<string>

The match patterns.

Source:
Throws:

Throws an error if the match pattern is not valid.

Returns:

The regular expression string.

Type
string

(static) normalizeUrl(url) → {string}

Normalizes a URL string for subsequent comparison. Normalization includes the following steps:

  • Parse the string as a URL object, which will (among other normalization) lowercase the scheme and hostname.
  • Remove the port number, if any. For example, https://www.mozilla.org:443/ becomes https://www.mozilla.org/.
  • Remove query parameters, if any. For example, https://www.mozilla.org/?foo becomes https://www.mozilla.org/.
  • Remove the fragment identifier, if any. For example, https://www.mozilla.org/#foo becomes https://www.mozilla.org/.
Parameters:
Name Type Description
url string

The URL string to normalize.

Source:
Throws:

Throws an error if the match pattern is not valid, absolute URL.

Returns:

The normalized URL string.

Type
string