This module provides utilities for matching URLs against criteria.
Matching Criteria
The module supports two types of criteria:
- Match Patterns (preferred) - a syntax used in the WebExtensions API for expressing possible URL matches. See: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns.
- Domains - a simple list of domain names, which are converted into match patterns.
Matching Output
The module supports three types of output for matching URLs:
- Match Pattern Sets (preferred) - optimized objects that compare a URL against the criteria.
- Regular Expressions -
RegExp
objects that compare a URL against the criteria. - Regular Expression Strings - strings expressing regular expressions for comparing a URL against the criteria.
Implementation Notes
We use Rollup pure annotations (@__PURE__
comments) because Rollup assumes that iterators might have side
effects (including subtle cases of iteration like Array.map
and Array.join
). Without the annotations, Rollup
would mark arguments for many of this module's functions (which might be large string arrays) as tainted by side
effects and always include those arguments in bundled output. The pure annotations are associated with either
iteration functions or class instantiation to provide clarity about why they're needed.
- Source:
- See:
Classes
Methods
(static) createMatchPatternSet(matchPatterns) → {MatchPatternSet}
Create a new MatchPatternSet for matching a set of match patterns.
Parameters:
Name | Type | Description |
---|---|---|
matchPatterns |
Array.<string> | An array of match pattern strings. |
- Source:
Returns:
- The new MatchPatternSet.
- Type
- MatchPatternSet
(static) domainsToMatchPatterns(domains, matchSubdomainsopt) → {Array.<string>}
Generates a set of match patterns for a set of domains. The match patterns will use the special "*" wildcard scheme (matching "http", "https", "ws", and "wss") and the special "/*" wildcard path (matching any path).
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
domains |
Array.<string> | The set of domains to match against. |
||
matchSubdomains |
boolean |
<optional> |
true | Whether to match subdomains of domains in the set. |
- Source:
Returns:
Match patterns for the domains in the set.
- Type
- Array.<string>
(static) domainsToRegExp(domains, matchSubdomainsopt) → {RegExp}
Generates a RegExp object for matching a URL against a set of domains. The regular expression
is based on match patterns generated by domainsToMatchPatterns
and has the same matching
properties.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
domains |
Array.<string> | The set of domains to match against. |
||
matchSubdomains |
boolean |
<optional> |
true | Whether to match subdomains of domains in the set. |
- Source:
Returns:
A RegExp object for matching a URL against the set of domains.
- Type
- RegExp
(static) domainsToRegExpString(domains, matchSubdomainsopt) → {string}
Generates a regular expression string for a set of domains. The regular expression is based on
match patterns generated by domainsToMatchPatterns
and has the same matching properties.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
domains |
Array.<string> | The set of domains to match against. |
||
matchSubdomains |
boolean |
<optional> |
true | Whether to match subdomains of domains in the set. |
- Source:
Returns:
A regular expression string for matching a URL against the set of domains.
- Type
- string
(static) escapeRegExpString(string) → {string}
Escapes regular expression special characters in a string.
Parameters:
Name | Type | Description |
---|---|---|
string |
string | The input string. |
- Source:
- See:
Returns:
The input string with regular expression special characters escaped.
- Type
- string
(static) importMatchPatternSet(exportedMatchPatternSet) → {MatchPatternSet}
Restore a MatchPatternSet that was serialized to an object with
the export
function.
Parameters:
Name | Type | Description |
---|---|---|
exportedMatchPatternSet |
Object | A serialized MatchPatternSet. |
- Source:
Returns:
- The new MatchPatternSet.
- Type
- MatchPatternSet
Example
Example usage of import.
const matchPatternSet1 = webScience.matching.createMatchPatternSet([ "*://example.com/*" ]);
const exportedMatchPatternSet = matchPatternSet.export();
const matchPatternSet2 = webScience.matching.importMatchPatternSet(exportedMatchPatternSet);
(static) matchPatternsToRegExp(matchPatterns) → {RegExp}
Converts an array of match patterns into a RegExp object.
Parameters:
Name | Type | Description |
---|---|---|
matchPatterns |
Array.<string> | The match patterns. |
- Source:
Throws:
Throws an error if the match pattern is not valid.
Returns:
The regular expression RegExp object.
- Type
- RegExp
(static) matchPatternsToRegExpString(matchPatterns) → {string}
Converts an array of match patterns into a regular expression string.
Parameters:
Name | Type | Description |
---|---|---|
matchPatterns |
Array.<string> | The match patterns. |
- Source:
Throws:
Throws an error if the match pattern is not valid.
Returns:
The regular expression string.
- Type
- string
(static) normalizeUrl(url) → {string}
Normalizes a URL string for subsequent comparison. Normalization includes the following steps:
- Parse the string as a
URL
object, which will (among other normalization) lowercase the scheme and hostname. - Remove the port number, if any. For example, https://www.mozilla.org:443/ becomes https://www.mozilla.org/.
- Remove query parameters, if any. For example, https://www.mozilla.org/?foo becomes https://www.mozilla.org/.
- Remove the fragment identifier, if any. For example, https://www.mozilla.org/#foo becomes https://www.mozilla.org/.
Parameters:
Name | Type | Description |
---|---|---|
url |
string | The URL string to normalize. |
- Source:
Throws:
Throws an error if the match pattern is not valid, absolute URL.
Returns:
The normalized URL string.
- Type
- string