This module provides functionality for resolving shortened and shimmed URLs.
- Source:
Members
(static, constant) ampMatchPatternSet :matching.MatchPatternSet
A MatchPatternSet for AMP caches and viewers.
Type:
- matching.MatchPatternSet
- Source:
(static, constant) ampRegExp :RegExp
A RegExp that matches and parses AMP cache and viewer URLs. If there is a match, the RegExp provides several named capture groups.
- AMP Cache Matches
ampCacheSubdomain- The subdomain, which should be either a reformatted version of the URL domain or a hash of the domain. If there is no subdomain, this capture group isundefined.ampCacheDomain- The domain for the AMP cache.ampCacheContentType- The content type, which is eithercfor an HTML document,ifor an image, orrfor another resource.ampCacheIsSecure- Whether the AMP cache loads the resource via HTTPS. If it does, this capture group has the values/. If it doesn't, this capture group isundefined.ampCacheUrl- The underlying URL, without a specified scheme (i.e.,http://orhttps://).
- AMP Viewer Matches
ampViewerDomainAndPath- The domain and path for the AMP viewer.ampViewerUrl- The underlying URL, without a specified scheme (i.e.,http://orhttps://).
Type:
- RegExp
- Source:
- See:
(static, constant) facebookLinkShimRegExp :RegExp
A RegExp for matching URLs that have had Facebook's link shim applied.
Type:
- RegExp
- Source:
(static, constant) urlShortenerMatchPatternSet :matching.MatchPatternSet
A matching.MatchPatternSet for known URL shorteners, based on the match patterns loaded from urlShortenerMatchPatterns.js.
Type:
- matching.MatchPatternSet
- Source:
(static, constant) urlShortenerRegExp :RegExp
A RegExp for known URL shorteners, based on the match patterns loaded from urlShortenerMatchPatterns.js.
Type:
- RegExp
- Source:
Methods
(static) initialize()
Initialize the module, registering event listeners for resolveUrl and built-in content scripts for parsing
and registering URL mappings (currently Twitter and Google News). Runs only once. This function is automatically
called by resolveUrl, but you can call it separately if you want to use registered URL mappings without
resolveUrl.
- Source:
(static) parseAmpUrl(url) → {string}
Parse the underlying URL from an AMP cache or viewer URL, if the URL is an AMP cache or viewer URL.
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string | A URL that may be an AMP cache or viewer URL. |
- Source:
Returns:
If the URL is an AMP cache or viewer URL, the parsed underlying URL. Otherwise, just the URL.
- Type
- string
(static) parseFacebookLinkShim(url) → {string}
Parse a URL from Facebook's link shim, if the shim was applied to the URL.
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string | A URL that may have Facebook's link shim applied. |
- Source:
Returns:
If Facebook's link shim was applied to the URL, the unshimmed URL. Otherwise, just the URL.
- Type
- string
(static) registerUrlMappings(urlMappings, pageIdopt) → {RegisteredUrlMappings}
Register known URL mappings for use in link resolution. This functionality allows studies to minimize HTTP requests for link resolution when a URL mapping can be parsed from page content.
Parameters:
| Name | Type | Attributes | Default | Description |
|---|---|---|---|---|
urlMappings |
Array.<UrlMapping> | The URL mappings to register. |
||
pageId |
string |
<optional> |
null | An optional page ID for the page that the URL mappings were parsed from. If a page ID is provided, the mappings will be automatically removed shortly after the page visit ends. |
- Source:
Returns:
An object that allows unregistering the URL mappings.
- Type
- RegisteredUrlMappings
Example
// A content script parses URL mappings from a Twitter page, then in the background script:
webScience.linkResolution.registerUrlMappings([
{
sourceUrl: "https://t.co/djogkKUD5y?amp=1",
destinationUrl: "https://researchday.princeton.edu/",
ignoreSourceUrlParameters: true
},
// Note that the following mapping involves a known URL shortener and would require further resolution
{
sourceUrl: "https://t.co/qQTRITLZKP?amp=1",
destinationUrl: "https://mzl.la/3jh1VgZ",
ignoreSourceUrlParameters: true
}
]);
(static) removeFacebookLinkDecoration(url) → {string}
Remove Facebook link decoration (the fbclid paramater) from a URL, if present.
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string | A URL that may have Facebook link decoration. |
- Source:
Returns:
The URL without Facebook link decoration.
- Type
- string
(static) resolveUrl(url, optionsopt) → {Promise.<string>}
Resolve a shortened or shimmed URL to an original URL, by recursively resolving the URL and removing shims.
Parameters:
| Name | Type | Attributes | Description | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
url |
string | The URL to resolve. |
|||||||||||||||||||||||||||||||
options |
Object |
<optional> |
Options for resolving the URL. Properties
|
- Source:
Returns:
- A Promise that either resolves to the original URL or is rejected with an error.
- Type
- Promise.<string>
(static) urlToPS1(url) → {string}
Extracts the public suffix + 1 from a URL.
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string | The URL. |
- Source:
Returns:
The public suffix + 1.
- Type
- string
Example
Example usage of urlToPS1.
// returns "mozilla.org"
urlToPS1("https://www.mozilla.org/");
Type Definitions
RegisteredUrlMappings
Type:
- Object
Properties:
| Name | Type | Description |
|---|---|---|
unregister |
function | Unregister the URL mappings. |
- Source:
UrlMapping
Type:
- Object
Properties:
| Name | Type | Description |
|---|---|---|
sourceUrl |
string | The source URL for the mapping. |
destinationUrl |
string | The destination URL for the mapping. |
ignoreSourceUrlParameters |
boolean | Whether to ignore parameters when matching URLs against the source URL. |
- Source: