This module provides functionality for resolving shortened and shimmed URLs.
- Source:
Members
(static, constant) ampMatchPatternSet :matching.MatchPatternSet
A MatchPatternSet for AMP caches and viewers.
Type:
- matching.MatchPatternSet
- Source:
(static, constant) ampRegExp :RegExp
A RegExp that matches and parses AMP cache and viewer URLs. If there is a match, the RegExp provides several named capture groups.
- AMP Cache Matches
ampCacheSubdomain
- The subdomain, which should be either a reformatted version of the URL domain or a hash of the domain. If there is no subdomain, this capture group isundefined
.ampCacheDomain
- The domain for the AMP cache.ampCacheContentType
- The content type, which is eitherc
for an HTML document,i
for an image, orr
for another resource.ampCacheIsSecure
- Whether the AMP cache loads the resource via HTTPS. If it does, this capture group has the values/
. If it doesn't, this capture group isundefined
.ampCacheUrl
- The underlying URL, without a specified scheme (i.e.,http://
orhttps://
).
- AMP Viewer Matches
ampViewerDomainAndPath
- The domain and path for the AMP viewer.ampViewerUrl
- The underlying URL, without a specified scheme (i.e.,http://
orhttps://
).
Type:
- RegExp
- Source:
- See:
(static, constant) facebookLinkShimRegExp :RegExp
A RegExp for matching URLs that have had Facebook's link shim applied.
Type:
- RegExp
- Source:
(static, constant) urlShortenerMatchPatternSet :matching.MatchPatternSet
A matching.MatchPatternSet for known URL shorteners, based on the match patterns loaded from urlShortenerMatchPatterns.js
.
Type:
- matching.MatchPatternSet
- Source:
(static, constant) urlShortenerRegExp :RegExp
A RegExp for known URL shorteners, based on the match patterns loaded from urlShortenerMatchPatterns.js
.
Type:
- RegExp
- Source:
Methods
(static) initialize()
Initialize the module, registering event listeners for resolveUrl
and built-in content scripts for parsing
and registering URL mappings (currently Twitter and Google News). Runs only once. This function is automatically
called by resolveUrl
, but you can call it separately if you want to use registered URL mappings without
resolveUrl
.
- Source:
(static) parseAmpUrl(url) → {string}
Parse the underlying URL from an AMP cache or viewer URL, if the URL is an AMP cache or viewer URL.
Parameters:
Name | Type | Description |
---|---|---|
url |
string | A URL that may be an AMP cache or viewer URL. |
- Source:
Returns:
If the URL is an AMP cache or viewer URL, the parsed underlying URL. Otherwise, just the URL.
- Type
- string
(static) parseFacebookLinkShim(url) → {string}
Parse a URL from Facebook's link shim, if the shim was applied to the URL.
Parameters:
Name | Type | Description |
---|---|---|
url |
string | A URL that may have Facebook's link shim applied. |
- Source:
Returns:
If Facebook's link shim was applied to the URL, the unshimmed URL. Otherwise, just the URL.
- Type
- string
(static) registerUrlMappings(urlMappings, pageIdopt) → {RegisteredUrlMappings}
Register known URL mappings for use in link resolution. This functionality allows studies to minimize HTTP requests for link resolution when a URL mapping can be parsed from page content.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
urlMappings |
Array.<UrlMapping> | The URL mappings to register. |
||
pageId |
string |
<optional> |
null | An optional page ID for the page that the URL mappings were parsed from. If a page ID is provided, the mappings will be automatically removed shortly after the page visit ends. |
- Source:
Returns:
An object that allows unregistering the URL mappings.
- Type
- RegisteredUrlMappings
Example
// A content script parses URL mappings from a Twitter page, then in the background script:
webScience.linkResolution.registerUrlMappings([
{
sourceUrl: "https://t.co/djogkKUD5y?amp=1",
destinationUrl: "https://researchday.princeton.edu/",
ignoreSourceUrlParameters: true
},
// Note that the following mapping involves a known URL shortener and would require further resolution
{
sourceUrl: "https://t.co/qQTRITLZKP?amp=1",
destinationUrl: "https://mzl.la/3jh1VgZ",
ignoreSourceUrlParameters: true
}
]);
(static) removeFacebookLinkDecoration(url) → {string}
Remove Facebook link decoration (the fbclid
paramater) from a URL, if present.
Parameters:
Name | Type | Description |
---|---|---|
url |
string | A URL that may have Facebook link decoration. |
- Source:
Returns:
The URL without Facebook link decoration.
- Type
- string
(static) resolveUrl(url, optionsopt) → {Promise.<string>}
Resolve a shortened or shimmed URL to an original URL, by recursively resolving the URL and removing shims.
Parameters:
Name | Type | Attributes | Description | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
url |
string | The URL to resolve. |
|||||||||||||||||||||||||||||||
options |
Object |
<optional> |
Options for resolving the URL. Properties
|
- Source:
Returns:
- A Promise that either resolves to the original URL or is rejected with an error.
- Type
- Promise.<string>
(static) urlToPS1(url) → {string}
Extracts the public suffix + 1 from a URL.
Parameters:
Name | Type | Description |
---|---|---|
url |
string | The URL. |
- Source:
Returns:
The public suffix + 1.
- Type
- string
Example
Example usage of urlToPS1.
// returns "mozilla.org"
urlToPS1("https://www.mozilla.org/");
Type Definitions
RegisteredUrlMappings
Type:
- Object
Properties:
Name | Type | Description |
---|---|---|
unregister |
function | Unregister the URL mappings. |
- Source:
UrlMapping
Type:
- Object
Properties:
Name | Type | Description |
---|---|---|
sourceUrl |
string | The source URL for the mapping. |
destinationUrl |
string | The destination URL for the mapping. |
ignoreSourceUrlParameters |
boolean | Whether to ignore parameters when matching URLs against the source URL. |
- Source: