Overview
This module addresses several challenges for studying user engagement with web content.
- Syncing Measurements and Interventions. A study that uses
WebScience
will often involve multiple measurements or interventions on a webpage. ThepageManager
module enables studies to sync these measurements and interventions by assigning a random unique identifier to each webpage. - Generating Page Lifecycle Events. Measurements and interventions are often
linked to specific events in the webpage lifecyle. The
pageManager
module standardizes a set of webpage lifecycle events. - Tracking User Attention. Measurements and interventions often depend on user
attention to web content. The
pageManager
module provides a standardized attention model that incorporates tab switching, window switching, application switching, locked screens, and user mouse and keyboard input. - Generating Audio Events. This module provides events for webpage audio, enabling measurements and interventions based on media playback.
- Bridging the Background and Content Script Environments. WebExtensions includes two distinct execution environments: background scripts and content scripts. These execution environments are, unfortunately, only loosely bound together by tab IDs. As a result, there can be race conditions---the background and content environments can have mismatched states, such that messages arrive at the wrong webpage or are attributed to the wrong webpage. This module provides provides page lifecycle, user attention, and audio events that are bound to specific webpages.
Pages
This module creates an abstraction over webpages as perceived by users (i.e., when content loads with a new HTTP(S) URL in the browser bar or the page visibly reloads). Note that the History API enables web content to modify the URL without loading a new HTML document via HTTP(S) or creating a new Document object. This module treats a URL change via the History API as equivalent to traditional webpage navigation, because (by design) it appears identical to the user. Accounting for the History API is important, because it is used on some exceptionally popular websites (e.g., YouTube).
Page IDs
Each page ID is a random (v4) UUID, consistent with RFC4122.
Page Lifecycle
Each webpage has the following lifecycle events, which fire in both the background page and content script environments.
- Page Visit Start - The browser has started to load a webpage in a tab. This
event is fired early in context script execution (i.e., soon after
document_start
). For a webpage with a new Document, the event is timestamped with the time thewindow
object was created (the time origin from the High Resolution Time Level 2 API, in ms). For a webpage that does not have a new Document (i.e., resulting from the History API), the event is timestamped with the URL change in the WebNavigation API. - Page Visit Stop - The browser is unloading the webpage. Ordinarily this
event fires and is timestamped with the
window
unload event. When the page changes via the History API, this event fires and is timestamped with the URL change in the WebNavigation API.
Attention Tracking
Attention to a page is defined as satisfying all of the following conditions.
- The tab is the active tab in its browser window.
- The window containing the tab is the current browser window.
- The current browser window has focus in the operating system.
- The operating system is not displaying a lock screen or screen saver.
- Optional: The user has provided mouse or keyboard input within a specified time interval.
In the content script environment, each page has an attention status, and an event
fires when that status changes. Attention update events are timestamped with events
from the WebExtensions tabs
, windows
, and idle
APIs.
Audio Events
In the content script environment, each page has an audio status, and an event fires
when that status changes. Audio update events fire and are timestamped with events
from the WebExtensions tabs
API.
Event Ordering
This module guarantees the ordering of page lifecycle, attention, and audio events.
- Page visit start and page visit stop only fire once for each page, in that order.
- Page attention and audio update events will only occur between page visit start and stop events.
Additional Implementation Notes
This module depends on the idle
API, which has a couple quirks in Firefox:
- There is a five-second interval when polling idle status from the operating system.
- Depending on the platform, the idle API reports either time since user input to the browser or time since user input to the operating system.
The polling interval coarsens the timing of page attention events related to idle state. As long as the polling interval is relatively short in comparison to the idle threshold, that should not be an issue.
The platform-specific meaning of idle state should also not be an issue. There is only a difference between the two meanings of idle state when the user is providing input to another application; if the user is providing input to the browser, or is not providing input at all, the two meanings are identical. In the scenario where the user is providing input to another application, the browser will lose focus in the operating system; this module will detect that with the windows API and fire a page attention event (if needed).
Some implementation quirks to be aware of for future development on this module:
- Non-browser windows do not appear in the results of
windows.getAll()
, and callingwindows.get()
on a non-browser window throws an error. Switching focus to a non- browser window will, however, fire thewindows.onFocusChanged
event. The module assumes that ifwindows.onFocusChanged
fires with an unknown window, that window is a non-browser window. - The module assumes that valid tab IDs and window IDs are always >= 0.
Known Issues
- The background script sends update messages to tabs regardless of whether they
are ordinary tabs or have the pageManager content script running, because the
background script does not track window types or tab content. The errors
generated by this issue are caught in
messaging.sendMessageToTab
, and the issue should not cause any problems for studies.
Possible Improvements
- Rebuild a page attention update event in the background page environment.
- Rebuild the capability to fire events for pages that are already open when the module loads.
- Add logic to handle the situation where the content script execution environment crashes, so the page visit stop message doesn't fire from the associated content script.
- Add an event in the content script for detecting when content has lazily loaded into the DOM after the various DOM loading events (e.g., on Twitter).
- Source:
Namespaces
Methods
(static) initialize()
Initialize pageManager
in the background and content script environments. If you are using
pageManager
events in content scripts but not background scripts, you must call this function.
If you are using pageManager
events in background scripts, this function is automatically called
when adding a listener for an event. This function configures message passing between the
pageManager
background script and content script, registers browser event handlers, caches
initial state, and registers the pageManager
content script. It runs only once.
- Source: