Precisely observing structural page changes

SHARE ON

When integrating with a large, complex webapp, proceed with caution

This blog post is part of the
Mixmax 2017 Advent Calendar. The previous post on
December 6th was about
Database-backed job processing.

Mixmax is built on Gmail. Our product, and its convenience and power, depends on tight
user-interface integration with Gmail. In order to add features to Gmail for our users, we need to
track the structure of Gmail’s DOM and be able to manipulate it. For years, we achieved this by
crafting query selectors to identify important elements within Gmail’s DOM, and continuously
re-applying these selectors as the page changed. As we did this, we attached our own
content to the page.

Sadly, many users previously reported performance problems with Gmail when Mixmax was installed. Our
performance analysis showed significant processing time being spent in the code that observed the
page for changes. This code was thus a candidate for optimization.

Problem

Our performance analysis revealed that our existing implementation configured
MutationObserver instances to watch for changes anywhere within the document
tree, causing major page slowdown. We were using this approach to observe page changes within Gmail
without undue risk of breakage from changes in Gmail’s page structure. To combat the expected
performance issues endemic to responding to every little change in the DOM, we wrapped our handler
in _.throttle, thereby only responding to a small fraction of changes without losing correctness.
Underscore’s throttle function returns a closure function which checks the elapsed time since the
last call, and delegates to the provided function only when sufficient time has passed. In our case,
we discovered that simply running this check was enough to induce low performance, as our mutation
observers fired their handlers with such intensity that it overwhelmed poor V8.

We wanted to maintain functionality without the massive performance penalty. Optimally, we would
(instead of watching all changes) identify nodes we expect to change and observe them directly.

We realized early on that a good solution must balance performance and reliability. If we made our
declarations too specific, we would risk significant maintenance work to respond to Gmail changes,
but if we make them too indirect, we risk missing crucial page updates that allow us to introduce
our own controls into Gmail.

Solution

Instead of registering global MutationObserver instances, or leaning heavily on fragile,
specialized code to watch the right elements for changes, we now make heavy use of
page-parser-tree, a module written by Chris Cowan at Streak. page-parser-tree observes subsets
of a dynamic webpage, using declarations provided by the developer that identify important sections
of the page. The declarations consist of realtime watchers and polling finders, which pick out
specific elements and add them to tracked sets of elements dubbed “tags.”

The watchers declare parts of the page structure that we know will change—things like the
thread and message list containers, and the compose and reply buttons. Watchers are generally
hierarchical, and reference other tags to explore smaller and smaller DOM subtrees.

The finders define functions that run periodically (usually every five seconds) to detect important
elements the watchers might have missed. They do so by running query selectors against the entire
document. The finders will discover the same elements as our old approach, but without the
performance penalties associated with the global mutation observers. The finders thus serve two
purposes: to ensure that our integration doesn’t break entirely if we miss an edge-case, and to
provide fallback behavior in case Gmail’s page structure changes in a way that our watchers can’t accommodate.

Should the watchers and finders identify different sets of elements, we log the reported
inconsistency and some context. These reports give us a channel to proactively identify and fix
regressions related to Gmail updates.

Under the hood, page-parser-tree uses live-set, essentially a set with a subscribe method that
tracks changes to some group of objects. These live-sets can be converted to
Observable objects which have a slightly different subscribe methods with a
different use-case. Observable objects are useful—see the example in the next section.

In addition to page-parser-tree, we have some specialized code to handle poorly supported
edge-cases, and to achieve reliability unattainable by using page-parser-tree’s watchers. One
example is a preview pane (a
Gmail lab) agnostic thread navigation watcher, which emits events when the user changes the current
thread. The next section includes a few other examples of this.

The new approach avoids the performance issue with global mutation observers by not using them.
Profiling shows that page-parser-tree’s tricks make watching the page no longer a significant
performance problem.

Implementation

We used to monitor changes to Gmail’s structure using global subtree MutationObserver instances.
We had an ElementUtils module that provided onElementAdded and ensureElementExists functions
to detect elements given a query selector. The onElementAdded utility called the given handler
function when it noticed any element matching the query selector, whereas the ensureElementExists
function returned a promise that resolved to the first matching element, including existing
elements. ensureElementExists built on onElementAdded, unsubscribing after the first element.
onElementAdded used _.throttle call to reduce the frequency with which it called the handler.
The following code, for example, would detect the compose button:

const selector = GmailSelectors.COMPOSE_BUTTON;
// Delay watching for the compose button for long enough so the loading performance
// is quick, but also short enough so the button doesn't noticeably flash the default
// gmail color.
const wait = 300;
ElementUtils.ensureElementExists(selector, wait).then((origComposeButton) => {
  // replace the compose button
});

Internally, ElementUtils used a slightly more complicated variation of the following code:

// Find existing elements.
onMutation();
// Throttle element queries and handler calls.
const wrappedOnMutation = _.throttle(onMutation, throttleDuration, {leading: false});
const observer = new MutationObserver(wrappedOnMutation);
observer.observe(document, {
  childList: true,
  subtree: true
});
function onMutation() {
  const elems = $(selector);
  if (elems.length) {
    handler(elems, observer);
  }
}

The new code that detects the compose button proxies through a new common interface, called UI:

UI.get('originalComposeButton').then((origComposeButton) => {
  // replace the compose button
});

Under the hood, UI uses a getFirstFromTag utility function to get the first node from the
originalComposeButton tag. The getFirstFromTag function asks the page-parser-tree instance for
an Observable corresponding to that tag, and unsubscribes as soon as the
observable produces a compose button. This code is roughly analogous to the following, but handles
numerous edge-cases:

import toValueObservable from 'live-set/toValueObservable';
const Watcher = new PageParserTree(definitions);
function getFirstFromTag(tag) {
  const deferred = $.Deferred();
  // Get an observable for the given tag.
  const observable = toValueObservable(Watcher.tree.getAllByTag(tag));
  // Subscribe to elements in the tag - include elements already in the tag. The
  // value parameter is unpacked from an object that also contains the removal
  // Promise, which we don't need for this use-case.
  const subscription = observable.subscribe(({value}) => {
    subscription.unsubscribe();
    deferred.resolve($(value.getValue()));
  });
  return deferred.promise();
}

This new detection requires a deeper understanding of how the page changes. When the user simply
loads their inbox, the compose button will be reachable once the loading view disappears. However,
if they navigate to contacts, Gmail removes the compose button from the DOM. The compose button
watcher must therefore rediscover the button when the page changes. To avoid an overly specific set
of child selectors, we “jump” between well-known points in the DOM. The tag is defined as a watcher
and associated finder:

watchers: [
  // The 'pageContent' source references another tag that finds a defined DOM node
  // that wraps the entire page content (minus things like top-level scripts and our
  // compose windows).
  {sources: ['pageContent'], tag: 'originalComposeButton', selectors: [
    // Use the $map operator to hop from the "pageContent" container to the left
    // sidebar container, which includes the dropdown that navigates between Mail
    // and Contacts, and is seven levels down from the "pageContent" container. Here,
    // the mapping function will be called with each element from the pageContent tag
    // (which should be a single element).
    {$map: (e) => $(e).find('.Ls77Lb')[0]},
    // We define this immediate-child selector for the .aj9 Mail sidebar container,
    // which Gmail replaces with the .aXo Contacts sidebar container.
    // page-parser-tree watches the immediate children of the sidebar, declared by
    // the previous selector, for when the .aj9 container is added and removed.
    // By declaring this element as a direct child of its parent, we discover the new
    // button when the user returns to the primary Mail view.
    '.aj9',
    // Under the .aj9 container, we again use the $map operator to jump to the
    // compose button itself.
    {$map: (e) => $(e).find('div[gh="cm"]:not(.mixmax-compose-button)')[0]}
  ]}
],
finders: {
  originalComposeButton: {
    // page-parser-tree calls this (by default) every five seconds to ensure
    // we haven't missed the compose button due to a change that impacts the above
    // selectors.
    fn: (doc) => $(doc).find(GmailSelectors.COMPOSE_BUTTON_ORIGINAL).toArray()
  }
}

Using this formulation for the watcher, we avoid specifying selectors for each element between the
pageContent container and the sidebar container, and between the Mail sidebar container and the
compose button itself. As such, Gmail is free to change the exact structure it uses within that DOM subtree.

The new approach is a dramatic shift in how we observe page changes. Instead of watching the entire
page for any change, and then rediscovering all elements that match a given selector, we define the
structural relationships between key page elements, and have page-parser-tree watch only those
elements for relevant changes. The new approach is faster, reasonably reliable, and more responsive
to page changes than our old method.

Limitations

Do note that page-parser-tree isn’t a silver bullet. It doesn’t support tagging the same dom node
with the same tag from multiple watchers, nor does it support unrestricted subtree fanout/deep
selectors. It has first-class support for identifying immediate children, and provides operators to
watch for attribute changes, arbitrary filters, element-to-element mapping, and
more.

Another important caveat is that watchers aren’t smart enough to watch for attribute changes based
solely on the selector. If you ask it to identify elements that are .nH.id, and Gmail changes that
element to not have the id class, the element will remain in the tag. To correctly track these
changes, we need the $watch operator. We do this when we watch for messages being opened and
closed by the user:

watchers: [
  // Filter message containers by whether they are open, updating the tag when the
  // user opens/closes one of the messages.
  {sources: ['message'], tag: 'openMessage', selectors: [
    // page-parser-tree calls cond to determine whether a given element should be in
    // the openMessage tag, and re-evaluates a given element when any of the
    // attributes on that element change. When the attributeFilter array is provided,
    // page-parser-tree only re-evaluates when any of those attributes change.
    {$watch: {attributeFilter: ['class'], cond: (e) => $(e).hasClass('h7')}}
  ]}
]

In another case, we identify zero or more form elements within email message bodies, and disable
Gmail’s form submission warning. The old code looked like this:

ElementUtils.onElementAdded('form[action*="mixmax.com"]', (forms) => {
  forms.removeAttr('onsubmit');
  forms.on('submit', (e) => e.stopPropagation());
});

The above approach is super robust to changes within Gmail, but introduces performance issues as in
every other case. We thus now use page-parser-tree. It identifies the open message container, but we
can’t use it to find the actual form elements. The form elements might be anywhere within the DOM
subtree, and page-parser-tree does not support deep selectors as they violate the premise of not
watching for global changes. Moreover, because there might be more than one element, we can’t use
the $map trick from above to identify all the forms, because $map only maps one element to
another element—no “fanout.” As such, we do not expose the form elements from our watchers or
finders at all, but instead prefer to discover them on top of the open message container tag, and
provide an interface for discovering them in the UI abstraction.

UI uses the knowledge that these forms will only be present within open message containers, and
available DOM as soon as the message has been opened, to find them directly with jQuery:

// The actual implementation abstracts this line as subscribe('openMessage', ...)
toValueObservable(Watcher.tree.getAllByTag('openMessage')).subscribe(({value}) => {
  const message = $(value.getValue());
  // Find the forms that have an action that submits data to a mixmax domain.
  message.find('form[action*="mixmax.com"]').each(function() {
    handler($(this));
  });
});

Which is then used to disable the warning:

UI.added('mixmaxForm', (form) => {
  form.removeAttr('onsubmit');
  form.on('submit', (e) => e.stopPropagation());
});

A final limitation is that tags cannot receive the same element multiple times. We ran into this
limitation when adding functionality to support preview pane. It means we must be
careful when sharing watchers between multiple tags. For example, it means we cannot use the
following to detect when an element is empty:

// Do not do this!
watchers: [
  {sources: ['replyContainer'], tag: 'nonEmptyReplyContainer', selectors: [
    // Grab all immediate children of the replyContainer.
    '*',
    // Return to the replyContainer - in theory, page-parser-tree would track which
    // child elements it received the replyContainer element from, and would
    // appropriately remove the replyContainer from nonEmptyReplyContainer when all
    // the child elements are gone. page-parser-tree does not currently support this
    // use-case.
    {$map: (e) => e.parentNode}
  ]}
]

A potential solution to the above would be to replace the '*' with '*:first-child'. We also
encountered this issue when attempting to reuse watchers between different preview pane states and
modes, and had to wholly restructure our selectors to correctly match the message list view in all
view configurations.

Observables

We use Observable objects to simplify some of our interactions with page-parser-tree. A
subscription to an Observable immediately receives any elements that already reside in the
Observable (and in the tag), and receives subsequent elements as they’re discovered by
page-parser-tree. The Observable is provided by zen-observable and
instantiable via live-set’s toValueObservable (demonstrated in getFirstFromTag in the
Implementation section).

Observable subscriptions also expose handy removal Promise objects, which are resolved when the
associated element has been removed from its tag. We use removal Promises in a couple places to
correctly manage the lifecycle of our own code.

For example, when Gmail is in preview pane mode (a Gmail lab), it re-renders the recipient cell in
the thread row into which we inject our reminder button. To reattach the button when the row
changes, our UI abstraction watches thread rows, and subscribes to an observable that observes the
thread tag. The observable provides both the added element and a removal Promise. UI registers
its own MutationObserver when it detects a new row, and disconnects it when the promise resolves:

const observable = toValueObservable(Watcher.tree.getAllByTag('thread'));
observable.subscribe(({value, removal}) => {
  const row = $(value.getValue());
  // Find the parent of the node of interest: the node of interest will exist now, but may be replaced.
  const recipientContainer = row.find(GmailSelectors.RECIPIENT_WRAPPER).parent()[0];
  const observer = new MutationObserver(_.throttle(() => handler(row), 10));
  // Watch the recipient container node for child list changes, so we discover the new recipient
  // wrapper when it's been replaced.
  observer.observe(recipientContainer, {
    childList: true
  });
  // When the row is removed (or we switch views), disconnect the observer.
  removal.then(() => observer.disconnect());
  // Initially call the handler with the row.
  handler(row);
});

Our old approach had deadly performance issues. Our new approach has many nuances and complexities.
This reflects the nature of the problem—it’s not easy, it has real user impact, and a solution
must balance many factors.

Have a knack for engineering solutions to software problems that prioritize user experience?
We’re hiring!

SHARE ON

Written By

Eli Skeggs

Eli Skeggs

From Your Friends At