Adventures in the Gmail PubSub API
Or, ‘wait, that’s not what the docs say…’
Thursday, Dec 8th, 2016
Mixmax is a communications platform that brings professional communication & email into the 21st century.
It pushes notifications to an endpoint in our system whenever a user’s inbox changes. This includes new messages arriving, as well as other events, such as a message being read, or moved to a different folder.
Our experience with it has been largely positive, however we did encounter a couple of gotchas that we haven’t seen documented anywhere yet.
1. It’s possible to subscribe to a user’s notifications more than once
watch method (which subscribes to notifications regarding a user’s inbox) is supposed to be idempotent. While that’s broadly the case, we found that if you send multiple
watch requests simultaneously, you end up with multiple subscriptions, meaning every event for that user gets pushed to your endpoint multiple times!
Our system was swamped by this a few days after deploy, reaching over 1000 requests per second before we shut it off.
(We were calling
watch simultaneously in some cases due to the way events propagated through our internal queueing system - duplicates sometimes occur, but we didn’t protect against them because the method was documented as idempotent. We've since added a user-level lock to prevent this).
messagesAdded collection in
history.list isn’t reliable
The push notifications from Gmail contain a
historyId - you then need to query the API to get all changes which have occurred between the last
historyId you saw for that user, and this new
history.list method returns an array of changes, broken down by change type. For example,
labelsAdded contains details of labels being applied to messages. We were mostly interested in the
messagesAdded array, which according to the documentation, represents:
Messages added to the mailbox in this history record
We built our code assuming that all new messages would be listed within
However, inexplicably, that’s not always the case. We had multiple instances where new messages simply never appeared in the
We were never able to identify why this occurred. In our tests (moving a message out of the spam folder, messages sent from the user to themselves, messages skipping the inbox etc.) we could not reproduce the issue.
We did, however, find that when messages were missing from
messagesAdded, they did appear in the
messages array, which holds the ids of all messages modified in any way in the history record.
We now check the
messages array exclusively, which is inefficient, because it means we’re querying messages which aren’t new, but were simply changed in some way (read, moved to a different folder, archived etc).
But at least we can now guarantee we’re picking up all new messages.
3. If you subscribe to push notifications for all your users, then sending out a user email blast causes some serious spikes
We noticed a weird spike in push notifications (around 10x higher than normal volume), a few days after launch.
It took a few minutes to figure out it was due to a newsletter we had sent out to our entire user base, triggering inbox events for every user simultaneously :)
Incidentally, we've also noticed that inbox events spike on the hour and half-hour. Our theory is that this is due to bulk marketing emails, which are often scheduled to go out at these times.
Enjoy discovering API quirks like these? Drop us a line.