Add-on recommendations for Firefox users: a prototype recommender system leveraging existing data sources

By: Alessio Placitelli, Ben Miroglio, Jason Thomas, Shell Escalante and Martin Lopatka.
With special recognition of the development efforts of Roberto Vitillo who kickstarted this project, Mauro Doglio for massive contributions to the code base during his time at Mozilla, Florian Hartmann, who contributed efforts towards prototyping the ensemble linear combiner, Stuart Colville for coordinating integration with AMO. Last, but not least, to Anthony Miyaguchi who helped shaping the current code thanks to his reviewing efforts.

What’s TAAR?
Firefox has a robust ecosystem of add-ons that can enhance the browsing experience, but all users are different and not all add-ons are right for everyone. The TAAR project (Telemetry Aware Addon Recommender) is an experimental product developed over the course of 2017 to provide a personalized experience for Firefox users seeking to install add-ons based on available information already in Mozilla’s telemetry data. Our aim is to provide potentially interesting add-ons or useful replacements to add-ons built on legacy technology, without the need for additional data collection. Add-ons created with the new standard (after legacy) are safer, more secure, and won’t break in new Firefox releases.
Unlike conventional recommender systems, we designed TAAR to provide interesting add-on recommendations based on the Telemetry data Firefox collects in accordance to Mozilla’s Data Privacy Principles and privacy policy. The data contains, among other things, browser performance data and an hardware overview. This information is collected from Firefox desktop and can be disabled if users choose to do so. Retrofitting an existing data source to a new application is no easy task, but we are really happy with TAAR’s functionality in leveraging different information sources based on availability to provide a personalized add-on recommendation list.



Recording Telemetry scalars from add-ons

The Go Faster initiative is important as it enables us to ship code faster, using special add-ons, without being strictly tied to the Firefox train schedule. As Georg Fritzsche pointed out in his article, we have two options for instrumenting these add-ons: having probe definitions ride the trains (waiting a few weeks!) or implementing and sending a new custom ping (doing some pipeline work!).

Both solutions are not very appealing when shipping code faster. But hey.. we have plan!

Our current work is focused on extending Telemetry to fill this gap. The first step consisted in enabling add-ons event recording in Firefox 56 (bug) and we recently enabled add-on scalar recording as well (bug)!



Getting Firefox data faster: introducing the ‘new-profile’ ping

Let me state this clearly, again: data latency sucks. This is especially true when working on Firefox: a nicely crafted piece of software that ships worldwide to many people. When something affects the experience of our users we need to know and react fast.

The story so far…

We started improving the latency of the data coming from Firefox, in the previous quarters, and got to the point where the majority of pings reach our servers within 1 hour, instead of days (latest Beta only): there’s an extremely satisfying plot by :chutten about that!

However, this change does not help too much with the data latency of users who just installed Firefox (or created a new profile), don’t trigger a subsession split and usually suspend their computer instead of shutting Firefox down. Their first chunk of data would come either at their local midnight or after they wake their computer again. And this could take hours or days (on weekends).