Add-on recommendations for Firefox users: a prototype recommender system leveraging existing data sources

By: Alessio Placitelli, Ben Miroglio, Jason Thomas, Shell Escalante and Martin Lopatka.
With special recognition of the development efforts of Roberto Vitillo who kickstarted this project, Mauro Doglio for massive contributions to the code base during his time at Mozilla, Florian Hartmann, who contributed efforts towards prototyping the ensemble linear combiner, Stuart Colville for coordinating integration with AMO. Last, but not least, to Anthony Miyaguchi who helped shaping the current code thanks to his reviewing efforts.

What’s TAAR?
Firefox has a robust ecosystem of add-ons that can enhance the browsing experience, but all users are different and not all add-ons are right for everyone. The TAAR project (Telemetry Aware Addon Recommender) is an experimental product developed over the course of 2017 to provide a personalized experience for Firefox users seeking to install add-ons based on available information already in Mozilla’s telemetry data. Our aim is to provide potentially interesting add-ons or useful replacements to add-ons built on legacy technology, without the need for additional data collection. Add-ons created with the new standard (after legacy) are safer, more secure, and won’t break in new Firefox releases.
Unlike conventional recommender systems, we designed TAAR to provide interesting add-on recommendations based on the Telemetry data Firefox collects in accordance to Mozilla’s Data Privacy Principles and privacy policy. The data contains, among other things, browser performance data and an hardware overview. This information is collected from Firefox desktop and can be disabled if users choose to do so. Retrofitting an existing data source to a new application is no easy task, but we are really happy with TAAR’s functionality in leveraging different information sources based on availability to provide a personalized add-on recommendation list.

… 

 

Getting Firefox data faster: introducing the ‘new-profile’ ping

Let me state this clearly, again: data latency sucks. This is especially true when working on Firefox: a nicely crafted piece of software that ships worldwide to many people. When something affects the experience of our users we need to know and react fast.

The story so far…

We started improving the latency of the data coming from Firefox, in the previous quarters, and got to the point where the majority of pings reach our servers within 1 hour, instead of days (latest Beta only): there’s an extremely satisfying plot by :chutten about that!

However, this change does not help too much with the data latency of users who just installed Firefox (or created a new profile), don’t trigger a subsession split and usually suspend their computer instead of shutting Firefox down. Their first chunk of data would come either at their local midnight or after they wake their computer again. And this could take hours or days (on weekends).

… 

 

Getting Firefox data faster: the shutdown pingsender

The data our Firefox users share with us is the key to identify and fix performance issues that lead to a poor browsing experience. Collecting it is not enough if we don’t manage to receive the data in an acceptable time-frame. My esteemed colleague Chris already wrote about this a couple of times: data latency sucks. But we can fix that.

Why is there latency, anyway?

The bulk of measurements we collect (histograms, scalars, events, …) are sent through the main-ping. This ping is generated at different times during the browsing session, including shutdown. The “shutdown” main-ping, which accounts for about ~80% of all the pings we receive, once generated, is not sent to our servers until the next Firefox restart. Depending on the user habits and the day of the week, this could be anything between a few minutes to a few days (see the CDF plot below): way too much! One of my team’s goal for this year is to reduce this latency, allowing developers to take decisions and iterate quickly.

…