Mozilla, Perfherder

Perfherder Quarter of Contribution Summer 2016: Results

Aug 10th, 2016

Following on the footsteps of Mike Ling’s amazing work on Perfherder in 2015 (he’s gone on to do a GSOC project), I got two amazing contributors to continue working on the project for a few weeks this summer as part of our quarter of contribution program: Shruti Jasoria and Roy Chiang.

Shruti started by adding a feature to the treeherder/perfherder backend (ability to enable or disable a new performance framework on a tentative basis), then went on to make all sorts of improvements to the Treeherder / Perfherder frontend, fixing bugs in the performance sheriffing frontend, updating code to use more modern standards (including a gigantic patch to enable a bunch of eslint rules and fix the corresponding problems).

Roy worked all over the codebase, starting with some simple frontend fixes to Treeherder, moving on to fix a large number of nits in Perfherder’s alerts view. My personal favorite is the fact that we now paginate the list of alerts inside this view, which makes navigation waaaaay back into history possible:

alert pagination

You can see a summary of their work at these links:

Thank you Shruti and Roy! You’ve helped to make sure Firefox (and Servo!) performance remains top-notch.

Mozilla

Quarter of Contribution: June / July 2016 edition

May 27th, 2016

Just wanted to announce that, once again, my team (Mozilla Engineering Productivity) is just about to start running another quarter of contribution — a great opportunity for newer community members to dive deep on some of the projects we’re working on, brush up on their programming and problem solving skills, and work with experienced mentors. You can find more information on this program here.

I’ve found this program to be a really great experience on both sides — it’s an opportunity for contributors to really go beyond the “good first bug” style of patches to having a really substantial impact on some of the projects that we’re working on while gaining lots of software development skills that are useful in the real world.

Once again, I’m going to be mentoring one or two people on the Perfherder project, a tool we use to measure and sheriff Firefox performance. If you’re inclined to work on some really interesting data analysis and user interface problems in Python and JavaScript, please have a look at the project page and get in touch. :)

Mozilla, Perfherder

Are We Fast Yet and Perfherder

Mar 30th, 2016

Historically at Mozilla, we’ve had a bunch of different systems running to benchmark Firefox’s performance. The two most broadly-scoped are Talos (which runs as part of our build process, and emphasizes common real-world use cases, like page loading) and Are We Fast Yet (which runs seperately, and emphasizes JavaScript performance and benchmarks).

As many of you know, most of my focus over the last year-and-a-bit has been developing a system called Perfherder, which aims to make monitoring and acting on performance data easier. A great introduction to Perfherder is my project of the month post.

The initial focus of Perfherder has been Talos, which is deeply integrated into our automation and also maintained by Engineering Productivity (my group). However, the intention was always to allow anyone in the Mozilla community to submit performance data for Firefox and sheriff it, much like Treeherder has supported the submission of test result data from third parties (e.g. autophone, Firefox UI tests). There are more commonalities than differences in how we do performance sheriffing with Are We Fast Yet (which currently has its own web interface) and Perfherder, so it made sense to see if we could pool resources.

So, over the last couple of months, Joel Maher and I have been in discussions with Hannes Verschore, current maintainer of Are We Fast Yet (AWFY) to see what could be done. It looks like it is possible for Perfherder to provide most of what AWFY needs, though there are a few exceptions. I thought for the benefit of others, it might be useful to outline what’s done, what’s coming next, and what might not be implemented (at least not any time soon).

What’s done

  • Get AWFY submitting data to Perfherder and allow it to be sheriffed seperately from Talos. This is working on treeherder stage, and you can already examine the alert data.

What’s in progress (or in the near-term pipeline)

  • Allow custom alerting behaviour (bug 1254595). For example, we want to alert on subtests for AWFY while still summarizing the results. This is something we don’t currently support.
  • Allow creating an alert manually (bug 1260791). Sadly, our regression detection algorithm is not perfect. AWFY already supports this, we should too. This is something we also want for Talos.
  • Make regression-filing templates non-talos-specific (bug 1260805). Currently we have a convenience template for filing bugs for performance regressions, but this is currently specific to various things about Talos (job running instructions, links to documentation, etc.). We should make it configurable so other projects like AWFY can take advantage of this functionality.

Under consideration

  • Some kind of support for bisecting a push to figure out which patch caused a regression. AWFY currently supports this, but it’s a fairly difficult thing to add to Perfherder (much of which is built upon Treeherder’s per-push result model). Maybe this is something we should do, but it would be a significant amount of effort.
  • Proprietary benchmarks: AWFY runs one benchmark the results for which we can’t make public. Adding “private” jobs or results to Treeherder is likely a big can of worms, but it might be something we want to do eventually.

Probably won’t fix

  • Supporting comparative measurements between Firefox and other browsers. This is an important task, but doesn’t really fit into the model of Perfherder, which is intimately tied to the revision data associated with Firefox. To do this would require detailed tracking of Chrome on the same basis, and I don’t think that’s really a place where we want to go. We should definitely monitor for general trends, but I think that is best done with a seperate system.

Mozilla, Perfherder

Platform engineering project of the month: Perfherder

Mar 14th, 2016

[ originally posted on mozilla.dev.platform ]

Hello from Platform Engineering Operations! Once a month we highlight one of our projects to help the Mozilla community discover a useful tool or an interesting contribution opportunity.

This month’s project is Perfherder!

What is Perfherder?

Perfherder is a generic system for visualizing and analyzing performance data produced by the many automated tests we run here at Mozilla (such as Talos, “Are we fast yet?” or “Are we slim yet?”). The chief goal of the project is to make sure that performance of Firefox gets better, not worse over time. It does this by:

  • Tracking the performance generated by our automated tests, allowing them to be visualized on a graph.
  • Providing a sheriffing dashboard which allows for incoming alerts of performance regressions to be annotated and triaged - bugs can be filed based on a template and their resolution status can be tracked.

In addition to its own user interface, Perfherder also provides an API on the backend that other people can use to build custom performance visualizations and dashboards. For example, the metrics group has been working on a set of release quality indices for performance based on Perfherder data:

https://metrics.mozilla.com/quality-indices/

How it works

Perfherder is part of Treeherder, building on that project’s existing support for tracking revision and test job information. Like the rest of Treeherder, Perfherder’s backend is written in Python, using the Django web framework. The user interface is written as an AngularJS application.

Learning more

For more information on Perfherder than you ever wanted to know, please see the wiki page:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Perfherder

Can I contribute?

Yes! We have had some fantastic contributions from the community to Perfherder, and are always looking for more. This is a great way to help developers make Firefox faster (or use less memory). The core of Perfherder is relatively small, so this is a great chance to learn either Django or Angular if you have a small amount of Python and/or JavaScript experience.

We have set aside a set of bugs that are suitable for getting started here:

https://bugzilla.mozilla.org/buglist.cgi?list_id=12722722&resolution=---&status_whiteboard_type=allwordssubstr&query_format=advanced&status_whiteboard=perfherder-starter-bug

For more information on contributing to Perfherder, please see the contribution section of the above wiki page:

https://wiki.mozilla.org/EngineeringProductivity/Projects/Perfherder#Contribution

Mozilla, Talos

Talos suites now visible from trychooser

Feb 13th, 2016

It’s a small thing, but I submitted a patch to trychooser last week which adds a tooltip indicating the actual Talos tests that are run as part of the various jobs that you can schedule as part of a try push. It’s in production as of now:

Previously, the only way to do this was to dig into the actual buildbot code, which was more than a little annoying.

If you think your patch might have a good chance of regressing performance, please do run the Talos tests before you check in. It’s much less work for all of us when these things are caught before integration and back outs are no fun for anyone. We really need better documentation for this stuff, but meanwhile if you need help with this, please ask in the #perf channel on irc.mozilla.org

zen

Albert Low

Feb 7th, 2016

I was saddened to find out last week that the person who introduced me to Zen practice three years ago, Albert Low, has passed away. Albert was the teacher of the Montreal Zen Center, which I was a member of for a brief period (6 months) in 2014 before I moved to Toronto and started practicing at the center here.

Albert’s instruction was the gateway to a practice that has had a profound impact on my life. More than anything, he helped me understand Zen as something that one could incorporate directly into daily life. I will remain forever grateful.

BIXI, Nixi

NIXI is moving too

Jan 8th, 2016

As my blog goes to github pages, so do my other side projects. I just moved nixi, my bikestation finder project, to github pages. Its new location:

http://wlach.github.io/nixi

I opted not to move over the domain: it would have cost extra money, time and hassle and I couldn’t justify it for the very, very small number of people that still use this site (yes, there are a few, including myself!). For now, nixi.ca will redirect to the github pages site until I decommision my linode server, probably at the end of January (end of Feburary at the latest).

This transition brings some other changes with it:

  • Now using the citybik.es API directly, instead of proxing through an intermediary server. This was necessitated by the switch to github pages, but I suspect this will be more reliable than what we were doing before. Thanks citybik.es!
  • Removed all analytics and facebook integration. As with the domain, it didn’t seem worth bringing over. Also, it’s nice to give people at least marginally more privacy than they had before where possible.

I still think nixi is worlds more usable than most bikesharing maps, even if it’s not an actively maintained project of mine any more. Here’s hoping it lasts many more years in its new incarnation.

Meta

New year, new blog

Jan 2nd, 2016

After thinking about doing it for longer than I’d like to admit, I finally bit the bullet and decided to migrate away from WordPress, towards a markdown-based blog generator (Frog in this case). All the content from the old blog is coming with me (thanks mostly to WordPress’s jekyll exporter plugin).

While WordPress is a pretty impressive piece of software, it isn’t the ideal platform for the sorts of things I want to express. It’s a reasonable tool for publishing straight longform essays, but my more interesting posts tend to also include images, code and examples, and sometimes even math. Making those look reasonable involved a bunch of manual effort and the end result wasn’t particularly great. I was particularly disappointed in its (lack of) support for inline code snippits.

Perhaps this set of problems is resolvable by installing the right set of plugins. Perhaps. But therein lies my second problem with WordPress: it’s big, complex piece of software written in PHP, and I’m frankly tired of figuring out how to (barely) make it do the things I need it to do, while half-worrying that the new fancy WPAwesome plugin I’m installing is malware.

As I’ve grown older, I’m increasingly realizing the limits to what I have the time (and energy) to accomplish. While “Making WordPress do the things I want” is something I could continue working on, it would come at the expense of other things that I find more rewarding, whether that be meditating, brushing up on deep learning, or even just writing more stuff here. I don’t expect this new blog to be maintenance free, but it should be an order of magnitude simpler using Frog, which is narrowly focused on my rather technical use case and specifically has great support for inline code, images, and math.

Along the same lines, I’m completely tired of maintaining the Linux server that my blog ran on. Registering domains and setting up my own HTTP server seemed like an interesting diversion in 2009, when cheap Linux VPSes were first starting to appear on the market. These days… well, not so much. It’s a minor, though not completely trivial, expense ($10 USD/mo.) but more importantly it’s a sink of my time to install security patches, make sure things are to up to date, etc. It feels like I’m solving the same (boring) set of problems over and over, with no real payoff. Time to move on.

Thus, this blog (along with my other hosted projects, like NIXI and meditation) will be moving to github pages. Initially I had the worry that this move would mean that I wouldn’t be “in control of my own destiny”, but on reflection I don’t think that’s true. The fact that my blog is basically a giant git repository should make switching hosting providers quite easy if Github becomes unsatisfactory for whatever reason.

Indeed, even the custom domain (wrla.ch) seems unnecessary at this point. Although github pages does support them, I’m just not seeing the value in keeping it around. What purpose does it really serve? All a custom personal domain really says to me is that the person had the time/money to register it. Is that something that someone in my position really needs to communicate? And if I don’t need it, why continue with the unnecessary expense and hassle?

Perhaps the only legitimate reason to keep the domain would be continuity for readers (i.e. there’s a link or two in their browser history), but I don’t think that’s a big deal in my case. Yes, people might occasionally be thrown off and have to use Yahoo/Google to re-find something… but for the type of content I host, I don’t think that will take too much collective time. In the grand stream of things, I’m pretty small potatoes. Most of my traffic just comes through planet.mozilla.org, and that’s easy to redirect automatically.

So though I’ll be keeping around wrla.ch for a little bit to give people time to migrate their links (it doesn’t expire until the end of February 2016), it will also be going away. Please redirect your feed readers to wlach.github.io.

Now, onto more interesting things!

Mozilla, Perfherder

Perfherder: Onward!

Nov 4th, 2015

In addition to the database refactoring I mentioned a few weeks ago, some cool stuff has been going into Perfherder lately.

Tracking installer size

Perfherder is now tracking the size of the Firefox installer for the various platforms we support (bug 1149164). I originally only intended to track Android .APK size (on request from the mobile team), but installer sizes for other platforms came along for the ride. I don’t think anyone will complain.

Screen Shot 2015-11-03 at 5.28.48 PM

link

Just as exciting to me as the feature itself is how it’s implemented: I added a log parser to treeherder which just picks up a line called “PERFHERDER_DATA” in the logs with specially formatted JSON data, and then automatically stores whatever metrics are in there in the database (platform, options, etc. are automatically determined). For example, on Linux:

PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"subtests": [{"name": "libxul.so", "value": 99030741}], "name": "installer size", "value": 55555785}]}

This should make it super easy for people to add their own metrics to Perfherder for build and test jobs. We’ll have to be somewhat careful about how we do this (we don’t want to add thousands of new series with irrelevant / inconsistent data) but I think there’s lots of potential here to be able to track things we care about on a per-commit basis. Maybe build times (?).

More compare view improvements

I added filtering to the Perfherder compare view and added back links to the graphs view. Filtering should make it easier to highlight particular problematic tests in bug reports, etc. The graphs links shouldn’t really be necessary, but unfortunately are due to the unreliability of our data — sometimes you can only see if a particular difference between two revisions is worth paying attention to in the context of the numbers over the last several weeks.

Screen Shot 2015-11-03 at 5.37.02 PM

Miscellaneous

Even after the summer of contribution has ended, Mike Ling continues to do great work. Looking at the commit log over the past few weeks, he’s been responsible for the following fixes and improvements:

  • Bug 1218825: Can zoom in on perfherder graphs by selecting the main view
  • Bug 1207309: Disable ‘<’ button in test chooser if no test selected
  • Bug 1210503 – Include non-summary tests in main comparison view
  • Bug 1153956 – Persist the selected revision in the url on perfherder (based on earlier work by Akhilesh Pillai)

Next up

My main goal for this quarter is to create a fully functional interface for actually sheriffing performance regressions, to replace alertmanager. Work on this has been going well. More soon.

Screen Shot 2015-11-04 at 10.41.26 AM

Mozilla, Perfherder, SQL

The new old Perfherder data model

Oct 23rd, 2015

I spent a good chunk of time last quarter redesigning how Perfherder stores its data internally. Here are some notes on this change, for posterity.

Perfherder’s data model is based around two concepts:

  1. Series signatures: A unique set of properties (platform, test name, suite name, options) that identifies a performance test.
  2. Series data: A set of measurements for a series signature, indexed by treeherder push and job information.

When it was first written, Perfherder stored the second type of data as a JSON-encoded series in a relational (MySQL) database. That is, instead of storing each datum as a row in the database, we would store sequences of them. The assumption was that for the common case (getting a bunch of data to plot on a graph), this would be faster than fetching a bunch of rows and then encoding them as JSON. Unfortunately this wasn’t really true, and it had some serious drawbacks besides.

First, the approach’s performance was awful when it came time to add new data. To avoid needing to decode or download the full stored series when you wanted to render only a small subset of it, we stored the same series multiple times over various time intervals. For example, we stored the series data for one day, one week… all the way up to one year. You can probably see the problem already: you have to decode and re-encode the same data structure many times for each time interval for every new performance datum you were inserting into the database. The pseudo code looked something like this for each push:

for each platform we're testing talos on:
  for each talos job for the platform:
    for each test suite in the talos job:
      for each subtest in the test suite:
        for each time interval in one year, 90 days, 60 days, ...:
           fetch and decode json series for that time interval from db
           add datapoint to end of series
           re-encode series as json and store in db

Consider that we have some 6 platforms (android, linux64, osx, winxp, win7, win8), 20ish test suites with potentially dozens of subtests… and you can see where the problems begin.

In addition to being slow to write, this was also a pig in terms of disk space consumption. The overhead of JSON (“{, }” characters, object properties) really starts to add up when you’re storing millions of performance measurements. We got around this (sort of) by gzipping the contents of these series, but that still left us with gigantic mysql replay logs as we stored the complete “transaction” of replacing each of these series rows thousands of times per day. At one point, we completely ran out of disk space on the treeherder staging instance due to this issue.

Read performance was also often terrible for many common use cases. The original assumption I mentioned above was wrong: rendering points on a graph is only one use case a system like Perfherder has to handle. We also want to be able to get the set of series values associated with two result sets (to render comparison views) or to look up the data associated with a particular job. We were essentially indexing the performance data only on one single dimension (time) which made these other types of operations unnecessarily complex and slow — especially as the data you want to look up ages. For example, to look up a two week old comparison between two pushes, you’d also have to fetch the data for every subsequent push. That’s a lot of unnecessary overhead when you’re rendering a comparison view with 100 or so different performance tests:

Screen Shot 2015-08-07 at 1.57.39 PM

So what’s the alternative? It’s actually the most obvious thing: just encode one database row per performance series value and create indexes on each of the properties that we might want to search on (repository, timestamp, job id, push id). Yes, this is a lot of rows (the new database stands at 48 million rows of performance data, and counting) but you know what? MySQL is designed to handle that sort of load. The current performance data table looks like this:

+----------------+------------------+
| Field          | Type             |
+----------------+------------------+
| id             | int(11)          |
| job_id         | int(10) unsigned |
| result_set_id  | int(10) unsigned |
| value          | double           |
| push_timestamp | datetime(6)      |
| repository_id  | int(11)          | 
| signature_id   | int(11)          | 
+----------------+------------------+

MySQL can store each of these structures very efficiently, I haven’t done the exact calculations, but this is well under 50 bytes per row. Including indexes, the complete set of performance data going back to last year clocks in at 15 gigs. Not bad. And we can examine this data structure across any combination of dimensions we like (push, job, timestamp, repository) making common queries to perfherder very fast.

What about the initial assumption, that it would be faster to get a series out of the database if it’s already pre-encoded? Nope, not really. If you have a good index and you’re only fetching the data you need, the overhead of encoding a bunch of database rows to JSON is pretty minor. From my (remote) location in Toronto, I can fetch 30 days of tcheck2 data in 250 ms. Almost certainly most of that is network latency. If the original implementation was faster, it’s not by a significant amount.

Screen Shot 2015-10-23 at 1.55.09 PM

Lesson: Sometimes using ancient technologies (SQL) in the most obvious way is the right thing to do. DoTheSimplestThingThatCouldPossiblyWork