[ For more information on the Eideticker software I’m referring to, see this entry ]
I just put up a proof of concept Eideticker dashboard for FirefoxOS here. Right now it has two days worth of data, manually sampled from an Unagi device running b2g18. Right now there are two tests: one the measures the “speed” of the contacts application scrolling, another that measures the amount of time it takes for the contacts application to be fully loaded.
For those not already familiar with it, Eideticker is a benchmarking suite which captures live video data coming from a device and analyzes it to determine performance. This lets us get data which is more representative of actual user experience (as opposed to an oft artificial benchmark). For example, Eideticker measures contacts startup as taking anywhere between 3.5 seconds and 4.5 seconds, versus than the 0.5 to 1 seconds that the existing datazilla benchmarks show. What accounts for the difference? If you step through an eideticker-captured video, you can see that even though something appears very quickly, not all the contacts are displayed until the 3.5 second mark. There is a gap between an app being reported as “loaded” and it being fully available for use, which we had not been measuring until now.
At this point, I am most interested in hearing from FirefoxOS developers on new tests that would be interesting and useful to track performance of the system on an ongoing basis. I’d obviously prefer to focus on things which have been difficult to measure accurately through other means. My setup is rather fiddly right now, but hopefully soon we can get some useful numbers going on an ongoing basis, as we do already for Firefox for Android.
Okay, remember last time when I said I was going to continue my “sham of a human existence” and not commit to a Zen practice? Well, I came back to the idea sooner than I thought: the experience was just too compelling for me not to do some further exploration. In some strange coincidence, Hacker News had a great thread on meditation just after I wrote my last blog entry, where a few people recommended a book called Mindfulness in Plain English. I figured doing meditation at home didn’t involve any kind of huge commitment (don’t like it? just stop!), so I decided to order it online and give it a try.
Mindfulness in Plain English is really fascinating stuff. It describes how to do a type of Vipassana (insight) meditation, which is practiced with a great deal of ritual in places like Thailand, India, and Sri Lanka. The book however, strips out most of the ritual and just gives you a set of techniques that is quite accessible for a (presumably) western audience. From what I can gather though, it seems like the goal of Vipassana is quite similar to that of Zen (enlightenment; release from attachment and dualism), though the methods and rituals around it are quite different (e.g. there are no koans). Perhaps it’s akin to the difference between GIMP and Photoshop: as those two programs are both aimed at the manipulation of images, both Vipassana and Zen are aimed at the manipulation of the mind. There are differences in the script of how to do so, but the overarching purpose is the same.
Regardless, the portion of the C method that the book describes is almost exactly that which I tried at the Zen workshop: sit still and pay attention to your breathing. There’s a few minor distinctions in terms of the suggested posture (the book recommends either sitting cross legged or in a lotus position vs. the kneeling posture I learnt at the workshop) and the focal point (Mindfulness recommends the tip of the nostrils). But essentially it’s the same stuff. Focus on the breath — counting it if necessary, rince, repeat.
As I mentioned before, this is actually really hard to do properly for being simple in concept. The mind keeps wandering and wandering on all sorts of tangents: plans, daydreams, even thoughts about the meditation itself. Where I found Mindfulness in Plain English helpful was in the advice it gave for dealing with this “monkey mind” phenomenon. The subject is dealt with throughout the book (with two chapters on it and nothing else), but all the advice boils down to “treat it as part of the meditation”. Don’t try to avoid it, just treat it as something to be aware of in the same way as breathing. Then once you have acknowledged it, move the attention back to the breath.
Mindfulness, as far as I can gather, is simply non-judgemental awareness of what we are doing (and what we are supposed to be doing). Every time a distraction is noticed, felt, and understood, you’ve just experienced some approximation of the end goal of the meditation. Like it is with other things (an exercise regimen, learning to play a musical instrument), every small victory should push you further and the path to where you want to go. With enough practice, it might just become part of your day-to-day experience.
Or so I’m told by the book. Up to now, I haven’t enjoyed any longlasting effects from meditation aside from (possibly?) a bit more mental clarity in my day-to-day tasks. But I’ve found the practice to be extremely interesting both from the point of view of understanding my own thought, as well as being rather relaxing in and of itself. So while I’m curious as to what comes next, I am happy enough with things as they are in the present. I’m planning to continue to meditate (20–30 minutes a day, 6 days a week), but also delve a bit deeper into the details and history of Zen and Vipasanna. More updates as appropriate.
Another update on getting Eideticker working with FirefoxOS. Once again this is sort of high-level, looking forward to writing something more in-depth soon now that we have the basics working.
I finally got the last kinks out of the rig I was using to capture live video from FirefoxOS phones using the Point Grey devices last week. In order to make things reasonable I had to write some custom code to isolate the actual device screen from the rest of capture and a few other things. The setup looks interesting (reminds me a bit of something out of the War of the Worlds):
Here’s some example video of a test I wrote up to measure the performance of contacts scrolling performance (measured at a very respectable 44 frames per second, in case you wondering):
Surprisingly enough, I didn’t wind up having to write up any code to compensate for a noisy image. Of course there’s a certain amount of variance in every frame depending on how much light is hitting the camera sensor at any particular moment, but apparently not enough to interfere with getting useful results in the tests I’ve been running.
Likely next step: Create some kind of chassis for mounting both the camera and device on a permanent basis (instead of an adhoc one on my desk) so we can start running these sorts of tests on a daily basis, much like we currently do with Android on the Eideticker Dashboard.
As an aside, I’ve been really impressed with both the Marionette framework and the gaiatests python module that was written up for FirefoxOS. Writing the above test took just 5 minutes — and the code is quite straightforward. Quite the pleasant change from my various efforts in Android automation.
One of my frustrations with the Linux desktop is the lack of an email client that’s in the same league as GMail or Apple’s mail.app. Thunderbird is ok as far as it goes (I use it for my day-to-day Mozilla correspondence) but I miss having a decent conversation view of email (yes, I tried the conversation view extension — while impressive in some ways, it ultimately didn’t work particularly well for me) and the search functionality is rather slow and cumbersome. I’d like to be optimistic about these problems being fixed at some point… but after nearly 2 years of using the product without much visible improvement my expectation of that happening is rather low.
The Yorba non-profit recently started a fundraiser to work on the next edition of Geary, an email client which I hope will fill the niche that I’m talking about. It’s pretty rough around the edges still, but even at this early stage the conversation view is beautiful and more or less exactly what I want. The example of Shotwell (their photo management application) suggests that they know a thing or two about creating robust and useable software, not a common thing in this day and age. In any case, their pitch was compelling enough for me to donate a few dollars to the cause. If you care about having a great email experience that is completely under your control (and not that of an advertising or product company with their own agenda), then maybe you could too?
So for a bit of a departure from the usual technical content, a personal anecdote. I went to the Montreal Zen Center today for a workshop, which was a most illuminating experience. I’d been pretty fascinated with the idea of zen for a while (see this post of mine from 2006, for example) but was pretty stuck on how to put it into practice (aside from being sure it was something you had to live). So, this was a step in that direction. After having gone to it, I wouldn’t say I’ve figured anything out (in fact I’m more confused than ever), but I would say one thing with conviction: this is the way to learn more.
It was pretty simple stuff: exactly how they describe on the web page I linked to. A short verbal introduction on some of the ideas of zen, then a tea break, then instruction on how to begin practising meditation, another tea break (this time with biscuits), then actually practising meditation, then question & answer about the meditation. It doesn’t really sound like much, and it wasn’t. But nonetheless I can’t stop thinking about the experience.
As far as I can gather, the “revelation” offered by Zen Buddhism is simple: our existence as separate, unique beings is an illusion of the mind. This illusion makes us suffer. However, it is possible with practice to overcome this illusion and realize your true nature as being one with the world. I’m probably butchering it a little bit by writing about it in this way, to a certain extent that’s me, but in another way it’s rather unavoidable since in a way the concepts are beyond words (since words imply a dualism). Regardless, the important thing isn’t to grasp zen intellectually, but to come to a natural understanding through the practice of meditation (aka “the practice”).
And on that note, the meditation is austere and almost certainly less than you’d expect. There is no prayer and very little ritual. Just a very minimal breath counting exercise conducted in a seated posture for 20 minutes, followed by a short walking exercise that lasts 5 minutes, then repeating the breath counting exercise for another 20 minutes. For its utter simplicity, I found it incredibly difficult. I imagine like anything with weeks, months, years of practice it (and the variations of it that experienced practitioners use where they meditate on koans) it would become easier.
I’m still giving thought on whether I want to take the next steps with them and begin a regular meditation practice. It sounds like really hard work (self meditation practice 6 days a week by yourself, plus regular visits to the zen center), which brings up the question: why do you want to do this? There’s a weird contradiction between realizing that you as a self don’t really exist and committing yourself radically to this kind of practice. The only thing I can call it would be a “leap of faith”. My current thinking is that I’m not quite ready for that right now, but maybe in a while. For now I think I’m pretty happy going to yoga a few times a week and living my sham of a human existence. 😉
Last summer I wrote a bit about using Eideticker to measure the relative performance of Firefox for Android versus other browsers (Chrome, stock, etc.). At the time I was pretty optimistic about Eideticker’s usefulness as a truly “objective” measure of user experience that would give us a more accurate view of how we compared against the competition than traditional benchmarking suites (which more often than not, measure things that a user will never see normally when browsing the web). Since then, there’s been some things that I’ve discovered, as well as some developments in terms of the “state of the art” in mobile browsing that have caused me to reconsider that view — while I haven’t given up entirely on this concept (and I’m still very much convinced of eideticker’s utility as an internal benchmarking tool), there’s definitely some limitations in terms of what we can do that I’m not sure how to overcome.
Essentially, there are currently three different types of Eideticker performance tests:
- Animation tests: Measure the smoothness of an animation by comparing frames and seeing how many are different. Currently the only example of this is the canvas “clock” test, but many others are possible.
- Startup tests: Measure the amount of time it takes from when the application is launched to when the browser is fully running/available. There are currently two variants of this test in the dashboard, both measure the amount of time taken to fully render Firefox’s home screen (the only difference between the two is whether the browser profile is fully initialized). The dirty profile benchmark probably most closely resembles what a user would usually experience.
- Scrolling tests: Measure the amount of undrawn areas when the user is panning a website. Most of the current eideticker tests are of this kind. A good example of this is the taskjs benchmark.
As it turns out, it’s unfortunately been rather difficult to create truly objective tests which measure the difference between browsers in these two categories. I’ll go over them in order.
There are essentially two types of startup tests: one where you measure the amount of time to get to the browser’s home screen when you explicitly launch the app (e.g. by pressing the Firefox icon in the app chooser), another where you load a web page in a browser from another app (e.g. by clicking on a link in the Twitter application).
The first is actually fairly easy to test across browsers, although we are not currently. There’s not really a good reason for that, it was just an oversight, so I filed bug 852744 to add something like this.
The second case (startup to the browser’s homescreen) is a bit more difficult. The problem here is that, in a nutshell, an apples to apples comparison is very difficult if not impossible simply because different browsers do different things when the user presses the application icon. Here’s what we see with Firefox:
And here’s what we see with Chrome:
And here’s what we see with the stock browser:
As you can see Chrome and the stock browser are totally different: they try to “restore” the browser back to its state from the last time (in Chrome’s case, I was last visiting taskjs.org, in Stock’s case, I was just on the homepage).
Personally I prefer Firefox’s behaviour (generally I want to browse somewhere new when I press the icon on my phone), but that’s really beside the point. It’s possible to hack around what chrome is doing by restoring the profile between sessions to some sort of clean “new tab” state, but at that point you’re not really reproducing a realistic user scenario. Sure, we can draw a comparison, but how valid is it really? It seems to me that the comparison is mostly only useful in a very broad “how quickly does the user see something useful” sense.
At the time that I wrote these tests, most browsers displayed regions that were not yet drawn while panning using the page background. We figured that it would thus be possible to detect regions of the page which were not yet drawn by looking for the background color while initiating a panning action. I thus hacked up existing web pages to have a magenta background, then wrote some image analysis code to detect regions that were that color (under the assumption that magenta is only rarely seen in webpages). It worked pretty well.
The world’s moved on a bit since I wrote that: modern browsers like Chrome and Firefox use something like progressive drawing to display a lower resolution “tile” where possible in this case, so the user at least sees something resembling the actual page while panning on a slower device. To see what I mean, try visiting a slow-to-render site like taskjs.org and try panning down quickly. You should see something like this (click to expand):
Unfortunately, while this is certainly a better user experience, it is not so easy to detect and measure. For Firefox, we’ve disabled this behaviour so that we see the old checkerboard pattern. This is useful for our internal measurements (we can see both if our drawing code as well as our heuristics about when to draw are getting better or worse over time) but it only works for us.
If anyone has any suggestions on what to do here, let me know as I’m a bit stuck. There are other metrics we could still compare against (i.e. how smooth is the panning animation aka frames per second?) but these aren’t nearly as interesting.
Just wanted to give a quick heads up that as part of the ateam’s ongoing effort to improve the documentation of our automated testing infrastructure, we now have online documentation for mozdevice, the python library we use for interacting with Android- and FirefoxOS-based devices in automated testing.
Mozdevice is used in pretty much every one of our testing frameworks that has mobile support, including mochitest, reftest, talos, autophone, and eideticker. Additionally, mozdevice is used by release engineering to clean up, monitor, and otherwise manage
our hundred-odd the 1200*** tegra and panda development boards that we use in tbpl. See sut_tools (old, buildbot-based, what we currently use) and mozpool (the new and shiny future).
- Thanks to Dustin Mitchell for the correction.
Quick update on my last post about finding some kind of camera suitable for use in automated performance testing of fennec and b2g with eideticker. Shortly after I wrote that, I found out about a company called Point Grey Research which manufactures custom web cameras intended for exactly the sorts of things we’re doing. Initial results with the Flea3 camera I ordered from them are quite promising:
(the actual capture is even higher quality than that would suggest… some detail is lost in the conversion to webm)
There seems to be some sort of issue with the white balance in that capture which is causing a flashing effect (I suspect this just involves flipping some kind of software setting… still trying to grok their SDK), and we’ll need to create some sort of enclosure for the setup so ambient light doesn’t interfere with the capture… but overall I’m pretty optimistic about this baby. 60 frames per second, very high resolution (1280×960), no issues with HDMI (since it’s just a USB camera), relatively inexpensive.
[ For more information on the Eideticker software I’m referring to, see this entry ]
Ok, so as I mentioned last time I’ve been looking into making Eideticker work for devices without native HDMI output by capturing their output with some kind of camera. So far I’ve tried four different DSLRs for this task, which have all been inadequate for different reasons. I was originally just going to write an email about this to a few concerned parties, but then figured I may as well structure it into a blog post. Maybe someone will find it useful (or better yet, have some ideas.)
This is the device I mentioned last time. Easy to set up, plays nicely with the Decklink capture card we’re using for Eideticker. It seemed almost perfect, except for that:
- The image quality is really bad (beaten even by $200 standard digital camera). Tons of noise, picture quality really bad. Not *necessarily* a deal breaker, but it still sucks.
- More importantly, there seems to be no way of turning off the auto white balance adjustment. This makes automated image analysis impossible if the picture changes, as is highlighted in this video:
Canon Rebel T4i
This is the first camera that was recommended to me at the camera shop I’ve been going to. It does have an HDMI output signal, but it’s not “clean”. This blog post explains the details better than I could. Next.
Supposedly does native clean 720p output, but unfortunately the output is in a “box” so isn’t recognized by the Decklink cards that we’re using (which insist on a full 1280×720 HDMI signal to work). Next.
It is possible to configure this one to not put a box around the output, so the Decklink card does recognize it. Except that the camera shuts off the HDMI signal whenever the input parameters change on the card or the signal input is turned on, which essentially makes it useless for Eideticker (this happens every time we start the Eideticker harness). Quite a shame, as the HDMI signal is quite nice otherwise.
To be clear, with the exception of the Elmo all the devices above seem like fine cameras, and should more than do for manual captures of B2G or Android phones (which is something we probably want to do anyway). But for Eideticker, we need something that works in automation, and none of the above fit the bill. I guess I could explore using a “real” video camera as opposed to a DSLR acting like one, though I suspect I might run into some of the same sorts of issues depending on how the HDMI output of those devices behaves.
Part of me wonders whether a custom solution wouldn’t work better. How complicated could it be to construct your own digital camera anyway? 😉 Hook up a fancy camera sensor to a pandaboard, get it to output through the HDMI port, and then we’re set? Or better yet, maybe just get a fancy webcam like the Playstation Eye and hook it up directly to a computer? That would eliminate the need for our expensive video capture card setup altogether.
[ For more information on the Eideticker software I’m referring to, see this entry ]
Here’s a long overdue update on where we’re at with Eideticker for FirefoxOS. While we’ve had a good amount of success getting useful, actionable data out of Eideticker for Android, so far we haven’t been able to replicate that success for FirefoxOS. This is not for lack of trying: first, Malini Das and then me have been working at it since summer 2012.
When it comes right down to it, instrumenting Eideticker for B2G is just a whole lot more complex. On Android, we could take the operating system (including support for all the things we needed, like HDMI capture) as a given. The only tricky part was instrumenting the capture so the right things happened at the right moment. With FirefoxOS, we need to run these tests on entire builds of a whole operating system which was constantly changing. Not nearly as simple. That said, I’m starting to see light at the end of the tunnel.
We initially selected the pandaboard as the main device to use for eideticker testing, for two reasons. First, it’s the same hardware platform we’re targeting for other b2g testing in tbpl (mochitest, reftest, etc.), and is the platform we’re using for running Gaia UI tests. Second, unlike every other device that we’re prototyping FirefoxOS on (to my knowledge), it has HDMI-out capability, so we can directly interface it with the Eideticker video capture setup.
However, the panda also has some serious shortcomings. First, it’s obviously not a platform we’re shipping, so the performance we’re seeing from it is subject to different factors that we might not see with a phone actually shipped to users. For the same reason, we’ve had many problems getting B2G running reliably on it, as it’s not something most developers have been hacking on a day to day basis. Thanks to the heroic efforts of Thomas Zimmerman, we’ve mostly got things working ok now, but it was a fairly long road to get here (several months last fall).
More recently, we became aware of something called an Elmo which might let us combine
the best of both worlds. An elmo is really just a tiny mounted video camera with a bunch of outputs, and is I understand most commonly used to project documents in a classroom/presentation setting. However, it seems to do a great job of capturing mobile phones in action as well.
The nice thing about using an external camera for the video capture part of eideticker is that we are no longer limited to devices with HDMI out — we can run the standard set of automated tests on ANYTHING. We’ve already used this to some success in getting some videos of FirefoxOS startup times versus Android on the Unagi (a development phone that we’re using internally) for manual analysis. Automating this process may be trickier because of the fact that the video capture is no longer “perfect”, but we may be able to work around that (more discussion about this later).
FirefoxOS web page tests
These are the same tests we run on Android. They should give us an idea of roughly where our performance when browsing / panning web sites like CNN. So far, I’ve only run these tests on the Pandaboard and they are INCREDIBLY slow (like 1–3 frames per second when scrolling). So much so that I have to think there is something broken about our hardware acceleration on this platform.
FirefoxOS application tests
These are some new tests written in a framework that allows you to script arbitrary interactions in FirefoxOS, like launching applications or opening the task switcher.
I’m pretty happy with this. It seems to work well. The only problems I’m seeing with this is with the platform we’re running these tests on. With the pandaboard, applications look weird (since the screen resolution doesn’t remotely resemble the 320×480 resolution on our current devices) and performance is abysmal. Take, for example, this capture of application switching performance, which operates only at roughly 3–4 fps:
So what now?
I’m not 100% sure yet (partly it will depend on what others say as well as my own investigation), but I have a feeling that capturing video of real devices running FirefoxOS using the Elmo is the way forward. First, the hardware and driver situation will be much more representative of what we’ll actually be shipping to users. Second, we can flash new builds of FirefoxOS onto them automatically, unlike the pandaboards where you currently either need to manually flash and reset (a time consuming and error prone process) or set up an instance of mozpool (which I understand is quite complicated).
The main use case I see with eideticker-on-panda would be where we wanted to run a suite of tests on checkin (in tbpl-like fashion) and we’d need to scale to many devices. While cool, this sounds like an expensive project (both in terms of time and hardware) and I think we’d do better with getting something slightly smaller-scale running first.
So, the real question is whether or not the capture produced by the Elmo is amenable to the same analysis that we do on the raw HDMI output. At the very least, some of eideticker’s image analysis code will have to be adapted to handle a much “noisier” capture. As opposed to capturing the raw HDMI signal, we now have to deal with the real world and its irritating fluctuations in ambient light levels and all that the rest. I have no doubt it is *possible* to compensate for this (after all this is what the human eye/brain does all the time), but the question is how much work it will be. Can’t speak for anyone else at Mozilla, but I’m not sure if I really have the time to start a Ph.D-level research project in computational vision. 😉 I’m optimistic that won’t be necessary, but we’ll just have to wait and see.