Introducing run-perf-tests and Adding Performance Bots
Cited from my webkit-dev post
Executive Summary #
- I've added
Tools/Scripts/run-perf-test
, try it out. - Please add
--no-timeout
and--timeout
options to your DRT to supportrun-perf-test
- Perf-o-matic coming on webkit-perf.appspot.com, a clone of graphs.mozilla.org
- Chromium Mac perf bots coming on build.webkit.org
- Use
PerformanceTests/Parser/resources/runner.js
to write new performance tests
Background #
We have some performance tests in PerformanceTests but they're not ran by any bots. In fact, there are no performance bots at all on build.webkit.org. While Chromium has perf bots, we can only see progressions and regressions triggered by WebKit changes when Chromium gets a WebKit roll (pulling newer version of WebKit), which happens only a handful times a day. It doesn't scale to the rate at which we're making changes to WebKit and the visibility and the usability of bots are not great for non-Chromium WebKit contributors. Furthermore, Chromium perf bots will not catch JSC progressions and regressions at all.
Means to Run Performance Tests #
I've added Tools/Scripts/run-perf-tests
to run PerformanceTests
in DRT based
on the work Ilya Tikhonovsky (loislo) has done for run-inspector-perf-tests.py
.
The script aims to run performance tests both locally and on bots similar to the way
run-webkit-tests
works and runs on Mac (WebKit1) and Chromium ports.
Please try it out and give me a feedback (you can file a bug with run-perf-tests:
in the summary and cc me).
I didn't merge it into run-webkit-tests
because performance tests don't pass/fail
but instead give us some values that fluctuate over time.
While Chromium takes an approach to hard-code the rage of acceptable values,
such an approach has a high maintenance cost and prone to problems such
as having to increase the range periodically as the score slowly degrades over time.
Also, as you can see on Chromium perf bots,
the test results tend to fluctuate a lot so hard-coding a tight range of acceptable value is tricky.
Unlike run-webkit-tests
, run-perf-tests
doesn't generate any HTML or JSON files to summarize the results
by default since only output you get out of performance tests are time took to run tests or scores,
which are already reported on stdout.
The output of run-perf-tests
is designed to be compatible with Chromium perf bots
but we can easily change that to something more human friendly if people are so inclined.
The script optionally generates a JSON file to be used by perf bots.
In order for other ports (e.g. Windows, Qt, GTK, etc...) to support run-perf-tests
,
simply their respective DRT needs to support --no-timeout
option that disables the watchdog timer.
This is necessary as some performance tests take a long time to run.
Also, we'll appreciate your help if you could add --timeout
option per
WebKit bug 76662 for the code sanity.
Adding Performance Bots #
In the next couple of days, I'm going to post a patch to add a Chromium Mac Perf bot to build.webkit.org (of course, upon appropriate reviews) that runs run-perf-tests and uploads a JSON file to webkit-perf.appspot.com, a clone of graphs.mozilla.org.
While we could have adopted Chromium's perf bot output where each slave generates a JSON file with a html front end that loads the JSON, the approach didn't scale well for Chromium when the number of historical values stored on each slave soared and the size of JSON increased proportionally over time. Furthermore, it's hard to compare values between different bots or tests.O n the other hand, creating a new front end seemed like a too much work. As such, I've decided to port Mozilla's Graph Server to WebKit after consulting with tony^work, ojan, and evmar.
While we could have added another dedicated apache server with all nice features Graph Server's native backend provides, the maintenance cost of maintaining such a server seemed too high. Also, Robert Helmer (rhelmer), a Mozilla contributor who is actively working on the Graph Server, told me that Mozilla is planning to replace the backend with a key-value database. Given these circumstances and some experimentations, I wrote our own backend using Google App Engine for its low maintenance cost and ease of use; note App Engine is already used by commit-queue and flakiness dashboard.
My work to port the Graph Server is near completion and I expect it to be working in the next couple of days just as I add a Chromium Mac Perf bot. If you're interested in adding new perf bots for your port, please contact me directly and I'll give you a detailed instruction on what needs to happen (it's super trivial but involves giving out or receiving a password).
How to Write Performance Tests #
If you're interested in adding more performance tests (you should be!), then use html-parser.html as an example. It uses runner.js, which automatically aggregates results over multiple runs and outputs the results in the preferred format run-perf-tests understands.
Since there hadn't been any script to run performance tests,
tests in PerformanceTests don't have an uniform output format.
As a result, run-perf-tests only supports running tests in Bindings, Parser, and inspector at the moment.
I'd really appreciate your help if you could convert the existing tests to use runner.js
to increase the number of performance tests run-perf-tests can run or modify run-perf-tests
so that it can run more tests.
Obviously, our goal is to be able to run all tests in PerformanceTests
by run-perf-tests
.
Note Hajime Morita (morrita) has taken initiative on the effort to run Dromaeo in DRT.