Why isn’t Autoland working?

This question comes up enough that I figured a quick blog post/status update would be helpful.

What does Autoland do?

Poll individual Bugzilla bugs for an autoland token (currently using whiteboard tags, future Autoland has an extension & webservice that makes polling the entirety of bugzilla no longer needed). When an autoland request is found, the serviced does an automated landing to try for you of all non-obsolete patches attached to the bug if they can be landed cleanly on tip of mozilla-central and either the patch author or a feedback provider have appropriate hg permissions, otherwise it reports back to the bug what the issues were. Upon completion of the try run, a comment is left in the bug stating the results and if a final repo destination (or destinations) was specified (the hg permissions must match up between requester/reviewer and the destination repo(s)) the service can continue on to autolanding the patch(es) to the destination repo.  A comment would be left on the bug when the push to final destination(s) is done.  There would be no reporting back of final build results, that would be handed back over to human eyes on TBPL.

What is Autoland’s status now?

In April of 2012, right before Marc’s second internship ended, we launched a very experimental public-facing version of Autoland and announced it a bit so we could get more people testing it.  This had varying degrees of success.  We got more bugs ironed out but also discovered that Autoland’s daemons for hgpushing and bugzilla polling tend to fall over a bit too often.  When we moved Autoland off it’s staging VM to a more permanent home we lost the status page that would tell us (and developers) what the modules were doing and that really made the workings of Autoland quite opaque.  With Marc leaving at the end of April and my switch over to help with Release Management in Feb 2012  I had kept meeting regularly with Marc and driving the project to completion as much as possible but hadn’t been able to pull my weight on coding for the last 3 months of our time together. This left us with an Autoland that stopped working and no one available to continue to massage it into being the robust system we needed it to be.  I took mention of Autoland out of our trychooser page, try server docs, and have generally tried to downplay it’s existence while still keeping a plan on the back burner for how we will resurrect it as soon as there is some time.

What does Autoland need to be publicly usable again?

  • A status page that can show what modules are running or down, display what’s in the queue, and give a quick visual to users if Autoland is up or down as a result.
  • Nagios alerts on the Autoland modules that let me (and other people interested in helping to maintain Autoland) know when things fall over.
  • At least one person, if not several, who can access the Autoland master VM and ‘kick’ it as needed

This is what’s needed for a short term solution. I know that we have some bugs with our hg pusher module as well as some trickiness in our message queue that, once fixed, would make the overall system more robust.  We need people using the system to be able to catch more of those bugs though, so in the meantime having as close to on-click restoring of the system would be a huge win here.

What does Autoland need to be truly ‘production’ ready?

  • Security review on the code and the BMO extension so that we can move away from whiteboard tags and let people use the BMO extension instead — this gives much cleaner input to Autoland of what is being requested.
  • More VMs to run hgpusher modules on so that Autoland can handle a larger load. Each VM can run 2 hgpushers max so we’d want to be able to grow our pushing farm as the usage of the system increases.
  • Being able to push to repos other than Try.

There is no clear plan to be able to get the system beyond Try landings, but I see automated try landings as still being a huge help so I’d be super happy just to see that part get back to a working state.  This project is no longer a RelEng priority now that I’ve permanently moved to Release Management and Marc has gone on to other internships and more schooling. I can’t promise anything time-wise, but I wanted to provide some clarity into what is needed and put out the “patches welcome” call.  I see Autoland as being a great option for a community-managed project and I want to keep working on it when time permits. If you are looking to become a Mozilla contributor and are interested in automation and web APIs – this might be a good starter project for you. Please get in touch.

 

 

Autolanding your patch(es) to Try via Bugzilla

We’re ready for a soft release of the first step in our very experimental autolanding system. Experimental meaning: we reserve the right to pull the plug, take it down for tweaks, and some information may be lost when bugs arise.  You can check the if the autoland system is up and running by going to http://bit.ly/autoland_status

This is the Try branch only portion of what will become a system to automate and more easily manage multiple branch landings.  Marc Jessome, our returning Release Engineering intern, and myself have this as a Q1 goal.  There are several issues to be resolved with this system and the link above will keep you appraised of what the goals and known issues are. We will be working hard over the next two months to make this a secure, reliable system for landing patches with minimal developer time spent pulling, pushing, and watching tbpl.  The hope is that it will also become a useful tool for Release Coordinators to track that a fix lands across several branches as needed.  Future features will include a bugzilla extension to securely interact with this system and remove the need for whiteboard tag changes and a RESTful API so the whole process could be initiated through the command line.  We appreciate constructive feedback but please hold off on scope creep suggestions 🙂 

Here’s how it works right now:

  1. Attach patches to your bug Note: if you (the attacher) do not have permission to push to Try, get an r+ from someone who does
  2. Use this documentation to determine the appropriate whiteboard tag and enter it in the bug so our autolanding poller will pick up on your request
  3. The autoland system, which tracks all bugs with autolanding requests, will queue up your patchset and then pull it from the queue, apply the list of patches in your bug (or those in the whiteboard tag) to a clean pull of mozilla-central tip, commit each patch as the patch author, and finally push to try as Autoland User
  4. A comment will be posted in your bug with the results (hopefully successful) of the push attempt, and the whiteboard tag will read [autoland-in-queue]
  5. After all builds complete, their results get posted back to your bug (just like with the current ‘–post-to-bugzilla Bug #’ flag in trychooser) and the [autoland-in-queue] whiteboard tag will be removed

There is a threshold on how many bugs the autoland system will push and track at a given time.  We are curious to learn about usage and load on this system so please give it a try and inform us of any issues in IRC: #build or #developers channels are the best places to find myself (lsblakk) or Marc (mjessome) and chat us up about this initiative.