Why isn’t Autoland working?

This question comes up enough that I figured a quick blog post/status update would be helpful.

What does Autoland do?

Poll individual Bugzilla bugs for an autoland token (currently using whiteboard tags, future Autoland has an extension & webservice that makes polling the entirety of bugzilla no longer needed). When an autoland request is found, the serviced does an automated landing to try for you of all non-obsolete patches attached to the bug if they can be landed cleanly on tip of mozilla-central and either the patch author or a feedback provider have appropriate hg permissions, otherwise it reports back to the bug what the issues were. Upon completion of the try run, a comment is left in the bug stating the results and if a final repo destination (or destinations) was specified (the hg permissions must match up between requester/reviewer and the destination repo(s)) the service can continue on to autolanding the patch(es) to the destination repo.  A comment would be left on the bug when the push to final destination(s) is done.  There would be no reporting back of final build results, that would be handed back over to human eyes on TBPL.

What is Autoland’s status now?

In April of 2012, right before Marc’s second internship ended, we launched a very experimental public-facing version of Autoland and announced it a bit so we could get more people testing it.  This had varying degrees of success.  We got more bugs ironed out but also discovered that Autoland’s daemons for hgpushing and bugzilla polling tend to fall over a bit too often.  When we moved Autoland off it’s staging VM to a more permanent home we lost the status page that would tell us (and developers) what the modules were doing and that really made the workings of Autoland quite opaque.  With Marc leaving at the end of April and my switch over to help with Release Management in Feb 2012  I had kept meeting regularly with Marc and driving the project to completion as much as possible but hadn’t been able to pull my weight on coding for the last 3 months of our time together. This left us with an Autoland that stopped working and no one available to continue to massage it into being the robust system we needed it to be.  I took mention of Autoland out of our trychooser page, try server docs, and have generally tried to downplay it’s existence while still keeping a plan on the back burner for how we will resurrect it as soon as there is some time.

What does Autoland need to be publicly usable again?

  • A status page that can show what modules are running or down, display what’s in the queue, and give a quick visual to users if Autoland is up or down as a result.
  • Nagios alerts on the Autoland modules that let me (and other people interested in helping to maintain Autoland) know when things fall over.
  • At least one person, if not several, who can access the Autoland master VM and ‘kick’ it as needed

This is what’s needed for a short term solution. I know that we have some bugs with our hg pusher module as well as some trickiness in our message queue that, once fixed, would make the overall system more robust.  We need people using the system to be able to catch more of those bugs though, so in the meantime having as close to on-click restoring of the system would be a huge win here.

What does Autoland need to be truly ‘production’ ready?

  • Security review on the code and the BMO extension so that we can move away from whiteboard tags and let people use the BMO extension instead — this gives much cleaner input to Autoland of what is being requested.
  • More VMs to run hgpusher modules on so that Autoland can handle a larger load. Each VM can run 2 hgpushers max so we’d want to be able to grow our pushing farm as the usage of the system increases.
  • Being able to push to repos other than Try.

There is no clear plan to be able to get the system beyond Try landings, but I see automated try landings as still being a huge help so I’d be super happy just to see that part get back to a working state.  This project is no longer a RelEng priority now that I’ve permanently moved to Release Management and Marc has gone on to other internships and more schooling. I can’t promise anything time-wise, but I wanted to provide some clarity into what is needed and put out the “patches welcome” call.  I see Autoland as being a great option for a community-managed project and I want to keep working on it when time permits. If you are looking to become a Mozilla contributor and are interested in automation and web APIs – this might be a good starter project for you. Please get in touch.

 

 

3 thoughts on “Why isn’t Autoland working?

  1. What is the motivation for Autoland? Writing a special string in a bug whiteboard doesn’t seem any easier than writing a special string in a commit message and pushing to try. Am I overlooking a use case(s)? Maybe it’s good for people who don’t have try permissions?

  2. Yes, there’s definitely that use case quite frequently. There’s another goal though which would be to automatically land things across multiple branches if it passes try. This would be very handy for release tracking – set the patch(es) and get comments back in the bug that those changes (if successfully run on try) have landed across central/aurora/beta/esr.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.