• Archives

  • Categories:

OpenStack QA/Infrastructure Meetup in Darmstadt

I spent this week at the QA/Infrastructure Meetup in Darmstadt, Germany.

Our host was Marc Koderer of Deutsche Telekom who sorted out all logistics for having our event at their office in Darmstadt. Aside from the summer heat (the conference room lacked air conditioning) it all worked out very well, we had a lot of space to work, the food was great, we had plenty of water. It was also nice that the hotel most of us stayed at was an easy walk away.

The first day kicked off with an introduction by Deutsche Telekom that covered what they’re using OpenStack for in their company. Since they’re a network provider, networking support was a huge component, but they use other components as well to build an infrastructure as they plan to have a quicker software development cycle that’s less tied to the hardware lifetime. We also got a quick tour of one of their data centers and a demo of some of the running prototypes for quicker provisioning and changing of service levels for their customers.

Monday afternoon was spent with an on-boarding tutorial for newcomers to OpenStack when it comes to contributing, and on Tuesday we transitioned into an overview of the OpenStack Infrastructure and QA systems that we’d be working on for the rest of the week. Beyond the overview of the infrastructure presented by James E. Blair, key topics included in the infrastructure included jeepyb presented by Jeremy Stanley, devstack-gate and Grenade presented by Sean Dague, Tempest presented by Matthew Treinish (including the very useful Tempest Field Guide) and our Elasticsearch, Logstash and Kibana (ELK) stack presented by Clark Boylan.

Wednesday we began the hacking/sprint portion of the event, where we moved to another conference room and moved tables around so we could get into our respective working groups. Anita Kuno presented the Infrastructure User Manual which we’re looking to flesh out, and gave attendees a task of helping to write a section to help guide users of our CI system. This ended up being a great thing for newcomers to get their feet wet with, and I hope to have a kind of entry level task at every infrastructure sprint moving forward. Some folks worked on getting support for uploading log files to Swift, some on getting multinode testing architected, and others worked on Tempest. In the early afternoon we had some discussions covering recheck language, next steps I’d be taking when it comes to the evaluation of translations tools, a “Gerrit wishlist” for items that developers are looking for as Khai Do prepares to attend a Gerrit hack event and more. I also took time on Wednesday to dive into some documentation I noticed needed some updating after the tutorial day the day before.

Thursday the work continued, I did some reviews, helped out a couple of new contributors and wrote my own patch for the Infra Manual. It was also great to learn and collaborate on some of the aspects of the systems we use that I’m less familiar with and explain portions to others that I was familiar with.


Zuul supervised my work

Friday was a full day of discussions, which were great but a bit overwhelming (might have been nice to have had more on Thursday). Discussions kicked off with strategies for handling the continued publishing of OpenStack Documentation, which is currently just being published to a proprietary web platform donated by one of the project sponsors.

A very long discussion was then had about managing the gate runtime growth. Managing developer and user expectations for our gating system (thorough, accurate testing) while balancing the human and compute resources that we have available on the project is a tough thing to do. Some technical solutions to ease the pain on some failures were floated and may end up being used, but the key takeaway I had from this discussion was that we’d really like the community to be more engaged with us and each other (particularly when patches impact projects or functionality that you might not feel is central to your patch). We also want to stress that the infrastructure is a living entity that evolves and we accept input as to ideas and solutions to problems that we’re encountering, since right now the team is quite small for what we’re doing. Finally, there were some comments about how we run tests in the process of reviewing, and how scalable the growth of tests is over time and how we might lighten that load (start doing some “traditional CI” post merge jobs? having some periodic jobs? leverage experimental jobs more?).

The discussion I was most keen on was around the refactoring of our infrastructure to make it more easily consumable by 3rd parties. Our vision early on was that we were an open source project ourselves, but that all of our customizations were a kind of example for others to use, not that they’d want to use them directly, so we hard coded a lot into our special openstack_projects module. As the project has grown and more organizations are starting to use the infrastructure, we’ve discovered that many want to use one largely identical to ours and that making this easier is important to them. To this end, we’re developing a Specification to outline the key steps we need to go through to achieve this goal, including splitting out our puppet modules, developing a separate infra system repo (what you need to run an infrastructure) and project stuff repo (data we load into our infrastructure) and then finally looking toward a way to “productize” the infrastructure to make it as easily consumable by others as possible.

The afternoon finished up with discussions about vetting and signing of release artifacts, ideas for possible adjustment of the job definition language and how teams can effectively manage their current patch queues now that the auto-abandon feature has been turned off.

And with that – our sprint concluded! And given the rise in temperature on Friday and how worn out we all were from discussions and work, it was well-timed.

Huge thanks to Deutsche Telekom for hosting this event, being able to meet like this is really valuable to the work we’re all doing in the infrastructure and QA for OpenStack.

Full (read-only) notes from our time spent throughout the week available here: https://etherpad.openstack.org/p/r.OsxMMUDUOYJFKgkE