• Archives

  • Categories:

OpenStack QA/Infrastructure Meetup in Walldorf

I spent this week in the lovely town of Walldorf, Germany with about 25 of my OpenStack Quality Assurance and Infrastructure colleagues. We were there for a late-cycle sprint, where we all huddled in a couple of rooms for three days to talk, script and code our way through some challenges that are much easier to tackle when all the key players are in a room together. QA and Infra have always been a good match for an event like this since we’re so tightly linked as things QA works on are supported by and tested in the Continuous Integration system we run.

Our venue this time around were the SAP offices in Walldorf. They graciously donated the space to us for this event, and kept us blessedly fed, hydrated and caffeinated throughout the day.

Each day we enjoyed a lovely walk from and to the hotel many of us stayed at. We lucked out and there wasn’t any rain while we were there so we got to take in the best of late summer weather in Germany. Our walk took us through a corn field, past flowers, gave us a nice glimpse at the town of Walldorf on the other side of the highway and then began in on the approach to the SAP buildings of which there are many.

The first day began with an opening from our host at the SAP offices, Marc Koderer and by the QA project lead Ken’ichi Ohmichi. From there we went through the etherpad for the event to figure out where to begin. A big chunk of the Infrastructure team went to their own room to chat about Zuulv3 and some of the work on Ansible, and a couple of us hung back with the QA team to move some of their work along.

Spending time with the QA folks I learned about future plans for a more useful series of bugday graphs. I also worked with Spencer Krum and Matt Treinish to land a few patches related to the new Firehose service. Firehose is a MQTT-based unified message bus that seeks to encompass all the developer-facing infra alerts and updates in a single stream. This includes job results from Gerrit, updates on bugs from Launchpad, specific logs that are processed by logstash and more. At the beginning of the sprint only Gerrit was feeding into it using germqtt, but by the end of Monday we had Launchpad bugs submitting events over email via lpmqtt. The work was mostly centered around setting up Cyrus with Exim and then configuring the accounts and MX records, and trying to do this all in a way that the rest of the team would be happy with. All seems to have worked out, and at the end of the day Matt sent out an email announcing it: Announcing firehose.openstack.org.

That evening we gathered in the little town of Walldorf to have a couple beers, dinner, and relax in a lovely beer garden for a few hours as the sun went down. It was really nice to catch up with some of my colleagues that I have less day to day contact with. I especially enjoyed catching up with Yolanda and Gema, both of whom I’ve known for years through their past work at Canonical on Ubuntu. The three of us also were walk buddies back to the hotel, before which I demanded a quick photo together.

Tuesday morning we started off by inviting Khai Do over to give a quick demo of the Gerrit verify plugin. Now, Khai is one of us, so what do I mean by “come over”? Of all the places and times in the world, Khai was also at the SAP offices in Walldorf, Germany, but he was there for a Gerrit Hackathon. He brought along another Gerrit contributor and showed us how the verify plugin would replace our somewhat hacked into place Javascript that we currently have on our review pages to give a quick view into the test results. It also offers the ability in the web UI to run rechecks on tests, and will provide a page including history of all results through all the patchsets and queues. They’ve done a great job on it, and I was thrilled to see upstream Gerrit working with us to solve some of our problems.


Khai demos the Gerrit verify plugin

After Khai’s little presentation, I plugged my laptop into the projector and brought up the etherpad so we could spend a few minutes going over work that was done on Monday. A Zuulv3 etherpad had been worked on to capture a lot of the work from the Infrastructure team on Monday. Updates were added to our main etherpad about things other people worked on and reviews that were now pending to complete the work.

Groups then split off again, this time I followed most of the rest of the Infrastructure team into a room where we worked on infra-cloud, our infra-spun, fully open source OpenStack deployment that we started running a chunk of our CI tests on a few weeks ago. The key folks working on it gave a quick introduction and then we dove right into debugging some performance problems that were causing failed initial launches. This took us through poking at the Glance image service, rules in Neutron and defaults in the Puppet modules. A fair amount of multi-player (using screen) debugging was done up on the projector as we shifted around options, took the cloud out of the pool of servers for some time, and spent some time debugging individual compute nodes and instances as we watched what they did when they came up for the first time. In addition to our “vanilla” region, Ricardo Carrillo Cruz also made progress that day on getting our “chocolate” region working (next up: strawberry!).

I also was able to take some time on Tuesday to finally get notice and alert notifications going to our new @openstackinfra Twitter account. Monty Taylor had added support for this months ago, but I had just set up the account and written the patches to land it a few days before. We ran into one snafu, but a quick patch (thanks Andreas Jaeger!) got us on our way to automatically sending out our first Tweet. This will be fun, and I can stop being the unofficial Twitter status bot.

That evening we all piled into cars to head over to the nearby city of Heidelberg for dinner and drinks at Zum Weissen Schwanen (The White Swan). This ended up being our big team dinner. Lots of beers, great conversation and catching up on some discussions we didn’t have during the day. I had a really nice time and during our walk back to the car I got to see Heidelberg Castle light up at night as it looms over the city.

Friday kicked off once again at 9AM. For me this day was a lot of talking and chasing down loose ends while I had key people in the room. I also worked on some more Firehose stuff, this time working our way down the path to get logstash also sending data to Firehose. In the midst of which, we embarrassingly brought down our cluster due to failure to quote strings in the config file, but we did get it back online and then more progress was made after everyone got home on Friday. Still, it was good to get part of the way there during the sprint, and we all learned about the amount of logging (in this case, not much!) our tooling for all this MQTT stuff was providing for us to debug. Never hurts to get a bit more familiar with logstash either.

The final evening was spent once again in Walldorf, this time at the restaurant just across the road from the one we went to on Monday. We weren’t there long enough to grow tired of the limited selection, so we all had a lovely time. My early morning to catch a train meant I stuck to a single beer and left shortly after 8PM with a colleague, but that was plenty late for me.


Photo courtesy of Chris Hoge (source)

Huge thanks to Marc and SAP for hosting us. The spaces worked out really well for everything we needed to get done. I also have to say I really enjoyed my time. I work with some amazing people, and come Thursday morning all I could think was “What a great week! But I better get home so I can get back to work.” Hey! This all was work! Also thanks to Jeremy Stanley, our fearless Infrastructure Project Team Leader who sat this sprint out and kept things going on the home front while we were all focused on the sprint.

A few more photos from our sprint here: https://www.flickr.com/photos/pleia2/albums/72157674174936355