Day 2 of the OpenStack summit here in Hong Kong began with a series of really great keynotes. First up were three major Chinese companies, iQIYI (online video), Qihoo 360 (Internet platform company) and Ctrip (travel services) about how they all use OpenStack in their companies (video here). We also learned several OpenStack statistics, including that there are more Active Technical Contributors (ATCs) in Beijing than any other city in the world and Shanghai is pretty big too. This introduction also fed a theme of passion for Open Source and OpenStack that was evident throughout the keynotes that followed.
I was then really impressed with the Red Hat keynote, particularly Mark McLoughlin’s segment. Having been actively working on Open Source for over 10 years myself, his words of success we’ve had from Linux to OpenStack really resonated with me. For years all of us passionate Open Source folks have been talking about (and giving presentations) the benefits of open solutions, so seeing the success and growth today really does feel like validation, we had it right, and that feeds us to continue (literally, many of us are getting paid for this now, I love open source and used to do it all for free). He also talked about TripleO, yeah! (video here)
Next up was Monty Taylor’s keynote for HP where he got to announce the formal release of Horizon Dashboard for use with HP Cloud at horizon.hpcloud.com. It was great to hear Monty echo some of the words of Mark when discussing the success of OpenStack and then diving into the hybrid testing infrastructure we now have between the public HP Cloud and Rackspace testing infrastructures and the new “private” TripleO clouds we’re deploying (admittedly, of course I enjoyed this, it’s what I’m working on!). He also discussed much of what customers had been asking for when approaching OpenStack, including questions around openness (is it really open?), maturity, security, complexity and upgrade strategies. (video here)
– Neutron QA and Testing –
Neutron is tested much less than other portions of OpenStack and the team has recognized that this is a problem, so the session began by discussing the current state of testing and the limitations they’ve run into. One of the concerns discussed early in the session was recruiting contributors to work on testing, and then they dove into discussing some of the specific test cases that are failing to find solutions and assign tasks, there was in depth discussion of Tenant Isolation & Parallel Testing, which is one of their major projects. There are also several test concerns that there wasn’t time to address and will have to be tackled in a later meeting, including: Full Tempest Test Runs, Grenade Support, API Tests and Scenario Tests.
Copy of notes from this session available here: icehouse-summit-qa-neutron.txt
It’s interesting to learn in these QA sessions how many companies do their own testing. It seems that this is partially an artifact of Open Source projects historically being poor at public automation testing and largely being beholden to companies to do this behind the scenes and submit bugs and patches. I’m sure there will always be needs internally for companies to run their own testing infrastructures, but I do look forward to a time when more companies become interested in testing the common things in the shared community space.
– Tempest Policy in Icehouse –
Retrospective of successes and failures from work this past cycle. Kicked off by mentioning that they have now documented the Tempest Design Principles so all contributors are on the same page, and a suggestion was made to add time budget and scope of negative tests to the principles. Successes included the large ops and parallel testing, and usage of elastic search to help bug triaging. The weaker parts included use of and follow-up with (or not) blueprints, onboarding new contributors (need more documentation!) and prioritizing reviews (perhaps leverage reviewday more) and in general encouraging all reviewers.
Copy of notes from this session available here: icehouse-summit-qa-tempest-policy.txt
After lunch I did some wandering around the expo hall where I had a wonderful chat with Stephen Spector at the HP booth. I also got to chat with Robbie Williamson of Canonical and totally cheated on my Ubuntu ice cream by just asking him for banana with brownies instead of checking out their juju demo.
– Moving Trove integration tests to Tempest –
Trove is currently being tested independently of the core OpenStack CI system and they’ve been working to bring it in so this session walked through the plans to do this. One step identified was moving the Trove diskimage-elements into a different repo and discussed the pros and cons of adding it to the tripleo-image-elements, pros won. Built images from the job will then be pushed to tarballs.openstack.org for caching and then discussed more of what trove integration testing today does and what needs to be done to update tempest to run the tests on the devstack-gate using said cached instances.
Copy of notes from this session available here: TroveTempestTesting.txt
– Tempest Stress Test – Overview and Outlook –
The overall goals of Tempest stress testing is to find race conditions and simulate real-life load. The session walked through the current status of the tests and began outlining some of the steps to move forward, including the defining and writing of more stress tests. Beyond that, using stress tests in the gate was also reviewed where the time tests take (can valuable tests be done in under 45 minutes?) was considered so some of the pain points timing-wise were noted. There was also discussion around scenario tests and enhancing the documentation to include examples of unit/scenario tests and defining what makes a good test to make development of stress tests more straight forward.
Copy of notes from this session available here: icehouse-summit-qa-stress-tests.txt
– Parallel tempest moving forward –
Parallel testing in Tempest currently exists and speed of testing has greatly improved as a result, hooray! So this session was a review of some of the improvements needed to move forward. Topics included improving reliability, further speed improvements (first step: increase number of test runners. Eliminate setupClass? Distributed testing?) and testr UI vs Tempest UI.
Copy of notes from this session available here: icehouse-summit-qa-parallel.txt
– Zuul job runners and log management –
The first part of this session discussed log management for the logs produced from test runs, continuing an infrastructure mailing list thread from October: [OpenStack-Infra] Log storage/serving.
Next up: We use a limited number of features from Jenkins these days due to our workflow, so there has been discussion about writing a new/different job runner for Zuul that has several requirements:
- Distributed (no centralized ‘master’ architecture)
- Secure (should be able to run untrusted jobs)
- Should be able to publish artifacts appropriate to a job’s security context
- Lightweight (should do one job and simply)
Copy of notes from this session available here: icehouse-summit-zuul-job-runners-and-log.txt
– More Salt in Infra –
Much of the OpenStack Infrastructure is currently managed by Puppet, but there are some things like event-based dependencies that are not-trivial to do in Puppet but which Salt has built in support for. The primary example that inspired this was manage_projects.py which tends to have race/failure problems due to event dependencies.
Copy of notes from this session available here: icehouse-summit-more-salt-in-infra.txt
My evening wrapped up by heading down to Kowloon to enjoy dinner with several of my Infrastructure colleagues from HP, Wikimedia and the OpenStack Foundation.