Corrigible: Status of Rocket Mode
Posted on Dec 30, 2014 at 12:35AM
Rocket mode is a very cool idea that was poorly implemented and I'm still in the middle of the fix.
The purpose of rocket mode is similar to that of run selectors (which work fine, btw), but it is very-much developer-focused, where run selectors have uses beyond simply shaving hours off the effort that goes into plan development.
Run selectors save time by making it possible to include or excludes parts of the plan using one or more selectors at run time. A system plan developer might use a run selector called quick_reinstall to skip the installation of debs and python modules, saving several minutes of execution time at each iteration. But it is also possible that an administrator might use run selectors like backup_database or process_logs to execute different maintenance tasks associated with a system, so run selectors make it possible to call "procedures" (of sorts) as well.
By contrast, the goal of rocket mode is single purpose. When called with --ROCKET-MODE, corrigible will attempt to skip as much of the plan as possible. If the plan hasn't changed since the last time corrigible ran, then it won't try to run it again.
The initial tests looked good, but they looked good because I wasn't very thorough. I failed to consider what happens when a plan remains the same from one run to the next, but one of its children plans has changed.
What should happen is the parent plans of all changed plans should be considered changed as well. What happens now (in the master branch) is the parent plans are not re-run and, worse, if a parent plan hasn't changed but the child plan has, the child plan isn't run either because the fact that the parent plan is change-free causes corrigible not to look to see if the child plan has changed. FAIL.
Also, confounding the solution to the fix is a design choice I made on day #1 for no thoughtful reason at all. Basically, corrigible builds the ansible output yaml file one plan at a time by simply starting with an empty string and appending to it each rendered plan that isn't excluded due to run selectors. When it runs out of plans, it writes the file.
The path to a working rocket mode, I've concluded, involves coming up with a more structured intermediate storage mechanism for the output ansible snippets. I'll traverse the plan tree normally and store the parent-child relationship, the plan depth and the order along with the rendered snippet text. Just before stringing it all together, I'll exclude only those snippets that didn't change and whose children were also unchanged.
To that end, I've changed out the intermediate storage mechanism and, just today, I've got it passing all the tests that the master branch passes. So it's behavior should be functionally the same as the master branch minus the broken rocket mode functionality.
Left to do, then, is the just-in-time exclusion and the battery of tests that prove it's working as it should.