In Bootstrapping and Rooting Documentation: Part 1, I laid out a blueprint for using documentation as the bootstrap for entry into an operations environment. In this article I will dig into the three main concepts mentioned in Part 1. In Part 3, I will demonstrate our use case for a data center of about 100 servers in two different locations.
Bootstrapping & Rooting
When a new employee or even an old employee enters a new part of the operations environment there is often a sense of being lost. Typically, another employee, already familiar with the part will give them a quick crash course and the new employee will be turned loose. Each mentor has different views on what is important for the trainee. The mentor will focus on those things accordingly, possibly leaving out critical information. Using documentation as the bootstrap instead of a mentor can help solve this problem. If the documentation is the entry point everyone, including management, will be aware of what is and isn’t important in and around the system.
As an example, say that a new systems administrator is given access to the Nagios monitoring server because he needs to add a machine. In an environment that is not automated, the new systems administrator will be tempted to copy from a current configuration file and use it for the new machine. This can lead to regression problems since there is no formal root to start from.
This problem can be solved with a little automation, and many would quit there, but how does the new systems administrator know where to find the automation? Is it a command line tool or web based? Where is the password stored for accessing the tool? Is somebody in management supposed to be notified when a new machine is added? What happens when we build new automation with new features and need to track down all of the associated business processes? A rooted documentation repository can solve these problems.
Rooting the documentation can be a fundamental paradigm shift for many programmers and systems administrators. When trying to build a rooted and booststrapped documentation system, always think about the user’s entry point to the system and cover all of the major bases. Document the URL of the web based tool, where the passwords can be found, and who the business owner is. This can give the new user the information necessary to properly use the tool and understand the organizational impact of the tool or system being used.
Meta data is important even when working only within the operations or development environments. Each person has pet projects and often they like to keep tabs on how the system is working months or years after they have moved on to other projects. This meta information would not normally be captured in a configuration file or piece of code, but can be useful when a critical systems administrator or programmer leaves or moves on. This kind of documentation gives the key architect a forum for airing their thoughts on why something is the way it is. The documentation helps build consensus on architectural decisions.
Meta data and bootstrapping is critical when documenting scripts that work together or web based applications which have an internal help system but do not enumerate an organization’s standard use cases. As an example, imagine the wiki page that documents how to properly create new pages in the wiki. This is bootstrapping.
Culture of Self Service
Now that the documentation is the root or entry point into the system, it is possible to point users to it and start to foster a culture of self service. This is critical for efficiency. Many users will ask questions, but as the culture is ingrained in the organization each participant will start to seed their projects with this kind of documentation. The documentation doesn’t have to be cumbersome, it just has to capture the right information to facilitate self service.
The documentation will take on authority in a way that may be counter intuitive. Imagine trying to install Gentoo with out looking at the documentation, it would be nearly impossible from memory. Many would argue that installation of an OS can be automated, and I would agree, but when your goal is to teach a person about the operating system installation it is effective to strategically leave pieces un-automated to require user thought or decision where necessary.
Applied to operations for example, when building a kickstart or automated puppet deployment, document how to expand the automation in a way that one must start with the documentation. One cannot automate the extension of automation, so bootstrap it with documentation and carefully setup qualitative hooks which will require user intervention which will set off red flags if someone is doing something wrong. For example, have a script which is ran by the systems administrator which will pull puppet templates from a version control system while at the same time prompting the administrator to document the new puppet script. This allows a systems administrator to expand the automation in a controlled way and seamlessly ties the documentation to the code.
When documentation is constructed in such a way that it is tied to the code, it is selfish. It is selfish to do this because the architect is freed to move on to other projects, but it is also selfish because it allows the architect to maintain influence on a system which he or she may no longer officially participate in. As new pieces of the system are replaced or expanded upon, the same model will propegate.
To get to a fully bootstrapped and rooted documentation system one must work in stages. First start with a ticket system and create a project for a new wiki to house this documentation. Then require that every ticket from then forth that defines a project is only complete when and if boot strapped documentation is complete. Start a wiki and put the new documentation there, but first bootstrap it. Define a system and templates for creating pages in the wiki. Document types of pages and link them together so that way the wiki will stand on it’s own after the creator has moved on to other projects. At some point there will be such a large body of knowledge that using ones own memory over starting at the wiki will cease to be economically viable.
In this article I have defined some of major paradigm shifts necessary to implement and use bootstrapped documentation. In Part 3, I will map this to an existing system which our company is using successfully bootstrap approximately 100 servers, associated network gear, and facility systems which are spread across two data centers.