Let's Talk Documentation
·2 mins
[this was originally published on LinkedIn in 2024] Let’s talk documentation. Specifically, SRE and Operational documentation.
Opening with some base rules about Operational documentation:
- Documentation is always wrong, at least a little bit.
- Documentation is always out of date.
- Documenting isn’t considered valuable time spent by an SRE until a problem is already manifested and the Subject Matter Expert isn’t available.
- Processes that happen rarely, are less likely to get updated or accurate documentation.
Each of these issues is a prioritization, and value, problem, especially rule 4. Processes that are rarely run usually have a low priority to be documented correctly as documenting the process is less important than finishing the process itself. Paradoxically, less a process is run, the more valuable documentation of that process becomes.
The fixes are relatively simple:
- Make updating, and writing, documentation part of the operational process from the outset. Create a documentation ticket, and track that ticket’s lifecycle.
- Ticket, and prioritize, documentation updates during a given sprint.
- Periodic documentation reviews. At least a couple times a year, have the various team members use the documentation itself as a reference. See what fails, what’s missing.
All of this should be obvious, but it’s up to team leads to assign ownership, and encourage the team to tackle the mundane with as much gusto as the new.