From Zero to Hero: Recommended Practices for Training your Ever-Evolving SRE Teams
Andrew Widdowson, Google
or, "How can I strap a jetpack to my newbies, while keeping everyone up to speed?"
SRE teams go to where the action is, but when team members are deeply embedded in large scale problems, little time is left to do things like train one's newest teammates. "Here kid, grab a hose and help me fight this fire" only works up to a limit that you will definitely exceed when you're trying to mold your newest systems or software engineer into a fully functional Site Reliability Engineer. Plus, the stack(s) that your team is oncall for are rapidly evolving and if you blink, even your most senior SREs can quickly be out of touch with the state of the systems. Uh oh!
The often understated truth is that SREs need to be as good--or better--at scaling humans as they are at scaling computers, if they want to be able to keep up with the systems that they oversee. How, then, can you keep your existing SREs up to speed and sharp as a tack, while making sure that your newest teammates can learn the ropes and become just as seasoned, sooner rather than later or never?
In this talk, Andrew will share a set of practices we're using at Google to train our next wave of SREs better, stronger, and faster... and then keep them that way! You'll learn about ways to encourage large scale systems thinking, provide hands-on opportunities for learning, and impress the technical and philosophical subtleties of what make the best SREs so effective as quickly as possible.
View the full SREcon15 Program at https://www.usenix.org/conference/srecon15/program