The initial problem with finding and hiring a Site Reliability Engineer is that not everyone is on the same page as to who is one and what would be his/her exact requirements.  While that, of course, is an awkward exaggeration the question remains:

  • Is an SRE responsible for building servers?
  • Does am SRE write a lot of code?
  • Who is impacted the most by the work of an SRE?

The definitive answer to the problem comes from Ben Treynor, VP of Engineering at Google.  He knows all about SREs because he is one.”It happens, he says, when a software engineer is asked to design an operations function.  In addition to automating processes like server configurations, SREs ensure that websites are fast and available and provide a best-in-class web experience for the customer base.

Does it sound overwhelming?  It’s a lot of responsibility and those who can handle it are in high demand.  That means they are difficult to recruit, but first, you have to find them.

Where do you Find Them?

You won’t find them at the popular social networking sites.  They just don’t have the spare time to spend where nobody understands their work and their problems.  Specialized forums such as The Cisco Learning Network and Spiceworks are favored spots to hang out.  SRE’s share knowledge and conversation at AnandTech, Server Fault, and Network Engineering.

How do you Talk to Them?

After you find one, how do you recruit him?  Know enough about him and the work he does to maintain a reasonable conversation.  No.  You don’t have to have a degree in the subject.  Just show an interest in the work.  The willingness to ask questions is important.  This is especially true concerning SRE challenges.

  • SREs create run books with instructions on what to do or check when something goes wrong with a device. Such documentation resolves issues as quickly as possible
  • SREs are not stuck with a lot of code writing. This observation comes from Andrew Fong, Director of Engineering at Dropbox. Rather, he says, “SREs are worried about data center deployment and design.”
  • SREs, he adds, “worry about other layers of the stack.” This directs their attention to areas other than the software engineering side.

Know What They Look for in New Jobs

From the other side of the recruiting process, there are a few specific things that SREs want in their workplace.  An interview with Mark Henderson, an SRE at Stack Overflow explains.

  • Organizations open to change that allows flexibility in the exploration of new technologies is important. Organizations that are too gith are to be avoided.
  • Relying on the mutual respect between developers and operations is essential. Everyone works better if developers trust the SREs.
  • Realistic expectations about on-call shifts are a “must” part of recruiting. Henderson explains that most SRE’s expect to be on-call on a rotating schedule,