Start With Empathy
This post is about my observations and thoughts over number of years on what I have seen as interactions between engineering teams and development teams and faults there-in: how we can improve those interactions to build a better DevOps culture where SREs and developers work together towards a common goal of site reliability.
To build relationship in any situations in life, you have to start with empathy, and it’s even more important for relationships between teams, where people can have varying degree of knowledge and/or skills. As a member of a SRE team you need to be empathetic to the development team in order to listen and understand their way of thinking and fully understand their requirements. Without empathy, we fail to see the problem from other person’s point of view.
When we talk about building great DevOps cultures, good communication is the most important part of it, and communication can only happen freely when teams trust each other. This trust can only be built over time by openly sharing and discussing ideas and when people feel that their ideas are listened to and important, it allows them to communicate more. This allows any problems to surface early on and any friction issues gets resolved.
To have better communication between SRE and dev teams, dev teams needs to trust the SRE team and this can only be achieved by listening without prejudice and showing that we have their best interests at our hearts ; that we are there to help them. Try to ask questions and try to understand why they are asking for something and what their core needs are behind that request. Believe me, it will surprise you how much you can learn and build relations by just be curious and learning about their needs and this would allow you to suggest and provide better solutions.
So next time when a developer comes to you with a request or raises a ticket or if you are in a meeting with dev teams about some new project, instead of criticizing the lack of details in the plan or their lack of understanding production environments, keep an open mind that they might not know how their code looks like in production or where it’s deployed and how. As a developer, they are most likely busy with several other tasks in their domain, like we are most of the time with production issues, and also unlike us, they haven’t spent years learning about web servers and databases because they have been focused on acquiring different knowledge specific to their dev field. Just keep an open mind that we are all working towards common goal of site reliability and we all have an important part to play in it.
If teams don’t talk to each other, then no amount of fancy softwares, pipelines, or SRE titles is going to solve the silos and integration problems and constant tech debts which disjoined teams churn out.
In SRE we often talk about SLOs and use them as quantifiable measures of how efficient of a job you are doing and then set them as the service level objects to meet as a team. These SLOs determine user experience and how fast our site is, and how quickly we resolve an incident.
There can also be an internal SLO to measure our in-house users experience. For example, what the turn around time is for tickets/user creation, what the stability is of our pipeline runs. SLOs on their own are very measurable and quantifiable as objects, but we can make their more humanly by adding a human touch to them. For example, we can say that the performance team couldn’t do their job for four hours because their environment pipeline failed. This allows your SRE team to begin to understand the impact of these failures and allows them to see the user pain as felt by performance team instead of just seeing it another pipeline failure to fix. This way you have attached experience of perf team with the failure of the pipeline. This will allow your SRE team to make better decisions as they know the human impact of any future failures, and from this they will build a better pipeline, and it make them understand that there is a user experience cost attached to each of these failures.
Being empathetic not only helps relationship across teams, it also allows better communication within the SRE team as well. It allows people open to other team members’ views and allows them to listen properly and understand them, it fosters better understanding from people after their ideas has been listened to. It allows senior members to be better at mentoring junior staff.
In summary, when it comes to having building DevOps culture, empathy is the most important quality you can foster within yourself and others.