My First Three Months as an SRE: A Retrospective

July 6, 2025WORK/MINDSET

Introduction

I've been at the new company for about three months now. Looking back, this is the biggest career shift I've made so far. I started as a Frontend Engineer intern, moved into Full Stack, then Backend — and now I'm an SRE. Each transition felt natural at the time, but this one is different. The role, the environment, and the way I think about systems have all changed significantly.

What has been happening in my life recently

The first three months were a lot to take in, both personally and professionally. I felt exhausted in a way I hadn't experienced before. But not because I disliked the work — if anything, the opposite. I'm a workaholic by nature, and not having meaningful work to do actually burns me out faster than overworking does.

On the work side, the biggest adjustment was the environment itself. My previous role involved a lot of cloud infrastructure — AWS, Terraform, writing modules, managing services. I was comfortable in that world. Creating a VM, a database, a CDN — it takes a few clicks, and a managed platform handles everything underneath.

What happened during my SRE probation

From day one, I was given real work. My first task: refactor Jenkins across all of our microservices. We have over 30 of them. I won't lie — when I first understood the full scope of it, I stared at my screen for a moment.

Here's what I worked through during probation:

  1. Refactor Jenkins pipelines for all microservices — each service had its own slightly different pipeline setup that had accumulated over time. The first step was just understanding what every service actually did before touching anything.

  2. Redesign the Access Libraries permission model — the existing permission model for our shared libraries had grown inconsistent over time. I redesigned it from scratch, thinking through who should own what, and what the right boundaries between teams should look like. This kind of work is quiet but important — getting permissions wrong is a security issue, and getting them too tight breaks developer workflows.

  3. Write a Trivy scanning script for on-premise machines — I'd only ever written simple shell scripts before this. Writing a script that actually runs Trivy across our machines and produces something useful pushed me to think more carefully about how shell scripting works at scale. On-premise, vulnerability scanning isn't handed to you — you build it yourself.

  4. Write a custom Groovy script for Jenkins build failure notifications — I originally looked for a plugin that could send build failure messages in the way we needed. None of them fit our setup, so I ended up writing a Groovy script from scratch. It was my first real Groovy, and having to read through plugin source code and Jenkins docs to understand what was possible was a good lesson in figuring things out when there's no ready-made solution.

The mindset shift

The biggest thing these three months taught me is how differently I think about systems now.

As a developer, I cared about the code. As an SRE, I care about the code and everything underneath it — the pipelines, the machines, the permissions, the reliability. When something breaks, I can't hand it off to an ops team. It lands on me, and I need to understand it at every layer.

On-premise also forces a kind of discipline that cloud quietly abstracts away. You can't just spin up a new instance and deprecate the old one without thinking about it. Resources are physical and finite. Every decision has more weight to it.

I'm still learning a lot, and I expect the next few months to be just as challenging. But I can already feel that this role is reshaping how I approach software and infrastructure as a whole. That feels like exactly the right kind of hard.

Made with ❤️ by Jiawei Hong