Braze
Braze5d ago
New

Senior Site Reliability Engineer

BrazilBrazil·São Paulosenior
EngineeringDevops Engineer
1 views0 saves0 applied

Quick Summary

Overview

At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew. We seek to ignite that passion by setting high standards, championing teamwork,

Technical Tools
EngineeringDevops Engineer

At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew.

We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity – inside and outside our organization.

To flourish here, you must be prepared to set a high bar for yourself and those around you. There is always a way to contribute: Acting with autonomy, having accountability and being open to new perspectives are essential to our continued success.

Our deep curiosity to learn and our eagerness to share diverse passions with others gives us balance and injects a one-of-a-kind vibrancy into our culture.

If you are driven to solve exhilarating challenges and have a bias toward action in the face of change, you will be empowered to make a real impact here, with a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can’t wait to meet you.

Responsibilities

~2 min read

Braze runs one of the largest MongoDB deployments in the world – powering real-time customer engagement for thousands of the world’s leading brands. We process hundreds of billions of data points each month across more than 3.3 billion monthly active users, with MongoDB at the core of how we store, query, and serve that data at scale.

As a Senior SRE on the MongoDB Platform team, your primary mission is to make MongoDB better for Braze – and to do so with the rigor, automation-first mindset, and engineering discipline of a world-class SRE. You won’t just keep the lights on; you’ll architect a more reliable, scalable, and observable MongoDB platform that the entire engineering organization depends on.

Main responsibilities:

Own MongoDB Reliability at Scale

  • Design and operate Braze’s MongoDB infrastructure to meet strict enterprise-grade SLAs, with deep ownership of availability, durability, and query performance
  • Build proactive monitoring and alerting that fires on symptoms – before customers feel impact – with rich MongoDB-specific observability (oplog lag, replication health, lock contention, index hit rates, etc.)
  • Lead capacity planning and sharding strategy as data volumes and query patterns evolve
  • Drive root-cause analysis on MongoDB incidents and translate findings into permanent system improvements

Improve the MongoDB Developer Experience

  • Partner with product engineering teams to review schema designs, index strategies, and aggregation pipelines – catching scalability anti-patterns before they reach production
  • Build self-service tooling, automation, and runbooks that let engineers interact with MongoDB safely and efficiently without needing to page the platform team
  • Define and enforce connection pool sizing, write-concern defaults, and read-preference standards across the fleet

Build and Automate Infrastructure

  • Manage MongoDB cluster lifecycle (provisioning, upgrades, failovers, decommissions) on Kubernetes using the MongoDB Enterprise Kubernetes Operator, with infrastructure defined as code via Terraform and Ansible
  • Develop and maintain automated backup, restore, and point-in-time recovery workflows – tested regularly against real workloads
  • Contribute to internal platform tooling in Ruby and/or Go that reduces operational toil across the SRE organization

Incident Response & On-Call

  • Participate in a PagerDuty on-call rotation with a clear charter: use every quiet shift to eliminate the next page
  • Lead incident retrospectives with a bias toward systemic fixes, automation, and documentation – not blame
  • Maintain and improve runbooks so that any engineer on the team can respond effectively to MongoDB incidents

Required:

  • 5+ years of experience as a Software Engineer, DevOps Engineer, or Site Reliability Engineer in a production environment
  • Hands-on MongoDB expertise: replica sets, sharding, index design, aggregation pipelines, explain plans, and performance tuning under real load
  • Strong Linux fundamentals and comfort operating at the OS level (disk I/O, memory, networking, process management)
  • Strong programming skills in one or more of: Python, Go, Ruby, or JavaScript – you write automation, not just scripts (JavaScript/Python experience is a plus for MongoDB shell scripting and aggregation pipeline work)
  • Experience with IaC tools: Terraform, Ansible, or equivalent
  • Experience with container orchestration: Docker and Kubernetes
  • A systems thinker who reasons about interfaces, failure modes, edge cases, and cascading effects across the stack
  • Bias toward documentation and asynchronous collaboration across global remote teams

Nice to Have:

  • Experience running MongoDB at multi-terabyte scale or in a sharded topology
  • Familiarity with MongoDB Atlas, Ops Manager, or Cloud Manager
  • Experience with complementary data technologies in Braze’s stack: Redis, Kafka, Postgres
  • Prior work on database platform engineering or database reliability engineering (DBRE) teams

#LI-Hybrid

What We Offer

~3 min read

Braze benefits vary by location, and we encourage you to review our specific benefits offerings for each country here. More details on benefits plans will be provided if you receive an offer of employment.

From offering comprehensive benefits to fostering hybrid ways of working, we’ve got you covered so you can prioritize work-life harmony. Braze offers benefits such as:

Competitive compensation that may include equity
Retirement and Employee Stock Purchase Plans
Flexible paid time off
Comprehensive benefit plans covering medical, dental, vision, life, and disability
Family services that include fertility benefits and equal paid parental leave
Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
A curated in-office employee experience, designed to foster community, team connections, and innovation
Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
Employee Resource Groups that provide supportive communities within Braze
Collaborative, transparent, and fun culture recognized as a Great Place to Work®

Location & Eligibility

Where is the job
São Paulo, Brazil
On-site at the office
Who can apply
BR

Listing Details

Posted
June 3, 2026
First seen
June 4, 2026
Last seen
June 7, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
67%
Scored at
June 4, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Braze
Braze
greenhouse

Braze is a comprehensive customer engagement platform that powers relevant and memorable experiences between consumers and the brands they love.

Employees
750
Founded
2011
Domain
braze.com
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

BrazeSenior Site Reliability Engineer