Software Engineer - Infrastructure
Quick Summary
Grafana (dashboards, Loki logs, Prometheus metrics), New Relic (APM, golden metrics, transaction analysis) Enhance monitoring, alerting,
Emergent builds autonomous coding agents that replace traditional software development by generating, testing, and deploying production applications directly from plain-language intent. Our systems run in production at global scale and are used to build millions of real applications.
Since public launch, Emergent has reached $100M ARR in 8 months. 6M+ users across 190+ countries have built 6.5M+ applications on Emergent. We've raised $100M+, backed by Khosla Ventures, SoftBank, Google, Lightspeed, Prosus, Together, and Y Combinator.
We're solving the hard part of AI-driven software creation: correctness, reliability, security, and scale in real production systems. The team is built by repeat founders, Olympiad medalists, IIT & IIM alumni, and leaders from Google, Amazon, and Dropbox.
We're hiring builders who want ownership, speed, and impact at global scale.
- Maintain stability of our platform consisting of distributed microservices closely interacting with Kubernetes and cloud providers (GCP, AWS)
- Manage Kubernetes workloads with ArgoCD (GitOps) — deploy, monitor, and troubleshoot application syncs, resource trees, and rollouts
- Debug and resolve complex Kubernetes issues across clusters
- Manage CDN and edge infrastructure (Cloudflare) for performance, caching, and traffic management
- Automate infrastructure lifecycle operations and workflows
- Own the observability stack: Grafana (dashboards, Loki logs, Prometheus metrics), New Relic (APM, golden metrics, transaction analysis)
- Enhance monitoring, alerting, and distributed tracing across services
- Participate in on-call rotation via PagerDuty, handle incident response, and perform root cause analysis
- Proactively identify reliability risks before they become incidents
- Support the platform that runs AI agent workloads — job scheduling, trajectory tracking, environment provisioning, deployments and cost attribution
- Develop Kubernetes controllers and operators to extend platform capabilities for agent orchestration
- Work closely with product and backend teams to ensure platform scalability and reliability
- Build internal tools, automate workflows, and integrate systems to improve team productivity
- Stay current with Kubernetes releases, CNCF ecosystem updates, and cloud-native best practices
Requirements
~1 min read- 3+ years of software/platform engineering experience with production systems
- Strong proficiency in Go or Python — you write production code in at least one daily
- Hands-on experience building and deploying services on Kubernetes — not just YAML, you've developed something that runs on K8s
- Experience with GitOps tooling (ArgoCD, Flux, or similar)
- Strong networking and DNS fundamentals — TCP/IP, HTTP, load balancing, DNS resolution, TLS, and debugging connectivity issues
- Solid Linux/OS fundamentals — process management, filesystem, memory, systemd, and comfortable debugging with tools like strace, tcpdump, and netstat
- Relational databases — experience with PostgreSQL, MySQL, or similar; indexing, query optimization, replication, and backup/restore procedures
- NoSQL databases — familiarity with MongoDB, DynamoDB, Redis, or similar for document/key-value workloads
- Caching — experience with Redis, Memcached, or similar for application and infrastructure-level caching
- Message queues & streaming — hands-on with Kafka, SQS, RabbitMQ, or similar for event-driven architectures
- Strong SQL skills for debugging and operational queries
- Comfortable with the CNCF ecosystem — Helm, Kustomize, cert-manager, Ingress controllers, CNI/CSI interfaces
- Hands-on with at least one observability stack (Grafana/Prometheus/Loki, New Relic, Datadog, or similar)
- Familiarity with GCP and/or AWS — managed Kubernetes (GKE/EKS), networking, IAM, storage, and cloud-native services (SES, SQS, S3, etc.)
- Experience with CDN/edge platforms (Cloudflare, CloudFront, or similar)
Nice to Have
~1 min read- Experience building Kubernetes Operators (kubebuilder, operator-sdk, or controller-runtime)
- Experience tuning Kubernetes core components (API server, kubelet, scheduler)
- Familiarity with AI/LLM infrastructure — token management, cost tracking, agent orchestration
- Experience with CI/CD pipelines (GitHub Actions, automated testing, deployment pipelines)
- Infrastructure as Code experience (Terraform, Pulumi, or similar)
- Previous work on large-scale distributed systems or platform-as-a-service
- Startup experience — you thrive in fast-paced, ambiguous environments
- You're a generalist who can context-switch between debugging a K8s deployment, setting up a Grafana alert, and configuring CDN rules — all in the same day
- You enjoy solving complex infrastructure challenges and automating away toil
- You dig deep — when something breaks, you find the root cause, not just the workaround
- You communicate clearly and can collaborate effectively in a fast-moving, distributed team
We don't require previous experience with our entire stack, but enthusiasm for learning is key.
Go · Python · Kubernetes · ArgoCD · Helm · GCP · AWS · Cloudflare · Grafana · Prometheus · Loki · New Relic · PagerDuty · PostgreSQL · MongoDB · Redis · Kafka · GitHub
- YC S24 backed with strong investor support
- Building at the frontier of AI-powered software creation
- Small team, high ownership, real impact from day one
Listing Details
- Posted
- March 13, 2026
- First seen
- March 26, 2026
- Last seen
- April 14, 2026
Posting Health
- Days active
- 19
- Repost count
- 0
- Trust Level
- 39%
- Scored at
- April 14, 2026
Signal breakdown
Please let Emergentlabsinc know you found this job on Jobera.
4 other jobs at Emergentlabsinc
View all →Explore open roles at Emergentlabsinc.
Similar Software Engineer - Infrastructure jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.