Okx
Okx4h ago
New

Staff/Senior Staff Engineer, Kubernetes

SingaporeSingapore·Singaporesenior
OtherStaff Engineer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Own the build, scaling, version upgrades, daily operations, fault diagnosis,

Technical Tools
OtherStaff Engineer
OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.

At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom.
 
OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. 
 
Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er.

OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.

 

Responsibilities

~2 min read
  • K8s cluster lifecycle management: Own the build, scaling, version upgrades, daily operations, fault diagnosis, and performance tuning of large-scale production Kubernetes clusters; ensure 7×24 high availability and stable operations; support continuous business iteration.
  • Alibaba Cloud & AWS multi-cloud operations (core responsibility): Operate, govern, and optimize Alibaba Cloud and AWS resources across dual-cloud environments, covering container services, networking, storage, IAM, load balancing, databases, and object storage; manage configuration changes, cost optimization, and disaster recovery to achieve unified multi-cloud governance.
  • Cloud-native architecture and optimization: Lead containerization and microservices operational rollout; optimize Pod scheduling, resource quotas, network policies, image management, and log monitoring systems; resolve cluster resource fragmentation, business adaptation, and network interoperability challenges.
  • Stability and security: Build comprehensive K8s cluster monitoring, alerting, logging, and distributed tracing systems; define operations runbooks, change processes, and incident response plans; strengthen cluster security controls, disable high-risk permissions, harden container runtime environments, and ensure infrastructure and business data security.
  • Automated operations and DevOps: Develop operations automation scripts using Shell/Python; integrate Jenkins, GitLab CI, and ArgoCD to build automated release, inspection, and backup systems; implement Infrastructure as Code (IaC) principles to improve efficiency and reduce human error.
  • Incident management and post-mortem optimization: Lead online incident response, conduct root cause analysis, produce post-mortem reports, and continuously optimize cluster architecture, resource allocation, monitoring strategy, and long-term stability assurance mechanisms.
  • Technical knowledge sharing and team empowerment: Track Cloud Native and public cloud technology developments; document operations best practices and technical specifications; assist the team in improving multi-cloud K8s operations capabilities.

 

  • Bachelor's degree or above in a computer-related field; 4+ years of hands-on experience operating production-level Kubernetes clusters; proficient in K8s core principles and components including Pod, Deployment, StatefulSet, Service, Ingress, CRD, controllers, scheduling strategies, network models, and storage mounting; able to independently resolve complex cluster failures and performance bottlenecks.
  • Proficient in Alibaba Cloud and AWS dual-cloud operations, with independent experience in dual-cloud production environments:
  • Alibaba Cloud: proficient in ACK Container Service, ECS, SLB, VPC, RAM, RDS, OSS, CloudMonitor, security groups, and snapshot backups.
  • AWS: proficient in EKS, EC2, S3, VPC, IAM, TGW, load balancing, CloudWatch, and security policies; practical experience in overseas cloud deployment, operations, and disaster recovery.
  • Proficient in Linux system administration; familiar with system optimization, permission control, process management, log analysis, and online troubleshooting.
  • Familiar with mainstream container runtimes (containerd/Docker); understand K8s networking (CNI plugins such as Calico/Flannel), storage (CSI), and multi-cluster management; familiar with Istio/Envoy service mesh, east-west traffic governance, gray-scale releases, and network interoperability.
  • Strong Shell and Python automation skills; experienced with CI/CD pipelines (Jenkins, GitLab CI, ArgoCD); familiar with IaC tools (Terraform, Ansible, Helm); experienced with observability stacks (Prometheus, Grafana, ELK/EFK, Jaeger, SkyWalking).
  • Preferred: experience in large-scale public cloud environments (100+ nodes); multi-cloud cost optimization; K8s security hardening (OPA/Gatekeeper, Pod Security Standards, Falco); Kubernetes CKA/CKS certification; experience with AI/LLM workload scheduling (GPU scheduling, distributed training).

 

What We Offer

~1 min read
Competitive total compensation package
L&D programs and education subsidy for employees' growth and development
Various team building programs and company events
Wellness and meal allowances
Comprehensive healthcare schemes for employees and dependants
More that we love to tell you along the process!

Location & Eligibility

Where is the job
Singapore, Singapore
On-site at the office
Who can apply
SG

Listing Details

Posted
June 10, 2026
First seen
June 10, 2026
Last seen
June 10, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
67%
Scored at
June 10, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Okx
Okx
greenhouse

OKX is a global cryptocurrency exchange and Web3 technology company, offering trading, wallet services, and access to decentralized finance. Founded in 2017, it serves millions of users in over 100 countries.

Employees
5k+
Founded
2017
Domain
okx.com
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

OkxStaff/Senior Staff Engineer, Kubernetes