Csit
Csit19h ago
New

System Reliability Engineer (Data Centre)

SingaporeSingapore·SingaporeFull-timemid
EngineeringReliability Engineer
1 views0 saves0 applied

Quick Summary

Overview

You will be part of a dynamic team responsible for ensuring the reliability, availability, and performance of our data centre's IT operations. As a System Reliability Engineer (Data Centre),

Technical Tools
EngineeringReliability Engineer
You will be part of a dynamic team responsible for ensuring the reliability, availability, and performance of our data centre's IT operations. As a System Reliability Engineer (Data Centre), you will oversee the day-to-day IT operations within the data centre, working closely with various teams to ensure seamless IT service delivery. While knowledge of data centre power and cooling infrastructure is beneficial, the primary focus of this role is on IT operations. You will collaborate with Data Centre Facilities teams on matters related to power, cooling, and physical infrastructure as needed. You must have a good understanding of cloud infrastructure technologies, architecture, and site reliability engineering (SRE) principles. 
  • Oversee and manage IT operations within the data centre, including day-to-day monitoring, incident management, and problem management
  • Lead the end-to-end incident management lifecycle that encompass immediate troubleshooting, root cause identification, and resolution implementation to restore services, followed by comprehensive post-incident analysis
  • Develop and maintain documentation on IT infrastructure, operations, and procedures within the data centre
  • Perform capacity planning to ensure IT infrastructure is scalable for future demands
  • Collaborate and coordinate with Data Centre Facilities teams on matters related to power, cooling, and physical infrastructure
  • Design and implement robust observability platform alongside network monitoring tools for performance monitoring and real-time alerting of IT devices and networks
  • Implement and manage remote management tools for out-of-band access and control of IT devices and servers
  • Define, implement, and track SRE metrics, including SLO, SLI, and error budgets to improve data centre IT reliability
  • Background in Computer Science, Computer or Electrical Engineering, Information Technology or a related field
  • Good technical knowledge in IT infrastructure, including servers, storage, networking, and cloud technologies
  • Proficient in IT management software and tools
  • 2 years of working experience in IT operations is preferred
  • Fresh graduates are welcomed to apply
  •  
    As CSIT is an agency under the Ministry of Defence (Singapore), only Singapore Citizens will be considered.

    Location & Eligibility

    Where is the job
    Singapore, Singapore
    On-site at the office
    Who can apply
    SG

    Listing Details

    Posted
    June 22, 2026
    First seen
    June 22, 2026
    Last seen
    June 22, 2026

    Posting Health

    Days active
    0
    Repost count
    0
    Trust Level
    67%
    Scored at
    June 22, 2026

    Signal breakdown

    freshnesssource trustcontent trustemployer trust
    Csit
    Csit
    lever
    Employees
    125
    Founded
    2009
    View company profile
    Newsletter

    Stay ahead of the market

    Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

    A
    B
    C
    D
    Join 12,000+ marketers

    No spam. Unsubscribe at any time.

    CsitSystem Reliability Engineer (Data Centre)