Site Reliability Engineer

Canada·Torontomid

EngineeringDevOps & InfrastructureSite Reliability EngineerDevops EngineerInfrastructure & Cloud

4 views0 saves0 applied

Apply Now

Quick Summary

Overview

Who We Are At Momentum Financial Services Group, we help people move forward by reimagining how money works for those who need it most. With more than 40 years of experience, we’re the team behind Money Mart—Canada’s largest non-bank branch network—and a leader in financial solutions for…

Requirements Summary

Technical Proficiency: CI/CD tools Cloud platforms (AWS, Azure). Containers and orchestration (Docker, Kubernetes). Scripting languages (Python, Bash). Infrastructure as Code (Terraform, Ansible).

Technical Tools

ansibleawsazuredockerkubernetespythonterraformci-cdcybersecurityfintech

Compensation Philosophy – Competitive pay aligned with experience and market standards
Discretionary Annual Bonus – Rewarding both individual and company performance
Comprehensive Benefits – Health and dental coverage with premiums fully paid, plus access to an Employee Assistance Program
Retirement Plans – Helping you plan and save for the future
Hybrid Work Environment – Flexibility to balance remote and in-office collaboration; enjoy our corporate HQ spaces designed for teamwork and creativity
Perks and Rewards – Tuition reimbursement, professional development support, discounts through Perkopolis, and recognition programs that celebrate your impact

The Site Reliability Engineer is responsible for ensuring the availability, performance, and resilience of the organization's digital banking and financial services platforms. This role focuses on automating operational processes, defining and maintaining service-level objectives, and engineering systems that can withstand and recover from failure. You will work closely with engineering, DevOps, QA, cybersecurity, and compliance teams to ensure platform reliability meets both technical and regulatory standards, while minimizing risk to production systems through proactive monitoring, incident response, and continuous improvement of the software delivery lifecycle.

Define and maintain service-level objectives (SLOs), error budgets, and reliability targets aligned with business goals and compliance deadlines.
Oversee the end-to-end service lifecycle, from code integration to production deployment, with a focus on stability and risk reduction.
Ensure all changes comply with relevant financial regulations.
Conduct reliability risk and blast-radius assessments before production changes.
Coordinate go/no-go decisions with engineering, QA, compliance, and operations stakeholders.

Own build, test, and deployment pipelines across multiple environments (staging, UAT, production), ensuring changes are safe, repeatable, and observable.
Design and maintain automated CI/CD pipelines and enforce version control policies (e.g., Git Flow) to reduce toil and human error.
Engineer zero-downtime deployments and low-impact change strategies for high-availability systems.
Develop and maintain rollback, failover, and disaster recovery runbooks for production incidents.

Collaborate with Information Security and Compliance teams to validate that infrastructure and deployment practices meet data protection and privacy standards.
Maintain audit-ready documentation of change activity, incident timelines, and remediation records.
Support internal and external audits with detailed operational and change history.

Drive automation, standardization, and observability improvements across the production environment.
Conduct post-incident reviews (blameless post-mortems) to identify systemic failures and prevent recurrence.
Contribute to DevOps and SRE maturity initiatives across engineering teams.

Act as the central liaison between product, development, and compliance teams on production health and change risk.
Communicate change scope, reliability risks, and incident status clearly to both technical and non-technical stakeholders.
Provide regular reliability reporting, SLO performance metrics, and incident trends to senior management.

CI/CD tools
Cloud platforms (AWS, Azure).
Containers and orchestration (Docker, Kubernetes).
Scripting languages (Python, Bash).
Infrastructure as Code (Terraform, Ansible).
Observability and monitoring tools

Strong cross-functional collaboration and communication across engineering and compliance teams.
Rigorous attention to detail with a proactive approach to risk and failure detection.
Ability to perform under pressure and respond decisively during incidents and regulatory deadlines.

Bachelor's degree in Computer Science, Information Technology, or related field.
3-5 years in Site Reliability Engineering, DevOps, or Platform Engineering within financial services or fintech.
Hands-on experience maintaining reliability for real-time transaction systems, mobile banking, or payment gateways.
Familiarity with regulatory compliance requirements and their operational implications for production systems.

Ready to apply your Site Reliability Engineering expertise to make a real impact? Join us and help shape the future of tech at MFSG. Apply today and let’s build the future of MFSG, together.

Committed to Equal Opportunity:

MFSG is committed to accommodating applicants up to the point of undue hardship during the recruitment, assessment and selection process. If you are selected for an interview, please notify MFSG if you require accommodation in respect of the materials or procedures used at any time during this process. If you require accommodation, MFSG will work with you to determine how to meet your needs.

Please note: The salary range for this position is between C$ 110,000 to C$ 120,000.

About MFSG – Our Commitment to Responsible Innovation

At MFSG, we are committed to building innovative solutions grounded in ethical, transparent, and responsible use of data and technology. Aligned with the principles outlined in Canada’s Artificial Intelligence and Data Act (AIDA), we take a proactive approach to ensuring that any AI or data-driven systems we use are safe, fair, and accountable.

This posting is for a current position within our organization, offering the opportunity to contribute to meaningful, responsible innovation that supports our employees, clients, and communities.

We prioritize strong data governance, clear communication around how systems work, and safeguards that reduce risks and protect individuals. Our focus is on developing tools and processes that promote equity, reliability, and trust, supported by ongoing monitoring and continuous improvement.

Joining MFSG means contributing to a future-focused organization that values both innovation and integrity, where your work helps shape solutions that responsibly support our employees, clients, and communities.