L3 Support Engineer

IsraelIsraelmid
Customer SupportOtherDevOps & InfrastructureSupport EngineerL3 Support Engineer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Deep Technical Investigation (Primary Focus) Lead root cause analysis beyond L2 depth (GPU failures, firmware issues, Linux-level faults, HW/SW interactions).

Technical Tools
Customer SupportOtherDevOps & InfrastructureSupport EngineerL3 Support Engineer

What We Offer

~1 min read

Responsibilities

~1 min read
  • Lead root cause analysis beyond L2 depth (GPU failures, firmware issues, Linux-level faults, HW/SW interactions).

  • Detect recurring patterns across sites and convert findings into durable fixes.

  • Own technical workstreams during high-severity incidents.

  • Build evidence packs and drive escalations with ODM and R&D.

  • Push for firmware, component, and platform-level resolutions.

  • Track outcomes and ensure knowledge flows back to operations.

  • Support validation and rollout of firmware updates (risk assessment, staging, rollback planning).

  • Help operationalize platform standards across datacenters.

  • Create scalable runbooks, troubleshooting guides, and error catalogs.

  • Turn investigations into playbooks that elevate L1/L2 teams.

  • Travel to datacenters for complex troubleshooting, new platform readiness, or incident containment.

  • Strong hands-on experience with datacenter servers and deep Linux troubleshooting.

  • Ability to diagnose across hardware, BIOS/BMC firmware, and Linux (logs, drivers, storage basics, performance triage).

  • Structured incident response experience and clear communication under pressure.

  • Experience driving evidence-based escalations with vendors/R&D.

  • Fluent English (written and spoken).

Nice to Have

~1 min read
  • Strong familiarity with GPU server platforms and tooling (for example: nvidia-smi, dcgmi, Linux logs correlation).

  • Experience with ipmitool and Redfish workflows, firmware lifecycle, and staged rollouts.

  • Scripting skills (bash and basic Python) for log collection, triage automation, and simple reliability analysis.

  • Exposure to OCP-based platforms and ODM manufacturing ecosystems.

  • Experience supporting enterprise bare metal customers under contractual SLAs.

What We Offer

~1 min read
Competitive salary and comprehensive benefits package.
Opportunities for professional growth within Nebius.
Flexible working arrangements.
A dynamic and collaborative work environment that values initiative and innovation.

Listing Details

First seen
April 3, 2026
Last seen
April 26, 2026

Posting Health

Days active
23
Repost count
0
Trust Level
31%
Scored at
April 26, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Nebius
Nebius
greenhouse

Nebius is a cutting-edge AI cloud platform that offers scalable infrastructure for developing and deploying AI solutions.

Employees
350
Founded
2022
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

NebiusL3 Support Engineer