Post-doctoral Fellow in Agentic Control for Cluster Management - 12 months contract
Quick Summary
Who we are ? Télécom Paris, part of the IMT (Institut Mines-Télécom) and a founding member of the Institut Polytechnique de Paris, is one of France's top 5 general engineering schools.

Télécom Paris, part of the IMT (Institut Mines-Télécom) and a founding member of the Institut Polytechnique de Paris, is one of France's top 5 general engineering schools.
The mainspring of Télécom Paris is to train, imagine and undertake to design digital models, technologies and solutions for a society and economy that respect people and their environment.
We are looking for a postdoctoral researcher specialising in agent-based control for cluster management to join the INFRES department at Telecom Paris.
Kubernetes has become a core platform for deploying and managing cloud-native systems and is increasingly used to host production AI workloads. Despite its maturity as an orchestration platform with built-in automation, day-to-day Kubernetes operations still often require significant human involvement. Cluster operators must inspect cluster state, interpret metrics, logs, traces, and events, diagnose failures, select corrective actions, execute commands or API operations, and verify that the system has returned to a healthy state. Recent LLM-based Kubernetes tools and research prototypes demonstrate the potential of language models to support these tasks through natural language interaction, command-line and API interaction, and cluster-aware reasoning, pointing towards more autonomous Kubernetes and Site Reliability Engineering (SRE) operations. The degree of autonomy varies across existing solutions from interactive human-in-the-loop assistance to more autonomous execution.
At the same time, the growing use of Kubernetes in edge computing environments makes autonomous cluster management an increasingly important research problem. While most existing studies focus on cloud environments or general Kubernetes management, edge deployments may involve multiple independently managed Kubernetes clusters operating under very different conditions. These clusters may be deployed at heterogeneous, resource-constrained, or physically hard-to-reach sites, including remote deployments for applications such as environmental monitoring. They may also face changing resource availability, unstable network conditions, and limited connectivity. In such environments, failures are harder and costlier to address through manual intervention, which increases the importance of zero-touch management and autonomous recovery at the level of each individual cluster. These constraints also make locally deployable open-weight models a practical option for supporting on-site reasoning, control, and recovery. Their utility can be further strengthened by retrieval-augmented generation, which allows decisions to be grounded in relevant local documents and operational data without continuous reliance on remote third-party services.
This postdoctoral project will investigate closed-loop agentic control for autonomous Kubernetes management in resource-constrained edge environments. The project will study how AI agents can observe the state of a Kubernetes cluster, interpret heterogeneous operational signals, reason over possible causes and corrective actions under safety constraints, execute selected recovery steps, and verify whether the cluster has returned to a healthy state. The research will particularly examine how locally deployable open-weight models, supported by retrieval-augmented generation over local documentation and operational data, can provide practical autonomy under limited connectivity and infrastructure constraints. The designed solution will be evaluated using either an existing evaluation framework, such as AIOpsLab, or through a dedicated Kubernetes operational benchmark developed within the project. This evaluation is planned to use realistic Kubernetes failure diagnosis and recovery scenarios, administration tasks inspired by the Certified Kubernetes Administrator (CKA) exam, and repeated experiments to assess reliability under resource-constrained edge conditions.
The current postdoctoral position will be conducted within the Computer Sciences and Networks Department (INFRES), in the Networks, Mobility and Services (RMS) team, which is affiliated with the LTCI research laboratory. The INFRES department addresses some of the scientific challenges arising from widespread digitization on the basis of its expertise in areas such as: Architecture, design and verification of software systems and communication networks, data science, the interaction between man and machine, security, mobility, and the control of energy consumption. The research activities of the RMS research team focus on very large networks and operated systems. In particular, we design the mobile networks and communications of tomorrow, the future Internet, the Internet of things or the evolutions of the cloud and of virtualization. Our methodologies go from experimentation to theory: We experiment on testbeds, develop metrology tools, design architectures and protocols, develop algorithms and analytical methods for evaluating and optimizing networks.
Your main responsabilities :
To carry out research missions in the field of autonomous cluster management for resource-constrained edge environments.
To contribute to the reputation of the School, the Institut Mines-Télécom and the Institut Polytechnique de Paris
You hold a PhD or equivalent qualification. The role requires strong skills in artificial intelligence, machine learning and computer systems, as well as a good command of Linux programming. Experience with LLMs, RAG architectures, cloud-native technologies and Kubernetes is required, along with a good command of English.
Knowledge of distributed systems, edge computing or open source is an advantage. The ideal candidate must also be able to work as part of a team, communicate effectively and demonstrate the ability to synthesise information.
Why join us?
You'll be working in a fast-growing, pleasant, green and accessible environment (especially for people with disabilities) just 20 km from Paris (RER B and C suburban train lines, close to major roads, shared shuttle departing from Porte d'Orléans). You will benefit from :
49 days annual leave (CA + RTT)
flexible working hours (depending on department activity)
telecommuting 1 to 3 days/week possible
75% public transport pass reimbursement
Proximity to numerous sports facilities, concierge service, underground parking, in-house catering, etc.
Good to know: our social security contributions are lower than in the private sector
Other information :
Application deadline: August 30, 2026
Job type : 12 months fixed-term contract
Job description here
Our recruitment is based on skills, without distinction of origin, age, gender identity, or sexual orientation, and all our positions are open to individuals with disabilities.
Location & Eligibility
Listing Details
- First seen
- June 17, 2026
- Last seen
- June 18, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 51%
- Scored at
- June 17, 2026
Signal breakdown
Please let institutminestelecom know you found this job on Jobera.
3 other jobs at institutminestelecom
View all →Explore open roles at institutminestelecom.
Similar Post jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.