saltsquare
saltsquare~14d ago

AI/ML Architect/Lead

Bosnia and HerzegovinaBosnia and Herzegovina·Tuzla/sarajevolead
ArchitectConstruction & Real Estate
0 views0 saves0 applied

Quick Summary

Overview

Salt Square is a growing outsourcing company providing high-quality software development services to clients across a wide range of industries. Our team is composed of skilled and dedicated professionals delivering innovative solutions that meet and exceed client expectations.

Technical Tools
airflowawsdbtgithub-actionshuggingfacejenkinsjiralangchainpythonpytorchterraformab-testingci-cddata-analysisetlmachine-learningmentoringnetworkingproject-management

Salt Square is a growing outsourcing company providing high-quality software development services to clients across a wide range of industries. Our team is composed of skilled and dedicated professionals delivering innovative solutions that meet and exceed client expectations.

Our team is seeking an AI/ML Architect/Lead to own the architecture, infrastructure, and delivery of end-to-end AI/ML ecosystem. The ideal candidate is a technically deep, strategically minded engineer who has built and scaled production-grade AI/ML systems on AWS — someone who can design for long-term success while guiding implementation and delivery. This role sits at the intersection of platform engineering, data science, and applied AI, requiring hands-on expertise across the full ML lifecycle from model development through production observability.

Responsibilities

~1 min read

Reporting directly to the VP, Data Science & Analytics, the AI/ML Architect/Lead will serve as the technical authority for client’s AI/ML platform. You will design and own the infrastructure that powers our machine learning and NLP capabilities, establish MLOps standards and practices, lead the deployment and monitoring of production AI/ML systems, and provide architectural guidance and technical mentorship to the AI/ML engineering team.

The following responsibilities are considered essential functions of this position and are not intended to be

an exhaustive list of all duties.

  • AI/ML Infrastructure Architecture (AWS)

  • Architect, deploy, and manage scalable AWS infrastructure for AI/ML workloads including EC2, Lambda, ECS/EKS, S3, SageMaker, and related services

  • Design and maintain VPC networking, security group configurations, and IAM roles and policies governing all AI/ML platform components in partnership with infrastructure

  • Define and enforce infrastructure-as-code standards and CI/CD practices for AI/ML platform components

  • Monitor infrastructure health, optimize compute performance, and drive cost efficiency across model training and inference workloads

  • Ensure high availability, fault tolerance, and disaster recovery posture for all production AI/ML systems

  • Manage multi-account AWS environments with IAM roles, environment-specific security boundaries (dev, test, production), and secure access patterns

  • Architect and own the end-to-end ML lifecycle: experimentation, training, evaluation, deployment, monitoring, and retraining pipelines

  • Establish and enforce MLOps best practices including model versioning, experiment tracking, reproducibility standards, and deployment automation

  • Design and implement model serving infrastructure for both real-time inference and batch scoring, optimizing for latency, throughput, and cost

  • Build pipeline orchestration frameworks for ML workflows using tools such as Airflow, Step Functions, or equivalent

  • Implement model monitoring and observability frameworks to detect data drift, model degradation, and production anomalies

  • Own CI/CD pipelines for model and infrastructure promotion across development, testing, and production environments

  • Architect and manage Qdrant (or equivalent vector database) deployments for semantic search, similarity retrieval, and RAG (Retrieval-Augmented Generation) applications

  • Design embedding pipelines that transform patient-generated text into vector representations for downstream AI/ML applications

  • Optimize vector index configurations for query performance, recall, and storage efficiency at scale

  • Integrate vector retrieval layers with LLM-based applications and NLP pipelines

  • Partner with the Senior Data Engineers to ensure seamless integration between the Redshift data warehouse and AI/ML feature pipelines

  • Design and build feature stores and feature engineering pipelines that source structured and unstructured data for model training

  • Establish data contracts and quality standards between data engineering and AI/ML platform layers

  • Build ELT/ETL patterns tailored to AI/ML workloads including incremental feature computation, backfill strategies, and schema evolution handling

  • Own AI/ML platform security including IAM policies, encryption at rest and in transit, network access controls, and secure model artifact storage

  • Ensure compliance with HIPAA, SOC 2, and applicable healthcare data privacy regulations as they apply to AI/ML systems and model outputs

  • Design PII anonymization and de-identification pipelines to provision safe, production-representative training data to development and test environments

  • Implement model governance standards including audit logging, lineage tracking, and acces controls over model artifacts and inference endpoints

  • Serve as the technical authority and escalation point for the AI/ML engineering team, providing architectural guidance and hands-on support

  • Collaborate with Data Analytics, Data Engineering, Product, and Software Engineering teams to align AI/ML platform capabilities with roadmap priorities

  • Contribute to architecture reviews, technology evaluations, and build-vs-buy decisions across the AI/ML tooling landscape

  • Build and maintain architecture documentation, runbooks, and operational playbooks for all production AI/ML systems

  • Mentor AI/ML engineers on platform best practices, code quality standards, and production engineering principles

  • Cloud Infrastructure: 5+ years of hands-on experience architecting and managing AWS data and AI/ML infrastructure (EC2, Lambda, ECS/EKS, SageMaker, S3, IAM, VPC, CloudWatch)

  • MLOps & ML Lifecycle: 5+ years of experience building and operating production ML systems, including training pipelines, model serving, CI/CD automation, and monitoring

  • Model Deployment: Deep experience with model serving patterns (real-time inference, batch scoring, A/B testing, canary deployments) and frameworks such as SageMaker Endpoints, TorchServe, or equivalent

  • Vector Databases: Hands-on experience with Qdrant or equivalent vector databases (Pinecone, Weaviate, pgvector), including index design, embedding pipelines, and RAG architecture

  • Python: Strong Python development skills for ML pipeline engineering, including frameworks such as PyTorch, HuggingFace Transformers, LangChain, and AWS SDK (boto3)

  • Data Engineering: Experience integrating AI/ML platforms with cloud data warehouses (Redshift preferred) and building feature pipelines using tools such as dbt, Airflow, or Spark

  • Infrastructure as Code: Experience with Terraform, CloudFormation, or equivalent IaC tools for reproducible infrastructure provisioning

    Source Control & CI/CD: Proficiency with Git including branching strategies, pull request workflows, and integration with CI/CD platforms (Jenkins, GitHub Actions, or equivalent)

  • Strong systems thinking with the ability to design AI/ML platforms for scalability, reliability, and long-term maintainability

  • Demonstrated experience with ML experiment tracking, model versioning, and reproducibility frameworks (MLflow, Weights & Biases, or equivalent)

  • Experience with NLP and large language model (LLM) applications, including fine-tuning, prompt engineering, and RAG patterns

  • Ability to translate business and product requirements into architectural decisions and phased implementation plans

  • Experience with healthcare data, HIPAA regulations, and patient data privacy requirements

  • Experience working with a US-based product team

  • Excellent verbal and written communication skills, with the ability to translate complex technical concepts for both technical and non-technical stakeholders

  • Proven ability to work cross-functionally and drive technical decisions collaboratively

  • Experience with project management and issue tracking software such as Jira

What We Offer

~1 min read
Competitive salary and benefits package.
23 days of paid leave.
Opportunities for professional growth and development.
A collaborative work environment with talented and dedicated colleagues.

Location & Eligibility

Where is the job
Tuzla/sarajevo, Bosnia and Herzegovina
On-site at the office
Who can apply
BA

Listing Details

First seen
May 6, 2026
Last seen
May 21, 2026

Posting Health

Days active
14
Repost count
0
Trust Level
19%
Scored at
May 21, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

saltsquareAI/ML Architect/Lead