Senior Lead Data Engineer - R01566889
Quick Summary
Responsibilities
~1 min read• Design, develop, and maintain ETL/ELT pipelines using PySpark and Python in Databricks (including notebooks, jobs, Delta Lake tables, and Unity Catalog for governance).
• Implement medallion architecture (bronze/silver/gold layers) and optimize Spark jobs for performance, cost, scalability, and reliability (handling partitioning, skew, caching, adaptive query execution, etc.).
• Write efficient SQL queries for data transformation, validation, and analytics within Databricks.
• Provision and manage cloud infrastructure using Terraform (IaC) for Databricks workspaces, clusters, jobs, storage (ADLS/S3), networking, IAM roles/permissions, and related resources on Azure and/or AWS.
• Implement and maintain CI/CD pipelines using Jenkins, GitHub (Actions/Repositories), and branching strategies for automated testing, deployment of notebooks, jobs, Delta Live Tables, and Terraform configurations.
• Integrate data from diverse sources (databases, APIs, streaming, files) into cloud storage and processing layers.
• Ensure data quality, lineage, security, and compliance (including Delta Lake ACID transactions, schema evolution, time travel, and access controls).
• Monitor pipeline performance, troubleshoot failures, and implement alerting/observability (using Databricks tools, cloud monitoring services, or third-party solutions).
• Optimize cloud costs through auto-scaling clusters, spot instances, job scheduling, and efficient resource usage.
• Collaborate in agile teams, participate in code reviews, and contribute to best practices for data engineering.
Requirements
~1 min read• Strong proficiency in Python and PySpark for distributed data processing and ETL.
• Advanced SQL skills with experience in complex querying, window functions, and optimization.
• Hands-on experience with Databricks (clusters, notebooks, Delta Lake, Unity Catalog, Delta Live Tables, workflows/jobs).
• Proficiency in Terraform for infrastructure provisioning and management (Databricks resources, cloud storage, IAM, networking).
• Experience with GitHub for version control and collaboration (branching, pull requests, code reviews).
• Solid knowledge of CI/CD practices and tools, particularly Jenkins (pipelines, plugins for Databricks/GitHub/Terraform).
• Working experience on Azure (Data Lake, Data Factory, Synapse, Key Vault, etc.) and/or AWS (S3, Glue, EMR, IAM, Lambda, etc.).
• Understanding of big data concepts, data modeling (star/snowflake, dimensional), and lakehouse principles.
• Familiarity with performance tuning in Spark/Databricks environments.
Know more about DAE: https://www.brillio.com/services-data-analytics/
Know what it’s like to work and grow at Brillio: https://www.brillio.com/join-us/
Brillio is an equal opportunity employer to all, regardless of age, ancestry, colour, disability (mental and physical), exercising the right to family care and medical leave, gender, gender expression, gender identity, genetic information, marital status, medical condition, military or veteran status, national origin, political affiliation, race, religious creed, sex (includes pregnancy, childbirth, breastfeeding, and related medical conditions), and sexual orientation.
#LI-AY1
Location & Eligibility
Listing Details
- Posted
- June 23, 2026
- First seen
- June 26, 2026
- Last seen
- June 26, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 71%
- Scored at
- June 26, 2026
Signal breakdown
Please let Brillio 2 know you found this job on Jobera.
3 other jobs at Brillio 2
View all →Explore open roles at Brillio 2.
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.