Back to the stack

Lead DevOps Engineer (Data & AI Platform)

Remote Worldwide Hiring now

Are you ready to join a cutting-edge Digital Solutions company and help shape the future of enterprise software? Our client is advancing a world-class data and AI platform that powers decision-making across the entire mining value chain. Built on Azure and Databricks, thrir platform enables scalable data products, advanced analytics, and emerging AI capabilities—from digital twins to intelligent automation and natural language querying. We are looking for a Lead DevOps Engineer to play a critical role in evolving the platform into a highly automated, AI-enabled delivery ecosystem. This role goes beyond traditional DevOps—focusing on platform engineering, AI-assisted development, and intelligent automation of software delivery and operations. You will lead the design and implementation of modern DevOps practices, integrating AI tools, copilots, and automation frameworks to significantly improve developer productivity, pipeline efficiency, and platform reliability. As a Lead DevOps Engineer, you will refine and enhance CI/CD processes, integrate new capabilities into release pipelines, and drive automation across the software delivery lifecycle. You will also play a key role in embedding AI-driven practices into DevOps, enabling faster and more reliable delivery of data and AI products. Essential Functions: DevOps Strategy and Platform Evolution Define and implement a modern DevOps and platform engineering strategy aligned with data and AI platform goals. Develop roadmaps that incorporate AI-assisted development, testing, and operations. Drive the evolution from traditional DevOps to intelligent, self-service platform capabilities. Continuously evaluate emerging technologies (e.g., GenAI, LLMOps, AIOps) and incorporate them where relevant. AI-Enabled CI/CD and Automation Design and optimize CI/CD pipelines using AI-assisted tools (e.g., code generation, test generation, pipeline optimization). Integrate AI copilots and automation agents into development and deployment workflows. Implement intelligent quality gates (e.g., automated code reviews, anomaly detection in pipelines). Enable self-healing pipelines and automated failure diagnostics where possible. Automation and Framework Enhancement Build scalable automation frameworks leveraging AI, scripting, and infrastructure as code. Automate repetitive tasks using AI agents, prompt-based workflows, or orchestration frameworks. Enhance DevOps pipelines to support data products and AI/ML workloads (MLOps/LLMOps). Standardize reusable templates and pipeline components for platform-wide adoption. Data & AI Platform Integration Analyze and optimize integrations across the Anglo American Data Platform, including: o Databricks (data processing, workflows, DABs) o Airflow (orchestration) o Azure services (compute, storage, identity) o Power BI / downstream consumption layers Support deployment patterns for AI/ML models, feature pipelines, and inference services. Enable end-to-end lifecycle management for AI applications (training → deployment → monitoring). Governance, Security, and Reliability Implement governance practices across pipelines, including policy-as-code and automated compliance checks. Manage access control and ensure secure DevOps practices across environments. Introduce AIOps practices for monitoring, alerting, and incident management. Ensure high availability, scalability, and observability of DevOps processes. Documentation and Developer Experience Create and maintain clear documentation, including AI-assisted “how-to” guides and selfservice enablement. Improve developer experience through intelligent tooling, chat-based interfaces, and automation. Promote adoption of DevOps and AI capabilities across teams. Troubleshooting and Operational Support Collaborate with Data Delivery and platform teams to resolve issues efficiently. Use AI-assisted diagnostics and root cause analysis tools to accelerate incident resolution. Support production environments and ensure stability of pipelines and deployments. Standards and Best Practices Define and promote best practices in DevOps, platform engineering, and AI-enabled delivery. Coach teams on adopting modern DevOps + AI approaches. Drive consistency and reuse across teams and projects. Required Skills and Qualifications: Technical Expertise Strong experience with CI/CD tools (e.g., Azure DevOps, GitHub Actions). Expertise in infrastructure as code (Bicep, ARM or similar). Proficiency in scripting (PowerShell, Python, Bash). Deep understanding of DevOps principles, Git workflows, and release strategies. Experience with Azure services and cloud-native architectures. Familiarity with data platforms (Databricks, ADF, Airflow, SQL, AAS or equivalent). AI & Modern DevOps Capabilities Hands-on experience or strong familiarity with: o AI-assisted development tools (e.g., GitHub Copilot, ChatGPT, code assistants) o MLOps / LLMOps concepts (model deployment, monitoring, versioning) o AIOps tools for monitoring and incident management Understanding of how AI can be applied to: o Code generation and testing o Pipeline optimization o Incident detection and resolution Experience integrating APIs or services for AI capabilities into workflows is a plus. Platform & Systems Knowledge Experience with Azure cloud platform Knowledge of data and AI workload deployment patterns. Understanding of observability tools and practices. Problem-Solving and Analytical Skills Strong ability to analyze complex systems and improve scalability and performance. Proven troubleshooting skills in DevOps and platform environments. Collaboration and Communication Ability to work across technical and business teams. Strong communication skills, including documenting and explaining complex concepts. Experience enabling teams through tooling and best practices. Governance and Standards Experience with governance frameworks, access control, and compliance. Ability to implement and enforce DevOps standards at scale. Apply To This Job

Apply for this role Opens the employer's application page — free, no JobStack account needed.

More from the stack

Principle Engagement Manager

Remote Worldwide
View role

Video Evaluation Analysts

Remote Worldwide
View role

Associate - Actuary (US)

Remote Worldwide
View role

Director, Development and Human Factor Engineering

Remote Worldwide
View role

Scheduling Coordinator

Remote Worldwide
View role

Principal Data Scientist

Remote Worldwide
View role

(Senior) Software Engineer Data Integration (all genders)

Remote Worldwide
View role

Revenue Integrity & CDM Operations Manager

Remote Worldwide
View role

Supervisor/a

Remote Worldwide
View role

Python Engineer

Remote Worldwide
View role

Volunteer Opportunity: Call Scheduler & Fundraising Assistant (Remote Anywhere)

Remote Worldwide
View role

Remote Customer Support Associate – arenaflex – Delivering Exceptional Service for Food Delivery Platform

Remote Worldwide
View role

Enterprise Sales Development Representative

Remote Worldwide
View role

[Remote] Clinical Documentation Specialist Auditor- HIM Coding & CDI Quality

Remote Worldwide
View role

Senior Consultant, Medicaid Policy - HCBS

Remote Worldwide
View role

In-house Clinical Research Associate II - CNS/Psychiatry - U.S. - Remote

Remote Worldwide
View role

Account Manager

Remote Worldwide
View role

Remote Part-Time Customer Chat Support Specialist – No Experience Required – Join arenaflex's Dynamic Online Team

Remote Worldwide
View role

[Remote] Systems Engineer Intern - Fall Recruitment - MD, DC, VA (Remote)

Remote Worldwide
View role

Regional Director, Medical Science Liaison (Eastern USA)

Remote Worldwide
View role