Senior AI DevOps / LLMOps
TechBiz GlobalBaden-Badenhybridsenior6-10 years24 people scored this
Description
At TechBiz Global, we are providing recruitment service to our TOP clients from our portfolio. We are currently seeking an Senior AI DevOps / LLMOps specialist to join one of our clients' teams. If you're looking for an exciting opportunity to grow in a innovative environment, this could be the perfect fit for you.Key ResponsibilitiesAutomation of Build-to-Production- Design and implement robust CI/CD pipelines tailored for AI, covering model weights,dataset versioning, and application code.- Develop specialized workflows for PromptOps, ensuring that system prompts areversion-controlled, tested for regressions, and deployed with the same rigor as traditionalcode.-Automate the deployment of Agentic workflows, managing the complexities of statefulAI interactions and multi-agent handoffs.2. AI Infrastructure as Code (IaC)- Provision and manage high-performance compute environments (GPU clusters, TPUpods) using Terraform, Pulumi, or Ansible.- Define and enforce Policy-as-Code for AI endpoints to ensure compliance with security,cost-usage limits, and data residency requirements.- Maintain a consistent environment across Hybrid Infrastructure, ensuring seamlessparity between On-Premises development and Cloud production.3. Safe Experimentation & Controlled Releases- Architect Progressive Delivery strategies for AI, including Canary releases, Blue-Greendeployments, and Shadowing (where new models run in parallel with production tocompare outputs).- Build “Evaluation-in-the-Loop” gates within the pipeline to automatically test for bias,hallucination, and performance degradation before a release.- Implement A/B testing frameworks specifically designed for LLM outputs and agenticbehavior.4. Monitoring & Observability- Establish deep observability into Inference Endpoints, tracking metrics like tokens-per-second, latency, and drift in model accuracy.-Integrate feedback loops that capture production “edge cases” to feed back into thetraining and fine-tuning pipelines.Must-Have Technical Skills:-Orchestration: Advanced Kubernetes (K8s) skills, specifically with KubeFlow, Ray, orNVIDIA Triton.-CI/CD & IaC: Expertise in GitHub Actions/GitLab CI, and Terraform or Pulumi.- AI Tooling: Experience with Weights & Biases, MLflow, LangSmith, or ArizePhoenix.-Hardware: Understanding of GPU virtualization, CUDA drivers, and on-premiseshardware management.-Security: Familiarity with Open Policy Agent (OPA) and secret management (Vault).Experience:- 10+ years in DevOps, SRE, or Cloud Engineering.- 2+ years of hands-on experience in MLOps or LLMOps, specifically moving LLMsfrom notebook to production.-Proven experience managing Hybrid Cloud environments (e.g., AWS/Azure + PrivateData Center).Find more English Speaking Jobs in Germany on Arbeitnow
Required skills
Engineeringbachelor's degree
Tech stack
TerraformKubernetesGitHub ActionsAWSAzure
Similar roles
Business Continuity Specialist
TechBiz Global · Australia, US
senior
Software Support Engineer(Fresh Graduate or Entry Level)
TechBiz Global · India
entry
Trainee Account Executive
TechBiz Global · Berlin
intern
Senior Enterprise Architect (Insurance Industry)
TechBiz Global · Aachen
senior
Sales Development Representative (SDR)
TechBiz Global · UK
£32K – £32K (~$41K – $41K)
Want to know your chances? OpteroAI predicts your offer probability for this role based on your profile.
See your offer scoreFree to start. No credit card.
Glassdoor rating3.5/5
IndustryIT Jobs
Company Insights
Glassdoor rating
3.5