Skip to main contentOpteroAIBeta
Back to listings
TechBiz Global logo

Senior AI DevOps / LLMOps

TechBiz Global
Baden-Badenhybridsenior6-10 years24 people scored this

Description

At TechBiz Global, we are providing recruitment service to our TOP clients from our portfolio. We are currently seeking an Senior AI DevOps / LLMOps specialist to join one of our clients' teams. If you're looking for an exciting opportunity to grow in a innovative environment, this could be the perfect fit for you.Key ResponsibilitiesAutomation of Build-to-Production- Design and implement robust CI/CD pipelines tailored for AI, covering model weights,dataset versioning, and application code.- Develop specialized workflows for PromptOps, ensuring that system prompts areversion-controlled, tested for regressions, and deployed with the same rigor as traditionalcode.-Automate the deployment of Agentic workflows, managing the complexities of statefulAI interactions and multi-agent handoffs.2. AI Infrastructure as Code (IaC)- Provision and manage high-performance compute environments (GPU clusters, TPUpods) using Terraform, Pulumi, or Ansible.- Define and enforce Policy-as-Code for AI endpoints to ensure compliance with security,cost-usage limits, and data residency requirements.- Maintain a consistent environment across Hybrid Infrastructure, ensuring seamlessparity between On-Premises development and Cloud production.3. Safe Experimentation & Controlled Releases- Architect Progressive Delivery strategies for AI, including Canary releases, Blue-Greendeployments, and Shadowing (where new models run in parallel with production tocompare outputs).- Build “Evaluation-in-the-Loop” gates within the pipeline to automatically test for bias,hallucination, and performance degradation before a release.- Implement A/B testing frameworks specifically designed for LLM outputs and agenticbehavior.4. Monitoring & Observability- Establish deep observability into Inference Endpoints, tracking metrics like tokens-per-second, latency, and drift in model accuracy.-Integrate feedback loops that capture production “edge cases” to feed back into thetraining and fine-tuning pipelines.Must-Have Technical Skills:-Orchestration: Advanced Kubernetes (K8s) skills, specifically with KubeFlow, Ray, orNVIDIA Triton.-CI/CD & IaC: Expertise in GitHub Actions/GitLab CI, and Terraform or Pulumi.- AI Tooling: Experience with Weights & Biases, MLflow, LangSmith, or ArizePhoenix.-Hardware: Understanding of GPU virtualization, CUDA drivers, and on-premiseshardware management.-Security: Familiarity with Open Policy Agent (OPA) and secret management (Vault).Experience:- 10+ years in DevOps, SRE, or Cloud Engineering.- 2+ years of hands-on experience in MLOps or LLMOps, specifically moving LLMsfrom notebook to production.-Proven experience managing Hybrid Cloud environments (e.g., AWS/Azure + PrivateData Center).Find more English Speaking Jobs in Germany on Arbeitnow

Required skills

Engineeringbachelor's degree

Tech stack

TerraformKubernetesGitHub ActionsAWSAzure
Posted 5 days agoSource: ArbeitnowView original listing

Want to know your chances? OpteroAI predicts your offer probability for this role based on your profile.

See your offer score

Free to start. No credit card.

Glassdoor rating3.5/5
IndustryIT Jobs

Company Insights

Glassdoor rating
3.5