Posted on Tuesday, 6th January 2026
Recruiting on behalf of a leading data consultancy specializing in logistics solutions for Germany’s Mittelstand firms. Help build trustworthy AI pipelines powering predictive maintenance, supply chain optimization, and GenAI apps compliant with EU AI Act standards.
Design and productionise Retrieval-Augmented Generation (RAG) pipelines to eliminate AI hallucinations by grounding LLMs in enterprise data. Implement vector databases and governed data products using Unity Catalog, Delta Lake vectors, and observability tools for scalable, audit-ready lakehouses on Azure Databricks.
Proven experience building end-to-end RAG systems: data chunking, embedding generation (Hugging Face), hybrid search, and orchestration with LangChain/LlamaIndex.
Expertise in vector databases (Pinecone, Weaviate, or Databricks Vector Search) integrated with PySpark/SQL ETL pipelines.
Strong governance skills: Unity Catalog volumes, data lineage, quality gates (Great Expectations), and MLflow for model/data product deployment.
3+ years in data engineering, with Databricks Certified Data Engineer Associate/Professional.
Proficiency in Python, Delta Live Tables, streaming (Kafka), and cloud (Azure/AWS).
Experience in regulated German sectors (manufacturing, energy, finance) a plus.
Munich (hybrid). €100 per hour contract rate.
