Posted on Tuesday, 3rd February 2026
Key Responsibilities:
Develop and maintain ETL pipelines using PySpark, SparkSQL, and Python within our Lakehouse Data Platform.
Work with cloud-based infrastructure (AWS) to build, deploy, and maintain scalable data solutions.
Ensure data quality, implement monitoring, and support CI/CD processes for data pipelines.
Cooperate closely with data analysts, ML engineers, and other data professionals to deliver reliable and well-structured datasets.
Contribute to the modernisation of our existing data stack, helping migrate from legacy solutions to a unified lakehouse architecture.
You’ll need:
Bachelor’s degree in Computer Science, Information Systems, Engineering, Mathematics, or a related field.
Around 1-3 years of experience in data engineering.
Strong hands-on skills in Python and Spark (PySpark) – required for daily development.
Practical experience with at least one major cloud platform (AWS, Azure, or Google Cloud) – required.
SQL skills for working with structured and semi-structured data.
Understanding of data lake and data modelling principles.
Experience with Git-based development workflows and CI/CD practices.
Fluency in English and strong problem-solving mindset.
Nice to Have:
Experience with Terraform or another Infrastructure as Code tool.
Basic knowledge of Databricks, AWS Glue, or other distributed data processing tools.
Familiarity with monitoring, data testing frameworks, or data quality automation.
Curiosity about MLOps and data orchestration tools.
What We Offer & What You Can Expect:
Challenging tasks and real impact – you’ll be directly involved in bringing new projects to life and influencing how we grow as a e-commerce business.
Fast-paced learning environment – we’re an international team, so you’ll constantly pick up new skills and insights.
Flexible working hours and B2B contract – choose when and how you work.
Lots of room to grow – through hands-on experience, training, and working with experts from different parts of the world.
