Description
About the Position:
We are looking for a Data Engineer to help us expand our data pipeline and architecture. You will be a part of our data group consisting of data analysts, data scientists and data engineers, and work closely with our R&D team to deploy solutions to multiple production environments. You will take a key part in building the data pipeline and ETL processes, designing and building new data-based services and deploying machine learning models to production.
Main Responsibilities
- Be completely responsible for the company’s data pipeline for multiple products, and building the company’s data warehouse.
- Build the infrastructure for our ETL processes, processing data from various sources, incl. relational and non-relational databases, data streams and object storage.
- Deploy the solutions to multiple production environments in an automated manner.
- Work with data analysts and data scientists to ensure data quality and accessibility.
- Build internal tools for the company to automate analysis of the data and training of models.
Requirements
- Advanced programming skills, ideally in Python or similar.
- Advanced knowledge of SQL and working with various relational and non-relational databases such as Postgres and ElasticSearch.
- Experience in building data pipelines, ETL processes and data warehouses. Experience with workflow tools such as AirFlow.
- Experience with big data tools such as Spark, Kafka, Presto.
- Experience with AWS cloud services: RDS, EC2, EMR, Glue.