Data Engineer Apprentice

Toulouse  - Alternance (12 Months)

Go to company jobs

About Sigfox

Our vision: bringing objects to life! In the future, billions of objects around the world will be connected to the Internet; their data will be stored in the Cloud and will participate in the digitization of our environment. A simple, low-cost, low-power, global connectivity solution is fundamental.

Each Sigfoxer is driven by the company's project to revolutionize the world!

Sigfox is the place for personal and collective challenge. Every day, nearly 44 different nationalities come together here: diversity is one of their strong points.

Job description

We started the data centric activities at Sigfox 5 years ago, to transform Sigfox toward a data driven company. Among all these activities, the current Data Management Service (DMS) team is responsible to develop the big data platform and analytics, for ourself and for our customer. 

It consists currently of a team of 6, where inside each member and each decision and action are important. As an apprentice, you will be fully integrated and be considered as full team member.


You will have one principal and one secondary activity during your internship : 


  • You will own fully a subject as your main objective: A data driven company must have absolute trust in its data. We have defined the business requirement for the Data Quality (DQ) functionality. Your objective, under the supervision of your mentor and with help of your colleagues, will be to define and implement the MVP solution: find open source components to speed up implementation, design the data quality datamart model and pipelines, implement a first round of quality probe and the data flow to a DQ datamart. This will allow us to control data quality inside our development projects (acceleration of validation) and maintain a DQ dashboard on all our production data assets. 

 

  • After 5 years, first generation of our data flows are already legacy, as things are moving fast. We need to migrate them to a more industrial mature stack based on Spark, Airflow and Kafka. As secondary mission, you will be working with other data engineers to enhance and migrate parts of these legacy data flow to the new technical stack/environment. That will help you understand how and where DQ is valuable. 


We are looking for a Computer Science 3rd year master level or equivalent apprentice, with data specialization. You must have : 

  • Good knowledge of SQL and relational database, with practice. 
  • Python development skills a must, other languages a plus (ex: Java, Scala, bash) 
  • English proficient, at least in reading/writing (our spec and documentation are in english) 

You should have ideally some basic understanding, knowledge or experience with : 

  • Data models (ex: Codd relational, star schema or fact-dimension) 
  • Big data paradigms, batch and streaming paradigms 

That may help if you have already some knowledge or practical experience on : 

  • big data tools and framework (Hadoop, Spark, Hive, Flink, Kafka, noSQL database, …) 
  • cloud platform (AWS, GCP, Azure)  
  • Docker and/or Airflow  

The technical environment is AWS cloud, using many of their technology offering: mainly EC2, EMR (Spark), S3, CloudWatch, Redshift and outside AWS: Tableau, Airflow 

Profile

Every Sigfoxer is driven by the project to make things come alive (ambitious)! The DMS Team is filled with passionate people. We are looking for an apprentice with strong sense of initiative, technical curiosity and good humor 

 

  • We expect you to own your tasks and take charge 
  • We expect you to be proactive and propose/take initiatives 
  • Personal success is team success, individualist need not apply. Inversely, we will collectively help you succeed 
  • Our technical stack is quite rich, the time is short, so you must be a fast learner  
  • Communication in a team and among people is key to success.  


Go to company jobs

Made by