Data Engineer | Casablanca (Morocco)

Morocco
Company culture :

Inetum operates within a predominantly collaborative culture, where people, trust and teamwork are at the core of the organization. The company promotes a supportive management approach focused on guidance, accountability and skills development. This collaborative foundation is strengthened by a performance-driven mindset, emphasizing ambition, self-improvement and customer focus. Innovation and organizational dimensions further complement this culture, encouraging initiative and agility while relying on structured processes to ensure efficiency and reliability. [+]





Job :

As part of the Data team, you will be responsible for the design, industrialization, and optimization of data pipelines in a Big Data environment (Hadoop/HDFS, Hive, Spark). You will ensure the quality, traceability, and availability of datasets that feed BI (Power BI) and business analytical needs.





Key Responsibilities: 





  • Ingestion & Modeling


Integrate data from multiple RDBMS (PostgreSQL, SQL Server, MySQL, IBM DB2) and files via Sqoop/ETL.




Structure bronze/silver/gold zones and define schemas (Hive).





  • Distributed Processing


Develop and optimize Spark / PySpark jobs (partitioning, broadcast, cache, bucketing).
Write efficient and maintainable SQL/HiveQL transformations.





  • Orchestration & Production


Design and maintain Airflow DAGs (scheduling, retry, SLA, alerting).
Industrialize via GitLab (CI/CD), Shell scripts, and DevOps Data best practices.





  • Quality & Governance


Implement checks (completeness, uniqueness, referentials), unit/data tests, and documentation (catalogue, dictionaries).
Ensure traceability (lineage) and incident management (RCAs, runbooks).





  • Value & BI


Publish "analytics-ready" datasets and optimize Power BI data feeding (materialized views, aggregations).
Contribute to KPI calculation and reliability.






Required profile :

Profile sought:





  • 2 to 4 years of experience in Data Engineering/Big Data, with proven achievements in PySpark/Hive and Airflow.

  • Master's degree (Big Data & AI, Data Engineering, or equivalent).

  • Proficiency with RDBMS (PostgreSQL, SQL Server, MySQL, IBM DB2) and query optimization.

  • Familiarity with Linux environments and Shell scripting.

  • Ability to document, test, and monitor production pipelines.


Technical Stack:





  • Big Data Processing: Spark / PySpark, Hive, HDFS (+ MapReduce/Impala appreciated).

  • Languages & Data: Python, advanced SQL, Shell (bash).

  • Orchestration: Apache Airflow.

  • Dataviz/BI: Power BI (dashboards, datasets).

  • OS & Tools: Linux (Ubuntu/CentOS), Git/GitLab, CI/CD.

  • Bonus: Pandas/Numpy for prototyping, MongoDB/HBase knowledge.


Behavioral Skills:





  • Rigor and attention to quality (tests, code reviews, documentation).

  • Team spirit and clear communication with business and BI teams.

  • Autonomy in incident investigation and proactivity in continuous improvement.

  • Results-oriented: adherence to SLAs and performance culture. 



Post date: Today
Publisher: Bayt
Post date: Today
Publisher: Bayt