Data Engineer | Casablanca (Morocco)

المغرب

Company culture :

Inetum operates within a predominantly collaborative culture, where people, trust and teamwork are at the core of the organization. The company promotes a supportive management approach focused on guidance, accountability and skills development. This collaborative foundation is strengthened by a performance-driven mindset, emphasizing ambition, self-improvement and customer focus. Innovation and organizational dimensions further complement this culture, encouraging initiative and agility while relying on structured processes to ensure efficiency and reliability. [+]

Job :

As part of the Data team, you will be responsible for the design, industrialization, and optimization of data pipelines in a Big Data environment (Hadoop/HDFS, Hive, Spark). You will ensure the quality, traceability, and availability of datasets that feed BI (Power BI) and business analytical needs.

Key Responsibilities:

Ingestion & Modeling

Integrate data from multiple RDBMS (PostgreSQL, SQL Server, MySQL, IBM DB2) and files via Sqoop/ETL.

Structure bronze/silver/gold zones and define schemas (Hive).

Distributed Processing

Develop and optimize Spark / PySpark jobs (partitioning, broadcast, cache, bucketing).
Write efficient and maintainable SQL/HiveQL transformations.

Orchestration & Production

Design and maintain Airflow DAGs (scheduling, retry, SLA, alerting).
Industrialize via GitLab (CI/CD), Shell scripts, and DevOps Data best practices.

Quality & Governance

Implement checks (completeness, uniqueness, referentials), unit/data tests, and documentation (catalogue, dictionaries).
Ensure traceability (lineage) and incident management (RCAs, runbooks).

Value & BI

Publish "analytics-ready" datasets and optimize Power BI data feeding (materialized views, aggregations).
Contribute to KPI calculation and reliability.

Required profile :

Profile sought:

2 to 4 years of experience in Data Engineering/Big Data, with proven achievements in PySpark/Hive and Airflow.

Master's degree (Big Data & AI, Data Engineering, or equivalent).

Proficiency with RDBMS (PostgreSQL, SQL Server, MySQL, IBM DB2) and query optimization.

Familiarity with Linux environments and Shell scripting.

Ability to document, test, and monitor production pipelines.

Technical Stack:

Big Data Processing: Spark / PySpark, Hive, HDFS (+ MapReduce/Impala appreciated).

Languages & Data: Python, advanced SQL, Shell (bash).

Orchestration: Apache Airflow.

Dataviz/BI: Power BI (dashboards, datasets).

OS & Tools: Linux (Ubuntu/CentOS), Git/GitLab, CI/CD.

Bonus: Pandas/Numpy for prototyping, MongoDB/HBase knowledge.

Behavioral Skills:

Rigor and attention to quality (tests, code reviews, documentation).

Team spirit and clear communication with business and BI teams.

Autonomy in incident investigation and proactivity in continuous improvement.

Results-oriented: adherence to SLAs and performance culture.

قدم على موقع الوظيفة

تاريخ النشر: اليوم

الناشر: