omario

Data Engineer

Got my PhD (in data mining) in 2010. I worked essentially on designing a specific statistical data analysis Model (called Symbolic Data Analysis, SDA). combined with, relational algebra, nested algebra. Hence to the algebraic soundness and completeness of the model a set of Statistical model algorithms (k-means and hierarchical clustering and other data mining / Machine learning) were designed with simple queries. All this were embedded in SQL SERVER using C#. Since 2016, I am working on big-data/Azure projects as data engineer for massive data processing, data quality and data migration projects.


Experience: 11 years

Yearly salary: $0

Hourly rate: $90

Nationality: 🇫🇷 France

Residency: 🇫🇷 France


Experience

Azure / Data Engineer / Developer
saur
2024 - 2026
As a team member of TheFactory (5 people), my role was to design, integrate and develop solutions for our users (+15 teams), Medium/Large‑scale, cloud‑native data platforms on Microsoft Azure for batch and Realtime data pipelines. Additional operation related to (L3) support activities. Design, operate, and evolve a multi‑layer Azure Data Lake architecture (Curated, Refined, Featured) on Azure Data Lake Storage Gen2. Operate and maintain 230+ data ingestion pipelines using Azure Data Factory, over multiple data domains. Develop and maintain batch data pipelines using Databricks (PySpark), Delta Lake, DBT, and Databricks Asset Bundles (DAB). Build and operate real‑time, event‑driven apps using Azure EventHub (Kafka protocol) and Spark Structured Streaming on AKS (Kubernetes): * Consume events from RabbitMQ queues, * Enrich and transform streaming data (Azure EventHub/Kafka + spark streaming), * Route processed events back to downstream RabbitMQ queues, * Package and deploy streaming services using Helm, * Reproducing local environment with Docker/Docker-Compose. Operate and maintain Azure SQL databases by business domain (billing, contracts, assets, telemetry, customers), managing high‑volume datasets. Integrate heterogeneous internal and external data sources, including ERP systems, public datasets, and operational platforms. Export and partition Delta Lake datasets for analytics and operational consumption, ensuring robustness, auditability, and fault tolerance. Automate CI/CD pipelines and release workflows across dev / rec / prep / prod environments using Azure DevOps templates. Provision and manage Azure and Databricks infrastructure using Terraform and CDKTF (clusters, SQL Warehouses, Unity Catalog). Implement end‑to‑end observability and alerting using Grafana, Azure Monitor, Application Insights, Log Analytics, and KQL. Technical Environment: Azure Data Factory, Azure Data Lake Storage Gen2, Databricks, Delta Lake, DBT, Python, PySpark, Terraform, CDKTF, Azure SQL, RabbitMQ, Azure EventHub, AKS, Azure DevOps, Log Analytics, KQL, Helm, Docker/Docker-Compose, WSL/Bash, Az CLI, Databricks cli, Jinja2 Templates, Git.
Azure / Data Engineer / Developer
KPN
2022 - 2024
As a team member of DPPR (20 people), my role was contribute to design ( and support L1/L2/L3) solution for our users (+15 teams, +100 users) to be able to setup infrastructure in Azure to perform their data loading using a modern data organization (Data Mesh). Onboard (and support) on-premise big data projects to cloud (Azure), during all the migration process; within a modern data organization Maintain and evolve in-house framework to integrate sources and load data for raw, core, mirror layers and Teradata.  Enable new features: Azure services (AKS, HDI, Databricks, etc.), Oracle Golden Gate;  Enable custom features: Airflow, specific Spark features (e.g. dynamic allocation), etc.  Automate feature delivery and enablement (using code rendering jinja templates and custom yaml configuration)  Resolving security vulnerabilities. Technical infra environment: Terraform, Azure HDInsight (4.0  5.x), Databricks (+API), Docker, Azure Kubernetes Service, Virtual Machine Desktop, PostgreSQL/MySQL/SQL SERVER, Azure Storage (Gen2), WSL/Bash, Az CLI, Oracle Golden Gate. Technical soft/dev environment: Python, Spark (PySpark), Hive, Jinja2 Templates, Livy, Airflow, Git, Teradata, Robot Framework.
Azure / Data Engineer / Developer
KPN, Telecom, Amsterdam, Holland
2022 - 2024
As a team member of DPPR (20 people), my role was to contribute to design (and support L1/L2/L3) solution for our users (+15 teams, +100 users) to be able to setup infrastructure in Azure to perform their data loading using a modern data organization (Data Mesh).
Azure / Data Engineer / Developer
Societe Generale
2019 - 2022
As a team member of ARC/LAB (15 people), my role was to onboard on-premise big data projects to azure cloud platform, and assist them while migrating data project from on-premise to cloud (azure), then I was enrolled as data engineer, and in charge of Azure’s stream, in a new team (DDS/DAT/DMS), to provide Data Quality tool for big data projects. Tasks and missions: Big Data Trainer: As a big data trainer, many training formats where organized for different audience and purpose.  Internal/corporate training : produce and present resources (Java, Scala, Spark, Hadoop, Hbase) for data project teams (56 attendees, 7 sessions) to level up their skills in big data projects (5 days training), in-site.  Public training: 2 different training formats, big data fundamentals (1 days training session) and advanced (2 days training session), remotely. Data Project Migration & onboarding: onboard data project to migrate to Azure platform, against establishment data governance (Data Mesh) and other technical constraints (security, scalability, etc.) Data Quality (DQ): Enhance existing App/framework for DQ on the data lake, for each data project,  Management of Azure’s stream o Ensure coherence of azure target solution with on-premise solution, o Onboard team to setup cloud native solution (or migrate from on-premise) to include DQ inhouse solution to their data project. o Evangelization and technical workshops  Other technical daily tasks o Setup and maintain airflow workflows (dags, python, livy), o Monitoring DQ metrics, performances benchmarking with ELK, dashboarding with Kibana, o Enhance (with features), maintain (bugs, and issues) and integrate (against data governance constraints) DQ solution for Azure and On-premise (ansible, java/scala, microservices in Kubernetes, data mesh, HDI) o support/release. Technical infra environment: Azure Service EndPoint Policy, Azure App Certificate, Azure HDInsight (3.6  4.0), Azure Kubernetes service (ingress, deployment, service, secrets, pvc), CosmosDB, Azure Sql Server, PostgreSQL, Azure Data Lake Store (Gen2), Queue, Table, Virtual Machine Desktop, Elastic search, Kibana, Ansible. Technical soft/dev environment: java, scala, Spark, Spring-boot, Livy, Zepplin, Airflow (Dags in python), swagger, Git, PowerShell / bash.
Azure / Data Engineer / Developer
Societe Generale, Corporate & Investment Banking, Paris, France
2019 - 2022
As a team member of ARC/LAB (15 people), my role was to onboard on-premise big data projects to azure cloud platform, and assist them while migrating data project from on-premise to cloud (azure).
Data/Infra Engineer
Total, Energy, Nanterre, France
2019 - 2019
My role is to provide technical support for these projects. Support is related to platform migration, new data projects setup and data governance.
Azure Developer
Colas/SPEIG, Civil Engineering, Velizy, France
2018 - 2019
My main role is to contribute, as a team member, to build up from scratch a set software based on Microsoft (asp.net core mvc) and Cloud Technologies (web apps, functions, sql azure, ServiceBus, etc.) and deliver business features (front and back ends) in agile (Scrum).
Data Developer/Analyst
Mappy.com, Massive web-traffic app, Paris, France
2017 - 2018
My role is to develop data pipeline, charts and dashboards for requested data insights (servers perf monitoring, Api usage, user behaviors, etc.) for the whole internal teams (Infra, dev, marketing, etc.) and external partners (clients, top management, etc.)
Data/Machine Learning Developer
SeLoger.com, Massive web-traffic app, Paris, France
2016 - 2017
My role is to implement, integrally or partially, data science process, using many different data sources, for marketing department.
.Net developer
GDF SUEZ Trading, Commodities trading, La Défense, France
2015 - 2016
Develop (within BEE entity) a new app Omega for gas asset management (Sale/Purchase, distribution, optimization, production, etc.).
.Net developer
Mutuelle des Architectes Français, Insurance, Paris, France
2013 - 2015
Involved as a .Net developer in two .Net projects (intra and extra net apps) for selling insurance contracts.
.Net Developer
Aeroports de Paris, Air transport, Orly, France
2012 - 2013
Implementing an Airport Resources Manager.
.Net Software Architect
Banque de France, Banking, Paris, France
2012 - 2012
Develop, maintain and package a shared technical framework for building Rich applications.
R&D Developer
Artza Technologies, Software editor, Paris, France
2012 - 2013
Charged to recruit and validate resources to achieve development goals, assign and scheduling user stories to iterations (scrum).
.Net Developer
SGCIB, Investment banking, La Défense, France
2010 - 2012
Agile software development for Front-Office Trading tools.
Ph.D. candidate
CEREMADE & LAMSADE labs, R&D Labs, Paris, France
2006 - 2010
Scale up existing data analysis model called symbolic data analysis.

Skills

agile
big-data
c-sharp
docker
hadoop
java
kubernetes
math
python
remote
trader
analyst
english
arabic
french
spanish