
☁️ Microsoft Fabric Data Engineering Pipeline
A real-time data pipeline connecting Kibo Commerce to Microsoft Fabric for unified analytics
Azure Functions.NET 8Azure Data FactoryMicrosoft FabricPySparkDelta LakePower BIOneLake

Overview
At London Drugs, I designed and implemented a data engineering pipeline that connects Kibo Commerce webhooks and APIs to Microsoft Fabric (OneLake) for unified analytics. The goal was to modernize how e-commerce data — including orders, payments, shipments, and returns — flows into our analytics environment.
Previously, data was siloed, manually exported, and hard to keep consistent across systems. My project automated the ingestion, transformation, and storage of this data in real-time, laying the foundation for the company's future data platform.
Tech Stack
The solution is built on Azure Functions (.NET 8 isolated worker) to process incoming webhook events, Azure Data Factory (ADF) pipelines for orchestration, and Microsoft Fabric for data storage and analytics.
I used PySpark notebooks inside Fabric for transformation and deduplication, Delta Lake for schema evolution and partitioning, and Power BI for reporting. Data is stored in OneLake and organized by domain (Orders, Shipments, Returns, Payments).
Challenges
The biggest challenge was handling schema evolution and large payloads. Kibo's APIs occasionally changed structure, causing ingestion failures. To fix that, I implemented dynamic deserialization using Newtonsoft.Json and designed a flexible schema auto-merge pattern in PySpark.
Another issue was token refresh and API pagination — ensuring continuous synchronization without data loss. Building robust retry and checkpoint logic was key to maintaining reliability.
Solution
The architecture uses an event-driven pattern: each webhook triggers an Azure Function that normalizes and stores the event into OneLake in Delta format. ADF pipelines handle periodic syncs and archival, while PySpark notebooks perform transformations and aggregations for semantic models.
I also built semantic datasets in Fabric — "OrdersNBA", "ShipmentsNBA", "PaymentsNBA", and "ReturnsNBA" — for Power BI and executive dashboards.
Impact
The new pipeline replaced hours of manual exports with automated, near-real-time analytics. It provided better visibility into e-commerce performance, fulfillment delays, and refund trends.
This project solidified my expertise in data engineering and cloud integration and gave me hands-on experience connecting enterprise commerce systems with Microsoft's latest data platform technologies. It also reinforced my love for building reliable, scalable systems that directly improve how businesses operate.