Metadata Frameworks in Microsoft Fabric: Logging with Eventhouse (Part 2)

Following up on the previous post about YAML-based metadata frameworks, let’s talk about logging – the part of the framework that often stays invisible until something fails. In Part 1, YAML helped us replace config tables with cleaner, version-controlled definitions. Now, logging ensures we have the visibility to understand what’s really happening inside our Fabric pipelines.

Because without proper logs, troubleshooting a failed run is a bit like fixing a car in the dark – you know something broke, but you’ll have no clue where the problem is.

Why Eventhouse for Logging?

When we switched to YAML for configurations, we moved away from storing pipeline metadata in the warehouse or SQL config tables. For logging, we deliberately chose Eventhouse and its KQL database.

Why? Because KQL is purpose-built for this kind of workload:

Optimised for ingestion → it handles high-volume, append-only data (a natural fit for logs).
Efficient querying → KQL is designed for quickly scanning large datasets, filtering, and aggregating by time.
Real-time analytics → logs can drive actions (e.g., triggering Fabric Activator events).
Time-series support → perfect for tracking job runs and spotting patterns.

This is exactly how Azure Monitor and Log Analytics work under the hood, so using KQL in Fabric for logging isn’t reinventing the wheel; it’s adopting a proven pattern within the Fabric ecosystem.

That said, the Eventhouse approach is not without limits. On lower Fabric SKUs, we’ve seen throttling under moderate to high concurrency. Retry mechanisms (like exponential backoff) can help, but they’re still workarounds rather than a complete fix.

Why Not Warehouse or Lakehouse?

While logs could technically be stored in a Warehouse or Lakehouse, in practice:

Fabric Warehouse writes support from Spark is limited today (no native connector; only pyodbc or pipeline workarounds).
Lakehouse tables can receive logs directly from notebooks, but pipelines don’t support writing to Lakehouse tables, which breaks consistency across the framework.
Concurrency → Eventhouse is simply better at handling multiple simultaneous writes compared to Warehouse or Lakehouse.

Even if the logging volume is “not that high,” KQL’s concurrency handling and simple write API make it the most practical choice.

How Logging Works in the Framework

We capture logs at two different levels: ingestion pipeline execution details and overall orchestration.

Pipeline logging

After each ingestion pipeline finishes, a generic pipeline step calls a KQL activity that logs:

Source table name
Destination object/location
Rows ingested
Start/end timestamps
Duration, status, and any custom metrics

Notebook logging (orchestration)

A single orchestration notebook reads/parses the YAML, builds the DAG, and executes tasks.

Instead of adding logging code inside every task notebook, we use a notebook wrapper.

The wrapper logs start/end and status for each task.
It safely catches errors thrown anywhere inside the called notebook (any cell).
It records process name, execution status (started/succeeded/failed), start/end timestamps and error message (when applied).
Because logging is centralised, no changes are required inside individual task notebooks, and failure info is consistent (which also helps with resume-from-failure scenarios).

The wrapper writes each log entry to Eventhouse using the Kusto Spark Connector:

This gives us granular ingestion logs from pipelines and consistent, centralised task logs from the orchestration layer — all in Eventhouse for analysis and troubleshooting.

Wrapping Up

Logging isn’t just a side feature; it’s the backbone of reliable pipelines. By choosing Eventhouse, we align with Fabric’s strengths — specifically, KQL for ingestion, analysis, and time-series queries — while maintaining consistency in both pipeline and notebook logging.

Other engines can also be considered depending on needs:

Lakehouse / Warehouse → Both rely on Delta tables and struggle with high-concurrency writes, although they may fit audit-style or low-frequency logging.
SQL Database (currently in preview) → Supports structured relational inserts, but CU consumption makes it less attractive for high-volume operational logging, especially in lower SKUs.

For most scenarios, Eventhouse remains the natural fit: scalable ingestion, real-time visibility, and built-in time-series analysis — exactly what operational logging needs.

In the next part of this series, we’ll look at how DevOps pipelines were set up for YAML deployment in Fabric, covering version control, environment promotion, and approval workflows.

Author

Rui Francisco Gonçalves

Senior Specialist

Metadata Frameworks in Microsoft Fabric: Logging with Eventhouse (Part 2)

Fabric: nova plataforma de análise de dados

Metadata Frameworks in Microsoft Fabric: Logging with Eventhouse (Part 2)

Why Eventhouse for Logging?

Why Not Warehouse or Lakehouse?

How Logging Works in the Framework

Wrapping Up

Author

Rui Francisco Gonçalves

The real bottleneck in Agentic AI isn’t data. It’s context

HUMAN–AI Alliance Agents | BI4ALL Talks

Fabric Model Analyzer: Entreprise-scale best practices monitoring

Finsolutia: Accelerated portfolio analysis

Human–AI Partnerships: From Automation to Collaboration

Native Writeback in Power BI with Translytical Task Flows

Metadata Frameworks in Microsoft Fabric: Logging with Eventhouse (Part 2)

Fabric: nova plataforma de análise de dados

Metadata Frameworks in Microsoft Fabric: Logging with Eventhouse (Part 2)

Why Eventhouse for Logging?

Why Not Warehouse or Lakehouse?

How Logging Works in the Framework

Wrapping Up

Author

Rui Francisco Gonçalves

Share

Suggested Content

The real bottleneck in Agentic AI isn’t data. It’s context

HUMAN–AI Alliance Agents | BI4ALL Talks

Fabric Model Analyzer: Entreprise-scale best practices monitoring

Finsolutia: Accelerated portfolio analysis

Human–AI Partnerships: From Automation to Collaboration

Native Writeback in Power BI with Translytical Task Flows