Power Query Just Became a Programmable Engine — and That Changes Everything for Fabric
The Challenge
Every organisation that uses Microsoft Fabric, Power BI, or Excel has Power Query scripts. Thousands of them. M language transformations that clean, reshape, and prepare data across the business. The problem? Those scripts have always been trapped inside interactive tools and scheduled dataflow refreshes.
If you wanted to call an M transformation from a Python notebook, you couldn't. If you wanted to trigger a Power Query script as part of an automated pipeline, you had to work around the limitation with clunky refresh triggers and polling. And if you wanted to mix Power Query logic with Spark or SQL in the same workflow, you were out of luck.
This created an odd split in the data engineering world. Low-code teams build in Power Query. Code-first engineers write Spark and SQL. The two rarely meet, and when they do, someone rewrites the logic from scratch.
What's Changed
Microsoft has released the Execute Query API in public preview — a REST endpoint that lets you run Power Query M scripts programmatically within Fabric. You send an M script (or reference an existing query in a Dataflow Gen2 artifact), and you get results back as an Apache Arrow stream. That stream slots straight into Spark DataFrames or Pandas without any conversion overhead.
Here's what the call looks like from a Fabric notebook:
import requests
import pyarrow as pa
token = notebookutils.credentials.getToken(
"https://analysis.windows.net/powerbi/api/"
)
url = (
f"https://api.fabric.microsoft.com/v1/"
f"workspaces/{workspace_id}/dataflows/{dataflow_id}/executeQuery"
)
body = {
"queryName": "CleanedSalesData",
"customMashupDocument": "section Section1; shared CleanedSalesData =..."
}
response = requests.post(url, headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}, json=body, stream=True)
with pa.ipc.open_stream(response.raw) as reader:
df = reader.read_pandas()
The API operates against a Dataflow Gen2 artifact, which provides the execution context — the connections, credentials, and gateway configuration. You can either reference an existing named query in the dataflow or pass a custom M script at runtime.
This is not just another REST wrapper. It turns Power Query into a first-class compute engine within Fabric that can be called from notebooks, pipelines, external applications, or anything that speaks HTTP. The Arrow-based response format means results are columnar and fast, without the overhead of CSV or JSON serialisation.
Three capabilities stand out:
Dynamic M scripts at runtime. You can pass different M expressions with each API call. This means parameterised transformations — the same endpoint can clean different datasets based on runtime input. For agentic workflows, this is significant: an AI agent could construct and execute M transformations on demand.
Gateway and on-premises access. The API supports gateway-backed connections. If your data lives behind a firewall or in a private network, you can still invoke Power Query against it programmatically. This is one of the few API surfaces in Fabric that bridges cloud and on-premises in a single call.
Cross-engine integration. Power Query results flow directly into Spark and SQL workflows. A notebook can call Power Query for the transformation step, then push the Arrow DataFrame into a lakehouse table using Spark. No intermediate files, no staging layers.
Getting Started
The prerequisites are straightforward:
- Create a Dataflow Gen2 (CI/CD) artifact in your Fabric workspace. This acts as the execution context.
- Configure connections in the dataflow for the data sources your queries will access.
- Acquire a token using Azure CLI or
notebookutils.credentials.getToken()in a Fabric notebook. - POST to the Execute Query endpoint with your workspace ID, dataflow ID, and either a query name or custom M script.
The Execute Query API Reference has full parameter documentation. Start with a simple query that references a lakehouse table and verify the Arrow response parses correctly.
For pipeline integration, wrap the API call in a Notebook Activity within a Fabric pipeline. Store your M scripts in Git alongside your pipeline definitions — version-controlled transformations that can be tested independently.
A practical first step: identify the M scripts your team already maintains in Dataflow Gen2. Pick one that runs on a schedule. Convert it to an API call from a notebook. Compare the results. You'll see immediately whether the 90-second timeout is a constraint for your workloads.
What This Means
Power Query has over a decade of investment behind it. Millions of M scripts exist across organisations. Making those scripts callable via API doesn't replace Spark or SQL — it makes them composable with Spark and SQL.
In my experience working with enterprise customers, the most painful data engineering problems aren't about individual transformations. They're about the gaps between tools. The Power BI team builds one pipeline. The data engineering team builds another. Both transform the same data, differently, because the tools didn't talk to each other.
The Execute Query API doesn't solve all of that, but it removes a major barrier. M scripts become reusable assets rather than locked-in artefacts. And for organisations exploring agentic data workflows — where AI systems orchestrate data preparation — having a programmable transformation API is a prerequisite, not a nice-to-have.
The 90-second timeout and read-only limitation are real constraints for now. But for the pattern of "transform this subset, return the result" — which covers a huge proportion of data preparation tasks — this is exactly what was missing.
Leon Godwin, Principal Cloud Evangelist at Cloud Direct