Mastering Data Privacy Levels: A Guide to Resolving Firewall Errors in Power BI

The significance of data privacy levels and their impact on Power Query's Firewall

Dhyanendra Singh Rathore
Microsoft Power BI

--

Photo by Matthew Henry on Unsplash

Data privacy levels are one of the least discussed and underrated security features of Power BI. For most of the Power BI developers, self-service business analysts, and advanced business users, it solely exists to frustrate them with no clear benefits.

In this article, we will discuss the importance and role of data privacy levels and their impact on Power Query's data privacy Firewall. Additionally, we will explore practical methods to tackle Firewall errors and address them appropriately.

Please read our terms of use before proceeding with this article: https://dhyanintech.medium.com/disclaimer-disclosure-terms-of-use-fb3bfbd1e0e5

Data Privacy Levels

In Power BI, data privacy levels define the isolation level between data sources and influence the extent to which one data source remains separate. Privacy levels provide an additional layer of security by ensuring that data doesn't unintentionally leak across the data sources. A trade-off is that privacy levels can reduce the functionality and performance of Power Query operations.

For example, suppose you have a table with credit card numbers sourced from a SQL database. And you're using an external REST service to get the card details, such as card type, card issuer, etc. One way to achieve this is to get the credit card numbers from the SQL table and pass them to the REST service. Data privacy levels exist to define and determine whether such a mashup of data across data sources is allowed.

In Power Query terms, privacy levels determine whether data from a specific source will be "folded" into other sources. How does Power Query ensure privacy levels are implemented and respected during data mashups? By using a Firewall. More on folding and Firewall later in the article.

Privacy Levels and Compatibility

There are three distinct privacy levels with well-defined compatibility with each other. The compatibility determines whether it is acceptable to share data between different sources. Understanding the different privacy levels and their compatibility is essential to combine data in a way that Firewall can allow.

Power BI Data Privacy Levels and Compatibility (Image by author)

Where do we define the data privacy levels?

Privacy levels are defined for data sources and can be found under each data source's setting in both Power BI Desktop and Power BI Service.

Power BI Desktop: In Power Desktop, data source settings can be accessed from multiple places

  • File > Options and settings > Data source settings
  • Home > Transform data > Data source settings
  • Home > Transform data > Power Query Editor > Home > Data source settings

Select a data source from the list and then choose Edit Permissions. Under Privacy Level, select a level in the drop-down.

In the Data source settings prompt, data sources are categorized into two different groups:

  • Data sources in current file: lists the data sources connected to the current file
  • Global permissions: lists all the data sources connected across all the files that you've opened in your Power BI Desktop
Power BI Desktop: Privacy levels for a data source (Image by author)

Power BI uniformly applies data source permissions, such as credentials, privacy level, encryption, etc., across all the files. So if we change the privacy level for a data source in the current file, the privacy level for that data source will change for all the files connected to that data source and vice-versa.

Power BI Service: In Power BI Service, data source settings are available for Dataset, Dataflow, and Datamart and can be found under More Options(…) > Settings > Data source credentials

Power BI Service: Privacy levels for a data source (Image by author)

Can we control when to apply the privacy levels?

Power BI Desktop has a separate set of settings that allows us to disable or enforce the privacy levels for the current or all the files. Global settings take priority over the current file's setting.

  • File > Options and settings > Options > GLOBAL/Privacy
  • File > Options and settings > Options > CURRENT FILE/Privacy
Power BI Desktop: Privacy settings (Image by author)

These settings can be described as

Privacy level file settings (Image by author)

It’s critical to remember that while Power BI Desktop does provide an option to ignore the privacy levels, the setting doesn’t work in the Power BI Service.

Reports that ignore the privacy levels and are published to the Power BI Service don’t adhere to the setting and the data refresh fails.

What if we don't set privacy levels?

Power BI Desktop is intuitive and prompts the user to set the privacy levels during data previews and refresh if the following conditions are met:

  • Privacy level settings are not set to ignore the privacy levels
  • Privacy level for at least one data source is not set
  • Data mashup operations are performed across data sources
Power BI Desktop: Privacy level required prompt (Image by author)

Folding and Firewall

"Folding" refers to converting M expressions, like filters, renames, joins, etc., into operations executed against a data source such as SQL or OData. One of the significant strengths of Power Query is its ability to translate user actions from its UI into more complicated SQL or other backend data source languages.

This capability allows for seamless integration and execution of user-defined operations using a common set of commands on various data sources, leveraging the performance benefit of native data source operations.

Consider a scenario where you have a huge fact table coming from a SQL data source, and you're joining this table with a small CSV file in Power Query. One way to achieve this is to read the entire SQL table, read the CSV file, and join them together in the local memory. However, Power Query may determine during folding that the most efficient way to execute this mashup is to read the CSV file, inline the data into a SQL statement, and let the database perform the join.

How are privacy levels and folding related to Firewall errors?

Power Query's Data Privacy Firewall enforces the data privacy levels during folding in Power Query. The Firewall's primary purpose is to safeguard against unintentional data leakage between data sources.

Unintentional data leakage can happen due to the folding of expressions, and Firewall is designed to prevent such data leaks. In our example of joining a SQL table with a CSV file, folding can cause unintentional data leakage if Power Query sends the CSV data to the SQL data source. On the other hand, our example of credit card numbers and REST service was intentional data leakage.

Regardless of whether intentional or unintentional, the Firewall blocks the data mashup operations and raises errors if the following rule is broken:

A partition may either access compatible data sources, or reference other partitions, but not both.

What is a partition?

To accomplish its duty, the Firewall divides the M queries into partitions. In simple terms, a partition is a collection of one or more query steps. And at the most basic level, a partition is a single query step.

Power Query Editor: Steps and Queries (Image by author)

A detailed description of how the partitions are created and why the Firewall needs to ensure that the rule is enforced is beyond the scope of this article. However, you can find an elaborated article here.

To summarize, Firewall raises an error when it detects that a partition can access an incompatible data source and another partition simultaneously.

The errors signify that the Firewall can’t enforce the data privacy during data refresh.

Resolving Firewall errors

Based on the above rule, the Firewall can raise two different types of errors. The errors are only raised if privacy rules are enforced, i.e., the privacy level setting is not set to ignore the privacy levels. In other words, you can resolve both errors by disabling the Firewall from the privacy level settings. However, remember that the ignore privacy level setting doesn't work in the Power BI Service, so your future refreshes will not succeed.

Consider a scenario with a local CSV file with country codes and a SQL table with sales records. We want to read the country codes from the CSV file and then pass the country codes in the SQL statement to load only the related sales records from the SQL table. Let's look at how to resolve the Firewall errors for this scenario.

Query references other queries or steps

First, we create a Country query to read the country codes from the local CSV file and some steps to convert it to a quoted and comma-separated list for the SQL statement.

-- Country
let
Source = Csv.Document(
File.Contents("C:\Users\dhyan\Downloads\Countries.csv"),
[Delimiter = ",", Columns = 1, Encoding = 1252, QuoteStyle = QuoteStyle.None]
),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars = true]),
Added_Column = Table.AddColumn(#"Promoted Headers", "Quoted", each "'" & [Country] & "'"),
To_Text = Text.Combine(Added_Column[Quoted], ",")
in
To_Text

Next, we create a Sales query that references the Country and passes the data to the SQL server.

-- Sales
let
Source = Sql.Database(
"everydaybi-sql-server-dev.database.windows.net",
"everydaybi-sql-dw-dev",
[ Query = "SELECT
[Sales Date]
,[ISO]
,[ID Business Area]
,[ID Product Line]
,[Sales]
FROM [everydaybi].[fact_sales_view]
WHERE [ISO] IN ( " & Country & ")",
CreateNavigationProperties = false
]
)
in
Source

While evaluating the Sales query, the Firewall raises an error.

Formula.Firewall: Query 'Sales' (step 'Source') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

Power Query Editor: Firewall error — Query references other queries or steps (Image by author)

The error is raised because the Source step of Sales violates the Firewall's rule by referencing a data source (SQL server) and another partition (from Country) in the same partition.

A simple way to resolve this error is to flatten and combine both queries. We can combine the Country and Sales queries as follows (let's call it Sales Flattened):

-- Sales Flattened
let
Source = Csv.Document(
File.Contents("C:\Users\dhyan\Downloads\Countries.csv"),
[Delimiter = ",", Columns = 1, Encoding = 1252, QuoteStyle = QuoteStyle.None]
),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars = true]),
Added_Column = Table.AddColumn(#"Promoted Headers", "Quoted", each "'" & [Country] & "'"),
To_Text = Text.Combine(Added_Column[Quoted], ","),
Get_Table = Sql.Database(
"everydaybi-sql-server-dev.database.windows.net",
"everydaybi-sql-dw-dev",
[ Query = "SELECT
[Sales Date]
,[ISO]
,[ID Business Area]
,[ID Product Line]
,[Sales]
FROM [everydaybi].[fact_sales_view]
WHERE [ISO] IN ( " & To_Text & ")",
CreateNavigationProperties = false
]
)
in
Get_Table

Now everything happens inside a single partition, and assuming that the privacy levels for both the data sources are compatible, the Firewall allows the query to execute.

Power Query Editor: Resolving Firewall errors by flattening queries (Image by author)

Incompatible privacy levels

Continuing with our example, we have two sources: a CSV file and a SQL server. After rewriting the queries, the Firewall will only allow the query to execute if the privacy levels of both sources are compatible. Let's change the privacy level of CSV to Private and SQL server to Organizational.

The Firewall raises an error while evaluating the Sales Flattened query.

Formula.Firewall: Query 'Sales Flattened' (step 'Get_Table') is accessing data sources that have privacy levels which cannot be used together. Please rebuild this data combination.

Power Query Editor: Firewall error — Incompatible privacy levels (Image by author)

The error is raised because the Get_Table step of Sales Flattened is violating the Firewall's rule by referencing incompatible data sources in the same partition.

To resolve this error, we must set compatible privacy levels for both the sources, i.e., Organizational in our case.

And with that, we reach the end. That's all for now.

Conclusion

In this article, we delved into the significance and function of data privacy levels in Power BI and their impact on the Power Query's Firewall. Resolving Firewall errors can often be a complex task. You can now troubleshoot these errors in your specific scenarios with our insights.

Further reading

If you face Firewall errors in Tabular Editor, resolving them may be more complex than in Power BI Desktop and may involve additional steps. Fortunately, we have an excellent article to help you overcome these challenges effectively.

Coming Soon . . .

Like this post? Connect with Dhyan

Let's be friends! You can find me on LinkedIn or join me on Medium.

👉 Don’t forget to subscribe to our

Power BI publication and Weekly Newsletter

--

--