3

Here is the code to reproduce the issue:

CREATE TABLE [dbo].[EmployeeDataMasking](
    [RowId] [int] IDENTITY(1,1) NOT NULL,
    [EmployeeId] [int] NULL,
    [LastName] [varchar](50) MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL,
    [FirstName] [varchar](50) MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL,
 CONSTRAINT [PK_EmployeeDataMasking] PRIMARY KEY CLUSTERED 
(
    [RowId] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY],
) ON [PRIMARY]
GO

Insert Into dbo.EmployeeDataMasking (EmployeeId, LastName, FirstName) VALUES( 1,'Smithsonian','Daniel'),( 2,'Templeton','Ronald')

-- Partial data masking works correctly here Select
EmployeeId, LastName, FirstName From dbo.EmployeeDataMasking

-- Partial data masking does not work correctly here. Full masking is used. Select
EmployeeId, ISNULL(LastName,'') as LastName, ISNULL(FirstName,'') as FirstName From dbo.EmployeeDataMasking

enter image description here

It appears, when using ISNULL, the partial masking is being converted to a full (default) mask. Is this the expected behavior?

Update: I discovered that COALESCE does not exhibit this problem. These queries return the expected results:

Select  
    EmployeeId,
    COALESCE(LastName,'') as LastName,
    COALESCE(FirstName,'') as FirstName
From dbo.EmployeeDataMasking

Using Azure SQL Database

i-one
  • 2,374
  • 2
  • 14
  • 23
Randy Minder
  • 2,032
  • 4
  • 22
  • 41

2 Answers2

2

As a test, I created a custom user function that does nothing except pass the string back to the caller; it also exhibits this behavior.

Be aware, this code will drop users and objects, so only run it in a test database.

First, we create the required objects, users, and insert some data:

IF OBJECT_ID(N'dbo.EmployeeDataMasking', N'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.EmployeeDataMasking;
END
IF OBJECT_ID(N'dbo.SomeFunc', N'FN') IS NOT NULL
BEGIN
    DROP FUNCTION dbo.SomeFunc;
END
GO

IF EXISTS ( SELECT 1 FROM sys.database_principals dp WHERE dp.name = N'SomeUser' ) BEGIN DROP USER [SomeUser]; END CREATE USER [SomeUser] WITHOUT LOGIN; ALTER ROLE db_datareader ADD MEMBER [SomeUser];

CREATE TABLE [dbo].[EmployeeDataMasking]( [RowId] [int] IDENTITY(1,1) NOT NULL, [EmployeeId] [int] NULL, [LastName] varchar MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL, [FirstName] varchar MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL, CONSTRAINT [PK_EmployeeDataMasking] PRIMARY KEY CLUSTERED ( [RowId] ASC )WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY], ) ON [PRIMARY] GO

GO CREATE FUNCTION dbo.SomeFunc(@i varchar(50)) RETURNS varchar(50) AS BEGIN RETURN @i; END GO ALTER AUTHORIZATION ON dbo.SomeFunc TO [SOmeUser]; GRANT EXECUTE TO [SomeUser]; GRANT SHOWPLAN TO [SomeUser];

Insert Into dbo.EmployeeDataMasking (EmployeeId, LastName, FirstName) VALUES( 1,'Smithsonian','Daniel'),( 2,'Templeton','Ronald')

Next, we'll query the data from the perspective of the non-privileged user:

EXECUTE AS USER = N'SomeUser';
Select  
    EmployeeId,
    ISNULL(LastName,'') as LastName,
    ISNULL(FirstName,'') as FirstName,
    dbo.SomeFunc(LastName) as LastName,
    dbo.SomeFunc(FirstName) as FirstName,
    COALESCE(LastName,'') as LastName,
    COALESCE(FirstName,'') as FirstName
FROM dbo.EmployeeDataMasking
REVERT

The results:

╔════════════╦══════════╦═══════════╦══════════╦═══════════╦══════════╦═══════════╗
║ EmployeeId ║ LastName ║ FirstName ║ LastName ║ FirstName ║ LastName ║ FirstName ║
╠════════════╬══════════╬═══════════╬══════════╬═══════════╬══════════╬═══════════╣
║          1 ║ xxxx     ║ xxxx      ║ xxxx     ║ xxxx      ║ SmXXXXan ║ DaXXXXel  ║
║          2 ║ xxxx     ║ xxxx      ║ xxxx     ║ xxxx      ║ TeXXXXon ║ RoXXXXld  ║
╚════════════╩══════════╩═══════════╩══════════╩═══════════╩══════════╩═══════════╝

As you can see, the custom user function, dbo.SomeFunc exhibits the same behavior as ISNULL. At this point, I'm unsure why, since COALESCE does not exhibit the unwanted behavior.

Hannah Vernon
  • 70,928
  • 22
  • 177
  • 323
2

It appears, when using ISNULL, the partial masking is being converted to a full (default) mask. Is this the expected behavior?

The product documentation says:

Whenever you project an expression referencing a column for which a data masking function is defined, the expression is also masked. Regardless of the function (default, email, random, custom string) used to mask the referenced column, the resulting expression will always be masked with the default function.

To that extent yes, it is expected that projecting an expression over a masked column would return data masked using the default function rather than any more specific masking function on the column.

ISNULL(masked_column, value) counts as applying a function to the column.

This remains the case even though the ISNULL is simplified away in your example (because the column is defined as NOT NULL).

SQL Server could reassess the type of data masking after performing this simplification, but it does not.

I discovered that COALESCE does not exhibit this problem

COALESCE expands to a CASE expression of the form:

CASE WHEN masked_column IS NOT NULL THEN masked_column ELSE '' END

This is not considered as applying a function to column since the column references appear unmodified. The CASE itself does not count as an expression over the column in this context.

Applying a function to the column reference directly or to the result of the CASE would result in masking using the default function:

-- Masked with the default function due to the COLLATE
-- (assuming the database collation is not Latin1_General_BIN2)
SELECT  
    EmployeeId,
    LastName = COALESCE(LastName,'') COLLATE Latin1_General_BIN2,
    FirstName = COALESCE(FirstName,'') COLLATE Latin1_General_BIN2
FROM dbo.EmployeeDataMasking

db<>fiddle demo

The documentation is light on detail in this area, but the behaviour is as intended.

Paul White
  • 94,921
  • 30
  • 437
  • 687