Questions tagged [data-masking]

Data Masking is the act of replacing meaningful data with non-meaningful, or "masked" data to be used during development, and testing, etc. This is generally used to protect personally identifiable information from being seen outside of a production system.

The main reason for applying masking to a data field is to protect data that is classified as personally identifiable data, personally sensitive data or commercially sensitive data, however the data must remain usable for the purposes of undertaking valid test cycles. It must also look real and appear consistent.

The primary concern from a corporate governance perspective is that personnel conducting work in non-production environments are not always security cleared to operate with the information contained in the production data. This practice represents a security hole where data can be copied by unauthorized personnel and security measures associated with standard production level controls can be easily bypassed. This represents an access point for a data security breach.

A well-conceived data-masking program will have, among others, the following attributes:

  1. The data must remain meaningful for the application logic. For example, if elements of addresses are to be obfuscated and city and suburbs are replaced with substitute cities or suburbs, then, if within the application there is a feature that validates postcode or post code lookup, that function must still be allowed to operate without error and operate as expected. The same is also true for credit-card algorithm validation checks and Social Security Number validations.

  2. The data must undergo enough changes so that it is not obvious that the masked data is from a source of production data. For example, it may be common knowledge in an organisation that there are 10 senior managers all earning in excess of $300K. If a test environment of the organisation's HR System also includes 10 identities in the same earning-bracket, then other information could be pieced together to reverse-engineer a real-life identity. Theoretically, if the data is obviously masked or obfuscated, then it would be reasonable for someone intending a data breach to assume that they could reverse engineer identity-data if they had some degree of knowledge of the identities in the production data-set. Accordingly, data obfuscation or masking of a data-set applies in such a manner as to ensure that identity and sensitive data records are protected - not just the individual data elements in discrete fields and tables.

24 questions
4
votes
2 answers

Scrubbing sensitive data

I am looking for an automated solution to scrub sensitive data from my prod environment to my DEV and DEVINT environments so that I don't have to write lots of code to get this done. Does anyone know if Data Quality Services and a data cleansing…
Easportsaz11
  • 43
  • 1
  • 1
  • 3
4
votes
1 answer

SQL Data Masking a copy of data for backup

I need to send a database backup to a vendor for an upgrade, and somehow need to mask several columns containing PII. Was looking into static data masking, but this seems to change the data permanently. Dynamic data masking seems better as it…
BrianC
  • 111
  • 9
4
votes
4 answers

Create SQL Randomized Date of Birth

Does anyone have any SQL code to automatically generate randomized birth dates, where the date of birth is less than today? Please add Date of Birth Range parameters, eg: from 18 to 70 years old. Is there any inline SQL or function to do this? We…
user129291
3
votes
2 answers

How does my system understand if data got masked?

I was trying to understand the difference between encryption and masking Below statement says that real data replaced and gone! Masking protects your data by transforming it into a readable format that’s useless to anyone who steals it. The actual…
kudlatiger
  • 269
  • 2
  • 4
  • 22
3
votes
1 answer

Dynamic Data Masking Issue when Concatenating Fields

You can reproduce the issue here: CREATE TABLE [dbo].[EmployeeDataMasking]( [RowId] [int] IDENTITY(1,1) NOT NULL, [EmployeeId] [int] NULL, [LastName] [varchar](50) MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL, …
Randy Minder
  • 2,032
  • 4
  • 22
  • 41
3
votes
2 answers

Dynamic Data Masking Doesn't Seem To Work Correctly With ISNULL

Here is the code to reproduce the issue: CREATE TABLE [dbo].[EmployeeDataMasking]( [RowId] [int] IDENTITY(1,1) NOT NULL, [EmployeeId] [int] NULL, [LastName] [varchar](50) MASKED WITH (FUNCTION = 'partial(2, "XXXX", 2)') NOT NULL, …
2
votes
3 answers

Making production data accessible to developers via masking

We want to provide developers in our organization masked data from production to help troubleshoot production issues. What would be the best way to approach it? I've read this article…
areller
  • 121
  • 3
2
votes
1 answer

Data redaction with regex in Oracle

I would like to redact data with regex in oracle database. I know the procedure but I get unpredictable results. I use the following script to apply redaction to a certain column with card numbers. BEGIN DBMS_REDACT.ADD_POLICY ( object_schema …
RokX
  • 163
  • 7
2
votes
1 answer

Static Data Masking does not appear in SSMS 18 preview 6

I installed SSMS version 18 preview 6 and am using it with SQL Server 2017. I want to test the Static Data Masking feature, but the option does not appear.
2
votes
2 answers

Masking PII columns in Oracle

I am using Oracle 11g R2 on Linux, and I have multiple tables in my database that has got columns that needs to be masked. I have tried the OEM to mask the data as mentioned in the below hyperlink, it worked fine. …
2
votes
1 answer

Create Random Word to obfuscate a SQL Column

Does anyone have any SQL code to automatically generate words (all letter, no number). Is there any inline SQL to perform this on a column? We are trying to obfuscate first/last name and other word columns in table. This answer currently does not…
user129291
2
votes
2 answers

Can't create indexed view against table with masked columns

I am trying to create an index on a view that references a table with a masked column (SQL Server 2016). The masked column is not the only one in that table, and it's not used in the view. create unique clustered index [IX_Name] on…
1
vote
1 answer

How to apply Static Data Masking With Replication of Database

We used to have multiple environments for applications such as Production, Staging, UAT, Dev, and Sandbox. It is a regular task for the DBA to refresh the lower (staging, test, UAT etc.) environments with a backup of the production database backup.…
1
vote
1 answer

Using aggregate function on data masked column returns zeroes

Recently we have implemented dynamic data masking in SQL Server 2019 database in order to hide sensitive information from developers. However, for testing purposes I'd like them to see close-to-real values, so my mask looks like this: CREATE TABLE…
1
vote
5 answers

Create Random Number up to 50 Digits and Store in Varchar

Does anyone have code; looking for a random number function generator. I would just supply with a variable length @NumberLength = 50, etc. It can create numbers up to 50 digits, and store in varchar. (Bigint does not store this high) I am using…
user129291
1
2