Skip to main content

Quickstart

Goal

This Quickstart will explain the Anonymizer workflow:

  • From having a database
  • To running your anonymizing tasks and rules against the database

It will also provide some tips.

To follow along requires access to our services.

The database

We will use the Address and Customer Example Project throughout this Quickstart. The database service provider here is SQL Server. You may follow along with your own database and your own database service provider.

Tip

When developing new anonymizing Tasks and Rules for your database, make sure to do it on a database copy (e.g. in Localhost) where data can easily be reset. This allows you to test your anonymization logic against production data each time you write and run new versions of the Tasks and Rules.

Reset the database from script - reset_database_script.sql

drop table pii_tables.Customer;
drop table pii_tables.Address;

CREATE TABLE pii_tables.Address (
addressNo BIGINT not null,
homeAddress varchar(50) null ,
postalCode varchar(10) null,
created datetime null,
CONSTRAINT Address_PK PRIMARY KEY (addressNo),
);

CREATE TABLE pii_tables.Customer (
ID int NOT NULL identity(1,1),
addressNo BIGINT not null,
email varchar(50) null ,
name varchar(50) null,
created datetime null,
CONSTRAINT Customer_PK PRIMARY KEY (ID),
FOREIGN KEY (addressNo) REFERENCES pii_tables.Address(addressNo)
);

INSERT INTO pii_tables.Address (addressNo, homeAddress, postalCode, created) VALUES (372036854775807, 'Prod Address 1', '1234', '2020-01-28 12:24:06');
INSERT INTO pii_tables.Address (addressNo, homeAddress, postalCode, created) VALUES (137862354775678, 'Prod Address 2', '6542', '2020-07-15 16:35:13');
INSERT INTO pii_tables.Address (addressNo, homeAddress, postalCode, created) VALUES (825675368758452, 'Prod Address 3', '1008', '2021-01-01 14:05:34');
INSERT INTO pii_tables.Address (addressNo, homeAddress, postalCode, created) VALUES (265736898756443, 'Prod Address 4', '4064', '2021-12-21 08:57:30');

INSERT INTO pii_tables.Customer (addressNo,email,name,created) VALUES (372036854775807,'jean.smith@mail.com', null, '2020-01-28 12:24:06');
INSERT INTO pii_tables.Customer (addressNo,email,name,created) VALUES (137862354775678,'m.simmons@somemail.com','Mark Simmons', '2020-07-15 16:35:13');
INSERT INTO pii_tables.Customer (addressNo,email,name,created) VALUES (825675368758452,null,'Edward Snake', '2021-01-01 14:05:34');
INSERT INTO pii_tables.Customer (addressNo,email,name,created) VALUES (265736898756443,'mary.lake@foomail.com','Mary Lake', '2021-12-21 08:57:30');


1. Extract database definition

The only relevant parts of the database are the tables that shall be anonymized, and their constraints.

Export the DDL statements for your tables, and store them in an .sql file.

If you know which tables need anonymizations, and which does not, you may stick to only exporting the required tables.

Exported DDL Statements - creates_and_constraints.sql

-- pii_tables.Customer definition

CREATE TABLE `Customer` (
`ID` int NOT NULL AUTO_INCREMENT,
`addressNo` bigint NOT NULL,
`email` varchar(50) DEFAULT NULL,
`name` varchar(50) DEFAULT NULL,
`created` datetime DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `addressNo` (`addressNo`),
CONSTRAINT `customer_ibfk_1` FOREIGN KEY (`addressNo`) REFERENCES `Address` (`addressNo`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

-- pii_tables.Address definition

CREATE TABLE `Address` (
`ID` int NOT NULL AUTO_INCREMENT,
`addressNo` bigint NOT NULL,
`homeAddress` varchar(50) DEFAULT NULL,
`postalCode` varchar(10) DEFAULT NULL,
`created` datetime DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `Address_UK` (`addressNo`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;


2. Translate

Use our DBano Service to translate your database definitions by providing it the .sql file. Register and log in to gain access.l The service will provide you an .ano file.

This .ano file is also where you will write your Task and Rules for anonymizing your database.

ANO File for your database - address_and_customer.ano


table Address
column integer ID
column integer addressNo
column text homeAddress
size 50
column text postalCode
size 10
column datetime created
primary-key ID
unique-key addressNo addressNo

table Customer
column integer ID
column text email
size 50
column text name
size 50
column datetime created
column integer addressNo
primary-key ID

foreign-key
Address ID
Customer addressNo


/*
...
*/


3. Tasks and Rules

In the .ano file we now write the logic for our desired anonymizations. Consult the Writing Tasks and Writing Rules sections for details on how to do this.

Tip

Edit the .ano file using Visual Studio Code with the Ano Extension installed. This makes your life easier.

ANO File With Tasks and Rules - address_and_customer_done.ano

 // - - - - Database Schema Information - - - - //
table Address
column integer addressNo
column text homeAddress
size 50
column text postalCode
size 10
column datetime created
primary-key addressNo

table Customer
column integer ID
column text email
size 50
column text name
size 50
column datetime created
column integer addressNo
primary-key ID

// - - - - Foreign Keys (Optional) - - - - //
foreign-key
Address addressNo
Customer addressNo


// - - - - Tasks and Rules - - - - //

task updateAdressTasks
{
update Address UpdateAddressFields
mask homeAddress UpdateAddressUsingFile
format %s
file src/main/resources/addresses.txt random-order

mask postalCode UpdateToRandomPostalCodeAsString
format "%d"
//transform PostCodeGeneralization
random-integer 1000 9999

shuffle created
}

task updateCustomerTasks
{
update Customer UpdateCustomerNameUsingTwoFiles
sql-before "delete from Customer where name is null"
mask name UpdateNamesUsingFiles
format "%s %s"
file src/main/resources/firstnames.txt random-order
file src/main/resources/lastnames.txt random-order map name_map.txt output
update Customer UpdateCustomerEmailNotNullSubset
where "email is not null"
mask email UpdateEmailsBasedOnNameColumn
format %s
transform Email // appends @mail.com to formatted names. Results in firstname.lastname@mail.com
column name

update Customer UpdateCustomerRandomizeCreated
randomize created
type datetime // datetime adds noise in seconds
format "%1$tF %1$tT"
convert String2DateTime
offset 360000.0
flat-noise 50000.0
}

task propagateTask
{
update Address
mask addressNo
format %d
unique
sequence 10000 1 // start at 10000, increment by 1
temporary-value 9999999
propagate Customer.addressNo
}


4. Generate

Use our DBmasker Service to generate the runtime project by providing the .ano file. Buy DBmasker and log in to gain access. The service will provide you an .zip file containing a Java 8, Maven 3.6 project.

Tip

Install Java 8 on your computer. Use Intellij (Community or Professional) to setup the Java application. Install maven.


5. Configure

Configure the Java 8 application by using this guide. There are two required steps. First, you must add the Maven Dependency that properly connects the Java application to your database. Second, you must provide the Java application the database URL and login credentials.

Maven Dependency for the SQL Server Database Provider - pom.xml

  <dependencies>
...
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>mssql-jdbc</artifactId>
<version>9.4.0.jre8</version>
</dependency>
...
</dependencies>

Edit database URL and credentials - src/main/resources/config.properties

# Database connection parameters
connection.host = localhost
connection.port = 1433
connection.db = master
connection.schema = pii_tables
connection.user = sa
connection.password = myPassword
connection.driverClass = com.microsoft.sqlserver.jdbc.SQLServerDriver
connection.url = jdbc:sqlserver://localhost:1433;database=master
file.encryptionkey =
sql.wrapper = "
...


6. Execute Tasks

Execute tasks by using this guide.