Transform
Goal
Use transform
to parse the values created by the anonymization rule. The transform is a java function that takes the output of the format
string as input, and processes them before storing the results to the database.
The difference between a convert
and a transform
is that a convert
is done on an input source
and a transform
is done on the output of the anonymization rule, i.e., the output of the format
string.
Use cases for transform
Use transform
to ensure that the generated anonymized data follows business specific structural rules.
Ano Structure and Examples
- Structure
- Custom Structure
- Email transform Example
- CreditCard transform Example
...
<rule> <column>
format %s
transform <transformation>
<input-source>
...
- Passes the value of
%s
to the<transformation>
function. The result is then stored database.
transformation <package-location>.<transformation>
task <task-name> {
...
<rule> <column>
format %s
transform <transformation>
<input-source>
}
- Custom Transformation: If you use a custom transformation and not a built-in one, then import it in the
.ano
file using thetransformation
keyword - import the custom function from its package in the java project
`transformation example.anonymizer.transformations.PostCodeGeneralization
`
...
update CUSTOMER
mask EMAIL
format "%s"
transform Email
unique
column NAME
...
update Customer
mask CreditCard
format "%d%d"
transform CreditCard
random-integer 40005000 60000000
random-integer 10001000 99919991
- Note that in order to produce 16 digit integer, we must produce 2 integers of 8 size each, and concatenate them in a string. This is because of the size limit of the Java.lang.Integer class in Java.
Built-in transformation functions
The Email
built-in transformation function does the following to the format string output:
- Letters are lowercased:
A-Z
->a-z
- Numbers and lowercased letters are unchanged
- Some country specific letters are replaced:
æ
->ae
ø
->oe
Ã¥
->aa
- Some symbols are replaced by a dot:
- (whitespace)
.
-
->.
_
->.
..
->.
- (whitespace)
- All other characters are removed
- Checks if string contains
@
:- if not, then append
@mail.com
to the end of the string. - if last character, then append
mail.com
to the end of the string. - else, nothing is appended.
- if not, then append
CreditCard
The CreditCard
built-in transformation function uses the Luhn algorithm for Credit Card encoding. It replaces the last digit for MOD10 checksum validation.
- The format string must be digits for the transform to work, e.g., "1234567890123456".
Custom Transformations
If you want to write your own transformation, see Custom Transformations for details.