Skip to main content

Distribute

Goal

Use distribute when creating values for a column by picking values from another table's column.

  • Every value created in the child table using distribute will be taken from the parent table column.
  • Choose a <distribution> function that decides how parent table values will be distributed in the child table column. E.g. that all parent values must appear at least once.

Use cases for distribute

Use distribute to create

  • foreign key column values using existing parent key column values
  • values in a column based upon a set of values from other columns
  • a desired distribution of rows in a table, based upon the occurrence of values in other tables

distribute is only usable in create Work Tasks.

Ano Structure and Examples

task <task-name> {
create <table> mininum-rows <#rows>
distribute <distribution> <params> // params optional
table <parent-table> <params> // params optional
child <foreign-key>
parent <key>
}

How many rows are created?

The number of rows is either minimum-rows, as defined on the create Work Task, or the number of rows necessary to satisfy the condition of the distribution function.

Built-in Distribution Functions

SimpleSpread

  • Foreign key column(s) are randomly assigned values among possible existing values in the parent table(s) column

Number of rows

The number of rows will be equal to what is specified in minimum-rows on the create Work Task.

EvenWithDeviation

  • Even spread with deviation for occurrence of parents
  • takes a number as a parameter on the table level

Number of rows

The number of rows will be equal to what is specified in minimum-rows on the create Work Task.

AllCombinations

  • All combinations of parents are satisfied at least once

Number of rows

The number of rows will be at least the multiplication value of the number of rows of each parent.

Example: Suppose a child table has three parent tables, i.e., it has three foreign key columns. Given that parents have 100, 500, and 800 rows, the number of rows created in the child is at least 100 * 500 * 800 rows = 40 000 000 rows. But if the minimum-rows number is bigger than this value, then minimum-rows number of rows will be created instead.

Custom Distributions

If you want to write your own distribution, see Custom Distributions for details.