How to Split a load into batches : Portal

How to Split a load into batches Print

Modified on: Fri, 3 Jun, 2022 at 8:57 AM

When a source table has a big volume of data, it can be useful to split the load to the target into several parallel batches.

This can be done easily when using a Load Template which supports the "Split Parallel Degree" parameter.

Note: if your Template does not have this parameter, contact the support team to ask if this can be added.

First, add a "SPLIT_BY" tag to the metadata node of the source column which will support the Split.

Important:

- the column must be numeric,

- the rows have to be well distributed on this key for better performance

- an Index on this column is necessary for better performance

Then, in the Load Template, set the "Split Parallel Degree" parameter to the number of batches you would like to produce.

In this example, we want to make 5 batches. If the source table has 500,000 lines, this will produce 5 batches of 100,000 lines.

As a result, the generated process will

1. get the Min and Max values for the split column

2. execute 5 "Load Data" steps in parallel

3. each "Load Data" step will use a SELECT statement with a WHERE clause based on the split column

Each step looks like this:

Did you find it helpful? Yes No

How to Split a load into batches Print