4.10. Union node
Number of inputs: 2 or more.
Number of outputs: 1.
- Definition
A union node lets you add multiple input datasets (stacking), corresponding to new rows (union in the SQL sense).
- Configuration
A union node is used to combine two or more datasets.
Each dataset is added at the end of the previous dataset.
Three stacking strategies are available:
Use the first dataset as a benchmark:
the first dataset determines the names and types of the fields in the output dataset. This option will trigger an error if the number OR type of the input fields do not match.
Merge input fields :
if, for example, the first dataset has fields A, B and C and the second dataset has fields B, C & D then the output dataset will have the fields A, B, C & D. Fields with types that do not match will be converted to text.
Intersection of input fields: if, for example, the first dataset has fields A, B and C and the second has fields B, C & D then the output dataset will have the fields B & C. Fields with types that do not match will be converted to text.
Within the union node configurator you can also specify the field of provenance (by name). This is a field that will be added after the union and will indicate the dataset of provenance (or origin) of each record:
- Example
In this example, a field union strategy is set up with an additional provenance field (Origin):