On Dataflow, the node Join allow multiple nodes to merge multiple data nodes into one bloc.
Settings
To learn about common nodes settings, please visit this article:
βNodes: Main settings
Join type
Join nodes can take data from multiple sources and aggregate them in a single output. It's currently the only type of node that takes more than a single source.
Merge
Allows you to aggregate multiple data sources without performing joins between them. Product IDs must be unique across all sources.
βExample: I have several Google Merchant Center data sources and I want to get a complete dataset with all of them.
When using a merge join and the same product id is present in multiple sources, the behavior is undefined.
Left-join
Left-join allows you to add some complementary data on a primary feed.
βExample: I want to enrich my main feed with attributes generated using GenAI.
The primary source determines the output products of the node: if there are 10 products in the main branch, there will be 10 products in the output.
Data from secondary branches is merged into the products of the primary branch when the IDs match, otherwise, it is ignored.
Primary source (left-join)
Choose the source becoming the main feed during the left-join.
Join On (left-join)
On the primary feed side, the product is always identified by its product id. The other sources can have different types of ID that get resolved during the join operation.
id: this is the default ID type (lpoid) that will be used in most cases
offerID: join on the primary source offerIDs when the secondary source uses this type if identifier. This is common for external CSV files



