Thursday, January 14, 2021

Einstein Analytics: Many-to-Many data transformation

I am a big fan of Dataflow and not Recipe (Dataprep), however, Dataflow will not work when you deal with a multi-to-multi data source. 

In Dataflow, you always have a 'table' that will define the granular data, you can 'clone' the data into multiple streams, then use append node to combine them back, so the granular data exploded. However, you cannot 'explode' the data based on data from a different dataset.

Here samples of data:

This is the expected result

The objects relation

Solution: using Left Join in Recipe

1. Create Recipe

Go to Data Manager > Dataflows & Recipes tab > Recipes tab > Create Recipe button

2. Select Data

Now you are on a blank canvas, click the Select Data button.

Select left dataset, click +, select Join, select right dataset, makes sure you have key align for both datasets, in my sample is Opportunity Id.

Select Left in Join Type, make sure to select all fields to be included for the result, then click the Apply button.

You can preview the result from the Preview tab.

3. Add Output node

Enter the label and API name for the result dataset and we are done. 

Here is the dataset created

Opportunity Id 006A5 in the custom object does not appear in the result, because we are using left join, so it will contain only the left dataset key

Select Right if want to make the right dataset is the key, here is the result.

Use Inner, if the key should exist in both dataset:

And the last one Outer, where all data will be included


No comments:

Post a Comment

Page-level ad