Pages

Saturday, April 27, 2019

Einstein Analytics: Handling null values

This blog is not related to Null Handling in Measures. As per this article, null values in Dimensions are not completely supported in Einstein Analytics, however, we often deal with null values in many scenarios.

To prevent data quality issues, Einstein Analytics will disregard any fields in Salesforce (or columns in external data) that are entirely null.

Grouping with Null
Date field by default is null, when you use it in a chart for grouping, null value will not be shown. To overcome this, if you pull data from Salesforce, use defaultValue (e.g. 1900-01-01) to override in sfdcDigest.



Dimension field by default is null, when you use it in a chart for grouping, null value will not be shown. To overcome this, if you pull data from Salesforce, use defaultValue (e.g. NA, " is not needed) to override in sfdcDigest.

Measure field by default will become 0 for null value. We cannot use Measure field for grouping, but we can use it for filter.


Filter null records in SAQL
q = load "DTC_Opportunity_SAMPLE";
q = foreach q generate 'Account_Owner' as 'Account_Owner', 'Product_Name' as 'Product_Name', (case when 'Product_Name' is null then "" else 'Product_Name' end) as 'PM';
q = filter q by PM == "";
q = order q by 'Account_Owner';
q = limit q 10;

Count Not null records in SAQL
q = load "Lead";
q = foreach q generate (case when 'SFDC_Lead_ID__c' is null then "" else 'SFDC_Lead_ID__c' end) as 'SFDC_Lead_ID__c';
q = filter q by 'SFDC_Lead_ID__c' != "";
q = group q by all;
q = foreach q generate count() as 'count';

To group null as NA
q = load "DTC_Opportunity_SAMPLE";
q = foreach q generate coalesce('Product_Name',"NA") as 'Product_Name', count() as count ;
q = group q by 'Product_Name';
q = foreach q generate 'Product_Name' as 'Product_Name', count() as count;
q = order q by count desc;


Augment transformation cannot find the parent will cause Null


SAQL Expression in above computeExpression case when 'Acq.Industry__c' is null then "Parent not available" else 'Acq.Industry__c' end

Data in Salesforce

Result in Einstein Analytics

Notes from above screenshot:
  • 1st row for Parent Industry = NA, because we set the default value to NA in sfdcDigest.
  • 1st and 2nd row for Account Source = null, because we set the default value to null in sfdcDigest.
  • 1st row for Employees and 2nd row for Annual Revenue = 0, Einstein Analytics auto set 0 for Measure field with null value.
  • 3rd and 4th row for Parent Industry = Parent not available, this is because there is parent lookup value, we use computeExpression to set the value, this is different with 1st row, where the parent lookup value is available, but Industry for that parent record is null.


Reference:


No comments:

Post a Comment

Page-level ad