Sunday, March 29, 2020

Salesforce: EmailMessage object

When Enhanced Email is enabled, Salesforce will create EmailMessage object. Emails sent from Salesforce are saved as Email Message records and Task records. There is a link from the Email Message record to a Task record, which is ActivityId field.

If you use Outlook panel (and not enable EAC), you can "Log Email" manually to Salesforce for email received and sent out. Both emails received and sent will be stored both as EmailMessage record and Task record.

How to differentiate email send from Salesforce and manually logged from an email client?
You will not find any difference on the Task object, both Type and TaskSubType will populate with "Email". But there are some differences in Email Message object. Check out this query:
SELECT Id, ActivityId, FromAddress, ToAddress, FromName, IsClientManaged, MessageIdentifier, Subject, TextBody FROM EmailMessage

Row 1,4,5 - email manual log from Outlook
Row 2,3 - email sent from Salesforce

As you can see, IsClientManaged and MessageIdentifier are different.

Note: using My Email to Salesforce service (BCC) will not create Email Message record, but only Task.


Monday, March 23, 2020

Einstein Analytics: using Flatten node to get Account Parent

Here is our scenario, we have multi-level account hierarchy, the sample of accounts for this blog:

Use case 1: display all accounts and their opportunities when the top parent is selected.


We need to manually edit the Flatten node from JSON, by default, the Multi Field and Path Field fields are created as system fields, which aren’t visible in the user interface. To make the fields appear in the user interface and dataflow, add a schema section to the flatten transformation and set the IsSystemField metadata attribute to false for each field in the transformation.

"Flatten_UltimateParent": {
    "schema": {
      "objects": [
          "label": "Flatten_UltimateParent",
          "fields": [
              "name": "UltimateParentPath",
              "label": "UltimateParentPath",
              "isSystemField": false
              "name": "AccountParentIds",
              "label": "AccountParentIds",
              "isSystemField": false
    "action": "flatten",
    "parameters": {
      "include_self_id": true,
      "multi_field": "AccountParentIds",
      "path_field": "UltimateParentPath",
      "source": "getAccount",
      "self_field": "Id",
      "parent_field": "ParentId"

Unfortunately, we will not see the schema in the dataflow UI.

We also need to Connect Data Source between Account and Opportunity using the Ultimate Parent Name.

Here is the dashboard:


1) Multi_Field from the flatten nodes will contain self Account Id (we select "Include Self ID" in the flatten node) and all the parents' Id.

Notice that top Parent Id 0018000001BNnOPAA1 is stored in ALL hierarchy, while Account F Id only stored on itself Account F and it child Account G. Don't be trick when AccountParentIds only show 1 value, because this is a multi-values field.

2) Path_Field will show all hierarchy from self Id, up to parent Id and all the way to the top level.

* I use Image (under Show Data As) to show the full length of field content of UltimateParentPath column

Use case 2: display all accounts and their opportunities when an Account is selected, the account could be the top, middle, or lower level in the account hierarchy.

Dataflow: let us modify existing dataflow to below:

We just need to add an augment node Account Name from Multi_Field AccountParentIds.

You also need to Connect Data Source between Account and Opportunity using the Account Parent Name from the augment node.

Here is the dashboard:

When select Account F, the dashboard will filter to Account F and the child accounts.

Referenceflatten Parameters

Saturday, March 14, 2020

Einstein Analytics: Dataflow Performance Best Practice

Performance is critical for Einstein Analytics dataflow, e.g. an optimized dataflow may take only 10 minutes, while the same dataflow with a poor design may take 1 hour (this includes sync setup) to run. Therefore, without great architected dataflows, it will be hard to maintain and sustain Einstein Analytics as a whole, as the company evolved.

Here are a few items noted based on my personal finding/experience, if you have additional inputs or a different perspective, feel free to reach me.

1. Combine all computeExpression nodes whenever possible



calcURI node in image-1 contains 1 compute field return Numeric, the same for calURI2 node also contains 1 other compute field return Numeric, a total of calcURI1 + calURI2 = 3:41 sec.

In image-2, we combined both compute field into calcURI node, and it only took 2:0 sec.

2. Do compute as early as possible, and augment as late as possible

The rationale behind this is, compute node will process lesser fields before augment (as augment always adding fields to the stream), unless you need the field from the augment node for computation.

3. Remove all unnecessary fields

In most of my experience, a dataflow usually a dashboard or clone of a dashboard. The more fields handled by each node will need more power and time, so slice out unnecessary fields if they are not needed in the dashboard or lens.


Notice that calcURI3 in image-1 and image-2 took around 2:08 sec. In image-3, we add a slice node before calcURI3 to remove unnecessary fields, this reduces the number of fields processed in calcURI3, therefore it took only 1:55 sec.

4. Combine all sfdcDigest nodes of the same object to a node, if sync is not enabled

For some reason, your org. maybe not enable for sync, this does not mean you "must" enable straight away, and please DO NOT enable it without a complete analysis, as this may cause data filtering issue.

You should combine all sfdcDigest nodes of the same object into a node, imagine if you have 10 millions row of opportunity, every sfdcDigest nodes take 10 minutes (as an example), and if the dataflow designer adds 3 sfdcDigest nodes of opportunity, the data retrieve itself will need 30 minutes.

Thursday, March 12, 2020

Einstein Analytics: Precision and Scale

Precision and Scale are important and required for computeExpression node that returns Numeric in Dataflow, otherwise, your dataflow rum will fail.

For numeric, as per this article External Data Metadata Format Reference
  • precision: the maximum number of digits in a numeric value, includes all numbers to the left and to the right of the decimal point (but excludes the decimal point character). Value can be up to 18.
  • scale: the number of digits to the right of the decimal point in a numeric value, must be less than the precision value.

But in short:
  precision: must be 1 - 18
  scale: must be 0 - 17 and less than the precision value

Let us see how this works in reality. I'll do a few same calculations on computeExpression, but with different precision and scale, the formula is A/B for all calculations, here is the result:

Calc_10_5 mean, precision = 10 and scale = 5, and etc. At a glance, you may think that all decimal points do not exist, this is incorrect as you need to "format numbers" on the widget or metadata.

For this blog testing, I set 5 digits decimal point:

Here is the result after all fields set with 5 decimal points:

From the above table, "scale" shows the difference in the calculation result, the result will be round up or round down based on the decimal point defined in the scale.

Notice that decimal point "below 0.5" will be round down, while "0.5 and above" will be round up. But, if scale = 0, all decimal points will be round down, see calc_10_0.

Reference: External Data Metadata Format Reference

Tuesday, January 14, 2020

Einstein Analytics: Using EdgeMart object from Salesforce Direct

In Winter '20 release, Einstein Analytics introduces Salesforce Direct, read this release notes for complete info of Salesforce Direct.

However, Salesforce Direct offers you to get data more than just Salesforce objects, but data in Einstein Analytics too, one of them is EdgeMart, please not to confuse with the edgemart node in Dataflow.

Let's have hands-on, make sure you are in the Production org., EdgeMart object does not available in the sandbox at this moment.

1. Create a new Dashboard
2. Click Create Query button (if you do not see the button), click the blank canvas
3. Select Salesforce Direct as the data source
4. Type EdgeMart in search box
5. Select EdgeMart

6. Now you will be presented with Untitled Query with a bar chart with a count of rows, this row represent the number of datasets you have.

7. You can modify to table mode when as your needs.

The table above shows where is the dataset located, created by, last modified by, data refresh date, etc.

ReferenceShow your data's refresh date with Salesforce Direct

Friday, January 10, 2020

Einstein Analytics: Grouping in Dataflow

After the blog to transpose data from columns to rows, and from rows to columns. Today I have another challenge to group data based on a date.

Here is the data

I know the recipe offers this functionality to group data easily, however, I am reluctant to put a recipe in between of two dataflows, as it will cause maintenance nightmare in the future.

But, can we do this in dataflow? Dataflow does not offer data grouping by default, but as still we can achieve with it some tricks. Here we go:

The key node here is just cr1 which is a computeRelative node. I add 4 fields here:
- Sum_1
- Sum_2
- Sum_3
- IsLast

1. Partition the data with Date

2. For fields Sum_1 to Sum_3, choose SAQL (not Source Field), the Type should be Numeric and remember to enter Scale and Default Value. 
Here is the SAQL Expression 
case when previous(Sum_1) is null then current(Data_1) else current(Data_1) + previous(Sum_1) end

3. For IsLast, choose SAQL (not Source Field), the Type should be Text. Here is the SAQL Expression
case when next(Data_1) is null then "Yes" else "No" end

data after computeRelative, before cleanup

4. Delete unused rows with Filter node and unused columns with Slice node.

In another scenario, if you just need to count items in a group, change Data_1 to 1.
e.g. Count_1 is the field name in CR node
case when previous(Count_1) is null then 1 else 1 + previous(Count_1) end


Page-level ad