Pages

Tuesday, June 26, 2018

Einstein Analytics: computeRelative getting started

computeExpression is a well-known transformation tool in Dataflow, you expect to get new fields as result, based on data in the same row. While computeRelative is also transformation tool in Dataflow, but you can compare with data in the same or the other row (first, last, next, and previous).

In computeRelative, you need to specify:
- input source
- partition by field: this is unique key field such as Opportunity Id, Account Id, etc.
- sort field: this is to tell data order
- computed fields: this would be the result fields from computeRelative.

Let's use this sample dataset:


Create a dataflow using edgemart to get existing dataset in Einstein Analytics:
- add computeRelative to get Previous Amount and Previous Closed Date
- using Opportunity Id as partition field
- order by Snapshot_Date


Here is the field attribute:


In this case, we would like to leave the Previous Amount and Previous Close Date as blank if there is no previous data for the particular records. If we leave the Default Value in Field Attribute as blank, there will be no default value defined in the JSON and this will cause errors when running the dataflow.

To fix that, download the JSON file and edit it manually, then upload it back to the dataflow.

  "computeRel1": {
    "action": "computeRelative",
    "parameters": {
      "source": "EdgeProspect",
      "computedFields": [
        {
          "name": "PreviousAmount",
          "label": "PreviousAmount",
          "expression": {
            "sourceField": "Amount",
            "offset": "previous()",
            "default": null
          }
        },
        {
          "name": "PreviousCloseDate",
          "label": "PreviousCloseDate",
          "expression": {
            "sourceField": "Closed_Date",
            "offset": "previous()",
            "default": null
          }
        }
      ],
      "orderBy": [
        {
          "name": "Snapshot_Date",
          "direction": "asc"
        }
      ],
      "partitionBy": [
        "Opportunity_ID"
      ]
    }
  }


Here is the final result:




Reference: computeRelative Parameters



Monday, June 25, 2018

Einstein Analytics: Binding syntax

Binding is one of the most powerful technique to implement when building a dashboard, although it is also one of the most difficult as you need to edit the JSON directly. In a few blogs post earlier here and here, you may notice a function call coalesce(), let's see what is this function do before we jump further into the binding technique.

coalesce function returns the first non-null source from a list of sources. This function is useful for providing a default value in case function returns a null value.
syntaxcoalesce(source1, source2,...)
example: coalesce(cell(step1.selection, 0, "column1"), "green")
  output: the output is the result returned by cell(step1.selection, 0, "column1"). However, if cell(step1.selection, 0, "column1") returns null, then the output is "green".


We can use cell, row and column function for binding, let's see each of them:
cell function returns a single cell of data as a scalar, like "This salesperson rocks", 2, or null
syntax: cell(source, rowIndex, columnName)
An error occurs if the rowIndex is not an integer, the columnName is not a string, or the cell doesn’t exist in the table.
example: assume that following rows from the step.
    {stateName: ‘CA’, Amount:100}, 
    {stateName: ‘TX’, Amount:200}, 
    {stateName: ‘OR’, Amount:300}, 
    {stateName: ‘AL’, Amount:400}, 

Although Einstein Analytics doesn’t store this data as a table, let’s show the data in this format to make it easier to understand the example
(row index) stateName Amount
0 CA 100 
TX 120 
OR  115 
AL  105 

sample function: cell(myStep.selection, 1, "stateName")
result: "TX"


column function returns one column of data (as a one-dimensional array) or multiple columns of data (as a two-dimensional array) -- allow users to select multiple values.
Syntax: column(source), [columnNames...])

Let's use similar step with cell functions above
    {stateName: ‘CA’, Amount:100}, 
    {stateName: ‘TX’, Amount:200}, 
    {stateName: ‘OR’, Amount:300}, 
    {stateName: ‘AL’, Amount:400}, 

And here is the presentation virtual table format.
(row index) stateName Amount
0 CA 100 
TX 120 
OR  115 
AL  105 

sample function: column(myStep.selection, "stateName")
result: ["CA", "TX", "OR", "AL"]


Now, let us continue with binding. There are two types of bindings: selection binding and results binding, the selection or results of one step triggers updates in other steps in the dashboard.

Selection binding is a method used to update a step based on the selection in another step. Selection bindings are interaction-driven, where it’s evaluated each time the user selects something in a widget.

Results binding is a method used to update a step based on the results of another step.

syntax: {{ cell(<stepName>.<result|selection>,<rowIndex>,<columnName>).asString() }}
example: {{ cell(Static_Step_1.selection, 0, \"value\").asString() }}
--> from above is from 1st row
--> {{ }} = binding

syntax: {{ column(<stepName>.<result|selection>,<columnName>).<asString()|asObject()> }}
example: {{ column(Static_Step_1.selection, \"value\").asObject() }}

If you saw that coalesce function is to return the first not null function, we can combine it in our cell binding.
{{ coalesce(cell(Static_Step_1.selection, 0, \"value\"), cell(Static_Step_1.result, 0, \"value\")).asString() }}


From above samples, there is also 2 more functions asString() and asObject()
asString() function serializes a scalar, one-dimensional array, or two-dimensional array as a string, escapes double quotes in strings.
syntax: <input data>.asString()
example: cell(stepOpportunity.selection, 1, \"measure\").asString()
--> 1 from above is from 2nd row
{{ cell(color_1.result, 0, \"color\").asString() }}


asObject() function passes data through with no serialization, returns data as an object (an array of strings).
syntax: <input data>.asObject()
example: column(StaticMeasureNamesStep.selection, [\"value\"]).asObject()
{{ cell(static_1.selection, 0, \"value\").asObject() }}



Reference:

Thursday, June 21, 2018

Einstein Analytics: Connect Data Sources and Binding between Dataset

Einstein Analytics allow us to have more than 1 dataset in a dashboard. With Connect Data Source, we can link the dataset using a common field from both datasets, but the field must be used in the dashboard.

Let us using following sample of datasets:



Connect the data source using "Region"



With Connected Data Source, when we select a Region in Target, the Actual Region picklist will be filtered too, if Apply filters from faceting is enabled, this is enabled by default.



The same when we select Actual Region with a value, Target Region will be filtered too -- if "Apply filters from faceting" is enabled.



Without adding Region (actual) dropdown list, Actual value will be still filtered correctly. That's all about Connect Data Source, but in some cases, we can't connect data source, binding is powerful for this, but you need to get your hand dirty with JSON.

Let's see the result first:


Selecting a Region in Target do not filter picklist value in Actual, but the Actual value is changed accordingly, the same as in Connect Data Source.

In this sample, we need to add a filter into the step of Actual value.
This is the original query:
 "query": {
            "measures": [
                    [
                       "sum",
                       "Sales_Actual"
                    ]
            ],
            "groups": [
                       "Sales_Region"
            ]
          }

Add filters to the query:
"query": {
            "measures": [
                    [
                      "sum",
                      "Sales_Actual"
                    ]
            ],
            "groups": [
                       "Sales_Region"
            ],
            "filters": [
                    [
                       "Sales_Region",
                       "{{column(list_Target.selection,[\"Region\"]).asObject()}}",
                       "in"
                    ]
             ]
          }

  • Sales_Region is API field name for the column Region from Actual dataset -- we will need to use that field as filtering
  • list_Target is the step name for Target Region drop-down.
  • Region is the API field name for the column Region from Target dataset.


Reference:


Wednesday, June 13, 2018

Einstein Analytics: Multi-Select Picklist to Text

As per Summer '18 release, Einstein Analytics is not really friendly with Multi-Select Picklist field. One of the issues, when you have multi-values in a record, it will show only the 1st value when you show the data as Value Table in Lense or Table wizard in a Dashboard.

Service Type contains multi-values in Salesforce


Service Type only show the 1st value in EA


As per this document, we can customize JSON dataflow to treat multi-select values as text. Once, you have the dataflow built:

  • Download the JSON file
  • Edit the file and add ,"isMultiValue": false after the field name and save it (you should backup the original JSON file)
  • Upload back the edited JSON file to Dataflow



Here is the result the multi-select values show as text in Einstein Analytics



Note: if you do not update the JSON dataflow as above, the data will flow into EA as multi-values, and it will work if you use it as list filter as an independent value, but the record selection will work as if the multi-values for the record.


ReferenceLimitations of Multi-Value Fields in Einstein Analytics



Sunday, June 10, 2018

Einstein Analytics: Getting started with cogroup

You can combine data from two or more data streams into a single data stream using cogroup. The data streams must have at least one common field. Only data that exists in both groups appear in the results.

example:
qs = cogroup qsd by 'State', qsp by 'State';
in this sample, data stream qsd contain field State and data stream qsp also contain field State, we can use it for grouping.

q = cogroup ops by 'Account', meetings by 'Company';
Account in ops data stream will have the same value with Company in meetings data stream.


Use case: show death per state percentage from 2 datasets.



Let's use cogroup to combine the dataset:
dsd = load "StateDeath2";
dsp = load "StatePopulation2";
ds = cogroup dsd by 'State', dsp by 'State';
ds = foreach ds generate dsp.'State' as 'State', sum(dsp.'Count') as 'Population', sum(dsd.'Count') as 'Death', (sum(dsd.'Count')/sum(dsp.'Count')*100) as 'Death (%)';

The Result


Let's try to use Opportunity and User datasets from our previous blog.
dsu = load "user1";
dso = load "opportunity";
ds = cogroup dsu by 'Id', dso by 'OwnerId';
ds = foreach ds generate first(dsu.Name) as 'Name', sum(dso.Amount) as 'Sum_Amount';
ds = order ds by Name;


If you notice, Angela is not shown on that chart, because she do not have any Opportunity records. Remember that only data that exists in both groups appear in the results.


Reference:




Saturday, June 9, 2018

Einstein Analytics: Opportunity Dashboard with SAQL Union to show all User

Let's go straight to the business requirements, "show all sales rep with their total sales amount, if there is no opportunity owned by the sales rep, sales rep name must be shown with $0."

High-level solution:
1. Load User master data
2. Load Opportunity data
3. Use UNION to combine the dataset

User data


Opportunity data


Build User master Lens group by Id and Name
q = load "user1";
q = group q by ('Id', 'Name');
q = foreach q generate 'Id' as 'Id', 'Name' as 'Name', count() as 'count';
q = order q by ('Id' asc, 'Name' asc);
q = limit q 2000;

Let's modify necessary SAQL:
1. Rename all q data stream to dsu -- for easier identifier and uniqueness
2. Rename projected 'Id' to 'User_Id', and 'Name' to 'User_Name' -- I'll tell you the reason later
3. Remove 'count' as we do not need it -- User_Id is unique
4. Add 'sum_Amount' with 0 in foreach -- I'll tell you the reason later
5. Remove limit

Here is the result
dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'User_Name', 0 as 'sum_Amount';
dsu = order dsu by 'User_Id';


Build Opportunity Lens group by OwnerId
q = load "opportunity";
q = group q by 'OwnerId';
q = foreach q generate 'OwnerId' as 'OwnerId', sum('Amount') as 'sum_Amount';
q = order q by 'OwnerId' asc;
q = limit q 2000;

Let's modify necessary SAQL:
6. Rename all q data stream to dso -- for easier identifier and uniqueness
7. Rename projected 'OwnerId' to 'User_Id' -- I'll tell you the reason later
8. Add 'User_Name' with "-" in foreach -- I'll tell you the reason later
9. Remove limit

Here is the result
dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', "-" as 'User_Name', sum('Amount') as 'sum_Amount';
dso = order dso by 'User_Id';


Combine the dataset with UNION
final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('User_Name') as 'User_Name', sum('sum_Amount') as 'sum_Amount';


The Complete SAQL
dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'User_Name', 0 as 'sum_Amount';
dsu = order dsu by 'User_Id';

dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', "-" as 'User_Name', sum('Amount') as 'sum_Amount';
dso = order dso by 'User_Id';

final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('User_Name') as 'User_Name', sum('sum_Amount') as 'sum_Amount';


The Moment of Truth



Explanation
  • we rename Id and Name in step (2) to have the same column name with step (7)  and (8)
  • we add 'sum_Amount' in step (4) to have the same column name with dso data stream
  • for our use case, we get the dataset aligned with the same column by adding dummy columns before using UNION to both data stream
  • In the last row, we use aggreagate function first() to return the first user name, as our union  start with dsu which contain user name, while dso at the second/last will always contain "-" for user name, see step (8)
  • In the last row, we also sum the 'sum_Amount' again, practically this is sum the 'sum_Amount'  with 0, remember we add 0 as dummy value in step (4)

Make it simple
Since sum_Amount always 0 in dsu, and User_Name always "-" in dso, we can just simply not need to add them to the data stream, and we will still get the same result, let's remove the unnecessary statement.

dsu = load "user1";
dsu = group dsu by ('Id', 'Name');
dsu = foreach dsu generate 'Id' as 'User_Id', 'Name' as 'Name';

dso = load "opportunity";
dso = group dso by 'OwnerId';
dso = foreach dso generate 'OwnerId' as 'User_Id', sum('Amount') as 'sum_Amount';

final = union dsu,dso;
final = group final by ('User_Id');
final = foreach final generate first('Name') as 'Name', sum('sum_Amount') as 'sum_Amount';
final = order final by 'Name';




Reference: