Pages

Wednesday, May 13, 2020

Einstein Analytics: Compare Table functions

Compare Table is another powerful and easy to use feature in Einstein Analytics to meet your business requirements, you do not need to manually construct the SAQL or edit with JSON. It allows you to aggregate data into a table (or chart) based on formulas defined.

You can use familiar SAQL syntax to create your own formula for a column of multiple columns in a table or visualize it to a chart. 

On top of manually building custom formula with SAQL, Einstein Analytics also provide defined windowing functions to analyze data across rows.

Sliding Window
Applies an aggregate function to the current row with respect to a configurable range of rows.
- Column: source data (a column)
- Function: Average, Sum, Min, Max
- Start: start row to analyze, -1 mean 1 previous row from the current row
- End: end row to analyze, 0 mean current rows, 1 mean 1 row after current row
- Reset Group: if only you have more than 1 grouping, you can reset by 

Let us see a sample:

This function compares the current value with 1 previous row (start = -1, and end = 0), then get the Max value. 


Once saved, we will see the formula
max(A) over ([-1..0] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day'))

Now, let us add Account Name as 2nd level grouping without adding Reset Group


Because there are 3 accounts on 13-May-2017, the row explodes to 3 lines, but the logic still the same, which is comparing max value between the current and previous row. In this change, the formula will not change.

Now, let us add Reset Group = Opty.CloseDate


The formula will be changed by adding Account Name (in this sample the API name is "Name")
max(A) over ([-1..0] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day','Name'))


Because Close Date added as Reset Group, windowing functions only compare within the same date.


Percentage of Group
Calculates the percentage each row is of its group total, or of the grand total. The percentage only applicable for data shown in the table, NOT for the whole data in dataset (if you apply filter).

The formula: A/sum(A) over ([..] partition by all)



The function can be reset on a grouping defined in the table, this is the same with Sliding Window, you need to have minimum 2 grouping to apply Reset Group.


The formula: A/sum(A) over ([..] partition by 'Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day')


Rank Within Group
There are 4 functions offered by Rank Within Group:
  • Rank: assigns rank based on order. Repeats rank when the value is the same, and skips as many on the next non-match
  • Dense Rank: same as rank() but doesn’t skip values on previous repetitions.
  • Cumulative Distribution: calculates the cumulative distribution (relative position) of the data in the reset group
  • Row-number: assigns a number incremented by 1 for every row in the reset group.

Formulas for each function:
  • Rank: rank() over([..] partition by all order by A desc)
  • Dense Rank: dense_rank() over([..] partition by all order by A desc)
  • Cumulative Distribution: cume_dist() over([..] partition by all order by A desc)
  • Row Number: row_number() over([..] partition by all order by A desc)



The Order menu determines the direction of ranking based on the values being ranked. Ascending ranks the lowest value as number 1, while descending ranks the highest value as number 1. Same as Sliding Window and Percentage of Group function, we can add Reset Group in Rank Within Group.


Period Over Period
To use Period Over Period, the table must be group by Date field. Period Over Period function compare periods of time to calculate changes in values, for example: year-over-year, quarter-over-quarter, month-over-month, week-over-week, or day-over-day.

Then, we have option to show the result as: % Change or Unit Change

Formula for % Change
(A - sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day')))/(sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day')))

Formula for Unit Change
A - sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day'))


Period Over Period do not offer Reset Group.


Change from Previous
Compares the value of the current row with that of the previous row and calculates the difference. This function similar with Period Over Period, but you do not need to group by Date. 

Change from Previous offer Reset Group function and similar with Period Over Period, we have option to show the result as: % Change or Unit Change.

Formula for % Change
(A - sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day')))/(sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day')))

Formula for Unit Change
A - sum(A) over ([-1..-1] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day'))


Above table looks very similar with Period Over Period, but Change from Previous can be implemented to any grouping, not just Date over a period.


Running Total
Calculates the total value of the current row summed with all previous rows. 

Running Total formula without Reset Group: sum(A) over ([..0] partition by all order by ('Opty.CloseDate_Year~~~Opty.CloseDate_Month~~~Opty.CloseDate_Day'))

Running Total also offer Reset Group if you have more than 1 grouping in table. Let us see in sample with and without reset Group:





Reference:



Sunday, May 10, 2020

Einstein Analytics: computeExpression return Date

In some scenarios, we need to return a Date field from computeExpression node. 

Date value can be null
sample:
case when ClosedDate_sec_epoch > date_to_epoch(now()) then toDate(ClosedDate,"yyyy-MM-ddTHH:mm:ss.000Z") end
or
case when ClosedDate_sec_epoch > date_to_epoch(now()) then toDate(ClosedDate_sec_epoch) end

Date or Date/Time
For a Numeric field, Precision and Scale are required, while for a Date field, Date Format is required. Einstein Analytics by default will "explode" a Date field into:
- Year
- Quarter
- Month
- Week
- Day
- Hour
- Minute
- Second
- Epoch days
- Epoch seconds

Let us see some samples of computeExpression returning Date type. In this sample, ClosedDate is a Date/Time field in Salesforce:

(1) SAQL Expression: toDate(ClosedDate,"yyyy-MM-ddTHH:mm:ss.000Z")
Date Format: yyyy-MM-ddTHH:mm:ss.000Z

(2) SAQL Expression: toDate(ClosedDate_sec_epoch)
Date Format: yyyy-MM-ddTHH:mm:ss.000Z

(3) SAQL Expression: toDate(ClosedDate,"yyyy-MM-ddTHH:mm:ss.000Z")
Date Format: yyyy-MM-dd

(4) SAQL Expression: toDate(ClosedDate,"yyyy-MM-dd")
Date Format: yyyy-MM-ddTHH:mm:ss.000Z
this will cause error because date format with HH:mm:ss, while toDate() format string parameter does not have.

(5) SAQL Expression: toDate(ClosedDate_sec_epoch)
Date Format: yyyy-MM-dd

(6) SAQL Expression: ClosedDate
Date Format: yyyy-MM-dd
this will cause error, we can't copy directly even it is a date field 

(7) SAQL Expression: now()
Date Format: yyyy-MM-dd
this is okay, it will return UTC date and time


As the screenshot above, the date format determines how the values stored in Einstein Analytics. But, let us drill further:


Even hour, minute, and second are not added in the date format then not shown in the date field, Einstein Analytics still maintain hour, minute, and second in the exploded fields.

yyyy-MM-ddTHH:mm:ss.000Z is similar with yyyy-MM-ddTHH:mm:ss.SSSZ and similar with yyyy-MM-dd'T'HH:mm:ss.SSS'Z'


Now, let us try to use Date field from Salesforce, instead of Date/Time field. CloseDate here is a Date field in Salesforce:

(1) SAQL Expression: toDate(CloseDate,"yyyy-MM-dd")
Date Format: yyyy-MM-ddTHH:mm:ss.000Z

(2) SAQL Expression: toDate(CloseDate_sec_epoch)
Date Format: yyyy-MM-ddTHH:mm:ss.000Z

(3) SAQL Expression: toDate(CloseDate,"yyyy-MM-ddTHH:mm:ss.000Z")
Date Format: yyyy-MM-dd
this will cause error

(4) SAQL Expression: toDate(CloseDate_sec_epoch)
Date Format: yyyy-MM-dd

(5) SAQL Expression: toDate(CloseDate,"yyyy-MM-ddTHH:mm:ss.000Z")
Date Format: yyyy-MM-ddTHH:mm:ss.000Z
this will cause error

(6) SAQL Expression: toDate(CloseDate,"yyyy-MM-dd")
Date Format: yyyy-MM-dd

Because there is no time in the source field, so all HH mm ss will be 00.


In summary:
  • for SAQL Expression, it is easier to use toDate(xxxx_sec_epoch) 
  • for Date Format, use yyyy-MM-ddTHH:mm:ss.000Z for Date/Time, or use yyyy-MM-dd for Date field

Side Note: can we refer a field in computeExpression within the same note?
Yes, as long as the field referred is located above (in order) from the field call it, see screenshot below, CloseDate_2 can use CloseDate_1, but not CloseDate_4 and CloseDate_6.








Wednesday, May 6, 2020

Einstein Analytics: Hidden Step

Hidden step probably not the official term in Einstein Analytics, but you probably heard when having a conversation with Einstein Analytics experts with many years of experience, furthermore, Salesforce no longer calls it "step", but now it calls "query".

So, what is the hidden step? it is a query that used to support other queries, but not directly uses for widgets. The query can be in compact form, Salesforce Direct, SAQL, and SOQL.

As mentioned above, we can use a hidden step to support other queries, in this case, usually, we use "result" binding, remember selection binding will be trigger when user change/select something.sim

Sample SOQL query:
 "QueryLoginUser_1": {
                "groups": [],
                "numbers": [],
                "query": "SELECT Name,Id FROM User Where Name = '!{User.Name}'",
                "selectMode": "single",
                "strings": [],
                "type": "soql"
            }

use it in a "real" query:
         "filters": [
                        [
                            "OwnerId",
                            "{{cell(QueryLoginUser_1.result,0,\"Id\").asString()}}",
                            "in"
                        ]
                    ]


Sample SAQL query:
q = load "Quote_History";
q = filter q by 'StatusChange' == "Created to In Progress";
q = group q by all;
q = foreach q generate avg('Cycle_Time') as 'avg_Cycle_Time';
q = foreach q generate avg_Cycle_Time as 'ave_seconds', floor(avg_Cycle_Time / 3600) as 'hours', floor((avg_Cycle_Time / 60) - (floor(avg_Cycle_Time / 3600) * 60)) as 'minutes';
q = foreach q generate ave_seconds as 'ave_seconds', hours as 'hours', minutes as 'minutes', floor(ave_seconds - ((3600 * hours) + (60 * minutes))) as 'seconds';
q = limit q 2000;

If you open in JSON, it would become
"cycle_time_1": {        
"query": "q = load \"Quote_History\";\nq = filter q by 'StatusChange' == \"Created to In Progress\";\nq = group q by all;\nq = foreach q generate avg('Cycle_Time') as 'avg_Cycle_Time';\nq = foreach q generate avg_Cycle_Time as 'ave_seconds', floor(avg_Cycle_Time / 3600) as 'hours', floor((avg_Cycle_Time / 60) - (floor(avg_Cycle_Time / 3600) * 60)) as 'minutes';\nq = foreach q generate ave_seconds as 'ave_seconds', hours as 'hours', minutes as 'minutes', floor(ave_seconds - ((3600 * hours) + (60 * minutes))) as 'seconds';\nq = limit q 2000;"}


use it in a "real" query
"text": "{{column(cycle_time_1.result, [\"seconds\"]).asObject()}}",



Reference:



Page-level ad