Wednesday, October 9, 2019

Salesforce Einstein – Where to start to experiment and understand Machine Learning?

by Jean-Michel Mougeolle, Salesforce MVP hall of fame, Salesforce Einstein Champion, SharinPix CEO.

What is the Einstein Champion program?
Let me start this blog with the Einstein Champions Program. The Einstein Champions Program is for Trailblazers that are passionate about the Einstein Platform and want to share their advanced knowledge with peers and evangelize the power of Einstein.

I have the chance to be part of those, certainly, due to the various have made around Einstein Vision at Dreamforce and in many Dreamin’ events. I’m convinced that Einstein Vision is a great way to start learning with Machine Learning in Salesforce.

Why starting by Einstein Vision?
First, it will make you understand very easily the benefits and the approach required by Machine Learning.

Second, you can easily play with it, FOR FREE!
For Free? You mean you don’t need any licenses?
No, you just have to install the Einstein Vision and Language Model Builder by Salesforce Labs, to start playing with it. The creation of models is free, and to test them you have up to 2000 predictions per month for free as well.

So where should we start to create our first model?
I will go with Einstein Vision Image Classification. It only takes a zip file with few images organized by labels in folders to start with something. Of course, you may have to gather enough images per label to get something working, and take care of image format, size and resolution. But if you plan only to create a model for testing, extracting images from some google search should be sufficient to have nice results.

Can you explain the basis of Image Classification?
Yes, for sure, Image Classification makes prediction to identify a picture from examples on which it has been trained. As an example, the model can recognize a cat from a dog if it has been well trained with enough dogs and cats pictures.

For our demo jam with SharinPix we have used images from google to create models to classify food pictures. The model can recognize hot-dog, pizza, burger, drinks, dessert, BBQ meat and more. That’s a good example on how to classify from image line of a menu to make them sorted automatically.

You mean that you can train a model that easily?
Yes, you just have to catalog enough images per label (100), construct a zip file with those and create a dataset with it. The UI from the Salesforce Lab package allows you to easily create a dataset from a zip file. Once you have a dataset, you can train a model from the same package. The model is the « engine » to create predictions.
Once you have a model, you can present a picture and the model will make prediction.

What can we expect to learn from that?
The limits of a poor dataset.  As an example, if you upload only white cats and only black dogs in a dataset, you will get a bad quality dataset. If you present then a black cat to it, it will certainly predict it as a dog.

Getting a good dataset is key, and it’s really easy to understand from example that is not working. As Image Classification is very visual, you can learn easily about the right and wrong approach around Machine learning.

What about Object Detection?
It’s quite the same principle than Image Classification, but it can detect many objects in a picture and get back with the position, the numbers and of course the probability associated to each recognition. The main usage for this is to automate retail execution from Shelf Display pictures.

Is that as easy as for the Image Classification?
Yes and no.
It doesn’t require different technology and it’s the same approach: create a dataset with pictures and train a model to get prediction. But if you need to label the pictures with bounding boxes representing all the objects you want to recognize.

So, in the example of retail execution, you may have to make it learn from shell display images where you have to draw boxes around each object you want to recognize, with the name of it. And this time you don’t need 100 pictures per label, but 200 bounding boxes per label across all the pictures used in the dataset. And the drawing of the box requires to be precise for a good prediction.

What are the main problems that can make you have a bad dataset?
The first is the bad quality of labeling. AI is basing is logic on the examples you feed it with. If you give him wrong examples, it will result in bad predictions. When you label hundreds of images, it’s easy to make mistakes. There, QA is mandatory to avoid any errors in the labeling.
The second, the diversity, frequency, and quality of images are key. You should not use images too angled or with too much light. And you may need as well to get the same frequency for each object to recognize across all the images in the dataset.

You seem to be very well experienced around that, does it come from what you have done with SharinPix?
Yes, we have provided the services to create tons of models for various big retail customers, but also from the company in other industries. We have labeled datasets that can recognize multiple hundreds of objects and with multiple thousands of images.

The quality approach is key in that kind of project, getting organized, having the right level of QA and a good understanding of the risk for each problematic met is really important.
We have constructed an app to help the team that wants to be serious about model making, model optimization, and model maintenance. We use it internally and provide the services around worldwide too many different companies.

Is that available on the AppExchange?
Yes, it’s part of the SharinPix App, but you can reach me for any question about Machine Learning and the app whenever you need!

So, can you recap the best thing to start with if you want to learn about Einstein?
Yes, the first one is if course trailhead, there is an incredible TrailMix that will make you learn a lot: 

Then you can install the Model Builder provided by Salesforce Labs from the AppExchange:

And of course, if you want some help and get serious about Image Recognition you can rely on SharinPix App and Labelling Services:

No comments:

Post a Comment

Page-level ad