Staying focused – Using a methodology to organize your thoughts and project activities
So far in this series we have introduced the notion of Einstein Analytics Stories, why you may create one and from a high level what sorts of business problems and analyses you can approach using a Story. But, we’re sure you are aware, Stories are just part of the business problem you are tackling – this next blog in our series aims to get you familiar with a methodology (not Salesforce’s, but from an independent consortium) which we’ve seen successes when using Einstein Analytics.
Expecting the first story you create in Einstein Discovery to tell you everything you ever wanted to know about your business problem would be a mistake. As mentioned Einstein Analytics Stories are very simple to use, after all, that’s the whole idea of the product, but like with most things in life, it is better if you make a plan first. We think of it as the “Look before you leap” principle! To help create this plan you can look to some of the many standard data mining methodologies to guide your thoughts and steps. One of these methodologies is called CRISP-DM or Cross-Industry Standard Process for Data Mining, which is a widely used approach for data mining. Oh, by the way, data mining is a fancy word for discovering patterns in large datasets using machine learning and statistics, which is exactly what Einstein Analytics Stories does for you!.
A closer look at CRISP-DM
CRISP-DM gives you a framework to drive your project planning and execution. The methodology breaks the process into six phases; Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment.
What is first noticeable is that the process is not linear, in fact, it suggests an iterative (and sometimes recursive) process with several stages in the process. However, the first step in the process should always be to get as much business understanding as possible and create what is commonly termed a use case, which we will talk about in more detail shortly. Having an understanding of what is interesting for the business helps qualify data sources and data qualities for your project. The next thing to understand is what data is available, where it sits and how you may start to represent the business problem in a digital format. As you can see in the methodology there is already here a possibility that you will go back and forth between these phases as you sometimes don’t immediately have the data that you want or that the decisions taken in your business problem don’t always have the data you thought. OK, good, you’ve got yourself happy that you understand the business problem, the context that you want to use your analytics and that there is (at least at first sight) enough data to start modeling your business problem. When thinking about business decisions, often us humans synthesize lots of different bits of data when we make decisions, so knowing the data is available is one thing, but getting it ready for statistical analysis is a whole different situation. So next, you need to prepare your data as you may have several different sources or perhaps you find that you need to segment your data in different subsets, create calculations or summaries or even create quite different views of your underlying raw data: this is where the powerful data prep tools of Einstein Analytics come in. With this step finished, or as finished as you feel comfortable for a first draft, you look to build your statistical model or Story as it is called in Einstein Analytics. As we mentioned previously you cannot expect to get this right the first time, you will often have to go back and forth with the data prep and modeling, adjusting, tweaking, seeking new data, etc, before you are happy. OK, great, you’ve iterated over a few data and Story cycles so now it comes to evaluating the model. You are clearly aiming to see if it’s statistically any good and if it actually answers the business questions defined in the use case. Once you have fettled, iterated and justified, it’s time to tackle the last component of the Methodology and get the results of the analytics into the hands of the decision-makers and end-users through deployment. Deployment should always be within some end-user or customer experience which either demonstrably increases revenue/profit or decreases cost/waste. Analytics without action is just a hobby. In our case deployment to Salesforce record pages are the first thing you should think about.
This is a brief introduction to the CRISP-DM methodology: if you want to read more about this framework a good place to start is this Wikipedia article.
Creating your use case
The king-pin of the methodology, and frankly the best way of communicating your project to all stakeholders is to create what’s called a use case. Your use case will be the foundation and North Star throughout the life-cycle of the methodology. You, all your colleagues and end-users will be referencing it in all the phases of the methodology. We’ve put an example skeleton use case template that will help you stay on point. Just a thought to bear in mind – we are sure once you start thinking of your use case you won’t just have one, you will quickly have identified and enumerated many. So prepare to spend a bit of time consulting on the relative impact to the business of tackling each of them and then working up a priority and strategy for your multiple Einstein Analytics projects. Instead of digging into every single use case you can come up with, prioritize them based on the business value it will bring to complete the use case and pick up to three to start with, don’t stretch yourself thin, you can always get back to those use cases you drop in the first round.
What can you do to get all stakeholders aligned and hungry for the output of your project? Create your use case with them. Necessarily less technical, this first round of thinking helps you explain in easy language and enough detail the business challenge, the approach and data required, the strategic and tactical insights you expect to find and deliver. Doing this exercise will truly help you and your business prioritize which use cases you want to focus on as it addresses business value and impact as well as whether data is available. Here’s an example based on being able to predict the likelihood that an open opportunity will be closed won.
In the business challenge section, you list all the challenges or reasons behind the goal – what is it you are trying to solve in the business. For the approach and data required you need to consider what data you have available and where this is stored. If you are looking at an opportunity’s likelihood of being won, you most likely want to look at the opportunity object in Salesforce, however, you may also want to look at other sources with other relevant details about the opportunity, account, sales activity, prior service case history, NPS, prior opportunities, etc. Finally, you list the strategic drivers and tactical drivers. The strategic insights are the insights you are getting from the story as a whole which helps you understand how the business is working and what the drivers of success are. Whereas tactical insights are related to the individual record you are going to score in Salesforce (in this case an opportunity).
When you think you have finished writing up each use case outline it’s best to go back to the business and validate each one. Why? Well, they may spot things you missed, got wrong, things that are not operationally valid or you may have the wrong emphasis. After all, they will be the ones using the decision support provided from your Einstein Analytics Stories, so you’d better get full buy-in if you want high adoption rates!
Converting your use case to a use case reference
Your next job is to create your working use case reference. This is necessarily more technical and translates what you have in your use case outline from before into more tangible and project-based terminology and activities. Here’s a template you may want to adopt and adapt.
Let’s pick this use case reference template apart a bit using the example below. Doing so will help you think about writing up your candidate use cases and converting some questions and wishes into concrete aspects of your project.
The first step of the template is to name your use case – always a good idea to give it something meaningful and motivational. To help further, it really does help everyone to be able to define the goal in one sentence. The example goal you can see in the use case below, the single sentence goal has been defined as “Understanding the drivers of winning deals so we can predict future wins”. Next is the operational approach, which really is about explaining what you are trying to understand, with what data and what you wish to predict. Expanding on the goal from our example below the operational approach is described as “Analysing the historical opportunities data for all regions and divisions we will generate a story that explains the effects which are driving more opportunities to be won. Further after understanding these drivers we want to then predict the probability of future opportunities to close as won”.
Now think about the question you want to ask of an Einstein Analytics Story; think back to the yes/no or how much/how many type of questions we looked at in our previous blog in the series. This will help determine the next parts of the template; data granularity, desirable outcome (model target), explanatory variables and actionable variables. With the defined goal and operational approach the tantalizing question is “will this opportunity be won?”. Keeping this simple statement in mind it’s easy to determine the data granularity which is, of course, an opportunity. In terms of the outcome, this would be a binary outcome with true (yes) or false (no) as either the opportunity is closed won or it’s not (closed lost). this is easily found on the Opportunity record as a checkbox field isWon on the opportunity object. While thinking about these facets of your use case through, you are probably starting to make further decisions about what are good examples of what you are trying to predict and what you will use to contrast those good examples. As a result you will be making a smaller, model building, dataset and not a reporting dataset that covers potentially all records. The model building dataset will be made of records that are as representative of the business issue as you can find and may well be culled from historic records or rehydrated from some of your historical data points. In your discussions with the business and user communities, note down anything like “opportunities closed in the last two years since the sales processes changed but which belong to the core sales teams”, or “opportunities that belong to Accounts that are not on an Enterprise License Agreement or Discount scheme” as these define the rows of data you include/exclude in your model building dataset.
The next step of the template helps you determine explanatory and actionable variables – essentially dataset fields. Explanatory variables are any fields that you have in your dataset that could help determine the likelihood of an opportunity closing as won or lost. Notice we said could – Einstein Analytics Stories will let you know if they are predictive, your job is to put as many potential predictors into the Story creation and Einstein Analytics does the evaluation of each of them for you. Go crazy on explanatory variables and think like an end user who’s been given access to any of your company’s data sources. These fields can be anything that describes the opportunity but not limited to your opportunity object, you can include account details or maybe even the number of activities associated with this opportunity as well. But remember to check if the fields are actually completed, you may want to use a field but sales never enter any value in it, which makes the field pointless. Actionable variables are similar to the explanatory as they in a way describe the opportunity, however, these are adjustable or changeable during the creation or execution of the Opportunity’s progression. Einstein Analytics Stories will use these actionable variables to make recommendations on what is likely to make a difference in each and every opportunity. So looking at our example we have picked ‘“type”, ”region“, ”product“, ”win rate“, ”account rep tenure“, ”closed date“ and ”days in pipeline“ as a starter list for your explanatory variables. Now two things to note, the first is not all these variables are necessary fields on the object, some may be derived in the data prep. The second is that you need to make the best guess with your business and domain knowledge, Einstein Analytics cannot tell you which fields to include, that’s up to you. The Einstein Analytics Story will, however, tell you if it does or doesn’t detect any statistically significant patterns in the data in your chosen explanatory fields. In the actionable variables think about things an account executive can change to increase the likelihood of the opportunity being won, which should, of course, be registered somewhere on the opportunity. Returning to our example we have defined ”promotions“, ”discount“, ”executive meeting“ and ”customer relationship status“ as key variables the account executive can offer or modify doing the deal process. Like explanatory variables, actionable variables are assessed in the Story to check to see they actually have an impact on the Story target but in this case, if they do, then how much of a significant difference choosing something actionable could make.
Fun isn’t it? The use case exercise is a fantastic way of concentrating the mind and helping to get the physical pieces of the project lined up to the CRISP-DM methodology and helps you properly prepare to undertake your project.
In the next blog, we’ll revisit the data topics we started above and spend more time discussing some of the finer details of how you go about building the best possible model building dataset for your use case.