Custom prediction

This feature is currently available in the Beta version.

Instead of using a predefined template, you can choose a custom prediction which allows you to specify your own prediction parameters. To do so, go to Analyses > Predictions > + New prediction > Custom prediction`.

Custom prediction generally

Setting up a custom prediction consist of 4 steps:

  • Target customer filter - define the goal you would like to predict (for example, purchase)
  • Eligible customer filter - define the customers for which the model should try to predict the target (for example, customers who opened a particular email)
  • Events, attributes - define what data should form the basis of the prediction
  • Algorithm settings - define the basic setting for the algorithm

Target customer filter

The target of the prediction is an event or customer property that you want to predict.

Prediction type

The first thing you need to define for the target is the prediction type - the type of the data which should come as a result. You have the following options:

Prediction type

Output value

Description

Example usage

Binomial classification

Probability

Determines whether customers will achieve a prediction goal in the future or not.

What is the probability that customers will make a purchase within a week?

Multinomial classification

Best segment

Determines which segment is the best one for each customer from a given set of segments.

Which campaign channel is the best one for a customer?

Regression

Absolute number

Predicts the numerical value of certain customer attributes in the future.

How much money will a customer spend this month?

Filter condition

The second thing you need to do is to define the specific target (filter condition) which you want to predict. The target can be any combination of events or customers' properties. The options are limited only by the variety of data that you have tracked about your customers. You choose the target by clicking on + Add filter condition. In the example below, we chose purchase as the target event.

Looking at the customersā€™ past behavior allows you to learn what seems to determine it. To do this, the predictions module looks at two-time frames:

  • Feature window - the time frame out of which the prediction extracts customer behavior patterns preceding the target
  • Target window - the time frame out which the prediction extracts whether a particular target has or has not been reached.

When you combine the behavior within the feature window and its result within the target window, you can predict the future.

For the model to be accurate, you should have at least a few hundred customers in each segment. Ideally, they would be equally distributed.

Eligible customer filter

You should filter customers in a way so that only those who are relevant to the particular prediction remain. These would ideally be customers for whom you had tracked the target events in the past three months. However, you might need to widen the criteria if you do not dispose of enough of such data. Eligible customer filter is used the same way as a regular customer filter.

šŸ“˜

The dataset resulting from the filter should be of the size of at least a few thousand customers so that the model can create accurate predictions.

The time frame should be set based on whether you want to predict the target happening now or in the further future. Read about how to choose the correct time frame in the next Events, attributesā€¦ section.

Events, attributes....

In the Events, attributes.... tab, you define which events or customer properties should be considered by the model. You also choose the time frame within which these should have been tracked. This timeframe is called the feature window. You have two options when choosing what the window should be:

  • If you want to predict something happening now you should choose a feature window which overlaps with the target window. This is explained in the Now prediction section.
  • If not, then the feature window and the target window should be separate. Read more about it in the Window prediction section.

Window prediction

If you are not predicting what is going to happen now but in the further future, you should choose a feature window that ends before the target window begins. We call this the Window prediction.

In the picture below, the feature window is the 60 days preceding the target window which consists of 30 days. Often, the feature window will be identical to the timeframe for eligible customers, however, it depends on your exact needs. This means that we analyze 60 days of customer behavior to get the patterns and that afterward, we have another 30 days to understand which patterns lead to the target and which do not.

The resulting information allows us to predict the future. In the picture below, we have new customers tracked in the last 60 days. When we compare their behavior with the behavior and results of the training model, we can calculate the probability that a particular customer will achieve the target in the next 30 days.

To set up this template in the app, replicate the following screenshots:

Now prediction

If you are predicting what is going to happen now (in the current session / this hour / today), you should choose a feature window which is overlapping with the target window. We call this the Now prediction. The feature window is calculated for every customer based on the target timestamp.

To set up this template in the app, replicate the following screenshot:

Advanced settings

The tab includes the following advanced settings:

  • Target offset allows you to define when the model should stop taking the relevant events into account prior to the target event (applies only to the customers who fulfill the target).
  • Floating time window allows the model to take into account the relevant events even prior to the start of the time window if the target was tracked at the beginning of the time window (and the preceding data would not be normally available to the model)
  • Dataset balancing should be activated when the data is too unbalanced (when the target rate is too high or too low).
  • Data validation specifies the percentage on which we create and test the model. This means it divides the data between training and testing datasets.
  • Prediction change over time means that the prediction of customer property will not store the percentage chances of reaching the target but the percentage difference between what was the chance in the past and what it is now. This feature is very sparsely used.

Algorithm settings

Lastly, you can decide to either use the default algorithm settings or to customize them. The following settings are available:

  • Minimum instances per node allow you to set the minimum number of data instances per decision node in a decision tree.
  • Maximal depth allows us to constrain the depth of a decision tree.
  • Algorithm type allows us to request a specific algorithm.

Best channel use case

The purpose of the Best channel prediction is to identify the channel which has the highest probability of leading to the desired result (for example, click, conversion, subscription). This will allow you to choose the right channel for each customer, maximizing the effectiveness of your campaigns.

Steps to execute:

  • Create a prediction model
  • Optimal send time as a prediction
  • Create segmentation based on the model
  • Create AB test to figure out what works the best for every segment
  • Evaluate and run

In the example for this section, we are going to predict, whether it is more effective to target a customer with a marketing email or a browser push notification.

Create a prediction model

To identify the best channel we need to create a new custom prediction. To do so, go to Analyses > Predictions > + New prediction > Custom prediction`.

Setting up this type of prediction consist of 4 steps:

  • Target customer filter
  • Eligible customer filter
  • Events, attributes
  • Algorithm settings

Target customer filter

Prediction type

Firstly, you need to define the prediction type of the target. When looking for the best channel, you want to categorize your customers into different segments - according to the channel through which they are most approachable. As you are looking for the best segment, you should choose Multinomial classification as the prediction type.

Filter condition

Secondly, you need to define the combination of events and even attributes that should be the target of your prediction. As you are looking for the best communication channel, you should use the filter to choose events with attributes that match your desired result for each channel.

In our example, we choose the following:

  • campaign >
    • action_type > equals > browser_push_notification
    • status > equals > clicked
  • campaign >
    • action_type > equals > email
    • status > equals > clicked

You should choose the channels which you want to compare and make predictions about. The target in our example is a click after an email or browser push notification is delivered. As we had chosen Multinomial classification before, these will form the basis of the two resulting segments - customers who are more likely to click on an email and those who would rather do so after receiving a browser push notification.

For the model to be accurate, you should have at least a few hundred customers in each segment. Ideally, they would be equally distributed.

Eligible customer filter

To include only the customers relevant to the prediction, you need to choose the relevant customer filter and its timeframe. In this case, you want the prediction to consider only the delivered campaigns of the relevant action_type. The timeframe should ideally be set at 60 to 90 days.

šŸ“˜

The dataset resulting from the filter should be of the size of at least a few thousand customers so that the model can create accurate predictions.

Events, attributes

To include only the events and attributes relevant to the prediction, you need to choose the relevant events and attributes and their timeframe. In the best channel case, you would like to get the probability that a customer will click through the campaign now, not in the wider time window.

Since you want to get the probability that the customer clicks on the campaign right now, the time window for events, attributesā€¦ needs to overlap the time frame of the target.

This setting means that the feature window is calculated for every customer based on their delivered campaign.

Algorithm settings

The last part is the algorithm setting. Unless you have some special preferences, you can use the default settings.

Check its performance

Before using a model, you always need to evaluate its performance. As explained in the Model quality evaluation section you can make your evaluation based on multiple different metrics. We suggest using F1 for this particular model. If the performance seems satisfactory, continue to the next step. If not, it means that no strong predictions could be derived from the available training data. In such cases, try using a different time frame or a custom template where custom features can be chosen instead.

We also recommend going through the resulting decision tree and checking the result of the individual nodes. This can give you not only business insights into probabilities of different groups of customers taking certain actions but also a way of verifying your model. If a certain result seems improbable, you probably fell into the overfitting trap and need to adjust the model.

Create AB test to figure out what works the best for every segment

After the last step, you know what is the predicted best channel for each customer but you still need to test it in a scenario so that we can evaluate its effectiveness. You should create an AB test, dividing customers into 3 groups:

  • Variant - best channel
  • Control group (email)
  • Control group (browser push notification)

Evaluate and run for all relevant customers

After running the scenario for a number of days you can evaluate whether your best option is sending email to everyone, sending a browser push notification to everyone or use the personalized ā€œbest channelā€ derived from the prediction.

To understand whether the difference in the AB test between the variant and the control group is statistically significant, we recommend using our bayesian calculator.

Based on the evaluation, the final adjustment of the campaign can be made so that real-time predictions will decide whether a web layer is displayed to a certain customer.

Updated 3 days ago


Custom prediction


This feature is currently available in the Beta version.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.


We rely on cookies

to optimize our communication and to enhance your customer experience. By clicking on the Accept and Close button, you agree to the collection of cookies. You can also adjust your preferences by clicking on Manage Preferences. For more information please see our Privacy policy.

Manage cookies
Accept & close

Cookies preferences

Accept & close
Back