A4E Case Team Mentors:
- Alexander Efremov(email@example.com)
A4E Case Team:
- Gergana Damyanova (firstname.lastname@example.org), Irena Lazarova (email@example.com), Stanislav Georgiev (firstname.lastname@example.org), Martin Boyanov(email@example.com), Doychin Damyanov (firstname.lastname@example.org), Vladimir Vutov(email@example.com)
- Python: GenSim, Keras, Tensorflow, Python SciKit Learn, Random Forest Regression, PyWavelets
- R: Apriori
- IBM SPSS Modeler
- SQL Server Express 2016 (SQL), Excel
Who: Retail Client
What (1): Optimal Recommendation for Combined Offer (CO) for next week
What (2): Market Basket Analysis (MBA)
- Explore datasets' structure
- Discover potential variable dependencies across datasets
- Identify the subsets of data that modelling would be based on
- Raw Data:
- Dataset 1--> Data Type: Transactions --> Data Format: CSV
- Dataset 2--> Data Type: Weather Data --> Data Format: CSV
- EDA + ETL + Prep:
- Remove: Duplicates and data waste based on business rules
- Aggregate: Product Total Sales per Day
- Attach: Daily weather data
- Variables reduction: Removed "low-varying" variables and predictors low-correlated to the dependent variables for Sales Volume Prediction Approach 2.
- For the target forecast, two approaches were opt-ed: the forecast to be made on weekly aggregated time intervals and on a daily basis. Outliers and extreme values were identified. We ensured that there are no variables with large percent of missing/outlier values. The dataset was split 70%/30% on Training/Testing for Sales Volume Prediction Approach 2
- Produce up to 24 random permutations of each transaction to increase the size of the dataset needed for Neural Network modelling
- Filter data frame only on last particular month, fetch out cake pieces products and on root indentify most frequently appeared drinking product and Normalize the data - to have mean 0 and variance 1 - by using the "scale" function for Graphical Models Approach
- Step 1 --> Market Basket Analysis (MBA):
- Used Apriori algorithm in R --> . Convert the data into basket form based on the saleID identifier and find the association rules by specifying min values for support (0.5%) and confidence (50 %). Output of this step --> 45 association rules satisfying the support and confidence constraints with Lift range between (1.6 - 17.7). Greater Lift values indicate stronger associations.
- Based on client's interest in the cakePieces category we filtered the association rules and came up with 6 association rules with Lift range between (1.9 - 2.4)
- Step 2 --> Sales Volume Prediction --> Approach 1:
- The sales volume predictors used are various weather condition features, working days etc: date, minTemp, maxTemp, avrTemp, windSpeed, wind16dir, precipit, humidity, pressure, cloudCover, FeelsLike, workDay, bankHoliday, workOff
- Currently the regressor used is Random Forest regression with the usual hyperparameters.
- The sales volume is decomposed using SWT with sym14 level 1, which produces 1 low freq part and 1 high freq. These 2 signals are fed into 2 separate RF regressors to train 2 models which represent the overal sales volume model.
- When prediction is done the weather condition and the desired period are used to predict using the 2 models and predict the 2 signals which are used in inverse SWT to reconstruct the predicted sales volume.
Here the regressor could be SVR with RBF kernel which is harder to tune, but could give some better results. Or some other ensemble predictors similar to RF.
Also there could be some tuning of the wavelet transform using more levels, different filters etc.
For volatile product sales it would be a good idea to tune separate regressors for low and high frequency decompositions.
For example: if SVR with RBF epsilon and C should be different for low and high freq signals.
- Step 2 --> Sales Volume Prediction --> Approach 2:
- Modeling technique : Expert Modeler option (Time series modelling node) in SPSS Modelev v.15, which automatically finds the best-fitting model for each dependent series. Confidence interval of 95% was chosen. For each of the 11 target variables, apart of the different time interval, modeling tuning was opted via changing the input explanatory variables, and using the option for automatic outliers detection (in the node) . The weekly time series approach was not successful, no significant models were created. Finalized the number of ARIMA-based models.
- Neural Network modelling
- A transaction can be seen as a stream of items.
This allows us to apply various models which tackle NLP problems. We were interested in trying out some deep learning techniques
- Run word2vec on the transactions.
Word2vec is a popular technique for mapping words to a higher-dimensional vector space which is supposed to have some semantic interpretation. Applied word2vec to map the products to a higher dimensional vector space. The resulting vectors for similar products are similar.
- Feed these vectors to a Recurrent Neural Network with the objective to predict the last item in the transaction.
- Graphical Models modelling
- Calculate partial correlation
- Plot links on tree structure(Reingold Tilford)
- MBA provides a set of association rules which is then combined with the frequency analysis of items sold alleviates the business to choose a specific combined offer.
- Modelling Approach 1: Most suitable for clients interested in long-term CO recommendation and larger datasets (could expand with artificial data). Allows quick models' development and relatively easy deployment
- Modelling Approach 2: Most suitable for clients interested in short-term CO recommendation. The ARIMA models through SPSS are relatively fast to develop, in case there is a change in the customer product catalogue and the MBA is updated, also and for most of the forecasted target variables, the models show good results in the short-term.
Neural Network Approach: It is evaluated against the results of the Market Basket Analysis. When given the precedent, the neural network predicts the antecedent ~52% of the time.
The w2v vectors are evaluated manually. Overall we can see that human perception similar items are also similar in the word2vec model
melba1 is similar to ('sundaeYogurt2', 0.6426471471786499) ('sundaeYogurt1', 0.5623082518577576) ('melba2', 0.47845354676246643)
mojito2 is similar to ('mojito3', 0.7359622716903687) ('mojito1', 0.6873572468757629) burger is similar to ('sandwich7', 0.5521785616874695) ('sandwich6', 0.5061086416244507)
All the similarities can be found here: https://drive.google.com/file/d/0BxYLkQRqdXrcTVd1Y01XVUNEeTA/view?usp=sharing
Could be deployed as self-service SaaS application.
Client loads data in a secured web based form --> data is ingested and validated through ETL cycle --> data is fed to the recommendation engine --> the engine outputs recommendations in the clients account dashboard section --> client makes a decision to act on the recommendation