to deliver a complete industrialized solution
stores projects tested by the client in year 1
to produce local market studies thanks to the architecture deployed
Our client is a recognized leader in retail, with thousands of outlets across the world. They plan to accelerate organic growth in Europe with +40 new outlets per year in the coming years (i.e. hundreds of millions of investments to orchestrate). To do so, they want to ensure they choose the best location opportunities to open these new outlets with help from the business expertise of network developers and an AI-based tool. Our goal was to allow them to build local market studies and to provide a forecast of expected turnover of a new shop in year 1.
Our client wanted to deploy a solution based on an industrialized pipeline which would allow them to access results rapidly. Essentially, a web app to be used by our client’s network developers and business analytics teams with the capability to select any outlet in a particular country to monitor its performance and visualize KPIs: competition, locality, socio-demographic indicators etc. They wanted the ability to create an “outlet project” anywhere on the country map to get a local market study covering many indicators in addition to forecasting the outlet turnover in year 1, including data on cannibalization impact expected on existing outlets in the selected area.
However, the client did not have a datalake that centralized all the necessary data. At first, it was not possible to create automated data flows, which is an important challenge when it comes to building a solution that is very reactive and responsive. The data quality was poor, particularly for descriptive information on outlets. We needed to collect external data to enrich the client’s view.
For business interpretability, we wanted to avoid a black box algorithm which can easily pop up in this type of sophisticated technical project.
Our core conviction is that breaking silos, internally and externally, is a supercharger of efficiency and agility. At Ekimetrics, we hire very different profiles to be able to create multi-disciplinary teams for our clients and build an integrated approach for every project. That’s exactly what we did for this retail client. We gathered data scientists, data engineers, full stack developers and design thinking experts in one unique team dedicated to his project.
We started with a design thinking phase to make sure we deeply understood the client’s needs and expectations effectively from day 1. This is a crucial step to guarantee business relevancy at the end of the project (and not to be stranded with a sophisticated AI tool, but disconnected from the ground teams’ workflow). This also helped us to build a better app: better design and a more user friendly interface, which is crucial for further adoption across the business.
We then collected and consolidated data, including internal data, third party data, open data available on the internet (demographics, flows, behaviors etc.). All collected data was gathered into a MongoDB Database. We chose this solution for its speed and its ease of access for both web application and machine learning algorithms.
We then developed a machine learning algorithm using most of the data collected as an input to forecast the turnover of new shops in year 1. To do so, we mixed state-of-the-art algorithms and technologies to ensure the best model in terms of prediction and speed to production. We used a LightGBM based algorithm to quickly and efficiently reach a comfortable level of prediction and MLFlow to optimize its hyperparameters. All the exploration and training phases were developed on Databricks. This platform deeply integrates all MLFlow modules, provides collaborative notebooks to speed up team work and allows ML models to be put in production in a few clicks.
To promote the business interpretability, we also added modules to get away from the black box effect, and bridge to business decision making. We used Shapley values, a game-theoretic approach, to explain outputs of ML algorithms. These values were also made available in the final webapp to ensure direct and painless interpretation. That was a key success factor, helping to convince and engage the field teams to use the tool after the final delivery.
Finally, we deployed the solution based on an industrialized pipeline which allowed our retail client users to access results rapidly. All key users of the solution have been involved at each step, from the set up phase to the tool delivery (validate decisions about content, design, KPI, interface etc).
To ensure the model used the most up-to-date data, we automated the data collection and model retraining. For data collection, we developed an automatic scraping robot to collect the open data and competitors locators monthly running. For training, we also used MLFlow to easily create a machine learning pipeline to start training as soon as new data becomes available and simply keep the best model after every training loop.
Fortunately, thanks to prior projects conducted with the same client, we had developed a deep business knowledge of the retail company, which helped us to maintain the right level of interactions, whilst keeping an eye on the strategic north star.
It took 3 months to set up a useful, usable and used data science tool for this client. We have been able to deliver a complete industrialized solution in 12 weeks – starting from scratch.
Thanks to the design thinking approach and our different methods to foster collaborative work all along the project, the adoption of the tool by the teams has been massive. More than 100 shop projects have been tested by the client in 30 days.
Finally, the architecture deployed allows the client to produce local market studies and forecast in less than 1 second (once the characteristics of the shops has been input in the application).