Why should we be building data products?

Oct 03, 2022

🧮 Why should we be building data products? They enable our organisation to reuse the blocks we have built over the years and empower any data analyst to serve our customer base without requiring advanced software engineering skills.

☁️Imagine pairing that ETL process with a segmentation model to power a churn dashboard that is available to anyone in the company and that everyone can update at wish or enable personalised newsletter recommendations targeting customers at risk. And all this is built with existing blocks made originally for internal and ad-hoc consumption. That is the idea of data products.

💡 Data products mean going live and releasing back-office tools and live customer-facing features powered by the data, modelling and analysis carried out by the data organisation. It means going beyond providing an API endpoint and offering a user interface to engage with our data. It also means that data scientists, analysts and machine learners can focus on their strengths, modelling and prediction without requiring software engineering skills.

📈 Data products are the natural evolution of ad-hoc models and dashboards. A commonality for any data organisation is that we repeat our work regularly. How often are we asked to update a report or a model? And besides us, nobody else can do this update. The data and the deliverable remains siloed, and as soon as we leave the organisation, someone must start from scratch to continue our task. With a data product, we have this process systematised and do not require manual updates. At most, only one block of the process must be updated and replaced, but not the whole system and method. Data products are the natural evolution of self-serve analytics.

💸 Depending on whom you believe, data products are a 300bn market. We should assume that everyone is building a data product. Yet, their development remains a challenge. Alone deploying a predictive model in production is a non-trivial task. To this day, most data scientists continue to be unable to bring their models into a live system. And let us not mention the millions of data analysts who excel in modelling but do not have software engineering skills or do not work with a programming language suitable for general software development such as R or SPSS.

In this regard, native warehouse apps are a significant enabler for building data products.

How do you deploy your data artefacts today? Are you already developing data products?

Why should we be building data products?

Discussion about this post