Why Data + Model + Users define AI products.

Think these first. They build on each other.

DataLife360

and

Matt Blasa

Mar 26, 2024

Hi there, it’s Matt! This is part of a larger series on AI product. Also check out our AI strategy thinking series.

Lately, I’ve consistently found the assumption that AI model = product. Both technical and business teams make this assumption. It usually doesn’t end well.

There’s a big drive (much of it FOMO) to add AI to a product. That isn’t enough - it creates a product about AI - rather than a business or customer need. I think that it holds a lot of potential AI products back.

But if AI isn’t the product, where do you get started? Start with the data. Then look at the models. Then the people who will use it.

Today, we cover four helpful concepts:

Data + Model define AI products. From idea to execution, data and the model define your AI product.
Think about data. Data determines the potential of your AI products.
Think about AI models. How models process data affects how effective they are for a user.
Think how a user interacts with your AI Product. Data and AI models alone doesn’t mean it’s useful for the user.

Let’s check these out.

1. Both Data + Model define AI products.

A common misconception is the AI model is the product. This is so common, and it leads to a lot of AI products failing. It’s an east assumption since stakeholders see the model’s outputs - and assume it’s the model.

AI needs data - for both training, inference, and evaluation.

You can’t build an AI model without the data. You can’t have an AI product without a model of some sort. You can have neither without data.

AI product at its core is:

AI Model + Data

An AI product is the car - it needs a fuel and an engine.

Data is the fuel.

Data creates the foundation for an AI product. You need data to train. You also need data to keep making predictions or generate data. If the AI model is the engine? You need fuel.

The dataset on which an AI model is trained fundamentally shapes the capabilities of the resulting AI product. Here’s why:

Context: Datasets provide context that AI models learn from. Datasets reflect the knowledge, patterns, and use cases of a specific domain or problems of an AI product.
Diversity. The quality, diversity, and relevance of the dataset directly impacts the AI product. Good datasets are robust and well-curated. They lay the solid foundation for AI to solve business problems and perform its tasks well.
Models Reality. Datasets for AI model are what we think represents reality. It’s a hypothesis - an educated guess. You curate and prepare data that reflects real-life situations for the AI product. Its data that are relevant to predictions, insights, or to generate outputs.

The data is the key element in an AI product. You must make sure that data is available for teams to train models. You must also make sure you can build a model with it.

It keeps the AI product engine running - the model.

The AI model is the engine.

The model defines the essence of an AI product. Its responses and generative output define the core value proposition of an AI product.

Value lies in three key areas:

Training: Models determine how training occurs, from feature extraction and representation learning to the optimization of objectives. The way a model is trained affects its ability to make predictions or outputs. After data, it determines an AI product's effectiveness and experience.
Pattern Recognition: AI models identify, learn, and generalize patterns within datasets. They act as interpreters, translating raw data into actionable insights or decisions.
Adaptability and Specialization: The model's architecture allows for the adaptation to the specifics of the dataset it's trained on, enabling the AI to specialize in the task at hand. This adaptability makes the model an effective conduit between the foundational data and the desired AI capabilities.

AI product quality is affected by these three factors. It’s also depending on how you optimize these factors too. It can speed up, slow down, or stop if not optimized.

The model isn’t static. Like an engine, it needs maintenance - retraining. How often a model in an AI product needs retraining is a major cost factor.

Think carefully about both data and models before you build.

Let’s dig into the importance of data a bit.

2. Think about your data.

Data dictates the potential of your AI products. Assumptions about data and their context? Has effects for the final AI product users interact with.

Air Canada told it is responsible for errors by its website chatbot | Vancouver Sun — Data can give different AI product experiences

How the data is interpreted by an AI model matters. Air Canada recently had a chatbot that made up a refund policy it was ultimately forced to honor.

Air Canada's chatbot gave wrong responses. It didn't understand the data about the refund policy.

Technically it was correct. The LLM spit out a coherent response. But the context about refunds wasn’t included. The data wasn’t in a useful (viable) state - it hadn’t been defined or explored.

Exploring the data and defining its impact. Think about the different aspects that affect product.

Here’s key ones I’ve ran into the last 6 months:

Different datasets create different AI products.

This is a big one I keep seeing. A common request I get is “Can’t you add more data to make the model work?”. This doesn’t help AI products because:

More data doesn't mean a better fit for the product's purpose.
Wrong data can lead to a less effective product.
AI models can only learn so much; extra data doesn’t always give lift.
Too much data can make the product harder to use.
What makes a product unique is focusing user needs, then data, not just how much data it has.

A distinct dataset trains a unique AI model. Using varied datasets to train a model will create different product experiences.

The dataset shapes the model. An AI model learns from a dataset, it uncovers unique patterns. So, the choice of data can transform the model - and the product. When product functionality changes, so does its product market fit and users.

Take for example, a voice recognition AI product used in smart speakers.

If we train the model on a dataset of Spanish commands, it can understand and do tasks for English users.

But if we train the same model on a dataset of Spanish commands, it becomes a product for Spanish speakers.

The AI model's might algorithm be the same. But different datasets lead to two distinct products. They focus on unique customer segments and markets.

To find PMF, datasets must be curated by use cases - like speakers of different languages. This increases their product value.

Curated datasets determine AI product potential.

If data fuels AI products? They’re an asset. They determine what you can do with AI.

Curated datasets improve product value, product market fit, and enhance offerings. They even may create product features or even products that didn’t exist before.

Here’s a good example from

Aakash Gupta

Product value increases if you can iterate on datasets as markets, user needs, and technological advancements evolve. As I mentioned before, data sets model what we hypothesize is reality. In this case, market reality.

Curated datasets like this have advantages in a competitive market.

Vin Vashishta

mentions this:

Each new curated dataset makes the business more valuable and increases the opportunities it can take advantage of.
What’s novel about data is that it can be multiple times. One dataset powers multiple product without ever being depleted. The more use cases and models it supports, the more valuable it is.

This means is that one dataset can enable more than one AI product. This powerful. We called this in the military “economy of force”.

Economy of force is about using resources wisely to achieve multiple goals. Think of it as having curated dataset that you can enable multiple product lines.

This approach is efficient because you're getting more value from the same resource. It’s like a key that unlocks many doors - and you haven’t found all the doors. This does huge wonders when you’re building AI product roadmaps and iterating on data sets.

It gives you strategic flexibility and maximum impact across different across different departments, revenue streams, and profit centers.

As long as you have data sets that are updated and curated continuously? That are founded on good problem scope and use cases?

AI products become a major product strategy asset. They create adaptable products - data and AI. You can meet diverse customer needs and market opportunities.

3. Think about your models.

Think of AI models like sports equipment. Each is designed for a specific game. Imagine using a basketball for soccer. Each piece of gear doesn't work because they are specialized.

Models in an AI product is similar. Trying to use a model trained on different dataset with a new dataset that has different features? That never ends well. You have to think about the model’s assumptions.

But what other AI model factors create a unique product?

Different technical frameworks. There’s a difference between model frameworks: processing speed, integration, and adaptability to new problems. Different frameworks affect the support requirements, customer experience, and effectiveness of a product.
Optimization Method. Just like a coach might focus on drills to perfect specific skills, AI optimization focuses on tuning specific parts of the model to improve performance. The more we optimize, it may affect cost, profitability, and usefulness of an AI product.
Feature Selection. Like key drills to enhance vital skills, feature selection zeroes in on crucial data attributes to boost AI model performance. Different model features, different AI product. Or different output.
Specialization - Think about what the model will be used for. Think about the data that it needs. Specialized data creates specialized AI product requirements, cost, and features. For example, A multimodal modal will have more requirements than an image recognition model. More datasets don’t mean a better product.
Training styles. These are like unique coaching strategies in sports - which leads to different play styles. The model training can define how well a model performs and adapts to new/unseen data. Retraining can get expensive if the model isn’t trained well.

Each can drastically change the AI product. To minimize impact to AI product performance, work with the data science and engineering teams. Try to understand the effects on the final product.

As a data scientist who’s built several models, I’ve seen each of these factors really affect a product. Even when the data and models were the same. Understand how the data science teams to version their models and data.

Track these. Work with your MLOps and data scientists save models, training data, and data sources, as well as documentation about the model. Think about how you deploy a model, and the resources it uses.

These go a long way if you have an AI product. They affect the profitability, overhead costs, and the life span of the model. Poor AI products can sunset very quickly because the model needs frequent retaining, new unique data, etc.

4. Think how your user interacts with the AI Product

Models and data are important. But so is the user. Think about how users will interact with it. A technically sound solution can provide very different user experiences. Not all of them positive.

Here’s a very good example of an AI product that didn’t consider user interaction:

Did A Man 'Hack' Chevrolet's AI Chatbot To Buy A Car For $1?

The user interaction wasn’t tailored to the customers. Guardrails and parameters weren’t defined. The dealership had connected to GPT API - without thinking about the user experience. There’s huge financial and reputational risks.

So, an AI product isn’t just model and data alone. Those create the foundations for an AI product. It’s also how we interact with the model’s outputs and responses. It’s not software - the code isn’t the product.

The user interaction ishow they experience the responses from an AI product. At the end of the day? The user has to interact with predictive and generative outputs from an AI model. Raw data isn’t useful alone.

So you have to constantly think about user experience and an operational support. Make sure to:

Scope use cases and user stories to understand the problem, data, and solution clearly.
Ensure the product interacts effectively with users, as humans cannot engage with raw data directly.
Design the interaction carefully, focusing on the platform, user interface, and the overall user journey.
Support the product with operational processes to maintain its performance in production

These are by no means comprehensive. But interaction isn’t a magic wand - but you need to make sure its user friendly. A sound AI model or curated data doesn’t mean much without that. Experiences can make or break a product.

You’re not only gaining a customer. You’re creating an experience that wows them - and keeps them coming back.

AI products may be built around models and data. But how they interact with users drives their adoption.

Final Thoughts

Recognize the clear connection from the dataset to the model to the AI product's outcomes. This understanding influences the product's success.

Creating great AI products requires a blend of technical expertise and product knowledge. Strive to develop products that are user-focused, dependable, flexible, and trustworthy.