Hi there, it’s Matt! This is part of a larger series on AI product. Also check out our AI strategy thinking series.
Sora has shown us the future of AI. The future is multimodal.
Sora is a proof of concept. But it does point the way to new and different types of AI products and product features.
This is a big opportunity for product managers. They’ll be needed to define product vision, roadmaps, and define risk to build multimodal AI products.
Product managers along with strategists and architects will manage the entropy. They put vision, frameworks, and tie it to the larger business model. We need operational roles to build multimodal products - and an operating model to face disruption.
Disruption is coming to several industries. Games, marketing, news, education, and social media will be disrupted. Disney and other companies are already trying to use generative AI to simplify processes from animation to advertising.
Last year, Coca-Cola led the way with a commercial built using Generative AI. Now, Open AI’s Sora is showing the full disruptive potential of multimodal AI products.
Product managers must be able to consider the potential multimodal AI. They need to consider how they approach user experiences, product planning, and how to face this typhoon of change.
How will the disruption happen?
Disruption from Gen AI tech like OpenAI’s Sora or Nvidia’s RTX won’t happen overnight. But they are proofs of concept (PoC) for product features that will be integrated into future AI products.
Profitability of generative AI products will lie in enabling these as product features, augmenting existing needs, or enhancing user experiences. Sora is a proof of concept for enhancing user experiences. RTX is a PoC for augmenting existing manual processes.
Big disruptions are business model disruptions. They challenge existing products and processes and make them more efficient and personalized.
Uber and Airbnb didn't invent new products. Taxis and accommodations already existed. They disrupted the processes of how these services were delivered, making them more accessible, convenient, and user-friendly through technology platforms. Process innovation created massive shifts in their respective industries, challenging traditional business models and setting new standards for customer expectations.
Just like Uber and Airbnb revolutionized their sectors by enhancing service delivery through technology, Multimodal AI can iterate on existing product offerings. It makes them more intuitive, personalized, and efficient.
AI-powered multimodal products integrate various data inputs, such as voice, visuals, and text. This allows the products to understand and respond to user needs more. This leads to products that are not just tools but enhancers and augmenters of user experience. They dynamically adapt to individual preferences and contexts.
This shakes up the usual one-size-fits-all approach, steering the market towards more tailored and user-friendly options. Companies using multimodal AI stand out by offering something special, raising the bar for what customers expect from their products. They hold the initiative in a competition.
Business models won’t be disrupted overnight. Product managers should strive to understand the shape of disruption early on. As multimodal AI products evolve, prioritizing user-centric strategies becomes essential for product managers.
User Experience holds the Potential
Multimodal AI products are user experience focused. While they do enhance workflows, their greatest potential lies in user experience. The user's experience, how they engage with and perceive the product, will become the most important aspect, rather than just an additional feature. The experience becomes the center rather than a second factor.
A big thing I was told in the military was don't attack the terrain, attack the target. This principle applies to multimodal AI as well. We shouldn't start with multimodal AI. We need to start from user experience. Great multimodal AI products will start from personalized user experiences, they create the foundation for monetization.
OpenAI has opened testing of Sora to visual artists, designers, and filmmakers to gain feedback on how to advance the model.
By engaging directly with these creative professionals, OpenAI ensures that Sora enhances not just the efficiency of creative workflows. They’re also thinking how multimodal AI can enrich the user experience of AI products. User experience becomes the forefront of product development.
This is a great important lesson for AI product managers. We must design with experiences in mind for multimodal products.
It has farther implications too. Augmented and virtual reality have a larger advantage down the road for this. Multimodal AI will enable them to not just give expansive experiences, but tailored ones.
We must identify user needs early and think about the multimodal product experience we can create. That will fuel adoption of multimodal AI based products - not the tech alone.
What Should Product Managers Do?
Multimodal and text to video products product strategy is in its infancy.
Product managers need to focus on how the multimodal will create a multidimensional user experience. There's a completive product advantage in creating experiences that engage and thrall the user, making them come back for more.
Change is never easy, but if you focus on fundamentals, then it gets a bit easer
Center on User Experiences. Really get to understand the user. Ask who are they, and what do they want. It’s fairly easy to get stuck building products nobody wants. This is even more important with multimodal products. User experience from users may not be the same. Business value with one user group may not translate into business value with another user group. User centric user testing must be done much more if we are going to use multimodal as an as an AI product.
Understand the current processes. Business leaders are looking to evaluate how you will implement. You really need to understand the data and the teams that work with them. You need to understand how the current processes work and how their current products enable the business model. This will give you a realistic assessment when you are ready to build multimodal AI products. Getting to know other teams and the products they built is a worthwhile investment. Building a multimodal AI product at scale is a major project. Maybe even a program.
Assess capabilities. FOMO is a big thing. We all want to build something that's cutting edge and impressive. There is a lot of technical, platform and architectural work that will need to be done to assess if you can build a multimodal product. The product need and user experience may even be satisfied by something simpler. If getting to a multimodal AI product is your goal, start planning the steps and the organizational thinking you need to get there.
Disruption from multimodal AI products won't happen in the next 6 months. There’s several large technical, business and process challenges. Eve firms like Nvidia, Microsoft, and others with large research and development budgets are still trying to figure how to turn proof of concepts like Sora and FTX into products.
But eventually, they and others will. The increase in these demos and proofs of concept suggests they’ve got a much bigger product vision. Multimodal AI products seem to fit within this vision.
Other companies will follow in the wake, eventually a blue ocean for AI products will turn red. AI product managers must now think about what products and production functionality multimodal AI PoCs enhance and augment.
The future of AI products is multimodal AI.
But current AI product frameworks and approaches will need to evolve, if we want technology like Sora to viable as AI products.