The Ultimate Guide for Setting OKRs in AI Products - Marily Nika
With 16 Exclusive Examples from Top Companies, from Netflix to Tesla
AI alone is not enough to solve user pain points. We measure AI Product success by using a combination of metrics, so we need a similar approach when making OKRs. Relying on a single metric falls short.
Marily’s framework for measuring success in AI Products:
AI Success = AI Proxy Metrics + Product Health Metrics + System Health Metrics
By considering all types of metrics above you will be able to get a somewhat complete picture of your AI feature’s success. Types of metrics explained:
AI Proxy Metrics: Navigating the Indirect Impact
As an AI Product Manager, you may not have direct control over AI Proxy Metrics, but their significance cannot be overlooked. Proxy Metrics play a significant role in gauging model accuracy. Referred to as "proxy" because they measure the performance of the model itself, these metrics differ from the ultimate goal of the product or feature. Other AI proxy metrics include: objective function, mean absolute errors (MAE), root mean square error (RMSE) & specificity average (SSA).
System Health Metrics: Acknowledge, Understand, Act
While you may not hold direct responsibility for System Health Metrics, being informed about them is essential. These metrics pertain to the performance of the overall feature in the face of millions of users. Being aware of how the system behaves under such loads helps in ensuring its robustness and scalability.
Product Health Metrics: Your Realm of Influence
Product Health Metrics are squarely within your domain of responsibility. These metrics encompass vital aspects such as Engagement, Retention, Satisfaction, and more. As a PM, you are likely well-acquainted with these metrics and actively work towards optimizing them to enhance the success of your AI product.
Marily’s framework for crafting OKRs in AI Products:
OKR: Your goal
KPIs: Metrics to measure progress against that goal
North Star Metric: A KPI. The core metric that captures all value you are bringing in, for that OKR. -
!!! You may have many OKRs, many KPIs under each OKR, but only one North Star Metric per OKR. !!!
While crafting OKRs in AI Products, you need to include 1 from each bucket above. Here is the framework I created:
OKR Framework for AI Products
This framework can act as a foundation. For each AI product or feature, you'd fill it in with specifics relevant to your product. The examples provided are just illustrative. Adjust the framework as necessary based on the nuances of the AI product and the strategic priorities of your organization.
Objective: What's the main goal for the next Quarter? This should be user-focused (who is it for?) and clear about the desired outcome.
Example: "Enhance the user experience by providing more personalized content recommendations."
Specific feature: What features or changes will you introduce to move towards the objective?
Example: "Introduce three new personalization algorithms based on user behavior."
North Star: This is your main metric that showcases product success. It's what you're aiming to influence.
Example: "Increase user engagement with recommended content by 20%."
Product Health Metric: How do you measure user satisfaction or the health of the product?
Example: "Reduce the number of user complaints about content irrelevancy by 15%."
Guardrail Metric: What potential negative side effects do you want to monitor and keep in check?
Example: "Ensure the overall content watch time doesn't decrease by more than 5%."
System Health Metric: How will you ensure the tool or feature remains reliable and performant?
Example: "Maintain 99% system uptime and ensure content loading times stay below 2 seconds."
AI Proxy Metric: What AI-specific metric are you aiming to improve? This is often related to the algorithm's performance.
Example: "Increase the accuracy of the content recommendation algorithm by 15%."
The best way to support this work is to like & repost this post on LinkedIn.
Join Marily’s upcoming AI Product Management Bootcamp
Example AI Feature OKRs from 15 companies
Note: The percentages and objectives listed below are illustrative and based on hypothetical scenarios provided by the AI PM Academy for educational purposes. In actual practice, they should be determined by analyzing historical data, user feedback, competitive benchmarking, and product team objectives. You can ignore the ‘North Star’ label below, I am adding this just to point out to you which category that specific KR resulted from.
Amazon Alexa - Speech Recognition:
Objective: Deliver a more intuitive and seamless voice command experience for home device users by optimizing the AI's natural language processing capabilities.
Key Result (Features): Expand voice command capabilities to cover 20% more tasks.
Key Result (North Star): Increase user daily interactions with Alexa by 10%.
Key Result (Product Health): Improve user satisfaction with voice command recognition by reducing user-reported issues by 20%.
Key Result (Guardrail Metric): Ensure no increase in unwanted activations or "false positives."
Key Result (System Health): Ensure 99.5% system uptime and reduce latency in command processing by 15%.
Key Result (AI Proxy): Reduce the speech recognition error rate by 15%.
Tesla - Self-driving:
Objective: Enhance the driving experience for Tesla vehicle owners by implementing dynamic risk assessment algorithms to bolster confidence and convenience in Tesla's autonomous driving.
Key Result (Features): Introduce two new user-customizable settings for driving style preferences.
Key Result (North Star): Increase the usage of autonomous mode by 15% during typical commuting times.
Key Result (Product Health): Boost user feedback scores related to the "trustworthiness" of the self-driving feature by 20%.
Key Result (Guardrail Metric): Ensure all safety alerts are clearly communicated and understood by drivers.
Key Result (System Health): Achieve a system responsiveness rate of under 200ms for hazard detection and alerting.
Key Result (AI Proxy): Maintain a vehicle intervention rate under 0.01% during autonomous mode.
Netflix - TV Show Recommendation System:
Objective: Enhance viewing pleasure for Netflix subscribers by delivering personalized content recommendations.
Key Result (Features): Release three new user interface enhancements that spotlight personalized content.
Key Result (North Star): Increase user engagement by improving the click-through rate (CTR) of recommended content by 10%.
Key Result (Product Health): Improve user satisfaction by reducing the percentage of "thumbs down" ratings on recommended content by 15%.
Key Result (Guardrail Metric): Don't cause a reduction in watchtime per person per session more than 2%.
Key Result (System Health): Maintain a 99% uptime for the recommendation engine and reduce content loading times by 15%.
Key Result (AI Proxy): Implement a real-time learning mechanism to adapt the recommendation algorithm based on user feedback and preferences.
OpenAI - General AI Use Cases:
Objective: Democratize AI research for the global developer community by enhancing the accessibility and flexibility of Gen AI tools.
Key Result (Features): Launch five new API endpoints tailored for specific research tasks.
Key Result (North Star): Increase the number of developers integrating with OpenAI's platform by 20%.
Key Result (Product Health): Improve the user feedback score on API documentation clarity by 25%.
Key Result (Guardrail Metric): Ensure not to compromise on the safety and ethical guidelines set by OpenAI.
Key Result (System Health): Achieve a 99.5% uptime for the Gen AI platform and a response time under 300ms for API queries.
Key Result (AI Proxy): Enhance the model's adaptability to diverse research tasks, aiming for a 20% increase in overall task accuracy.
Adobe - Image Generation:
Objective: Empower digital artists and designers to bring their creative visions to life by facilitating AI-enhanced image generation tools.
Key Result (Features): Introduce three AI-driven features that aid in image generation or transformation.
Key Result (North Star): Increase user engagement with AI-assisted tools by 15%.
Key Result (Product Health): Achieve a 20% improvement in user feedback scores for generated image quality.
Key Result (Guardrail Metric): Ensure that AI-generated content adheres to ethical and copyright standards.
Key Result (System Health): Ensure 98% uptime for image generation services and reduce generation times by 10%.
Key Result (AI Proxy): Enhance the AI model's ability to generate high-resolution images without artifacts, aiming for a 20% improvement in image fidelity.
Google - Image Recognition:
Objective: Provide developers and businesses with powerful image recognition capabilities to extract actionable insights from visual data.
Key Result (Features): Deploy four new features focused on nuanced image categorization and contextual understanding.
Key Result (North Star): Increase the number of API calls to Google's image recognition service by 20%.
Key Result (Product Health): Enhance user satisfaction by improving the clarity and depth of API documentation by 15%.
Key Result (Guardrail Metric): Ensure that image recognition respects user privacy and data ethics guidelines.
Key Result (System Health): Maintain a 99.8% uptime for the image recognition API and achieve a response time below 300ms.
Key Result (AI Proxy): Enhance the image recognition model's accuracy, aiming for a 15% reduction in misclassification rates.
Tinder - Matching Algorithm:
Objective: Cultivate meaningful connections for users seeking relationships by refining Tinder's AI-driven matching algorithm.
Key Result (Features): Launch two features that allow users to provide more nuanced feedback on matches.
Key Result (North Star): Boost the number of daily matches by 10% while maintaining or improving match quality.
Key Result (Product Health): Achieve a 20% improvement in user feedback scores related to match relevance.
Key Result (Guardrail Metric): Ensure that user data privacy is preserved and that there are mechanisms against misuse.
Key Result (System Health): Ensure 99% uptime for the matching engine and maintain quick profile loading times.
Key Result (AI Proxy): Enhance the matching model's sensitivity to user preferences, targeting a 15% improvement in match relevance scores.
Spotify - Music Recommendation:
Objective: Enhance the musical journey for listeners by personalizing content delivery based on their unique tastes.
Key Result (Features): Introduce two features that allow users to explore new genres or moods more easily.
Key Result (North Star): Elevate the user engagement with personalized playlists by 15%.
Key Result (Product Health): Achieve a 20% improvement in user feedback scores related to playlist song relevance.
Key Result (Guardrail Metric): Ensure that promoted or sponsored content does not exceed 10% of any personalized playlist.
Key Result (System Health): Maintain a 99.9% uptime for the recommendation engine and ensure smooth track transitions.
Key Result (AI Proxy): Improve the recommendation model's adaptability to emerging music trends and user feedback, aiming for a 20% increase in song match accuracy.
Facebook - News Feed Curation:
Objective: Foster a meaningful social connection for users by curating a relevant and engaging news feed.
Key Result (Features): Launch three features that enhance user control and transparency over feed curation.
Key Result (North Star): Increase user engagement metrics, such as likes and comments, by 10%.
Key Result (Product Health): Boost user feedback scores on feed relevance by 15%.
Key Result (Guardrail Metric): Ensure that controversial or polarizing content is flagged and reviewed appropriately.
Key Result (System Health): Achieve a 99.9% uptime for the feed curation system and ensure quick content loading times.
Key Result (AI Proxy): Optimize the feed curation model to better align with user interests, targeting a 20% improvement in content relevance scores.
Amazon - Product Recommendations:
Objective: Enhance the shopping experience for Amazon customers by delivering tailored product recommendations.
Key Result (Features): Introduce two new user interface elements that spotlight user-specific product deals.
Key Result (North Star): Boost sales from the "Recommended for You" section by 10%.
Key Result (Product Health): Improve user feedback scores on product recommendation relevance by 15%.
Key Result (Guardrail Metric): Ensure that product recommendations are balanced and not overly biased towards high-margin products.
Key Result (System Health): Ensure 99.9% uptime for the recommendation engine and maintain fast product page loading times.
Key Result (AI Proxy): Refine the recommendation model to capture nuanced user purchasing habits, aiming for a 20% improvement in recommendation accuracy.
Grammarly - Writing Assistance:
Objective: Empower writers of all skill levels to express themselves with clarity and confidence using AI-enhanced feedback.
Key Result (Features): Deploy three new features that provide genre-specific writing feedback.
Key Result (North Star): Increase the number of daily active users by 15%.
Key Result (Product Health): Boost user feedback scores related to feedback relevance and helpfulness by 20%.
Key Result (Guardrail Metric): Ensure that user writing style and voice are respected and preserved.
Key Result (System Health): Achieve a system responsiveness rate of under 200ms for real-time feedback.
Key Result (AI Proxy): Optimize the feedback model to reduce false positives in grammar and style suggestions by 15%.
Zoom - Video Background Processing:
Objective: Provide a professional and distraction-free virtual meeting environment for remote workers and teams by leveraging AI for video background enhancements.
Key Result (Features): Introduce two new customizable virtual backgrounds and a feature to blur real-time surroundings.
Key Result (North Star): Achieve a 20% increase in daily usage of virtual background features.
Key Result (Product Health): Elevate user feedback scores related to background rendering quality by 25%.
Key Result (Guardrail Metric): Ensure that AI processing does not distort user faces or main video content.
Key Result (System Health): Maintain smooth video frame rates and ensure less than 0.01% downtime for background processing services.
Key Result (AI Proxy): Optimize the background processing model to detect and enhance video backgrounds with a 20% improvement in rendering accuracy.
Microsoft Word - Auto Content Summarization:
Objective: Assist students and professionals in distilling lengthy documents into concise summaries for quicker comprehension.
Key Result (Features): Launch a "Quick Summary" feature that provides an auto-generated document overview.
Key Result (North Star): Increase the usage of the auto-summary tool by 20% among the student user segment.
Key Result (Product Health): Achieve a 20% improvement in user feedback scores related to summary relevance and quality.
Key Result (Guardrail Metric): Ensure that auto-summarization retains the main ideas and does not omit critical information.
Key Result (System Health): Ensure tool responsiveness under 1.5 seconds even during peak usage times and maintain 99.9% uptime.
Key Result (AI Proxy): Increase the semantic coherence and relevance of AI-generated content by 20%.
Adobe Illustrator - AI-assisted Design Tools:
Objective: Empower graphic designers to streamline their design process by integrating AI-driven automation and suggestion tools tailored for professional projects.
Key Result (Features): Launch two AI-powered features that assist in color matching and layout optimization.
Key Result (North Star): Drive a 25% increase in the usage of AI-powered tools within the Illustrator suite.
Key Result (Product Health): Achieve 30% faster project completion times as reported by users who frequently use AI tools.
Key Result (Guardrail Metric): Maintain user control over designs, ensuring AI suggestions can be easily overridden.
Key Result (System Health): Maintain a consistent tool loading time of under 2 seconds and a 99.5% uptime.
Key Result (AI Proxy): Improve the accuracy of design recommendations based on current design trends by 20%.
Adobe Photoshop - Image Enhancement Suggestions:
Objective: Empower photographers and graphic designers to refine and enhance their images with AI-driven insights and tools by deploying advanced image analysis techniques.
Key Result (Features): Roll out three new AI-driven image enhancement tools or features.
Key Result (North Star): Increase user utilization of the "Enhance" AI tool by 20%.
Key Result (Product Health): Achieve a 25% improvement in user feedback scores for AI-assisted image corrections.
Key Result (Guardrail Metric): Ensure user creative control remains paramount and AI suggestions are non-intrusive.
Key Result (System Health): Maintain 98% uptime and reduce image processing time by 10%.
Key Result (AI Proxy): Improve the AI's image quality assessment accuracy by 20%.
Waze - Traffic Predictions:
Objective: Deliver a more proactive and user-aligned navigation experience for daily commuters based on real-time conditions by introducing personalized notifications and warnings.
Key Result (Features): Launch two new features providing insights into traffic patterns or alternate routes.
Key Result (North Star): Elevate the usage of the "Arrival Time Prediction" feature by 15%.
Key Result (Product Health): Reduce user-reported route inaccuracies or delays by 20%.
Key Result (Guardrail Metric): Ensure timely alerts and updates without causing driver distraction.
Key Result (System Health): Ensure map loading times are under 2 seconds and maintain 99% uptime.
Key Result (AI Proxy): Improve real-time traffic prediction accuracy by 15%.
The best way to support this work is to like & repost this post on LinkedIn.
🎓 Join my next AI PM Bootcamp cohort: https://maven.com/marily-nika/ai-pm-b...
📚 All my AI PM courses: https://maven.com/marily-nika
🤝 Coaching: https://bio.link/marilynika
✉️ Enquiries: marily@aiproduct.academy
pretty cool to have Marily Nika's illustrative OKRs for actual AI products!