I’ve been thinking for quite some time now on how to build a framework for AI adoption. Not the theoretical kind, but the kind where a business can look at its own operations and understand where AI fits, and more importantly, where it doesn’t.
The problem is the gap between business people and the engineers building AI solutions. Engineers build what they think is cool. Business people ask for what they saw in a demo. Nobody maps either to the actual work being done and that gap is where most AI projects die. The initiative starts with excitement, burns through a quarter of engineering time, and delivers something that doesn’t connect to how work actually flows through the organization.
Over the last few months, I’ve been applying a framework to identify areas of AI adoption. It works because it starts with process, not technology.
What VSM2AI is (and what it avoids on purpose)
The framework uses two components: Lean Value Stream Diagrams together with basic AI knowledge. That’s it. No complex maturity models, no 47-page organizational assessments.
Value Stream Mapping comes from lean manufacturing. It’s a method for visualizing every step in a process, including the time each step takes, the handoffs between people, and the waste hiding in between. When you combine that visibility with even a basic understanding of what current AI models can and cannot do, you get something useful: a grounded, evidence-based view of where automation is realistic.
Why is this framework useful? Three reasons. First, it provides a simple step-by-step method for identifying where AI adoption can work. Second, it allows for proper cost estimation and capital investment planning, both for building and for operating AI solutions. Third, it bridges that gap between business and engineering by giving both sides a shared artifact to reason about. The value stream map becomes the common language. Engineers see the inputs and outputs. Business people see the time and cost. Everyone’s looking at the same thing.
Step 1: Map the work at 5-to-30-minute granularity
You start by getting a very good understanding of the business processes your organization is executing. That’s where the Lean Value Stream Diagram comes into play.
By mapping the processes and getting an understanding of the time required to complete each step, it becomes clear whether AI can be adopted there or not. The key is granularity, and this is where it’s easy to get it wrong by either mapping at very high level (“process customer order”) or very low level (“click the submit button”). Neither is useful.
The sweet spot is 5-to-30-minute steps that are operationally being executed. Why this range? Anything larger than 30 minutes is really a composite task. It would require multiple AI agent steps working in sequence, and the complexity compounds fast. You end up building an orchestration system instead of solving a business problem. Anything under 5 minutes is usually too trivial to justify the integration overhead. The API call, the data pipeline, the error handling: all of that has a baseline cost regardless of the task’s simplicity.
So keeping it at the 5-to-30-minute granularity allows for an easy and realistic assessment of what AI can actually handle.
The output of this step is an annotated process map: every step labeled with its duration, its inputs, its outputs, and who currently does it. This map is the foundation everything else builds on. If it’s wrong or too coarse, every subsequent decision will be off. Spend the time here. Walk the floor. Sit with the operators. Time things with a stopwatch if you have to.
Step 2: Turn brainstorming into a ranked AI candidate list
Once there’s a clear understanding of the operational steps, a discovery session can be organized to identify where AI solutions can be deployed.
This works especially well when it includes people who have been experimenting on their own. Maybe some operational people have been playing around with ChatGPT, creating product descriptions by dropping information into it, or drafting emails, or generating reports. These informal experiments are gold. They’re real-world evidence that a task can be partially automated, and the person running them already understands the edge cases.
To make the discovery systematic rather than just a conversation, each step from the value stream map should be scored against a set of criteria:
Repeatability: Is this step done the same way every time, or does it require significant human judgment each iteration?
Data Availability: Is the input and output digital and structured, or is it messy, verbal, or scattered across systems?
Error Tolerance: What happens if the AI gets it wrong 10-20% of the time? Minor inconvenience or critical failure?
Volume: How many times per day or per week is this step executed? High volume means higher ROI on automation.
Current Cost: What does this step cost in human time multiplied by the hourly rate of the person doing it?
Steps that score high on repeatability, data availability, and volume, and have reasonable error tolerance, are AI candidates. Steps that require deep contextual judgment, deal with unstructured or sensitive data, or happen rarely are probably not worth automating. At least not yet.
One example I’ve seen was in the context of creating pictures for products. We noticed that operators spend on average 5 to 30 minutes per product to create a picture highlighting that product’s features. It scored high on repeatability (same layout logic every time), high on volume (hundreds of products), and the data was already digital (product specs, feature lists, base images). This became a strong candidate for an AI agent that takes the picture and adds the text and relevant features automatically. The scoring turned an open-ended conversation into a ranked list of candidates, which is far more actionable than a whiteboard full of sticky notes.
Step 3 (Gate 1): Technical feasibility, can AI do it reliably enough?
The third step is to assess the feasibility of the top-scoring solutions through three distinct gates. The first gate is technical.
Not everything that sounds automatable actually is. The question here is: does a model exist today that can perform this task at an acceptable quality level? Can you prototype it in ChatGPT or Claude in an afternoon, or does it require custom training, fine-tuning, or specialized infrastructure?
This is where the quick experiments from the discovery phase pay off. If someone already dropped the inputs into ChatGPT and got reasonable outputs, you have evidence. That’s a green signal. If nobody can make it work even with manual prompting, that’s a red one. The prototype doesn’t need to be production-ready. It needs to demonstrate that the core transformation (input to output) is within the model’s capability at a quality level the business can accept.
Step 3 (Gate 2): Economic viability, model the build, run, and error costs
Gate 2 is where enthusiasm meets arithmetic. Right now running AI agents costs real money, and sometimes these costs can exceed the operational costs of doing the work manually.
For example, that product feature image I mentioned: it requires about $2 in API calls to various LLMs that compute the new pictures. The question becomes, are we better off with these $2 costs per image, or are we actually better off just doing the whole thing with a human operator? The honest answer is sometimes yes, the human work is cheaper.
The cost model needs to account for four components. Operating cost is API calls multiplied by volume multiplied by price per call. Build cost is engineering days multiplied by rate to create the solution. Maintenance cost covers MLOps, prompt tuning or retraining as data shifts over time. And error cost captures what an AI mistake actually costs the business: a wrong product description is annoying, but a wrong financial calculation is dangerous.
We can also account for the fact that API costs tend to go down over time. A solution that’s marginally uneconomical today might be clearly viable in 6 to 12 months. The framework should flag these as “revisit later” rather than “reject.” That nuance matters because it prevents teams from permanently dismissing candidates that are just slightly outside the economic threshold right now.
Step 3 (Gate 3): Implementation complexity, choose the right automation level
Once there’s an understanding of costs, we need to understand the investment required to build and maintain the agent. Things get nuanced here because agents sit on a spectrum:
You can have autonomous agents with a low degree of accuracy. Or you can have agents with human-in-the-loop that don’t replace a person but provide substantial work done upfront: a first draft, a pre-filled template, a suggested answer that someone reviews and approves.
Most AI adoption fails because organizations try to jump from fully manual to fully autonomous. The right answer is usually somewhere in the middle, and it often moves along the spectrum over time as trust and accuracy improve.
Depending on the use case and the chosen automation level, building such a solution can take 20 to 40 engineering days, or it can take 60 to 80 engineering days plus additional time for an MLOps workflow that continuously calibrates the agent as new data comes in. It might also be the case that a solution works for one domain but not another. We could do feature highlighting for electronics products but not for food products, just as one example. The framework needs to surface these boundaries explicitly rather than assuming universal applicability.
Run it as a loop: remap, redesign, revisit
The framework isn’t linear. In practice, the assessment often sends you back.
You might discover during feasibility analysis that a 30-minute step actually contains three sub-steps, and only one is automatable. That sends you back to Step 1 to re-map at finer granularity. Or the cost model might show that a solution is viable only at 3x the current volume, which means it goes on a “revisit in Q3” list. Or you might realize the process itself should be redesigned before AI is even considered. That’s a lean optimization insight, not an AI insight, but the framework surfaces it.
Before running VSM2AI on any business unit, check three prerequisites that can kill an otherwise sound initiative: data maturity (does the organization actually have clean, accessible data?), change readiness (will operators adopt the solution?), and compliance constraints (are there regulatory boundaries that no amount of engineering can bypass?). These aren’t steps in the framework. They’re guardrails.
VSM2AI starts with what people actually do, minute by minute. Not with what AI can theoretically accomplish.
Map the processes. Score the candidates. Assess through three gates. Then loop back when granularity was wrong, when costs change, or when new models drop. Try running it on one business unit this quarter. Pick the messiest process you have. You’ll be surprised what the value stream map alone reveals, before AI even enters the conversation.
VSM2AI: A Process-First Framework for AI Adoption That Actually Works
I’ve been thinking for quite some time now on how to build a framework for AI adoption. Not the theoretical kind, but the kind where a business can look at its own operations and understand where AI fits, and more importantly, where it doesn’t.
The problem is the gap between business people and the engineers building AI solutions. Engineers build what they think is cool. Business people ask for what they saw in a demo. Nobody maps either to the actual work being done and that gap is where most AI projects die. The initiative starts with excitement, burns through a quarter of engineering time, and delivers something that doesn’t connect to how work actually flows through the organization.
Over the last few months, I’ve been applying a framework to identify areas of AI adoption. It works because it starts with process, not technology.
What VSM2AI is (and what it avoids on purpose)
The framework uses two components: Lean Value Stream Diagrams together with basic AI knowledge. That’s it. No complex maturity models, no 47-page organizational assessments.
Value Stream Mapping comes from lean manufacturing. It’s a method for visualizing every step in a process, including the time each step takes, the handoffs between people, and the waste hiding in between. When you combine that visibility with even a basic understanding of what current AI models can and cannot do, you get something useful: a grounded, evidence-based view of where automation is realistic.
Why is this framework useful? Three reasons. First, it provides a simple step-by-step method for identifying where AI adoption can work. Second, it allows for proper cost estimation and capital investment planning, both for building and for operating AI solutions. Third, it bridges that gap between business and engineering by giving both sides a shared artifact to reason about. The value stream map becomes the common language. Engineers see the inputs and outputs. Business people see the time and cost. Everyone’s looking at the same thing.
Step 1: Map the work at 5-to-30-minute granularity
You start by getting a very good understanding of the business processes your organization is executing. That’s where the Lean Value Stream Diagram comes into play.
By mapping the processes and getting an understanding of the time required to complete each step, it becomes clear whether AI can be adopted there or not. The key is granularity, and this is where it’s easy to get it wrong by either mapping at very high level (“process customer order”) or very low level (“click the submit button”). Neither is useful.
The sweet spot is 5-to-30-minute steps that are operationally being executed. Why this range? Anything larger than 30 minutes is really a composite task. It would require multiple AI agent steps working in sequence, and the complexity compounds fast. You end up building an orchestration system instead of solving a business problem. Anything under 5 minutes is usually too trivial to justify the integration overhead. The API call, the data pipeline, the error handling: all of that has a baseline cost regardless of the task’s simplicity.
So keeping it at the 5-to-30-minute granularity allows for an easy and realistic assessment of what AI can actually handle.
The output of this step is an annotated process map: every step labeled with its duration, its inputs, its outputs, and who currently does it. This map is the foundation everything else builds on. If it’s wrong or too coarse, every subsequent decision will be off. Spend the time here. Walk the floor. Sit with the operators. Time things with a stopwatch if you have to.
Step 2: Turn brainstorming into a ranked AI candidate list
Once there’s a clear understanding of the operational steps, a discovery session can be organized to identify where AI solutions can be deployed.
This works especially well when it includes people who have been experimenting on their own. Maybe some operational people have been playing around with ChatGPT, creating product descriptions by dropping information into it, or drafting emails, or generating reports. These informal experiments are gold. They’re real-world evidence that a task can be partially automated, and the person running them already understands the edge cases.
To make the discovery systematic rather than just a conversation, each step from the value stream map should be scored against a set of criteria:
Steps that score high on repeatability, data availability, and volume, and have reasonable error tolerance, are AI candidates. Steps that require deep contextual judgment, deal with unstructured or sensitive data, or happen rarely are probably not worth automating. At least not yet.
One example I’ve seen was in the context of creating pictures for products. We noticed that operators spend on average 5 to 30 minutes per product to create a picture highlighting that product’s features. It scored high on repeatability (same layout logic every time), high on volume (hundreds of products), and the data was already digital (product specs, feature lists, base images). This became a strong candidate for an AI agent that takes the picture and adds the text and relevant features automatically. The scoring turned an open-ended conversation into a ranked list of candidates, which is far more actionable than a whiteboard full of sticky notes.
Step 3 (Gate 1): Technical feasibility, can AI do it reliably enough?
The third step is to assess the feasibility of the top-scoring solutions through three distinct gates. The first gate is technical.
Not everything that sounds automatable actually is. The question here is: does a model exist today that can perform this task at an acceptable quality level? Can you prototype it in ChatGPT or Claude in an afternoon, or does it require custom training, fine-tuning, or specialized infrastructure?
This is where the quick experiments from the discovery phase pay off. If someone already dropped the inputs into ChatGPT and got reasonable outputs, you have evidence. That’s a green signal. If nobody can make it work even with manual prompting, that’s a red one. The prototype doesn’t need to be production-ready. It needs to demonstrate that the core transformation (input to output) is within the model’s capability at a quality level the business can accept.
Step 3 (Gate 2): Economic viability, model the build, run, and error costs
Gate 2 is where enthusiasm meets arithmetic. Right now running AI agents costs real money, and sometimes these costs can exceed the operational costs of doing the work manually.
For example, that product feature image I mentioned: it requires about $2 in API calls to various LLMs that compute the new pictures. The question becomes, are we better off with these $2 costs per image, or are we actually better off just doing the whole thing with a human operator? The honest answer is sometimes yes, the human work is cheaper.
The cost model needs to account for four components. Operating cost is API calls multiplied by volume multiplied by price per call. Build cost is engineering days multiplied by rate to create the solution. Maintenance cost covers MLOps, prompt tuning or retraining as data shifts over time. And error cost captures what an AI mistake actually costs the business: a wrong product description is annoying, but a wrong financial calculation is dangerous.
We can also account for the fact that API costs tend to go down over time. A solution that’s marginally uneconomical today might be clearly viable in 6 to 12 months. The framework should flag these as “revisit later” rather than “reject.” That nuance matters because it prevents teams from permanently dismissing candidates that are just slightly outside the economic threshold right now.
Step 3 (Gate 3): Implementation complexity, choose the right automation level
Once there’s an understanding of costs, we need to understand the investment required to build and maintain the agent. Things get nuanced here because agents sit on a spectrum:
Manual → AI-Assisted (copilot) → AI-Supervised (human reviews) → Fully Autonomous
You can have autonomous agents with a low degree of accuracy. Or you can have agents with human-in-the-loop that don’t replace a person but provide substantial work done upfront: a first draft, a pre-filled template, a suggested answer that someone reviews and approves.
Most AI adoption fails because organizations try to jump from fully manual to fully autonomous. The right answer is usually somewhere in the middle, and it often moves along the spectrum over time as trust and accuracy improve.
Depending on the use case and the chosen automation level, building such a solution can take 20 to 40 engineering days, or it can take 60 to 80 engineering days plus additional time for an MLOps workflow that continuously calibrates the agent as new data comes in. It might also be the case that a solution works for one domain but not another. We could do feature highlighting for electronics products but not for food products, just as one example. The framework needs to surface these boundaries explicitly rather than assuming universal applicability.
Run it as a loop: remap, redesign, revisit
The framework isn’t linear. In practice, the assessment often sends you back.
You might discover during feasibility analysis that a 30-minute step actually contains three sub-steps, and only one is automatable. That sends you back to Step 1 to re-map at finer granularity. Or the cost model might show that a solution is viable only at 3x the current volume, which means it goes on a “revisit in Q3” list. Or you might realize the process itself should be redesigned before AI is even considered. That’s a lean optimization insight, not an AI insight, but the framework surfaces it.
Before running VSM2AI on any business unit, check three prerequisites that can kill an otherwise sound initiative: data maturity (does the organization actually have clean, accessible data?), change readiness (will operators adopt the solution?), and compliance constraints (are there regulatory boundaries that no amount of engineering can bypass?). These aren’t steps in the framework. They’re guardrails.
Map the processes. Score the candidates. Assess through three gates. Then loop back when granularity was wrong, when costs change, or when new models drop. Try running it on one business unit this quarter. Pick the messiest process you have. You’ll be surprised what the value stream map alone reveals, before AI even enters the conversation.
Archives
Categories
Archives
Recent Post
VSM2AI: A Process-First Framework for AI Adoption That Actually Works
April 9, 2026I Deployed Gemma 4 32B on a Rented H100 for $1.50/Hour. The Hard Part Wasn’t What I Expected.
April 5, 2026How to Build an Architecture Strategy That Actually Serves the Business
March 5, 2026Categories
Meta
Calendar