The Hidden AI Tax: Why Productivity Gains Turn Into Supervision Work
A practical look at when AI saves time, when it creates supervision work, and why the ROI math often lies politely.
Everyone likes the clean version of the AI productivity story.
AI saves time - Teams move faster - Manual work disappears.
The spreadsheet finally stops looking like a tiny administrative hostage situation, and for one brief second everyone believes the productivity soufflé will rise on schedule.
But what’s happening under the hood?
Recent reporting on research from Glean’s Work AI Institute found that UK digital workers save roughly 12 hours per week using AI, but spend about 6.3 to 6.4 hours supervising, correcting, rerunning, and managing AI-generated work.
That number did not surprise me.
I have seen teams build elaborate automation workflows only to discover they did not remove the work. They moved it into a new category: supervising the thing that was supposed to save them.
Some people are calling this “botsitting” (unpleasant word, but a very accurate job description).
The workflow is automated, technically. But someone still has to check whether it understood the brief, used the right context, updated the right field, routed the right lead, hallucinated something with full executive confidence, or turned a simple task into a small operational séance.
The ROI model usually misses the cleanup layer
Most AI and automation projects are evaluated with a simple promise:
Hours saved.
Tasks removed.
Headcount avoided.
Productivity improved.
The spreadsheet looks clean because spreadsheets are loyal little creatures. They will support almost any argument if you format them nicely enough.
What usually gets left out is the maintenance layer.
Every AI workflow creates obligations and someone has to monitor failures, update prompts, check outputs, manage exceptions, adjust integrations, review edge cases, and decide when the machine is confidently wrong versus merely unhelpful.
A workflow that saves ten hours, but creates five hours of supervision is not a ten-hour saving. It is a five-hour saving with a dashboard attached.
In the companies I work with, this distinction shows up very quickly.
Founders and GTM teams rarely have a “we need more AI usage” problem. They have a much more practical question: is this actually reducing the work required to create a better business outcome?
More seats, more prompts, more generated drafts, more internal screenshots of the team being “AI-first” can look good in an update deck. But none of it proves the system is improving revenue, speed, quality, or decision-making.
That is usually where I push the conversation back to the operating layer: what work is being removed, what work is being created, who owns the supervision, and whether the outcome is better than before.
Especially visible in GTM?
In go-to-market work, the difference between useful automation and expensive babysitting becomes obvious very quickly.
AI can help research accounts, summarize calls, enrich CRM data, draft first-pass outreach, flag buying signals, route leads, clean up admin work, and speed up follow-ups.
Those are good use cases when the rules are clear and the process already works manually.
But if the ICP is vague, the qualification logic changes depending on who is asking, and the handoff between marketing and sales mostly lives in someone’s head, AI will not rescue the motion.
It will expose the mess faster: the agent will not magically know which accounts matter. The scoring model will not understand political nuance. The automated sequence will not rescue a weak offer. The CRM cleanup workflow will not solve the fact that nobody agrees what a qualified opportunity actually means.
That is where botsitting becomes the real productivity tax.
People spend time feeding the tool missing context, checking whether the output makes sense, correcting errors, rerunning prompts, rewriting drafts, and explaining to themselves why the AI sounded so confident while being so deeply unserious.
I say this as someone who loves automation, builds these systems, and knows the small emotional journey of watching a workflow behave beautifully in testing and then collapse the moment it meets messy inputs, and a CRM field last updated during a previous economic cycle.
Before automating, use a filter
The best AI systems are usually built around boring clarity.
They have a narrow job.
They know where the data comes from.
They operate inside defined rules.
They flag uncertainty.
They keep a human in the loop where judgment matters.
(that last part is important!)
AI is useful when it supports judgment. It becomes expensive when it pretends to replace judgment the company has not even defined yet.
Before automating anything, I would ask six questions:
Does this task happen often enough for the saved time to compound?
Has the process survived 10-20 manual runs?
Can someone explain the workflow end to end?
Is the work rules-based, or does it require context-heavy judgment?
Does the time saved beat build time, maintenance, and error handling?
Who owns monitoring when the workflow breaks?
I use this as a simple automation filter. It keeps teams from turning every annoying task into a workflow, an agent, or a dashboard with opinions.
Automation is worth it when the work underneath it is frequent, stable, clear, and measurable. If the process is still vague, automation usually creates supervision work.
What is actually worth automating
Good automation usually starts with repetitive work that already has a clear process.
Lead routing is a strong example.
If a company knows the rules by region, segment, account owner, company size, and SLA, automation can remove delays and reduce missed handoffs. The workflow is boring, which is usually a good sign. Boring workflows behave better in public.
Follow-up reminders are another obvious case.
If sales keeps missing next steps because everything depends on memory, automate the reminder. No agent needed. No dramatic “autonomous revenue engine.” Just fewer opportunities quietly dying in the CRM because everyone was busy.
CRM cleanup can also be worth automating, especially when the rules are clear: standardize fields, detect duplicates, flag missing values, update lifecycle stages, or alert someone when records look wrong.
Document processing, proposal generation, meeting summaries, customer response workflows, Reddit monitoring, and email triage can all create real value when the inputs are predictable and the output can be reviewed quickly.
The pattern is simple:
Frequent work.
Stable process.
Clear rules.
Visible outcome.
Low drama.
A rare combination in business, but beautiful when it appears.
What should stay manual (for now)
Some work does not deserve automation yet.
One-off tasks usually fail the ROI test. If something happens once a month, takes two minutes, and does not cause errors, leave it alone. The build cost alone will quietly eat the savings while pretending to be strategic.
Unstable processes are another trap.
If the workflow changes every few weeks, automation becomes a rebuild subscription. Run the process manually first. Let it survive reality. Then automate the version that actually works.
Judgment-heavy decisions also need care.
Lead scoring, qualification, hiring decisions, strategic prioritization, and account selection can all be supported by AI, but they should rarely be fully owned by AI until the organization can explain how good judgment is being applied.
When AI starts making context-heavy decisions inside a process nobody has clearly defined, mistakes happen quickly. They also tend to arrive wearing the confidence of a senior executive who has not read the room.
And then there is automation tourism.
Every builder has done this at least once: you discover a new tool. You build something clever. It connects three systems, sends a Slack alert, updates a spreadsheet, and makes you feel briefly like a systems genius.
Then you realize it saves three minutes per week and requires emotional maintenance.
Congratulations. You built a tiny productivity monument to your own curiosity (been there myself!)
The real metric is work removed
The companies getting real value from AI are not simply adding more tools.
They are redesigning work around clearer context, better rules, and measurable outcomes.
Less “we used AI.”
More “lead response time dropped from four hours to five minutes, routing accuracy improved, and sales stopped manually fixing half the records.”
That is the difference between AI activity and AI productivity.
The market will not reward companies for generating more output if the output needs constant adult supervision.
Before building the next AI workflow, ask the question that belongs in every ROI model:
Are we eliminating work, or are we creating a new job that happens to sit behind a very impressive interface?
Source note: Based on recent reporting on Glean’s Work AI Institute research via TechRadar and ITPro.





