SKU complexity reduction is no longer a spreadsheet problem
Most SKU rationalization programs still start the same way: export the portfolio, rank the tail, argue in a steering pack.
That is not where the hard part is.
The hard part is reasoning across fragmented operational context.
To decide whether a SKU should stay, merge, simplify, or exit, a team may need to reconcile:
- ERP product and cost data
- plant and line constraints
- supplier and packaging specifications
- forecast and service signals
- customer and channel exceptions
- substitution and assortment risk
- historical outcomes from prior initiatives
In large supply chain organizations, that context is rarely available in one clean model. It is spread across multiple ERPs, planning systems, spreadsheets, local trackers, retailer files, and a lot of institutional memory that lives only in people.
SKU complexity reduction is becoming AI-native work.
If you are still running it manually across fragmented systems, you are structurally behind. Your search is worse. Your memory is worse. Your prioritization is worse.
Why this problem exposes the real shape of enterprise decision work
SKU complexity reduction is a useful example because it looks simpler than it is.
On paper, the logic seems obvious: remove weak SKUs and simplify the network.
These are some of the messiest high-value decisions in supply chain.
A SKU can look marginal in one view and still matter because it:
- protects shelf presence
- supports a key customer relationship
- anchors a regional assortment role
- plays a role in pack-price architecture
- has meaningful substitution risk
- shares cost or production logic with more strategic products
At the same time, the hidden cost of complexity is real.
A marginal SKU can drive:
- more changeovers
- shorter runs
- lower line productivity
- worse forecast quality
- more inventory fragmentation
- more packaging and labeling variation
- more procurement and supplier overhead
- more planning and replenishment exceptions
This is where simplistic portfolio cuts die. The hard part is not spotting weak-looking products. It is separating commercially meaningful complexity from operational drag.
Public signals from the industry already point in this direction:
- Mondelez said in 2020 that it would remove 25% of SKUs, representing less than 2% of sales, to simplify its supply chain and reduce cost and inventory.
- Unilever has spoken publicly about reducing SKUs, raw and pack materials, and suppliers as part of lowering complexity.
- McKinsey has written that leading consumer companies have made bold portfolio decisions, including major SKU complexity reduction, as part of performance improvement.
This is not an argument for indiscriminate cuts. It is an argument that complexity management is now a systems capability. The companies that do it well are not running occasional clean-up exercises. They are building repeatable ways to identify, test, and learn.
Why manual approaches underperform
Most complexity programs underperform for predictable reasons.
1. They use incomplete economics
Teams often stop at volume, gross margin, or standard cost.
But a serious decision needs cost-to-serve logic, changeover burden, service implications, and substitution effects. A SKU does not earn its place because one metric looks acceptable.
2. The evidence is fragmented
The relevant facts are spread across finance systems, plant data, planning tools, commercial files, local spreadsheets, and historical trackers.
That fragmentation is what makes manual portfolio decisions slow, inconsistent, and hard to repeat.
3. Good complexity and bad complexity get mixed together
Not all complexity is waste. Some complexity buys real commercial value. Some is just inherited drag.
The challenge is not ranking the portfolio. It is distinguishing strategic complexity from accidental complexity.
4. Stakeholder logic arrives too late
An initiative can look rational and still fail because category, sales, quality, or a key customer will reject it.
If that logic only appears after weeks of analysis, the team has already spent time on the wrong branch.
5. Organizational memory is weak
The same initiatives get rediscovered repeatedly.
A team proposes a delisting, a pack simplification, or a spec harmonization without realizing it was already tried, blocked, or partially implemented under different conditions.
That is not mainly an intelligence problem. It is a memory problem.
What AI changes
AI matters here for one reason: it can act as a reasoning layer across fragmented operational context.
That means a well-designed agent system can do work such as:
- harmonizing entities across inconsistent naming systems
- linking SKU, pack, plant, supplier, and customer context
- surfacing hidden operational burdens from scattered evidence
- retrieving precedent from prior initiatives
- generating candidate initiatives with an explicit value hypothesis
- challenging those initiatives before humans waste time on them
- producing reviewer-ready artifacts with evidence, assumptions, blockers, and next steps
That is a different operating model from spreadsheet triage. The advantage compounds when the system learns from outcomes.
Each blocked initiative improves future screening. Each implemented initiative becomes reusable precedent. Each stalled initiative reveals a governance or sequencing pattern.
That is what I mean by the infinite brain. Not a system that knows everything. A system that keeps the lessons enterprises usually throw away.
The output should be an initiative, not a list
A weak system produces a ranked list of tail SKUs.
A useful system produces an initiative with a point of view.
That means the output already expresses:
- what change is proposed
- why it matters
- where the value comes from
- what could block it
- what evidence supports it
- what should be checked first
For example, instead of saying:
SKU X and SKU Y are low velocity.
A stronger output says:
Retire two low-velocity regional variants in one customer cluster, consolidate demand into the core pack set, and validate whether the simplification benefit survives customer and substitution review.
That is a better decision object because it is already framed around action, evidence, and feasibility.
The architecture that matters
The right pattern is a feedback system with three core layers, not a recommendation machine that leaves humans to clean up the mess.
1. Generator
The generator works from structured enterprise context:
- products, SKUs, plants, suppliers, customers, channels
- costs, volumes, service levels, and specifications
- historical initiative outcomes
- capability constraints and commercial guardrails
Its job is to propose plausible initiatives, not final truth.
2. Adversarial validator
Every initiative should be challenged before human review.
The validator asks questions like:
- Is this allowed by policy and customer constraints?
- Has this been tried before?
- Is the savings pool material?
- What assortment or substitution risk exists?
- What operational burden is removed or introduced?
- What is the cheapest disconfirming test?
This is one of the clearest differences between a serious enterprise system and a demo. A demo generates. A production system generates and attacks itself.
3. Outcome memory
Every implementation, block, rejection, and stall should feed back into memory.
That is what turns one-off analysis into compounding intelligence. Without outcome memory, the system stays clever but forgetful. With outcome memory, it starts behaving more like institutional intelligence.
A simple example
Take a seemingly straightforward initiative:
Remove two low-velocity pack variants from a regional assortment and redirect volume into the core pack architecture.
A weak system may rank that initiative on volume alone.
A stronger system asks:
- Are these genuinely substitutable?
- Will demand transfer or disappear?
- Do these variants matter for a major customer discussion?
- What is the real simplification benefit in planning, manufacturing, procurement, and inventory?
- Does the retailer reset cycle change the timing?
- What is the safest pilot market or customer cluster?
That is the shape of the real decision.
The value of the agent is not that it replaces judgment. The value is speed to better judgment.
Why this becomes a competitive requirement
AI is getting better, but that is not the main driver. The main driver is that the decision problem keeps getting harder.
Supply chains are carrying:
- more channel variation
- more pack and format proliferation
- more customer-specific constraints
- more regulatory complexity
- more post-merger systems fragmentation
- more pressure to find productivity without weakening service
So the cost of managing complexity manually keeps rising.
The companies that get ahead will build better decision systems, not prettier dashboards:
- faster evidence gathering
- better cross-system reconciliation
- stronger memory of prior initiatives
- tighter prioritization
- clearer reviewer artifacts
- more disciplined testing of risky assumptions
That shifts the strategic question.
It is no longer only:
Do we have a SKU rationalization framework?
It is increasingly:
Do we have an intelligent system that can continuously identify, challenge, prioritize, and learn from SKU complexity reduction initiatives across a fragmented operating environment?
If the answer is no, then yes, you are falling behind.
Final thought
The opportunity is not more recommendations. It is a system that gets better every time reality pushes back.
In supply chain, that means connecting generation, validation, cost-to-serve logic, substitution logic, capability constraints, stakeholder logic, and outcome memory into one operating loop.
When that loop exists, AI stops behaving like a smart assistant with amnesia. It starts behaving like institutional reasoning.
In this domain, memory compounds. And memory turns into margin.