Published
Author
Marginstone
Category
Engineering
Reading Time
6 min read
Tags
Agentic WorkflowsSystem DesignAIPhilosophy
Back to Writing

Agentic Workflows Manifesto

A philosophy for building agents that are useful, reliable, and worthy of trust. We believe the future of software execution is neither brittle automation nor unconstrained autonomy.

A philosophy for building agents that are useful, reliable, and worthy of trust.

Preamble

The future of software will not be won by scripts alone. It will not be won by raw model autonomy either.

It will be won by agentic workflow design: environments where capable agents can act with range while staying pinned to explicit state, policy, and validation.

Scripts matter. Prompts matter. The environment matters more than both.

Good systems make good work easier than bad work. They keep uncertainty visible, evidence inspectable, and repair cheaper than reinvention.


What we believe

1. Environment matters more than prompting.

Prompts can nudge behavior. Environments decide it.

The real control surface in an agentic workflow is the combination of:

  • tools
  • visible state
  • artifact contracts
  • policies
  • validators
  • feedback loops

If outcomes are poor, the answer is usually not “write a cleverer prompt.” It is “design a better environment.”

2. The right pattern is deterministic substrate, agentic policy.

We do not want to script every move.

We want a deterministic substrate underneath a flexible policy layer.

The world the agent operates in should have stable artifacts, explicit state, tool contracts, and known validation rules. Inside that structure, the agent should still be able to investigate, sequence, retry, repair, and adapt.

Rigid scripts break on edge cases. Unbounded autonomy breaks trust. The design job is to give the agent room to reason without giving it room to fabricate.

3. Typed artifacts are better than fluent prose.

Agents perform better when the world contains explicit, stable objects.

We prefer:

  • structured files over informal notes
  • schemas over implied contracts
  • machine-checkable status over narrative reassurance
  • provenance-bearing state over ephemeral memory

A mature workflow is built from artifacts that can be inspected, validated, repaired, diffed, and reused.

4. Validation is part of the workflow, not a final gate.

Reliable systems are defined less by what they can generate than by what they can verify.

Every important workflow should make it easy to ask:

  • Is this grounded?
  • Is it in policy?
  • Is it complete?
  • Is it valid?
  • If it failed, why exactly?

The strongest systems do not wait until the end to discover that the work drifted.

5. Agents need bounded autonomy.

Agents need freedom in planning, diagnosis, tool choice, decomposition, and local repair. They do not need freedom to invent evidence, silently broaden scope, bypass validators, or hide uncertainty.

Useful freedom lives inside hard boundaries.

6. Dense feedback beats binary pass/fail.

A single success or failure signal at the end is weak.

Good workflows expose local signals such as:

  • schema valid or invalid
  • evidence sufficient or insufficient
  • confidence high, medium, or low
  • policy in-bounds or out-of-bounds
  • coverage complete, partial, or failed

Dense feedback is what makes search efficient and repair local.

7. Reversibility is a design requirement.

Agents should be able to make small, safe, reversible moves.

A healthy workflow supports:

  • rerunning one stage
  • repairing one artifact
  • retrying one failed validator
  • comparing current and previous outputs
  • fixing a local defect without regenerating everything

Global regeneration should be the exception, not the default.

8. Human verification is a first-class design goal.

In many enterprise workflows, correctness cannot be established through deterministic tests alone.

Software engineering gives us useful proxies such as compilation, tests, and type checks. Many other workflows do not. In research, operations, procurement, planning, and strategy, the crucial question is often whether a qualified human can verify the output efficiently.

We therefore design workflows so that:

  • important claims are adjacent to evidence
  • assumptions are explicit
  • uncertainty is visible
  • provenance is preserved
  • review is targeted by risk

We are after high-leverage human supervision, not blind automation.

9. Trust should be earned through inspectability.

Trust does not come from polish or speed. It comes from making it easy to answer:

  • Where did this come from?
  • What evidence supports it?
  • What was observed directly?
  • What was inferred?
  • What remains uncertain?
  • What should a skeptical reviewer check first?

An output that cannot be checked easily is not ready to be trusted.

10. Memory should be externalized.

Important context should not live only inside transient model state.

It should live in files, manifests, logs, scratchpads, structured memory stores, and other inspectable artifacts.

Externalized memory improves continuity, auditability, transferability, and multi-agent coordination.

11. Context should be engineered, not merely supplied.

More context is not automatically better.

Performance depends on what information is presented, in what order, at what level of abstraction, and with what retrieval strategy.

Good systems reveal the essential surface first and disclose deeper detail only when needed.

Context design is part of the control surface.

12. Reliability means convergence, not one impressive run.

One strong demo proves almost nothing.

A workflow is mature when multiple competent agents, taking different paths, converge toward similar high-quality outputs under the same constraints.

Convergence is the standard.


What we reject

  • prompts alone can guarantee quality
  • autonomy is inherently good
  • one-shot outputs are trustworthy by default
  • hidden assumptions are acceptable in production workflows
  • one good run proves a workflow is mature
  • AI-native should mean unstructured

Our operating model

Deterministic substrate

  • structured state
  • typed artifacts
  • tool contracts
  • policy tables
  • validators
  • logs
  • memory stores

Agentic policy layer

  • planning
  • decomposition
  • sequencing
  • repair
  • retry
  • prioritization
  • adaptive tool use

Feedback layer

  • validation results
  • quality signals
  • confidence markers
  • failure codes
  • runtime traces
  • portfolio summaries

Human verification layer

  • reviewer-oriented summaries
  • evidence packets
  • claim-to-evidence adjacency
  • risk-based review queues
  • approval, rejection, and escalation surfaces

This is how we align flexibility with reliability.


Our standard

An agentic workflow is good when:

  1. the correct path is easier than the incorrect path
  2. failure is visible and actionable
  3. uncertainty is surfaced instead of hidden
  4. repair is possible without collapse
  5. multiple agents converge toward similar good results
  6. a time-constrained human can verify important claims without redoing the task from scratch
  7. outputs are useful in the real world

Final declaration

We are not trying to make software sound intelligent.

We are trying to make intelligence dependable.

That means substrate, constraints, validation, externalized memory, and human verification by design. It means trust earned through inspectable mechanism, not mystique.

We do not script the intelligence. We build the conditions under which intelligence can do useful work reliably.

That is agentic workflow design.