Agentic Workflows in Fintech: Orchestrating LLM Agents for Autonomous Decision-Making
How autonomous AI agents are transforming financial services — from loan underwriting to fraud investigation — and the architectural patterns that make them production-ready.
Introduction
The convergence of large language models and agentic architectures is opening new frontiers in financial services. Unlike traditional ML pipelines that output a single prediction, agentic workflows enable autonomous, multi-step reasoning — an LLM that can plan, use tools, retrieve documents, and take actions in a loop until a task is complete.
At American Express, we've been exploring how these patterns can transform core banking operations. This post distills our learnings.
The Problem: Why Static Pipelines Fall Short
Traditional ML in fintech follows a rigid predict → threshold → action pattern:
# Traditional pipeline — brittle, single-step
prediction = model.predict(transaction_features)
if prediction > FRAUD_THRESHOLD:
block_transaction()This works for well-defined tasks but struggles with ambiguous, multi-faceted decisions — e.g., investigating a suspicious pattern across multiple accounts that requires cross-referencing documents, policies, and transaction histories.
Architecture: The Agent Loop
The core of an agentic system is the ReAct loop (Reason + Act):
Where is the current state, is an action (tool call), and the cycle repeats until a termination condition is met.
class FinancialAgent:
def __init__(self, llm, tools, memory):
self.llm = llm
self.tools = tools
self.memory = memory
async def run(self, task: str) -> str:
"""Execute the agent loop until task completion."""
self.memory.add("user", task)
for step in range(MAX_STEPS):
response = await self.llm.generate(
messages=self.memory.messages,
tools=self.tools.schema(),
)
if response.is_final:
return response.content
# Execute tool call
result = await self.tools.execute(response.tool_call)
self.memory.add("tool", result)
return "Max steps reached"Key Design Decisions
1. Tool Design Granularity
Keep tools atomic and composable. A search_transactions tool should not also filter and summarize — let the LLM compose smaller primitives.
2. Guardrails for Regulated Environments
In fintech, agents must operate within strict compliance boundaries:
- Action allowlists: Only pre-approved tools can be called
- Human-in-the-loop: High-value decisions require human approval
- Audit trails: Every reasoning step is logged immutably
3. Memory Architecture
Short-term (conversation) + long-term (vector DB) memory enables agents to learn from past investigations without retraining.
Key Takeaways
- Agentic workflows excel at ambiguous, multi-step tasks that traditional pipelines can't handle
- Tool design is more important than model choice — well-designed tools make mediocre models effective
- Guardrails are non-negotiable in regulated industries — build them into the architecture, not as afterthoughts
- Start with human-in-the-loop and gradually increase autonomy as trust is established
References
- Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models" (2023)
- Shinn et al., "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)
- Chase, "LangChain: Building applications with LLMs through composability" (2023)
Written by
Rohit Raj
Senior AI Engineer @ American Express