Rethinking Process Engines in the Age of AI




In this context, it's worth rethinking the value of visual, no-code approaches for describing complex business workflows. This article argues that for complex, long-running, and mission-critical processes, especially those incorporating AI, a "workflows-as-code" approach powered by durable execution engines offers a more robust and flexible alternative to traditional visual process modelers.
The Modern Workflow Challenge: Three Distinct Use Cases
Based on my experience, organizations tend to use process engines for three primary categories of workflows:
- Business Process Management (BPM): Handling internal and external, usually manual procedures required for running business like signing contracts, monthly reporting, hiring new staff, handling complaints.
- Office Task Automation: The democratization of tools for automating daily chores, such as creating records in CRM for meetings, sending files and emails, Slack/Teams notifications
- Client-Centric Solutions: Workflows that are a core part of a custom-built solution for customers. Examples include loan application processing, handling payments, and managing financial transfers.
Tools optimized for one area often fall short in others. There is no one-size-fits-all solution, and understanding the differences in these use cases is the key to finding an optimal one.
The Limits of Visual Modeling (BPMN)
For the last 20 years, the idea of "runnable diagrams" using Business Process Model and Notation (BPMN) has been dominant. This approach promised that process definition was also the documentation, and non-developers could modify the processes with a drag-and-drop approach.
However, my observations from working with financial organizations show a different reality:
- Organizations have hundreds of complex process definitions that are no longer easy to read as representing a complex process as graph is not always optimal.
- Despite BPMN being a standard, heavy customization and tool-specific extensions prevent easy migration of diagrams between different engines that companies brought over the years.
- Given the complexity, the number of integrations, custom parameters and plugins, workflow process definitions cannot be easily modified by non-developers.
- The actual control flow is often hidden as it depends on external data and decision-making agents not visible in the diagram.
- In practice, modifying in-flight workflows and versioning processes presents significant challenges.
Where BPMN-based tools shine is in managing multiple versions of workflows and handling thousands of simple, company-internal processes. They excel at involving people via predefined UI forms, task assignments, reminders, and escalations. Some tools also provide Robotic Process Automation (RPA) to integrate with software that is not "API-friendly." When implemented correctly, this can provide a good overview of internal procedures happening inside the company.
Introducing non-deterministic AI components makes process definitions even more complex. Existing process definitions may acquire agentic components, and many new workflows will be created to accommodate them. Graph-based workflows lack the abstractions needed to design dynamic control flows based on agentic decisions and tool selection. Implementing compensation logic for the many potential failure points can become more complex than the workflow itself.
The more complex and automated the processes are, the less value is brought by the human-centric advantages of BPMN process engines.
It is sometimes the case that simple versions of BPMN-like processes are maintained as Miro boards or similar tools for a common understanding, and then they are subsequently recreated in a BPMN tool. What seems to bring the most value is maintaining common language used across the solution, rather than the tool itself.
The Hidden Complexity of "Workflows-as-Code"
Given the limitations of visual tools, designing workflows directly in a programming language seems more effective for expressing complex scenarios, loops, and agentic decisions. It also allows developers to use familiar patterns for testing and versioning.
However, two factors make a naive "as-code" implementation difficult:
- Long-Running Processes: A process like a loan application can take days to complete, involving both automated steps and human activities. The application must "survive" server restarts and new code deployments.
- Integration Failures: Calls to external services (like a CRM or KYC/AML provider) can fail due to errors or network issues. Such actions must be retried or handled without restarting the entire process from scratch.
Implementing these requirements from scratch often results in a database-backed state machine. The foundational "framework" code for managing state, handling retries, and upgrading processes quickly becomes more complex than the business logic, slowing down feature development.
The Pitfalls of Event-Driven Choreography
To avoid building a monolithic orchestrator, architects often turn to event-driven choreography. The idea is simple: each step is a small service that listens for an event, performs its task, and emits a new event for the next step to consume. This pattern avoids long-running processes by design, breaking them into many short-lived, event-triggered functions.
Unfortunately, this approach introduces its own significant drawbacks, making it very hard to answer critical business questions:
- Which step is a specific process currently stuck at?
- How many processes are "in flight," and what are they waiting for?
- What is the complete end-to-end flow required to finish the process?
This approach requires extensive monitoring to ensure that every process eventually completes or time out. Complexity explodes when synchronization is needed. Imagine requiring two approvals that can arrive in any order over two days. Handling such an event-stream "join" requires stateful execution with all its associated drawbacks related to reliability, clustering, and data-handling—the very problems we are trying to avoid.
Inevitably, an "orchestrator component" re-emerges within the choreography pattern. One of the best-known examples is Netflix Conductor which, ironically, describes a workflow as a graph (in JSON instead of BPMN). The solution is quite popular, currently maintained as the Open-Source project Conductor OSS and provided as SaaS by Orkes.
An Alternative: The Rise of Durable Execution Engines
A more effective approach combines the flexibility of code with the execution guarantees of a traditional workflow engine. In recent years, several "durable execution engines" have emerged to solve this problem. Notable examples include:
- Cadence (Open Source from Uber)
- Temporal (Open Source and probably most popular one)
- Restate (Open Source combining workflow and event log)
The concepts behind the solutions are very similar
- Treat the workflow code execution as a log to allow re-runnability, error handling and pausing preserving state and making it independent on process time.
- Isolate process steps as “tasks” or “activities” that may fail so the framework handles retries, error handling and allows for compensation actions.
This concept of workflow code (expressed usually in Java, Python, Go, TypeScript etc.) and activity code brings the flexibility of “as code” solution and the framework handles the rest. The code execution has built-in reliability procedures that are critical in financial and other mission-critical solutions.
Furthermore, in the AI era it is possible to “talk with the code” using LLMs and streamline the process implementation like any other code.
Of course, these solutions are raw processing engines and are missing human-centric features compared to BPMN workflow engines like task assignments and automated UI generation.
Based on my experience, implementing a customer-onboarding process with Temporal turned out to be no more complex than a BPMN-based solution, and implementing a UI using the Temporal API for both the back-office and the client was quite easy. Integration options are much wider with custom code as you are not limited to connectors, and the code is arguably more readable than a BPMN graph with hidden data passing between nodes.
A key lesson learned from the BPMN-like approach is to reuse as much business language as possible in Temporal workflow descriptions to enable understanding and troubleshooting.
The major drawback compared to graph-based tools (like BPMN or Conductor) is that Temporal cannot show the process ahead of its current state.
What about agentic AI frameworks?
Every AI application is a workflow. As organizations build more sophisticated "agentic" frameworks, they are segregated into two types: no-code solutions for quick agent creation and code-based libraries (mostly in Python) for developers.
When selecting a tool, it's crucial to understand the use case. Is the goal to add an AI feature to an existing business process or to build a new, AI-native, client-centric solution? Most emerging AI frameworks focus on LLM integration, making strong assumptions about the workflow. Failing a workflow often means restarting it from the beginning—this is acceptable for simple cases but not for reliable business processes.
Before adopting the latest agentic framework for a serious financial application, ask these questions:
- How does it handle long-running processes that may last for days?
- How does it handle integration and network failures?
- How does it manage process state across restarts and releases?
- How does it handle load balancing and versioning?
Issues like distributed state management, retries, observability, and audit logging are already solved by durable execution engines like Temporal and Restate. Implementing this reliability layer is often much harder than the LLM integration itself.
This raises a fundamental question: beyond democratization, what is the value of a no-code interface for AI when AI itself is increasingly "pro-code"?
Conclusion: Choosing the Right Tool for the Job
This article's purpose is not to dismiss no-code tools or BPMN/RPA engines. Business Process Management solutions are widely adopted, and at GFT, we have a successful track record of implementing them for banking clients, with many such projects currently underway. When writing this article, I was able to exchange experiences with my fellow architects who support a variety of customers and bring different perspectives.
Choosing the right tool for a specific project requires careful analysis based on key architectural drivers. We support our customers in this selection process, whether they are migrating workflows, building new products, or initiating process automation within the project framework.
What is most important is to understand the wide range of options available to ensure selection of the most effective workflow solution in the age of AI. Ultimately, it is a strategic decision: leverage a graph-based approach and accept its complexity and potential for vendor lock-in or adopt a code-driven approach and build the missing human-centric features around it.



