Below is an explanation of what’s likely happening and why the Wait node appears to behave unexpectedly in your setup.
How the Wait Node Resumes Execution
The Wait node is designed to pause a workflow and then resume the same execution via a generated resume URL (available as the variable $execution.resumeUrl). In typical cases, when the trigger (for example, a webhook call) reaches that URL, the original execution’s data is reloaded from the database and processing continues from where it left off.
What’s Likely Causing the Issue
In your scenario, the workflow reaches the Wait node after triggering the Azure DevOps pipeline. However, the logs show that the original execution is “removed” (i.e. offloaded or cleared from active memory) about 30 seconds later. When the resume URL is called after this delay, n8n starts a new execution from the webhook call.
Because the Wait node isn’t set up as an entry point for workflow starts, this new execution lacks a valid starting node. This results in the error message:
“No node to start the workflow from could be found (execution 559)”
This behavior is more noticeable with Queue mode enabled because the execution is truly offloaded to the database after a certain delay. In this state, once the original execution is removed, the system no longer has the context needed to “resume” from the Wait node.
Why It Differs from Your Expectations
During testing, you might see all the nodes run correctly, and the pipeline starts as expected. The confusion arises because:
- The Wait node generates a resume URL tied to the original execution.
- In Queue mode under your containerized environment, the original execution data is purged (or “removed”) once the wait condition is met or after a certain timeout.
- When the resume URL is eventually invoked, it triggers a new workflow execution from the webhook processor rather than resuming the initial execution.
This is why you see errors in both the main processing log and on nodes like the webhook processor—even though you expected the main (original) execution to handle the entire workflow.
What You Can Do
While splitting the workflow to use a dedicated webhook entry point is one workaround, here are a few points to consider if you wish to stick with the current design:
-
Timeliness of the Resume Call:
Ensure that the resume webhook call is made before the original execution context is removed. In environments or modes (like Queue mode) where there might be a delay, the wait may expire too early.
-
Execution Mode Considerations:
Queue mode offloads execution data based on timing. In environments like Azure Container Apps, there might be additional delay or architectural factors (such as container scaling or network latency) that cause the resume call to occur after the execution is purged.
-
Workflow Design Implication:
The Wait node’s resume URL is not meant to be a new entry point for a workflow. It’s intended to “wake up” the same running execution with its stored data. If for any reason that context is lost, you’ll see errors like the one you’re encountering.
Recommended Approach
If adjusting the timing isn’t feasible, you may need to consider one of the following:
-
Split the Workflow:
Use a dedicated webhook-triggered workflow to resume execution. This means having one workflow that triggers the process and delegates the wait/resume action to another workflow that has a webhook entry point. While not ideal if you prefer a single workflow, it aligns with how n8n expects resume behavior to work when the original execution context isn’t available.
-
Optimize the Environment:
Explore whether tuning the settings related to execution offloading or the timing in Queue mode mitigates the delay. However, note that architecture constraints (like those in Azure Container Apps) could inherently impose delays.
Summary
The unexpected behavior you’re experiencing is due to the fact that in Queue mode on your deployed setup, by the time the resume webhook is called, the original execution (tied to the Wait node) has already been removed from active memory. This leads the resume URL to start a new execution where the Wait node isn’t defined as an entry point, causing the error.
I hope this helps clarify why the Wait node behaves the way it does on your setup. If you have further questions on configuring the Wait node or need assistance with workflow design adjustments, feel free to ask!