Logic Apps Error Handling With Nested Scopes Broken

by Admin 52 views
Logic Apps Error Handling with Nested Scopes Broken

Hey guys, are you experiencing issues with error handling in your Logic Apps workflows? Specifically, is your try/catch logic failing when you have nested scopes? It seems like some of us are facing this problem, especially when using the Service Bus peek lock with token renew pattern. Let's dive into the details and see what's going on.

Describe the Bug

We've got workflows that rely on the Service Bus peek lock mechanism, combined with token renewal. This setup is crucial for ensuring that messages are processed reliably. Check out the workflow template:

[Diagram of Workflow Template]

The template sets up a workflow structure like this:

[Diagram of Workflow Structure]

The idea here is that if something goes wrong within the Business Logic scope, the error should bubble up to the parent Scope. This is vital because it triggers the Compensation Logic action, which is responsible for abandoning the message and preventing it from being completed prematurely.

[Diagram of Error Handling Logic]

The Problem

Here's the kicker: It seems that in a more recent release of Logic Apps, this error propagation isn't working as expected. The Scope is no longer picking up the error from the nested Business Logic scope. Instead, the Scope incorrectly reports a Successful status, leading to the message being completed rather than abandoned. This effectively breaks the error handling mechanism, and every workflow run appears to succeed, even when it shouldn't.

Plan Type

We're seeing this issue in the Standard plan type.

Steps to Reproduce the Bug or Issue

To reproduce this bug, you can create a workflow using the template mentioned above. Then, introduce an action within the Business Logic Scope that is guaranteed to fail. A simple example is an HTTP GET request to an invalid address:

[Diagram of HTTP Get Action Configuration]

When the workflow runs, the HTTP request will fail. However, instead of abandoning the message (as it should), the workflow completes the message because the Scope is incorrectly marked as successful:

[Diagram of Workflow Run Result]

Workflow JSON

Here's the JSON definition of the workflow for your reference:

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Abandon_the_message_in_a_queue": {
                "type": "ServiceProvider",
                "inputs": {
                    "parameters": {
                        "queueName": "@parameters('queueName_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')",
                        "lockToken": "@triggerBody()?['lockToken']"
                    },
                    "serviceProviderConfiguration": {
                        "connectionName": "serviceBus_Get_Message_Service_Bus_Queue_Peek_Lock_Renew",
                        "operationId": "abandonQueueMessageV2",
                        "serviceProviderId": "/serviceProviders/serviceBus"
                    }
                },
                "runAfter": {
                    "Compensation_Logic": [
                        "SUCCEEDED"
                    ]
                }
            },
            "Compensation_Logic": {
                "type": "Compose",
                "inputs": "Replace this action with the actions for your exception handling logic.",
                "runAfter": {
                    "Scope": [
                        "FAILED",
                        "SKIPPED",
                        "TIMEDOUT"
                    ]
                }
            },
            "Complete_the_message_in_a_queue": {
                "type": "ServiceProvider",
                "inputs": {
                    "parameters": {
                        "queueName": "@parameters('queueName_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')",
                        "lockToken": "@triggerBody()?['lockToken']"
                    },
                    "serviceProviderConfiguration": {
                        "connectionName": "serviceBus_Get_Message_Service_Bus_Queue_Peek_Lock_Renew",
                        "operationId": "completeQueueMessageV2",
                        "serviceProviderId": "/serviceProviders/serviceBus"
                    }
                },
                "runAfter": {
                    "Scope": [
                        "SUCCEEDED"
                    ]
                }
            },
            "Initialize_Process_Complete_Flag": {
                "type": "InitializeVariable",
                "inputs": {
                    "variables": [
                        {
                            "name": "processCompleted",
                            "type": "boolean",
                            "value": false
                        }
                    ]
                },
                "runAfter": {}
            },
            "Scope": {
                "type": "Scope",
                "actions": {
                    "Business_Logic_Scope": {
                        "type": "Scope",
                        "actions": {
                            "Business_Logic": {
                                "type": "Compose",
                                "inputs": "Replace this action with the actions for your business logic."
                            },
                            "HTTP": {
                                "type": "Http",
                                "inputs": {
                                    "uri": "https://someinvalidurl321.com",
                                    "method": "GET"
                                },
                                "runAfter": {
                                    "Business_Logic": [
                                        "SUCCEEDED"
                                    ]
                                },
                                "runtimeConfiguration": {
                                    "contentTransfer": {
                                        "transferMode": "Chunked"
                                    }
                                }
                            }
                        }
                    },
                    "Process_Finished": {
                        "type": "SetVariable",
                        "inputs": {
                            "name": "processCompleted",
                            "value": true
                        },
                        "runAfter": {
                            "Business_Logic_Scope": [
                                "SUCCEEDED",
                                "TIMEDOUT",
                                "SKIPPED",
                                "FAILED"
                            ]
                        }
                    },
                    "Until": {
                        "type": "Until",
                        "expression": "@equals(variables('processCompleted'),true)",
                        "limit": {
                            "count": 60,
                            "timeout": "PT1H"
                        },
                        "actions": {
                            "Renew_lock_on_a_message_in_queue": {
                                "type": "ServiceProvider",
                                "inputs": {
                                    "parameters": {
                                        "queueName": "@parameters('queueName_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')",
                                        "lockToken": "@triggerBody()?['lockToken']"
                                    },
                                    "serviceProviderConfiguration": {
                                        "connectionName": "serviceBus_Get_Message_Service_Bus_Queue_Peek_Lock_Renew",
                                        "operationId": "renewLockQueueMessageV2",
                                        "serviceProviderId": "/serviceProviders/serviceBus"
                                    }
                                }
                            },
                            "Wait_for_Process_to_Complete": {
                                "type": "Wait",
                                "inputs": {
                                    "interval": {
                                        "count": "@parameters('delayInMinutes_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')",
                                        "unit": "Minute"
                                    }
                                },
                                "runAfter": {
                                    "Renew_lock_on_a_message_in_queue": [
                                        "SUCCEEDED"
                                    ]
                                }
                            }
                        }
                    }
                },
                "runAfter": {
                    "Initialize_Process_Complete_Flag": [
                        "SUCCEEDED"
                    ]
                }
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "triggers": {
            "When_messages_are_available_in_a_queue_(peek-lock)": {
                "type": "ServiceProvider",
                "inputs": {
                    "parameters": {
                        "queueName": "@parameters('queueName_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')",
                        "maxMessageBatchSize": "@parameters('messageBatchSize_Get_Message_Service_Bus_Queue_Peek_Lock_Renew')"
                    },
                    "serviceProviderConfiguration": {
                        "connectionName": "serviceBus_Get_Message_Service_Bus_Queue_Peek_Lock_Renew",
                        "operationId": "peekLockQueueMessagesV2",
                        "serviceProviderId": "/serviceProviders/serviceBus"
                    }
                },
                "splitOn": "@triggerOutputs()?['body']"
            }
        }
    },
    "kind": "stateful"
}

Screenshots or Videos

(Unfortunately, no screenshots or videos were provided in the original bug report.)

Additional Context

(No additional context was provided in the original bug report.)

Impact

This bug significantly impacts the reliability of workflows that depend on proper error handling with nested scopes. Messages may be incorrectly completed even when errors occur, leading to data loss or inconsistent processing.

Possible Causes

It's possible that a recent update to the Logic Apps engine has introduced a regression in how scope statuses are evaluated or propagated. It's also possible there is a change in how errors are handled in nested scopes.

Next Steps

  1. Verify the Issue: Try reproducing this bug in your own Logic Apps environments to confirm that it's not isolated.
  2. Report to Microsoft Support: If you can reproduce the issue, file a bug report with Microsoft Support, providing detailed steps and the workflow JSON.
  3. Explore Workarounds: While waiting for a fix, investigate potential workarounds, such as:
    • Consolidating Scopes: Restructure the workflow to minimize nested scopes, if feasible.
    • Manual Error Propagation: Explicitly set the status of the parent Scope based on the outcome of the Business Logic Scope.
    • Custom Error Tracking: Implement a custom mechanism to track errors and trigger compensation logic.

Community Discussion

Has anyone else encountered this issue? Share your experiences and any workarounds you've discovered in the comments below! Let's work together to find a solution and get this bug addressed.

This bug could be a serious issue for many Logic Apps users, so it's important to raise awareness and collaborate on finding a resolution.

Let's get this fixed, guys!