-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I want to use dispatch-workflow to manually trigger CI deploys for various microservices. For this, I may want to deploy dozens of services simultaneously, which results in dozens of concurrent workflow dispatches.
For example, I may tick all the checkboxes and hit "Approve and deploy" at the same time. Each of these will trigger a workflow dispatch:

I have started encountering a behaviour where a given service's workflow dispatch isn't discovered, even when changing the exponential backoff parameters so aggressively that it keeps retrying for 1m 46s!
🔄 Exponential backoff parameters:
starting-delay: 200
max-attempts: 10
time-multiple: 2
⌛ Fetching workflow id for deploy.yml
✅ Fetched workflow id: REDACTED
✅ Successfully dispatched workflow using workflow_dispatch method:
repository: REDACTED
branch: main
workflow-id: REDACTED
distinct-id: 7b4db870-2a2c-474e-973d-ae9c3ce5502b
workflow-inputs: {"env":"dev","service":"redacted"}
⌛ Fetching run-ids for workflow with distinct-id=7b4db870-2a2c-474e-973d-ae9c3ce5502b
Warning: 🟠 Does the token have the correct permissions?
Error: 🔴 Failed to complete:
getDispatchedWorkflowRun: Failed to find dispatched workflow
Distinct ID: 7b4db870-2a2c-474e-973d-ae9c3ce5502b
I turned on debug logging, and can see logs that look suspiciously like limited / paginated results that haven't been followed.
For example, take this deploy that did find the workflow dispatch, but took 29 seconds to find it!
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312561579,14312482181,14312482192,14312482436,14312481975]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312561579,14312482181,14312482192,14312482436,14312481975]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312561579,14312482181,14312482192,14312482436,14312481975]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312600660,14312600650,14312600645,14312600640,14312561579]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312601005,14312600650,14312600645,14312600640,14312600660]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312601005,14312600650,14312600645,14312600640,14312600660]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312601005,14312600640,14312600660,14312600645,14312600650]
##[debug]
##[debug]Fetched Workflow Runs
##[debug]Repository: REDACTED
##[debug]Branch: main
##[debug]Runs Fetched: [14312601005,14312600640,14312600660,14312600854,14312600645]
✅ Successfully identified remote run:
run-id: 14312600854
run-url: https://github.com/REDACTED/actions/runs/14312600854
So basically we got the same run results back every time, until on the final API request the limited response finally happened to include our desired run ID.
As I can see from action's code, the value '5' seems to be some magic number page size limit on PRs (and on main it's 10):
dispatch-workflow/src/api/index.ts
Lines 120 to 132 in 5623bf1
| response = await octokit.rest.actions.listWorkflowRuns({ | |
| owner: config.owner, | |
| repo: config.repo, | |
| workflow_id: config.workflow, | |
| ...(branchName | |
| ? { | |
| branch: branchName, | |
| per_page: 5 | |
| } | |
| : { | |
| per_page: 10 | |
| }) | |
| }) |
I guess perhaps the reason you chose to limit the response was to make the API more performant so it can perform its polling more efficiently, however in my use case it breaks the action.
I guess the fix requires the action to follow the paginated responses as far as they go to ensure the desired workflow dispatch is included.
Scaling issue?
If we follow the paginated response all the way to the end, this will introduce a scaling issue if there were a large number of workflow dispatches on a branch.
For example, in my use-case I will be making hundreds/thousands of workflow dispatches over time on the same branch (the scenario is, I merge my changes to main, then deploy to production via workflow dispatch, multiplied by hundreds of deploys). Over time the action will take longer and longer to reach the end of the paginated responses...
I think there is a solution. I looked at the List workflow runs for a repository API, and I see there is a created field:
created string
Returns workflow runs created within the given date-time range. For more information on the syntax, see "Understanding the search syntax."
Suggested Changes
I think it would make sense to make the following changes:
- The
createdfield should be assigned so that thelistWorkflowRunscall only shows workflow dispatches that were created after the action was triggered. This would involve recording a timestamp whenlasith-kg/dispatch-workflowis first invoked, then calling thelistWorkflowRunsAPI with this value (perhaps subtract a few seconds to account for any clock drift between the runner and GitHub APIs). - The
per_pageoption could possibly be increased from5. Maybe this could be an optional input if you are worried about performance regressions. - The paginated results should be followed. It looks like your usage of
octokitdoes not currently do this. The docs for how to implement pagination for octokit can be found here.