Error Handling

Overview

This section explores strategies for identifying, surfacing, and recovering from errors within a workflow to maintain operational integrity and minimize downtime.

Executor Options

ExecutorOptions define how a workflow should respond when errors occur during execution. They allow you to specify, in advance, whether the workflow pauses for error handling or terminates immediately, based on your workflow requirements.

Here’s how to configure the executor options in your workflow definition:

workflow = Workflow(
    workcell=workcell,
    #...
    options=Options(
        executor=ExecutorOptions(
            behaviour_on_failure="pause"
        )
    )
)

class linq.workflow.ExecutorOptions(*, behaviour_on_failure: Literal['abort', 'pause'])

Options for the execution of a workflow.

behaviour_on_failure: Literal['abort', 'pause']: Behaviour of the system when a task fails. Defaults to cancelling the workflow.

The options for behaviour_on_failure will execute as follows:

"PAUSE": the workflow will stop at the point of failure, allowing an operator to manually decide the next steps. This is outlined below.
"ABORT": the workflow will immediately terminate when any task fails.

Tip

"PAUSE" is the recommended option.

Get Latest Error

This command retrieves the most recent error encountered by the workflow on the specified workcell and returns its Error ID. The Error ID uniquely identifies the error event and can be used with Error Run Controls to decide how to recover. An error may also affect any labware whose workflow depends on the failed step.

linq workcell error get-latest --workcell-id=<workcell-id>

This command will return the error ID. The error ID represents the error event, which may also impact labware that depends on the failed step.

Error Run Controls

Once you have the Error ID, the next step is to decide how to respond to the error. You can use the CLI tool to interact with the error by adding respond to the following commands.

SKIP TASK: Bypass the current failed task and continue with the next one in the workflow. When skipping a transport task, be cautious as it skips both the robot and transport layer movements. Ensure that you manually place the labware on the next instrument if needed.
```
    linq workcell error respond --workcell-id= --error-id= --action= skip-task
```
RETRY TASK: Re-attempt the failed task. This option is focused mainly on scara and transport tasks. This is useful if the task is recoverable. If retry fails, you can manually move the plate to the next slot and Skip-TasK instead.
```
    linq workcell error respond --workcell-id= --error-id= --action= retry-task
```

ABORT LABWARE: Abort labware related to the error and continue execution for labware whose flow is not affected. Affected labware will not be processed further and must be cleared from the system before execution can continue.

Labware is considered affected when it depends on the journey of labware involved in the error. For example, if one labware is used alongside another that has errored, it is also considered affected. Task dependencies do not determine whether labware is affected.

    linq workcell error respond --workcell-id= --error-id= --action= abort-labware

This command will return the list of labware that is affected by the error and that needs to be removed from the workcell:

The following labware will be affected by the abort-labware action:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ Labware ID ┃ Reason        ┃ Current Location     ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ labware_1  │ some reason 1 │ instrument_x, slot 1 │
│ labware_2  │ some reason 2 │ instrument_y, slot 2 │
└────────────┴───────────────┴──────────────────────┘
Remove the labware from the workcell and then choose confirm, or choose cancel.

This action can be confirmed or canceled. Canceling will not execute the abort-labware action or abort the run, other error handling actions can still be attempted after cancelling.

For a video tutorial on using abort-labware see the error handling tutorials page.

ABORT: Stop the current workflow and terminate execution entirely. This is the final option if the error is unrecoverable. You will be required to reset after aborting.
```
    linq workcell error respond --workcell-id= --error-id= --action= abort
```
RESET: This option resets the specific workcell, returning the workcell to its initial state. You will be required to abort the workflow before resetting the workcell.
```
    linq workcell error respond --workcell-id= --error-id= --action= reset
```

Available Failure Actions

The available_failure_actions parameter allows you to override the default error handling behavior in an ActionTask. By specifying options, you can control what recovery actions are available to the operator when a task fails. Only complete_manually and retry are supported.

wash_dilution_plate = ActionTask(
    ...
    available_failure_actions=["retry"],
)

Options

complete_manually: Allows manual completion of the task.
retry: Enables retrying the failed task.