Error Handling

Overview

This section explores strategies for identifying, surfacing, and recovering from errors within a workflow to maintain operational integrity and minimize downtime.

Executor Options

ExecutorOptions define how the workflow should behave during execution, this is currently scoped to behavior upon error. This allows you to control whether the workflow pauses for error handling or proceeds automatically, depending on the situation.

Here’s how to configure the executor options in your workflow definition:

workflow = Workflow(
    workcell=workcell,
    #...
    options=Options(
        executor=ExecutorOptions(
            behaviour_on_failure="pause"
        )
    )
)
class linq.workflow.ExecutorOptions(*, behaviour_on_failure: Literal['abort', 'pause', 'continue'])

Options for the execution of a workflow.

behaviour_on_failure: Literal['abort', 'pause', 'continue']

Behaviour of the system when a task fails. Defaults to cancelling the workflow.

The options for behaviour_on_failure will execute as follows:

  • "PAUSE": the workflow will stop at the point of failure, allowing an operator to manually decide the next steps. This is outlined below.

  • "ABORT": the workflow will immediately terminate when any task fails.

  • "CONTINUE": the workflow will ignore the failure and proceed with the remaining tasks. This is not recommended except under specific use cases.

Tip

"PAUSE" is the recommended option.

Get Latest Error

This command retrieves the most recent error encountered by the workflow on the specified workcell. The resulting Error ID can be used for mitigating the error through Error Run Controls below.

linq workcell error get-latest --workcell-id=<workcell-id>

This command will return the error ID and a list of labware IDs affected by the error. Labwares affected are those whose labware flow is affected by the error.

Error Run Controls

Once you have the Error ID, the next step is to decide how to respond to the error. You can use the CLI tool to interact with the error by adding respond to the following commands.

  • SKIP TASK: Bypass the current failed task and continue with the next one in the workflow. When skipping a transport task, be cautious as it skips both the robot and transport layer movements. Ensure that you manually place the labware on the next instrument if needed.

    ```bash
        linq workcell error respond --workcell-id= --error-id= --action= skip-task
    ```
    
  • RETRY TASK: Re-attempt the failed task. This option is focused mainly on scara and transport tasks. This is useful if the task is recoverable. If retry fails, you can manually move the plate to the next slot and Skip-TasK instead.

    ```bash
        linq workcell error respond --workcell-id= --error-id= --action= retry-task
    ```
    
  • ABORT LABWARE: Abort labware related to the error and continue execution for labware whose flow is not affected by the error. Affected labware will not be continued and need to be cleared from the system before continuing.

    Labware is considered affected when it is dependent on the labware journey of labware involved in the error, for example if a labware is used alongside an errored labware. Task dependencies do not influence affected labware.
    
    ```bash
        linq workcell error respond --workcell-id= --error-id= --action= abort-labware
    ```
    This command will return the list of labware that is affected by the error and that needs to be removed from the workcell:
    ```console
    The following labware will be affected by the abort-labware action:
    ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
    ┃ Labware ID ┃ Reason        ┃ Current Location     ┃
    ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
    │ labware_1  │ some reason 1 │ instrument_x, slot 1 │
    │ labware_2  │ some reason 2 │ instrument_y, slot 2 │
    └────────────┴───────────────┴──────────────────────┘
    Remove the labware from the workcell and then choose confirm, or choose cancel
    ```
    This action can be confirmed or canceled. Canceling will not execute the abort-labware action or abort the run, other error handling actions can still be attempted after cancelling.
    
  • ABORT: Stop the current workflow and terminate execution entirely. This is the final option if the error is unrecoverable. You will be required to reset after aborting.

    ```bash
        linq workcell error respond --workcell-id= --error-id= --action= abort
    ```
    
  • RESET: This option resets the specific workcell, returning the workcell to its initial state. You will be required to abort the workflow before resetting the workcell.

        linq workcell error respond --workcell-id= --error-id= --action= reset
    

Available Failure Actions

The available_failure_actions parameter allows you to override the default error handling behavior in an ActionTask. By specifying options, you can control what recovery actions are available to the operator when a task fails. This is useful when retrying isn’t beneficial or when it’s safe to ignore an error.

wash_dilution_plate = ActionTask(
    ...
    available_failure_actions=["retry", "ignore"],
)

Options

  • complete_manually: Allows manual completion of the task.

  • retry: Enables retrying the failed task.

  • ignore: Skips the failure and continues the workflow.