AWS Step Functions - Variables and JSONata

I finally had a chance to review some of the newer features of AWS Step Functions. In particular, I was very excited to explore the possibilities of defining Variables which are available throughout the entire execution. Moreover, JSONata appears to add an incredibly expressive syntax to expand the logic that we can apply within native states/tasks. I encourage the reader to explore some of the blog posts demonstrating the “before and after”, in which JSONata has the potential to reduce the need to hop into Lambda functions for more “advnaced” logic.

Below I am going to run through a simple flow that uses conditional execution. While the example below is basic, it’s not a far stretch to think about how this type of solution could be applied to create agentic workflows. More on that below.

Step Functions are underrated

I don’t think AWS Step Functions get enough attention in the data world. In a future post I will discuss the benefits of this service, but for now, a simple list:

Managed Service
Error Handling
Retry logic
Self-loops, parallel execution, and even using child executions
Logging
Low Code builder but with full IaC support. Think: sandbox an idea in the console, and then rip out the code for deployment.

Our First Flow

Go ahead and create a new Step Function from the AWS Console. For my purposes, I am going to use a Standard workflow, but note that in practice we very likely would be leverage Express workflows. Discussing the tradeoffs between the two is outside the scope of this article.

I am going to step through the logic in order to hopefully help with the intuition of Variables. If you haven’t used Step Functions in the past, below is the visual Workflow editor.

The Pass state is incredibly powerful, and in this case, we can set variables and apply other logic as needed at the start of our execution.

If you select the Pass State in the canvas, and then the Variables tab, you can manually specify the variables.

From the pills at the top of the screen, you can select Config to rename the Workflow and define the execution role for the Step Function. I already have an Execution Role, but for this purpose of this post, you can let AWS create one for you.

As I noted earlier, not only can we define our workflow logic via drag-and-drop, but this solution is backed by a templating language that we can later use to package our projects via IaC. Below is the first workflow via JSON.

{
  "QueryLanguage": "JSONPath",
  "Comment": "I can set a comment about my workflow, what is required at invocation, the task this solves, etc.",
  "StartAt": "Pass",
  "States": {
    "Pass": {
      "Type": "Pass",
      "End": true,
      "Comment": "This will set the variables that will pass through the execution",
      "Assign": {
        "max_iter": 3,
        "counter": 0
      }
    }
  }
}

If you are following along with this post, it’s important to note that you can copy the code above and paste the contents into the code editor in the console. Select the Code pill, and paste the workflow definition on the left.

With our workflow created, we can now review the execution logic. Click Execute, this will run the workflow. No need to enter a payload yet, just click Start Execution in the lower right.

We can see that the execution successfully ran, and that the variables were assigned. These will be accessible throughout the execution of this pipeline.

Second Flow

Let’s update the workflow to add some additional logic. As with above, you can paste the code definition directly into the console to see the changes.

{
  "QueryLanguage": "JSONPath",
  "StartAt": "Pass",
  "States": {
    "Pass": {
      "Type": "Pass",
      "Assign": {
        "variableName": "$.states.input",
        "max_iter": 3,
        "counter": 0
      },
      "QueryLanguage": "JSONata",
      "Comment": "Manually creating an array for now, but ideally just say if you get to this state, we can specify an iterator as a for loop.  Might be missing something, but this expects \"data\" to iterate over, versus a for loop wih an exit (for max 3 iteratons, do this, else exit) -> like training, but Agent loops is the obvious use case.",
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$counter",
          "NumericLessThanPath": "$max_iter",
          "Next": "update counter"
        }
      ],
      "Default": "Success"
    },
    "update counter": {
      "Type": "Pass",
      "Next": "Success",
      "Assign": {
        "counter": "{% $counter + 1 %}"
      }
    },
    "Success": {
        "Type": "Succeed"
      }
    },
    "Comment": "Play around with the new JSONata things.  Really expressive syntax (is there a plugin/vs AI help) and the variables is huge.  Seems like just reference states.input.<object> and then assign variables as needed.  ONly keep what you need.  The concept of jsut passing JSON around is kinda gross, but also, JSONPath is very simple"
  }

Above we have added a few additional nodes to our workflow, namely the Choice which allows us to conditionally route the execution path within our workflow.

Let’s step back for a moment.

We have defined two variables, max_iter and counter. The former defines the upper bound for how many times we would allow the self-loop, as I may refer to it, self-reflection, to occur. The counter variable allows us to keep track of how many times we evaluted this path.
For each evaluation, we increment the counter by 1.
When the expression counter < max_iter is no longer True, the choice state will follow a different path and complete the exeuction.

Save the workflow and execute the workflow. In reviewing the output, you will notice the the counter doesn’t properly update.

JSONata

Let’s update the workflow to explicilty define the use of JSONata in the update counter state/task.

{
  "QueryLanguage": "JSONPath",
  "StartAt": "Pass",
  "States": {
    "Pass": {
      "Type": "Pass",
      "Assign": {
        "max_iter": 3,
        "counter": 0
      },
      "QueryLanguage": "JSONata",
      "Comment": "Manually creating an array for now, but ideally just say if you get to this state, we can specify an iterator as a for loop.  Might be missing something, but this expects \"data\" to iterate over, versus a for loop wih an exit (for max 3 iteratons, do this, else exit) -> like training, but Agent loops is the obvious use case.",
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$counter",
          "NumericLessThanPath": "$max_iter",
          "Next": "update counter"
        }
      ],
      "Default": "Success"
    },
    "update counter": {
      "Type": "Pass",
      "Next": "Success",
      "Assign": {
        "counter": "{% $counter + 1 %}"
      },
      "QueryLanguage": "JSONata"
    },
    "Success": {
      "Type": "Succeed"
    }
  },
  "Comment": "Play around with the new JSONata things.  Really expressive syntax (is there a plugin/vs AI help) and the variables is huge.  Seems like just reference states.input.<object> and then assign variables as needed.  ONly keep what you need.  The concept of jsut passing JSON around is kinda gross, but also, JSONPath is very simple"
}

After saving and executing the workflow, review the output. You can see above that the interface highlights that our counter is changing values, a subtle but nice touch!

However, you can see from the logging that the update counter task didn’t fire three times.

The Final Edit

Above, everything is working correctly, but we are immediately ending our workflow’s execution without going back for another evaluation pass.

Below we will update the workflow to go back to the Choice state in order to enable to allow our workflow to perform another pass.

{
  "QueryLanguage": "JSONPath",
  "StartAt": "Pass",
  "States": {
    "Pass": {
      "Type": "Pass",
      "Assign": {
        "max_iter": 3,
        "counter": 0
      },
      "QueryLanguage": "JSONata",
      "Comment": "Manually creating an array for now, but ideally just say if you get to this state, we can specify an iterator as a for loop.  Might be missing something, but this expects \"data\" to iterate over, versus a for loop wih an exit (for max 3 iteratons, do this, else exit) -> like training, but Agent loops is the obvious use case.",
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$counter",
          "NumericLessThanPath": "$max_iter",
          "Next": "update counter"
        }
      ],
      "Default": "Success"
    },
    "update counter": {
      "Type": "Pass",
      "Next": "Choice",
      "Assign": {
        "counter": "{% $counter + 1 %}"
      },
      "QueryLanguage": "JSONata"
    },
    "Success": {
      "Type": "Succeed"
    }
  },
  "Comment": "Play around with the new JSONata things.  Really expressive syntax (is there a plugin/vs AI help) and the variables is huge.  Seems like just reference states.input.<object> and then assign variables as needed.  ONly keep what you need.  The concept of jsut passing JSON around is kinda gross, but also, JSONPath is very simple"
}

Save and execute.

By looking at the events, we can now see that the update counter State/Task was invoked three times, as expected.

Summary

Variables allow us to keep track of core pieces of information throughout the full execution of our workflow. There is a limit to how much data we can pass around, but the limit is quite reasonable.
JSONata allows us to reference these variables, and shown above, update the values conditionally as needed
It’s worth pointing out that you can specify if you want to use JSONata at the creation of your workflow. I created my workflow with the previous JSONPath evaluation to highlight that it is possible to mix-and-match as needed, and it’s just a matter of defining the engine within the task state.

The above example is silly and a basic proof-of-concept, but let’s zoom out to why I find this new capability incredibly interesting. Agentic workflows allow us to combine multiple generative AI calls together in a system that aims to perform a given task.

As one example, consider an LLM-as-a-judge workflow which will use one LLM to generate output, and another to evalute the output. Above demonstrated that we could allow up to three calls to the LLM judge. When thinking about structured output, we could have the judge provide feedback for the initial agent to improve, or, if the judge “approved”, move onto new tasks in our workflow.