TL;DR AWS doesn’t provide a dashboard or user interface to efficiently run multiple Fargate tasks against a container with different inputs. Apache Airflow allows you to define your tasks via Python scripts programmatically. Using the AWS API, via an ECS operator, will enable you to orchestrate and run your container.
Say you have an application that takes a particular set of inputs, performs some form of analysis, and stores the results in a specified location—and you want to run this application in the cloud. AWS Fargate is a serverless computing engine best suited for running task-based containerized applications.
What makes Fargate ideal for this scenario is that it removes the need to provision and manage servers. Instead, it lets you pay only for specified resources used to run your container. That way, you don’t have to over-provision and end up paying for servers you’re not using.
But there is a snag…
Depending on the purpose of your containerized application, it’s possible to set up and configure AWS to run multiple instances of your container, providing each instance with a different set of input values and output storage locations. The problem is that AWS doesn’t provide a dashboard or user interface that makes it easy to configure and run your containers in parallel, programmatically.
AWS does, however, expose API endpoints that Apache Airflow can use to orchestrate and run your containers, either in parallel or in the order of your choosing.
What is Airflow?
Apache Airflow, created by Airbnb in October 2014, is an open-source workflow management tool capable of programmatically authoring, scheduling, and monitoring workflows. Airflow’s workflow execution builds on the concept of a Directed Acyclic Graph (DAG). Defined by a Python script, a DAG is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies.
For example, a simple DAG could comprise three tasks: A, B, and C. It could state that task B can only start executing after task A has succeeded, but C can execute at any time (not waiting for A or B).
A DAG can also state that a task should timeout after five minutes and or retry up a total of three times in the event of a failure. It might also state that the entire workflow runs every morning at 9:00 am, but not before a specific date.
Another core concept in Airflow is an operator. While a DAG describes how to run a workflow, an operator describes a single task in a workflow. For example, execute a bash command, send an email, or execute an AWS Fargate task.
To illustrate, let’s look at a simple Python application I’ve created that emulates importing the user’s financial data for a specific year, analyzing it, then exporting the results to a database. The year is specified as an input parameter that the application accepts at run time.
First, we Dockerize the analyzer application and upload it to Amazon Elastic Container Registry (ECR) to allow the Fargate task to locate the docker image.
Next, we create a Networking only Cluster under Amazon’s Elastic Container Service (ECS) and an associated Fargate Task Definition to run the container’s logic.
Once we have everything created in AWS, we’re able to run an individual Fargate task to test the application. Below are the CloudWatch output logs when the task is executed with 1999 as the input value for the year.
Now that we’re able to run one task with a specific year successfully, we’re going to create a setup and configure Airflow with a DAG that takes a range of years and executes a Fargate task for each year in parallel.
As mentioned earlier, scheduling and or running multiple Fargate tasks in parallel solely in AWS is a very ambitious endeavor that doesn’t leave much room for dynamic customization, such as changing the value of the year for each task.
For more information on creating Fargate Tasks visit Amazon’s ECS developer guide.
Airflow DAG setup
Airflow can be installed and hosted on several systems and environments. I’ve opted to use a Docker Image and run it on my local machine. Once that container is up and running, my first task is to configure our AWS connection via the Connections page under the Admin dropdown menu. Airflow comes with a list of predefined connections out of the box. For our solution, we want to edit the aws_default connection to enter AWS credentials, entering our AWS_ACCESS_KEY_ID in the Login text box and AWS_SECRET_ACCESS_KEY in the Password field.
Once we’ve saved those changes, we’ll now navigate to the Variables screen under Admin to configure known variables that our DAG can dynamically utilize. For the ECS operator to successfully launch our Fargate tasks, it needs to know the Cluster name, Task definition, Region, and the subnet, among other configuration values that we’ll review when looking at the DAG. We also have the start and end year variables that we’ll be using to identify the range/list of years that we’ll be performing analysis for—finally, the container name if we wanted to override the container at run time.
The variables mentioned can all be specified in the DAG file. However, this route allows us to change specific values on demand, without having to edit the Python DAG file.
Now we’ll look at our Python code for our DAG. At the top, lines 1-6, we have our import statements. Those import statements are followed by the use of Variable.get() to access and get the values of the variables that we configured earlier and assign them to local variables.
The problem we’re solving for is being able to run multiple instances on the same containerized application with different inputs in parallel. Specifically, in our use case, we want to run an analysis for each year within a specific range.
On line 17, we use the start and end year values to generate the list of years that we’ll use as an input parameter for each instance of the container we’ll be running. Alternatively, if they were not sequential, we could specify the years that we want to utilize in an array or other data structure.
In Lines 32-54, we define our ECS operator template with all the values that the tasks will have in common. Subsequently, in the for loop that starts on line 59, for each year in the yearstoAnalyze, we utilize the template, append the specific command—for example -y 2002—and add each ESC Task to the task list.
Next, we add the Python file to Airflow’s DAG folder. Within seconds, Airflow will display the analyzer DAG on the home screen. Clicking on the DAG takes us to the Tree View screen, which gives us visual confirmation on the creation of each unique task.
Running our DAG
Once triggered, our DAG will instruct the AWS to provision 21 Fargate Tasks, one for each year between 1999 and 2020. The output of each container can be viewed in the logs for each task.
As mentioned earlier, if any of the tasks fail for some reason, in the code, we configured the DAG to retry twice automatically. We can also retry a specific job manually, or the entire DAG.
Going with the flow
You may already be utilizing Fargate for scheduling and running multiple tasks, but the manual process of having to generate and configure each job may not seem worthwhile. You may have already learned how to use the AWS API to trigger multiple tasks with different inputs, but don’t have a way of effectively scaling and monitoring success and failures with automatic retries.
In either case, Airflow is an excellent fit for dynamically generating your tasks with Python. There is the added benefit of adding scheduling and many other operators such as Slack integrations that notify you when tasks succeed or fail. Airflow’s documentation is very straightforward and hosts several examples to get you started. And now that you know how to solve for taking one task/job and running the same logic with different input values in a parallel, you can save time and avoid the tedium of a sequential approach.