How to Run Argo Workflows with Docker Desktop
April 1, 2022
It's simple to deploy Argo Workflows to a K8s cluster using Docker Desktop. We'll implement the main components of an Argo Workflows deployment (Argo Server, Controller, UI, CLI, and artifact repo), and run two basic workflows.
Argo’s ecosystem of Kubernetes-based open source tools is increasingly popular in the cloud-native community because the tools can be combined to create powerful Kubernetes-native workflows, rollouts, and continuous delivery (CD) tasks.
Argo Workflows is a workflow engine used to create, manage, and orchestrate parallel jobs in Kubernetes. These workflows are implemented as Kubernetes `CustomResourceDefinitions` (CRDs) and enable you to carry out functions such as the following:
- Define container-based steps in workflows
- Execute compute-intensive extract, load, transform (ELT) operations, machine-learning tasks, and data processing tasks
- Create Kubernetes-native CI/CD pipelines
In this tutorial, you’ll learn how to deploy Argo Workflows to a Kubernetes cluster in Docker Desktop.
About Argo Workflows
Argo Workflows offers a number of useful features, including the following:
- Web-based UI
- Native artifact support (MinIO, S3, Artifactory, HDFS, OSS, HTTP, Git, GCS, raw)
- Templating and cron workflows
- Workflow archive
- REST API, Argo CLI
The `Workflow` resource is used to define the execution of a workflow as well as its storage state. Workflows consist of instructions that operate like functions, known as templates in Argo.
Templates detail the steps of execution in the workflow. The `spec` is the most important part of the `Workflow` manifest file. There are two sections to the spec:
- Templates: This is where you define the different types of templates you want to use.
- Entrypoint: The entrypoint determines which template will be used first.
A template can be any of the following:
- Container: This is probably the most common template type, and as the name implies, it schedules a container. Its spec is identical to that of a Kubernetes container spec.
- Script: This is a convenience wrapper around the container. The spec is the same as the container, but it has a `source` field that allows you to define a script. The script will be saved to a file and executed from there.
- Resource: This template can be used to perform create, read, update, delete (CRUD) operations directly on resources in the cluster.
- Suspend: This template is used to suspend the execution of a workflow either for a specified duration or indefinitely.
- Directed Acyclic Graph (DAG): This template allows you to define the tasks in a workflow as a graph of dependencies.
- Steps: This template allows you to define the tasks in your workflow as sequential steps. It consists of inner and outer lists; inner lists run in parallel, while outer lists run one after the other.
In order to do this tutorial, you’ll need the following:
- Docker Desktop
- Enable Kubernetes in Docker Desktop
- Increase memory resource allocation to at least 12 GB (for MinIO)
- K9s (optional)
Running Argo Workflows Using Docker Desktop
Once you have your prerequisites, open Docker Desktop and begin.
Connecting to Kubernetes Cluster
Once Docker Desktop is running, go to the preferences menu to increase memory resource allocation to at least 12 GB and enable Kubernetes. When your cluster is up, make sure that you’re connected to the `docker-desktop` Kubernetes cluster.
Run the below commands to list, select, and verify the cluster context, respectively:
Installing Argo CLI
The next step will be to install the Argo CLI. You can either use the latest version or select a previous one from the Argo releases GitHub page. The commands you run may vary depending on your operating system. The two code blocks below are for Mac and Linux, respectively. If you are using Windows, you can download the relevant executable from the Assets section of the releases GitHub page.
Installing Argo Controller and UI
Before installing the Argo Workflow resources, you need to create an `argo` namespace:
To access the Argo Workflows UI, you will need to expose it. This can be done in multiple ways, but for this tutorial, use the port-forwarding method:
Open your browser and go to https://127.0.0.1:2746. You will be redirected to a page for authentication.
In order for you to log in, you will need to generate an access token for the `ServiceAccount` that you will use to manage your workflows. The first step will be to create a `Role`, `ServiceAccount`, and `RoleBinding`:
- `Role`: This resource is used to determine a set of permitted operations on certain Kubernetes resources in a specific namespace.
- `RoleBinding`: This is used to determine which users or `ServiceAccounts` are authorized to execute operations on certain resources in a given namespace, as specified in an attached `Role`.
- `ServiceAccount`: A `ServiceAccount` is used to authenticate machine-level processes to gain access to your Kubernetes cluster. The API server in the control plane is responsible for such AuthN (authentication) to the processes running in the Pod.
If you want to increase the permissions for your `Role`, you can create and modify a manifest file that specifies the additional operations you want to carry out, along with the resources they should be executed against. The code block below contains an example:
After that, you can create an access token and store it in an environment variable (such as `ARGO_TOKEN`):
Take the printed Bearer token and paste it into the text area of the login section on the landing page of the Argo Workflow UI.
Now, you can explore the sidebar menu and select Workflows.
In this section, you will create both a basic and an advanced workflow.
Hello World Example
For starters, you will deploy a basic `hello world` example to familiarize yourself with the Workflow manifest file:
In order to submit this workflow, ensure that your kubeconfig context is still set to `docker-desktop`. Argo will use these credentials when communicating with the API server. Once you’ve verified this, run the following command:
Upon completion, review the workflow in the Argo Workflow UI.
In addition to this, the container that was executed prints a whale with the text `hello world`, which is visible in the logs of the Pod. You can view the logs via the Argo Workflows UI or by running the `kubectl logs` command.
Artifact Passing Example
Next, you’ll deploy a more advanced workflow that consists of generating an artifact and passing it into and out of containers in sequential steps. The artifact will be stored in MinIO, a Kubernetes-native object storage solution.
With Argo Workflows already running in your Docker Desktop cluster from the previous step, you just need to install MinIO on your cluster using `helm`:
After MinIO has been deployed, your artifact server will be up and running. To access your artifact server, you can set up port forwarding with the following command:
You will need to use the credentials (access key and secret key) from the generated `argo-artifacts` secret to log in. These values are base64 encoded by default and will need to be decoded. To do this, run the following commands:
Next, configure Argo to use MinIO as its artifact repository and use the relevant credentials for authentication. To do this, you can either follow the steps outlined here or you can use the simple patch manifest file below, created by Stephen Bailey:
Run the following command to execute the patch above:
Finally, run the `artifact-passing` example stored in the Argo GitHub repository:
Download the generated artifact and view it from your local Downloads directory.
As you saw in this tutorial, Argo Workflows can function as a powerful tool for creating Kubernetes-native workflows for various use cases. You should have a better understanding of how to implement basic Workflows concepts in a single-node Kubernetes cluster with Docker Desktop. More complex, enterprise-level projects, though, might cause difficulties around scaling and optimization.
Pipekit presents a solution. The control plane for your Argo Workflows offers a cost-effective model that supports your larger Kubernetes jobs. You can quickly set up your data pipeline infrastructure and get expert help as you scale. To see how Pipekit can help, check the documentation.
Subscribe for Pipekit updates.
Get the latest articles on all things Pipekit & data orchestration delivered straight to your inbox.