Nomad

From Leo's Notes
Last edited on 30 December 2021, at 01:16.

HashiCorp Nomad

Cheat sheet

Task Command
Start an agent

-dev mode for testing

nomad agent -dev -bind 0.0.0.0 -log-level INFO
Show node status nomad node status
Show connected members nomad server members
Create an example job nomad job init
Run a job nomad job run example.nomad
Show a job's status nomad job status example
Examine an allocation nomad alloc status allocation-id
Examine an allocation's logs nomad alloc logs allocation-id task-name

Concepts

See: https://learn.hashicorp.com/tutorials/nomad/get-started-vocab?in=nomad/get-started

  • Task: Smallest unit of work and is executed by a task driver. Eg. a container running via Docker
  • Group: A set of tasks that must run together. Eg. a web server and a database server
  • Job: A set of (one or more) groups. Used to constrain the scheduler, set update strategies, ACL

When defining a job, you:

  1. Specify a job definition
  2. Within the job, you specify one or more groups
  3. Within each group, you specify its replica (count=#), task and service

Installation

Installation instructions are outlined at https://www.nomadproject.io/docs/install. For CentOS, run:

# yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
# yum -y install nomad

You can start Nomad immediately by running nomad agent -dev-connect. In dev mode, you can poke around with nomad with minimal risk since nothing you do will persist.

Configuration

The Nomad configuration file is located at: /etc/nomad.d/nomad.hcl.

data_dir = "/opt/nomad/data"
bind_addr = "0.0.0.0"

server {
  enabled = true
  bootstrap_expect = 1
}

client {
  enabled = true
  servers = ["127.0.0.1:4646"]
}

But when you start Nomad, you need to specify it with -config: nomad agent -config=/etc/nomad.d. You can also specify multiple locations with additional -config parameters. A config path can also be a directory, whereby Nomad will then load all .hcl and .json files within that directory.

Question: Does this file need to be replicated on all member nodes? Or does it replicate automatically somehow as diagramed at https://learn.hashicorp.com/tutorials/nomad/get-started-vocab?in=nomad/get-started#nomad-cluster

Docker

If you intend to use docker volumes, you need to enable it by specifying a custom docker plugin config. An example is given below.

plugin "docker" {
    config {
        endpoint = "unix:///var/run/docker.sock"
        volumes {
            enabled      = true
            selinuxlabel = "z"
        }
    }
}

See: https://www.nomadproject.io/docs/drivers/docker#enabled-1

By default with Docker tasks, network namespaces between tasks within a group are isolated. You can have multiple containers be on the same network namespace by specifying within group -> network, mode = "bridge". This is useful if you were to run a web service application with a database backend but do not necessarily want to expose the database port.

Other

Consul integration

Start consul: consul agent -dev

Nomad appears to just detect this automatically, like magic.

Persistent Volumes

Host Volumes

To have a container mount a specific host volume, we need to:

  1. Define a host_volume stanza in the Nomad agent client configuration. Restart the Nomad agent to apply.
  2. Define a volume declaration within a task's group.
  3. Define a volume_mount declaration in a task.

If you are using Docker, you may be interested in skipping all this and just mounting the volumes as you would with docker-compose in the sub section below.

Step 1: Define a host_volume stanza within the client block in your Nomad configuration file. See: https://www.nomadproject.io/docs/configuration/client#host_volume-stanza

host_volume "mysql" {
    path      = "/opt/mysql/data"
    read_only = false
  }

Restart Nomad and then verify that the new host volume shows up when running nomad node status -short -self.

Step 2: Within the job/group, define the volume. Eg:

group "something" {
  volume "mysql" {
    type = "host"
    source = "mysql"
    read_only = false
  }
  ...
}

Step 3: Within the task, mount this volume:

group "something" {
  ...
  task "container" {
    ...
    volume_mount {
        volume      = "mysql"
        destination = "/var/lib/mysql"
        read_only   = false
    }
    ...
  }
}

The downside/upside to this way of doing things is that the volume is tied to the Nomad agent (which is ideal if you want to only schedule this task on nodes that have the volume available), but adding additional host_volumes requires restarting the agent.

Docker side note

If you have enabled Docker volumes, you can also add additional volumes without needing to define host_volumes simply by listing the volumes within the config section of the task. For example, to mount /local/volumes/container1 to /data in the container:

task "container" {
  driver = "docker"

  config {
	image = "..."

	volumes = [
		"/local/volumes/container1:/data"
	]
...
}

NFS Volumes

I am trying to replicate what I had with docker-compose where certain volumes were automatically mounted via NFS using the ContainX/docker-volume-netshare Docker plugin. Combing through the documentation and through some trial and error, I found out that you can define a specific volume driver to use with Docker.

To get this working, install the docker-volume-netshare plugin and start the plugin as a service.

In a task definition, specify a mount configuration as follows.

task "something" {
  driver = "docker"

  config {
	image = "some-image"
	ports = ["http"]

	mount {
		type = "volume"
		target = "/TV"
		source = "tv-volume"
		# readonly = true

		volume_options {
			driver_config {
				name = "nfs"
				options {
					share = "bnas:/remote/nfs/export"
				}
			}
		}
	}

  }
  
  ...
}

The target is the path of the volume in the container. The NFS server and export are defined as a the share in the driver option. The mount source value is analogous to the volume name you specify in the volumes definition in docker-compose.yml and is the directory name that netshare uses to mount the volume on the host machine. In the example above, when the container starts, it netshare will mount the volume to /var/lib/docker-volumes/netshare/nfs/tv-volume.

If you get a volumes are not enabled; cannot mount host paths: error, you need to enable docker volumes in the docker plugins configuration. Edit nomad.hcl and add:

plugin "docker" {
    config {
        endpoint = "unix:///var/run/docker.sock"
        volumes {
            enabled      = true
            selinuxlabel = "z"
        }
    }
}

Traefik load balancing

See: https://learn.hashicorp.com/tutorials/nomad/load-balancing-traefik

Configure Traefik with the consulCatalog provider in the traefik.toml configuration file:

# Enable Consul Catalog configuration backend.
[providers.consulCatalog]
    prefix           = "traefik"
    exposedByDefault = false

    [providers.consulCatalog.endpoint]
      address = "127.0.0.1:8500"
      scheme  = "http"

When defining a service within a job in Nomad, specify the following tags:

service {
      name = "demo-webapp"
      port = "http"

      tags = [
        "traefik.enable=true",
        "traefik.http.routers.http.rule=Path(`/myapp`)",
      ]

Traefik will then be notified of the new backends when changes to this service happens via Consul.

Open Questions

  • Can we have the load balancing Traefik instance be an external service? If so, how?

Other Questions

  • Can the docker driver run container as specific users? As said in https://www.nomadproject.io/docs/job-specification/task#user, "Docker and rkt images specify their own default users." which is super crummy if true as you can't force containers to run as non-root.
  • Can tasks within the same group share a volume? Like local volumes with docker-compose (it just makes a docker volume that can be shared). This does not need to be persistent and I do not want to define host_volumes for something that's supposed to be temporary. Ephemeral_disk doesn't share a volume across tasks. Boo.
  • Can linked containers be made? Multiple tasks within a group? No. It looks like each container is separate and they can't look each other up. Use Consul's DNS server which can do service discovery. But how do you get this to work within docker?