Cloud-init

From Leo's Notes
Last edited on 26 July 2023, at 16:36.

Cloud-init is a program that configures a guest virtual machine when it first boots based on configuration data that's supplied to it via network or a storage volume. It is capable of setting things like the guest's network configuration (IP addresses, routes, resolvers), account credentials and SSH authorized keys.

Building Cloud-Init images

Linux images that support Cloud-Init should have the Cloud-Init package installed and enabled on start up.

Distro Commands
RHEL / Rocky Linux / CentOS
## Cloud Init
# yum -y install cloud-init cloud-utils-growpart gdisk
# systemctl enable cloud-init.service
Alpine Linux
## Cloud Init
# apk add cloud-init

## Start Cloud-Init on Boot
# rc-update add cloud-init default
# rc-update add cloud-init-local default
# rc-update add cloud-config default
# rc-update add cloud-final default

The default user

The default username for the non-super user account is 'cloud-user'. Some distros use a different default user such as 'alpine' for Alpine Linux or 'ubuntu' for Ubuntu.

The default user account will be created only if no users are defined. If you do define a list of users, you'll have to also include the default user in this list or else it won't be created.

Do not create the default user account
I noticed that some Packer builds create the default user account as part of the initial image set up. Don't do this because you are making assumptions on how the VM is to be configured.

The user could have overridden the default user username with something else or wanted to skip creating the default user account entirely. If you precreate the account, you cause issues for these two scenarios.

Additionally, leaving the account creation to Cloud-Init allows it to take care of additional account settings including the SSH authorized key setup and group memberships.


Changing the default user username

If you wish to change the default user's username, you need to edit the cloud.cfg configuration file.

system_info:
  default_user:
    name: cloud-user
    lock_passwd: true
    gecos: Cloud User

For my Rocky Linux image, I ran the following to change the default username as part of the kickstart post install script.

## The default user is 'cloud-user'. we'll change this to 'rocky'.
# sed -i "s/name: cloud-user/name: rocky/" /etc/cloud/cloud.cfg

The default user has their password locked, thereby requiring authentication using SSH keys. You may change this by modifying cloud.cfg as well:

# sed -i "s/lock_passwd: true/lock_passwd: false/" /etc/cloud/cloud.cfg

Hypervisor support

Proxmox

Proxmox supports cloud-init using either the NoCloud (v1, used for Linux based guest OSes) or ConfigDrive (v2, used for Windows based guest OSes) data sources. To learn more about different Cloud-Init data sources, see https://cloudinit.readthedocs.io/en/latest/topics/datasources.html#known-sources.

A NoCloud configuration data is passed to the guest using a 4MB CD-ROM image. If you are creating a new VM, you must attach this CD-ROM device under 'Hardware' -> 'Add' -> 'CloudInit Drive'. Any changes made to the Cloud-Init settings will require this CD-ROM image to be regenerated.

When building images for use with Proxmox, it's recommended to add the following cloud.cfg.d file to restrict the data sources to only the supported options. Otherwise, you may have the VM try downloading the configs using some other method until it times out.

# /etc/cloud/cloud.cfg.d/99_pve.cfg
datasource_list: [ NoCloud, ConfigDrive ]

CloudStack

CloudStack has its own data source type. Source code for the data source is available at https://github.com/canonical/cloud-init/blob/main/cloudinit/sources/DataSourceCloudStack.py.

To configure CloudInit to use CloudStack, add the following configuration:

# cat <<EOF > /etc/cloud/cloud.cfg.d/99_cloudstack.cfg
datasource:
  CloudStack: {}
  None: {}
datasource_list:
  - CloudStack
EOF

The CloudStack data source does two things. It fetches the user data containing the user's cloud-init config and sets a custom password on the default user account. The data is obtained from a special service running on the virtual router (ie. your VM's gateway) on port 80 (meta data) and port 8080 (password). Your VM template should allow access to these services for CloudInit to function properly.

A few things you can try are:

Setting VM passwords

For VM templates that support passwords, CloudStack will generate a random 6 character password when the VM is created or when a password reset request is made. The password is then set on the virtual router on the VM's network and can be found on the virtual router at /var/cache/cloud/passwords-$network. The guest VM running CloudInit or the cloud-set-guest-password script will then fetch the password and set it on the default user account (defined in the CloudInit config or the script, respectively). The saved password is then cleared from the virtual router using another call.

If the password isn't being set in your VM, there are a few things to check for:

  • Is CloudInit running? You should see some log messages in journalctl about a password being obtained and set.
  • Is the default user defined? Is the default user being created? Remember, the password is set on the default user account (not root) by the CloudStack Cloud-Init data source.
  • If you provide a custom users list in your cloud init data, you'll need to also include the default user or else it won't be created and the password won't be set.
  • Is the password being set on the virtual router? If you log in to the virtual router, you can tail the /var/cache/cloud/passwords file and see if the password is being set by the management server and cleared by the guest VM.
  • If you are intentionally setting the password to the root account, make sure the root account isn't locked. You have to set disable_root: false as it defaults to true.

Cloud Configs

See: https://cloudinit.readthedocs.io/en/latest/topics/examples.html

Modules: https://cloudinit.readthedocs.io/en/latest/topics/modules.html

Example cloud-config snippets

Here are some cloud config snippets. You can combine any number of these into a single file.

Create a new user

Create a new user in the wheel group. The passwd hash can be obtained by running 'openssl passwd -1 -salt CUSTOMSALT'.

#cloud-config
users:
  - name: leo
    lock_passwd: false
    inactive: false
    gecos: Leo Leung
    primary_group: leo
    groups: users, wheel
    passwd: $1$ADUODeAy$eCJ1lPSxhSGmSvrmWxjLC1

Enable the root account and set a root password

Most VM templates disable the root account. To enable the root account and set the root password:

#cloud-config
disable_root: false
chpasswd:
  list:
    - root:password
  expire: false

Install specific packages

You can add additional packages or perform system updates when the system first comes up.

#cloud-config
packages:
 - tmux
 - git

package_update: true
package_upgrade: true

Create a file

Use the write_files module to create files. You can use other encodings like base64 as well.

#cloud-config
write_files:
  - path: /hello
    content: |
      Hello CloudInit
    permissions: '0644'

Resize the root LVM volume on startup

Assuming that your LVM is the 3rd partition, you could do something like this:

#cloud-config
write_files:
  - path: /usr/bin/expand_lvm_root
    content: |
      #!/bin/bash
      /usr/bin/growpart /dev/vda 3
      /usr/sbin/pvresize -y -q /dev/vda3
      /usr/sbin/lvresize -y -q -r -l +100%FREE /dev/mapper/*root
      /usr/sbin/xfs_growfs /
    permissions: '0700'

runcmd:
  - /usr/bin/expand_lvm_root

POST data when done booting

Cloud Init can POST to a web server once the system is ready. This could be useful to automate an Ansible call on the new VM, for example.

#cloud-config
phone_home:
    url: http://webhook-server/$INSTANCE_ID/
    post: [ pub_key_dsa, pub_key_rsa, pub_key_ecdsa, instance_id ]
    tries: 10

Tasks

Disable cloud-init

The cloud-init service file is set so that it doesn't run if /etc/cloud/cloud-init.disabled exists or if the 'cloud-init=disabled' exists in the kernel command line

Building a Cloud-init Linux template image

After installing the Linux distro, you will have to do the following to prepare it as a template image:

  1. Install cloud-init packages. Ensure it's enabled.
    yum -y install cloud-init cloud-utils-growpart gdisk
    systemctl enable cloud-init.service
    
  2. Edit /etc/cloud/cloud.cfg and ensure the defaults work for your cloud environment. You might want to ensure the default user is set up appropriately.
  3. For CloudStack, add in the CloudStack datasource by adding a file in /etc/cloud/cloud.cfg.d/99_cloudstack.cfg:
    cat <<EOF > /etc/cloud/cloud.cfg.d/99_cloudstack.cfg
    datasource:
      CloudStack: {}
      None: {}
    datasource_list:
      - CloudStack
    EOF
    
  4. Cleanup anything that identifies this host.
shred -u /etc/ssh/*_key /etc/ssh/*_key.pub

rm -f /var/run/utmp

echo > /var/log/lastlog
echo > /var/log/wtmp
echo > /var/log/btmp

rm -rf /tmp/* /var/tmp/*

# If you're updating an existing template with cloud-init, ensure anything cloud-init generated is cleaned up
rm -rf /var/lib/cloud/instances/* /var/lib/cloud/data/*

unset HISTFILE

rm -rf /home/*/.*history /root/.*history
# Red Hat kickstart file
rm -f /root/*ks*cfg


# Fill Disk with Zeros and remove it
dd if=/dev/zero of=/EMPTY bs=1M || true
dd if=/dev/zero of=/boot/EMPTY bs=1M || true
rm -f /EMPTY
rm -f /boot/EMPTY

# Block until the empty file has been removed, otherwise, Packer
# will try to kill the box while the disk is still full and that's bad
sync
sync
  1. Power of and snapshot it as a template.

Troubleshooting

Cloud-Init isn't setting the static IP address

On Alpine Linux, Cloud-Init runs but the VM is still obtaining the IP address via DHCP.

After much debugging, I believe the issue comes down to the networking service starting before cloud-init. As a result, the network configs in /etc/network/interfaces isn't being written to by Cloud-Init in time and the networking service uses DHCP. The fix here is to make networking start after cloud-init. To do this, remove the networking service from the boot runlevel and re-add it to the default runlevel. Because 'networking' comes after 'cloud-init', cloud-init should run before networking.

As part of my packer provisioning script, I added the following lines:

## Start ntworking after cloudinit?
# rc-update del networking boot     
# rc-update add networking default