Leo's Notes - User contributions [en]

FreeIPA

2026-04-07T06:00:20Z

Leo: /* Configure the Samba server */

== Installation ==
Here are my notes as I fumble my way setting up FreeIPA.

=== Docker ===
There is an official Docker container that has a complete FreeIPA installation. This container uses systemd to start up FreeIPA along with the other related services such as OpenLDAP, Bind, and Kerberos. See more at: https://github.com/freeipa/freeipa-container

If you are using Docker, you '''must disable cgroup v2''' (this is enabled by default on RHEL9 and above). More about this in the Troubleshooting section below.

Use the following <code>docker-compose.yml</code> stack to quickly get started with FreeIPA:
{{Highlight
| code = version: '3.3'

services:

freeipa:
image: freeipa/freeipa-server:rocky-8
restart: unless-stopped
tty: true
stdin_open: true
hostname: ipa
domainname: home.steamr.com
extra_hosts:
- "ipa.home.steamr.com:10.1.2.12"
environment:
- IPA_SERVER_HOSTNAME=ipa.home.steamr.com
- IPA_SERVER_IP=10.1.2.12
- DNS=10.1.0.8
- TZ=America/Edmonton
command:
- ipa-server-install
- --realm=home.steamr.com
- --domain=home.steamr.com
- --ds-password=xxxxxxxxxx
- --admin-password=xxxxxxxxxx
- --no-host-dns
- --setup-dns
- --auto-forwarders
- --allow-zone-overlap
- --no-dnssec-validation
- --unattended
sysctls:
- net.ipv6.conf.all.disable_ipv6=0
volumes:
- ./data:/data
- ./logs:/var/logs
- /sys/fs/cgroup:/sys/fs/cgroup:ro
tmpfs:
- /run
- /var/cache
- /tmp
cap_add:
- SYS_TIME
ports:
- "10.1.2.12:80:80/tcp"
- "10.1.2.12:443:443/tcp"
# DNS
- "10.1.2.12:53:53/tcp"
- "10.1.2.12:53:53/udp"
# LDAP(S)
- "10.1.2.12:389:389/tcp"
- "10.1.2.12:636:636/tcp"
# Kerberos
- "10.1.2.12:88:88/tcp"
- "10.1.2.12:464:464/tcp"
- "10.1.2.12:88:88/udp"
- "10.1.2.12:464:464/udp"
| lang = yaml
}}

== Samba integration ==
There are 3 methods to using FreeIPA with Samba. I eventually settled on method #2.

# Configure Samba to use FreeIPA as a simple LDAP server, using '''ldapsam''' as the passdb backend. This requires a schema change to include the sambaSAMAccount and sambaGroupMapping, and sambaSID object classes. There is a DNA (distributed numeric assignment) plugin that can be used to update these fields.
# Configure Samba to use FreeIPA using '''ipasam''' as the passdb backend. This requires the <code>ipasam.so</code> module installed on the samba servers.
# Configure Samba to use '''Kerberos'''. Does not seem to allow users to use password authentication.

=== Method 1: ldapsam ===
{{Warning|I didn't get this to work|I wasn't able to fully get this method to work. If you are using OpenLDAP only, this way of integrating Samba does work. The issue I had was getting the DNA plugin to work as advertised.

I eventually settled on the ipasam method in the section below.}}
To use ldapsam, we need to make some changes to the FreeIPA LDAP server by adding sambaSAMAccount and sambaGroupMapping as a default user object class and group object class.

You can either set this in the FreeIPA web interface under configuration, or run:{{Highlight
| code = # ldapmodify <<EOF
dn: cn=ipaConfig,cn=etc,dc=home,dc=steamr,dc=com
changetype: modify
add: ipaUserObjectClasses
ipaUserObjectClasses: sambaSAMAccount
-
add: ipaGroupObjectClasses
ipaGroupObjectClasses: sambaGroupMapping
EOF
| lang = terminal
}}We will then need to make a custom DNA (distributed numeric assignment) plugin to update these attributes whenever something related to the object (such as the password) is changed. This can be done by adding a DNA object into LDAP.{{Highlight
| code = ldapadd <<EOF
dn: cn=SambaSid,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
dnatype: sambaSID
dnaprefix: S-1-5-21-2049073866-1371207509-1214748462
dnainterval: 1
dnamagicregen: assign
dnafilter: ({{!}}(objectclass=sambasamaccount)(objectclass=sambagroupmapping))
dnascope: dc=home,dc=steamr,dc=com
cn: SambaSid
dnanextvalue: 2

dn: cn=sambaGroupType,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
cn: sambaGroupType
dnatype: sambaGroupType
dnainterval: 1
dnamagicregen: assign
dnafilter: (objectClass=sambagroupmapping)
dnascope: dc=home,dc=steamr,dc=com
dnanextvalue: 2
EOF
| lang = bash
}}

A ldapsam user entry should have these fields. I don't see these though.
{{Highlight
| code = dn: uid=guest2, ou=People,dc=quenya,dc=org
sambaLMPassword: 878D8014606CDA29677A44EFA1353FC7
sambaPwdMustChange: 2147483647
sambaPrimaryGroupSID: S-1-5-21-2447931902-1787058256-3961074038-513
sambaNTPassword: 552902031BEDE9EFAAD3B435B51404EE
sambaPwdLastSet: 1010179124
sambaLogonTime: 0
objectClass: sambaSamAccount
uid: guest2
sambaKickoffTime: 2147483647
sambaAcctFlags: [UX ]
sambaLogoffTime: 2147483647
sambaSID: S-1-5-21-2447931902-1787058256-3961074038-5006
sambaPwdCanChange: 0
| lang = text
}}

==== Configure the Samba server ====
You can either use a specific binding credential that's shared across all your samba servers, or use the machine's cifs service account to authenticate to the LDAP server.

I tried to do the following using the admin account as the bind DN: ('''using the admin account like this is probably a bad idea, I'm just testing''')
{{Highlight
| code = [global]
# freeipa configurations
passdb backend = ipasam:ldap://home.steamr.com
ldap admin dn = uid=admin,cn=users,cn=accounts,dc=home,dc=steamr,dc=com
ldapsam:trusted = yes
ldap suffix = cn=accounts,dc=home,dc=steamr,dc=com
ldap user suffix = cn=users,cn=accounts
ldap machine suffix = cn=computers,cn=accounts
ldap group suffix = cn=groups,cn=accounts
ldap passwd sync = only
ldap ssl = no
| lang = text
}}
Run <code>smbpasswd -w password</code> to set your bind credential passwords.

=== Method 2: ipasam ===
See: https://bgstack15.wordpress.com/2017/05/10/samba-share-with-freeipa-auth/

Install the adtrust components on the FreeIPA server. Install <code>ipa-server-trust-ad</code> and run <code>ipa-adtrust-install --add-sids</code>. This will add the additional IPASAM attributes such as ipaNtPassword in user objects.
{{Highlight
| code = # ipa-adtrust-install --add-sids
## Answer yes to overwrite smb.conf
## Answer yes to install slapi-nis
| lang = terminal
}}
Ensure that your hostname is set to the FQDN of the hostname otherwise this process will fail. If you are using an external DNS server, ensure that the additional service records are present. If you are using a Docker container, set the hostname to the full FQDN (eg. ipa.example.com, rather than just 'ipa').

Next, create a new user and then change the user's password. This should populate the <code>ipaNTPassword</code> attribute. {{Highlight
| code = # ipa user-add leo --first=Leo --last=Leung
# ipa group-add-member smbgrp --users=leo
| lang = terminal
}}
When modifying smb.conf, the service account or bind DN must have access to these new ipasam attributes. You need to create a new set of privilege and role and grant the account access. Create the role and permissions in the web interface or run:
{{Highlight
| code = # ipa permission-add "CIFS server can read user passwords" --attrs={ipaNTHash,ipaNTSecurityIdentifier} --type=user --right={read,search,compare} --bindtype=permission
# ipa privilege-add "CIFS server privilege"
# ipa privilege-add-permission "CIFS server privilege" --permission="CIFS server can read user passwords"
# ipa role-add "CIFS server"
# ipa role-add-privilege "CIFS server" --privilege="CIFS server privilege"
| lang = terminal
}}
Then, add your service account or bind DN to the 'CIFS server' role. For example:

{{Highlight
| code = # ipa service-add cifs/dnas.home.steamr.com
# ipa role-add-member "CIFS server" --services=cifs/dnas.home.steamr.com
| lang = terminal
}}

==== Configure the Samba server ====
Generate a keytab file for samba.
{{Highlight
| code = # kinit -kt /etc/krb5.keytab
## Note: we ran this in the previous step.
## ipa service-add cifs/dnas.home.steamr.com
# ipa-getkeytab -s ipa.home.steamr.com -p cifs/dnas.home.steamr.com -k /etc/samba/samba.keytab
| lang = terminal
}}
Then, tweak smb.conf.{{Highlight
| code = [global]
passdb backend = ipasam:ldap://ipa.home.steamr.com
ldapsam:trusted = yes
ldap suffix = dc=home,dc=steamr,dc=com
ldap user suffix = cn=users,cn=accounts
ldap machine suffix = cn=computers,cn=accounts
ldap group suffix = cn=groups,cn=accounts
ldap ssl = no
idmap config * : backend = tdb
create krb5 conf = No
dedicated keytab file = FILE:/etc/samba/samba.keytab
kerberos method = dedicated keytab
| lang = text
}}If you get: <code>NT_STATUS_BAD_TOKEN_TYPE</code>, you need to disable MS-PAC in the FreeIPA settings or disable it specifically for this cifs service account.

The ipasam passdb provider is available from the <code>ipa-server-trust-ad</code> package. However, this package also pulls in a ton of other IPA dependencies which aren't needed if you just want to run Samba that talks to IPA and not the entire FreeIPA server. If you just want the provider to work on a bare minimal samba server, you can simply just copy (or extract from the <code>ipa-server-trust-ad</code> package) the <code>ipasam.so</code> file to <code>/usr/lib64/samba/pdb/ipasam.so</code> with this set of commands:

{{Highlight
| code = # yum download ipa-server-trust-ad
# mkdir x && cd x
# rpm2cpio ../ipa-server-trust-ad*rpm {{!}} cpio -id ./usr/lib64/samba/pdb/ipasam.so
# cp ./usr/lib64/samba/pdb/ipasam.so /usr/lib64/samba/pdb/ipasam.so
| lang = terminal
}}

=== Method 3: Kerberos ===

This method is similar to the ipasam method above and you will need to set up the server in the same way. However, the way you configure Samba is different.

==== On the Samba server ====
{{Highlight
| code = ## Join the samba server to FreeIPA
# ipa-client-install

## Then add the client to samba.
# ipa-client-samba
| lang = terminal
}}
Note that when you try adding the samba client, the IPA server has to be able to resolve the A record for the Samba server you're adding or else it will fail.

This should automatically set up the cifs service accounts for this particular samba server, get the samba keytab file in <code>/etc/samba/samba.keytab</code>, and then tweak the smb.conf file to use this keytab file.

The smb.conf file now looks like:

{{Highlight
| code = [global]
# Limit number of forked processes to avoid SMBLoris attack
max smbd processes = 1000
# Use dedicated Samba keytab. The key there must be synchronized
# with Samba tdb databases or nothing will work
dedicated keytab file = FILE:/etc/samba/samba.keytab
kerberos method = dedicated keytab
# Set up logging per machine and Samba process
log file = /var/log/samba/log.%m
log level = 1
# We force 'member server' role to allow winbind automatically
# discover what is supported by the domain controller side
server role = member server
realm = HOME.STEAMR.COM
netbios name = DNAS
workgroup = HOME
# Local writable range for IDs not coming from IPA or trusted domains
idmap config * : range = 0 - 0
idmap config * : backend = tdb

idmap config HOME : range = 100000 - 299999
idmap config HOME : backend = sss

# Default homes share
[homes]
read only = no
| lang = text
}}

You need to then start winbind and smb. For some reason, this method doesn't seem to work as winbind is stuck with "<code>wb_parent_idmap_setup_lookupname_done: Lookup domain name 'home' failed 'NT_STATUS_DOMAIN_CONTROLLER_NOT_FOUND'</code>"

=== Debug samba issues ===
Use <code>smbclient</code> to help debug issues. This utility is provided by the <code>samba-client</code> package. You can then test authentication by running: <code>smbclient -d 10 -U leo //dnas/home</code>

== Tasks ==

=== Join a computer to a FreeIPA ===
Use the <code>ipa-client-install</code> command to add a computer to a FreeIPA server. This should also automatically add a computer account, generate a keytab file, and tweak sssd to use FreeIPA as an authentication mechanism.
{{Highlight
| code = # ipa-client-install -U -p admin -w $Password --server ipa.home.steamr.com --domain home.steamr.com --force-join --no-ntp --fixed-primary
| lang = terminal
}}

== Troubleshooting ==

=== Error: did not receive Kerberos credentials ===
Tools such as 'ipa' uses your session's Kerberos tickets for authentication. If you don't have any tickets or if your tickets expired, you may get an <code>ipa: ERROR: did not receive Kerberos credentials</code> error. Fix this by running:
{{Highlight
| code = ## Renew/obtain Kerberos tickets for 'admin'
# kinit admin
Password for admin@HOME.STEAMR.COM: ****
| lang = terminal
}}
Verify if your tickets are available with <code>klist</code>:
{{Highlight
| code = # klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: admin@STEAMR.COM

Valid starting Expires Service principal
03/06/22 14:44:35 03/07/22 14:39:47 krbtgt/STEAMR.COM@STEAMR.COM
| lang = terminal
}}

=== Container issues ===

* Don't mount /var/log because the container image symlinks everything into /data. If you do mount /var/log, make sure you make the expected directories or else the installer will fail.
* Error with <code>AssertionError: Another instance named 'HOME-STEAMR-COM' may already exist</code>. I can't figure out what's causing lib389 to think there's another instance. I built a container image on top of this image with the assertion patched out. This seemed to have fixed the issue.
* {{Highlight
| code = FROM freeipa/freeipa-server:rocky-8
RUN sed 's/assert_c(len(insts)/# assert_c(len(insts)/' -i /usr/lib/python3.6/site-packages/lib389/instance/setup.py
| lang = text
}}

==== Can't find the ipa-adtrust-install package ====
The FreeIPA packages are under a different app stream repo. Enable it by running <code>dnf -y module enable idm:DL1</code>.

=== sssd: Decrypt integrity check failed ===
After recreating the FreeIPA server, I uninstalled and reinstalled the FreeIPA client on a machine. Kinit works as expected, but sssd authentication fails with the following error in /var/log/sssd/.

<code>[krb5_child[29691]] [get_and_save_tgt] (0x0020): [RID#6] 1725: [-1765328353][Decrypt integrity check failed]</code>

The fix here is to wipe out all the caches. <code>sss_cache -E</code> isn't sufficient. You have to stop sssd and delete all the databases:

{{Highlight
| code = # systemctl stop sssd
# rm -rf /var/lib/sss/db/*
# systemctl start sssd
| lang = terminal
}}

=== File permission issues ===
I ran into issues starting FreeIPA in a Docker container. Symptoms include the following issues in the subsections below.

==== Bind / named doesn't start: ====
{{Highlight
| code = ipa named-pkcs11[5407]: LDAP error: Invalid credentials: bind to LDAP server failed
ipa named-pkcs11[5407]: couldn't establish connection in LDAP connection pool: permission denied
ipa named-pkcs11[5407]: dynamic database 'ipa' configuration failed: permission denied
ipa named-pkcs11[5407]: loading configuration: permission denied
ipa named-pkcs11[5407]: exiting (due to fatal error)
| lang = text
}}
Ignore this for now. You have other issues, likely.

==== Samba doesn't start: Error: Invalid credentials ====
When trying to start smb, I get the following:
{{Highlight
| code = [2023/01/24 05:17:46.170883, 0, pid=5124] ipa_sam.c:4945(bind_callback)
bind_callback: cannot perform interactive SASL bind with GSSAPI. LDAP security error is 49
[2023/01/24 05:17:46.171155, 0, pid=5124] ../../source3/lib/smbldap.c:1059(smbldap_connect_system)
failed to bind to server ldapi://%2fvar%2frun%2fslapd-HOME-STEAMR-COM.socket with dn="[Anonymous bind]" Error: Invalid credentials
(unknown)
[2023/01/24 05:17:46.171419, 1, pid=5124] ../../source3/lib/smbldap.c:1272(get_cached_ldap_connect)
Connection to LDAP server failed for the 1 try!
| lang = terminal
}}
The fix was to stop FreeIPA with <code>ipactl stop</code>, then <code>rm /run/samba/krb5cc_samba</code>, and restarting FreeIPA with <code>ipactl start</code>. If the error recurrs after restarting FreeIPA, then you have other issues that's preventing Samba from starting.

==== Tomcat doesn't start: status=5/NOTINSTALLED ====
When trying to start Tomcat, you get a 5 exit code, as reported by systemd:
{{Highlight
| code = pki-tomcatd@pki-tomcat.service: Control process exited, code=exited, status=5/NOTINSTALLED
pki-tomcatd@pki-tomcat.service: Failed with result 'exit-code'.
Failed to start PKI Tomcat Server pki-tomcat.
| lang = terminal
}}
The exit code 5 being 'NOTINSTALLED' is a red herring. You likely have permission issues that's preventing the user running tomcat (pkiuser) from accessing some certs or configs. Go through your Docker volumes and chown anything directory called 'pki' to the pkiuser.
{{Highlight
| code = # chown -R 17:17 etc/pki etc/sysconfig/pki var/lib/pki var/lib/ipa/pki-ca
| lang = terminal
}}

==== Tomcat still doesn't start: start-post operation timed out. Terminating. ====
Tomcat doesn't start as it times out.
{{Highlight
| code = pki-tomcatd@pki-tomcat.service: start-post operation timed out. Terminating.
pki-tomcatd@pki-tomcat.service: Control process exited, code=killed, status=15/TERM
pki-tomcatd@pki-tomcat.service: Failed with result 'timeout'.
| lang = terminal
}}
There's likely still something tomcat can't write to.

It seemed to get better when made the ownerships for /data/var/lib/pki/pki-tomcat/logs/ and /var/log/pki/pki-tomcat to pkiuser.

To help troubleshoot, su as pkiuser and try running tomcat as per the service file and see if you get any stack traces.

==== IPA Web GUI reports "Your session has expired. Please re-login" ====
Review the logs for dirsrv and see if there are any errors.
{{Highlight
| code = # tail -f /var/log/dirsrv/slapd-*/access
| lang = terminal
}}
For this instance, this was caused when I accidentally removed <code>/etc/dirsrv/ds.keytab.</code>

=== Configuring Samba issues ===
When running # ipa-adtrust-install --add-sids, you get:
{{Highlight
| code = # ipa-adtrust-install --add-sids
...
Configuring CIFS
[1/25]: validate server hostname
[error] ValueError: Host reports different name than configured: 'ipa' versus 'ipa.home.steamr.com'. Samba requires to have the same hostname or Kerberos principal 'cifs/ipa.home.steamr.com' will not be found in Samba keytab.
Unexpected error - see /var/log/ipaserver-adtrust-install.log for details:
ValueError: Host reports different name than configured: 'ipa' versus 'ipa.home.steamr.com'. Samba requires to have the same hostname or Kerberos principal 'cifs/ipa.home.steamr.com' will not be found in Samba keytab.
| lang = terminal
}}
Ensure that your hostname is the full FQDN.
{{Highlight
| code = # hostname
ipa
# hostname -f
ipa.home.steamr.com

## You need to fix the hostname so that this is what you get:
# hostname ipa.home.steamr.com
# hostname
ipa.home.steamr.com
| lang = terminal
}}
Once the hostname is fixed, try the command again.

=== DNS is missing A/AAAA entries for hosts ===
When trying to add a new host, the DNS silently gets ignored. Possible issues:

* I think this is related to this error when running ipa-client-install: <code>Could not update DNS SSHFP records.</code>.
* Possibly bad DNS entries during install was detected and it skipped the DNS tasks? {{Highlight
| code = Hostname (fc37.home.steamr.com) does not have A/AAAA record.
Failed to update DNS records.
Missing A/AAAA record(s) for host fc37.home.steamr.com: 10.1.2.32.
Incorrect reverse record(s):
10.1.2.32 is pointing to fc35.home.steamr.com. instead of fc37.home.steamr.com.
| lang = text
}}

Without this DNS entry set, other issues will crop up later on:
{{Highlight
| code = [root@ipa /]# ipa service-add cifs/dnas.home.steamr.com
ipa: ERROR: Host 'dnas.home.steamr.com' does not have corresponding DNS A/AAAA record
| lang = terminal
}}
The quick work-around would be to add the DNS entry manually and try again.

=== FreeIPA doesn't start under Docker: Failed to allocate manager object ===
When trying to run FreeIPA under Docker, you get the following message almost immediately on startup:
{{Highlight
| code = Detected virtualization docker.
Detected architecture x86-64.
Failed to create /init.scope control group: Read-only file system
Failed to allocate manager object: Read-only file system
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
| lang = terminal
}}
You're likely running this under a Docker host that has cgroup v2 enabled.

'''Possible workaround''' (though I couldn't get it to work before I reverted to Rocky Linux 8, but I realized after downgrading FreeIPA takes a ton of time before it appears to be functional): You will have to disable this by adding <code>systemd.unified_cgroup_hierarchy=0</code> as a kernel arguments to <code>/etc/default/grub</code>: . Rebuild grub configs <code>grub2-mkconfig -o /boot/grub2/grub.cfg</code> and reboot. Bring the container up as usual.

Alternatively, use Podman (except that I can't because I use docker-compose for all my setups).

=== Failed to authenticate to CA REST API ===
After suffering a brief power outage, my FreeIPA stack stopped working. A contributing factor might have been my auto-update mechanism which pulled in the most recent version of the freeipa/freeipa:rocky-9 container image. Looking at the container logs, I see that the container beings to shutdown after the upgrade command fails. Looking at the <code>/var/log/ipaupgrade.log</code> log file, I see the following error:
{{Highlight
| code = 2025-06-10T03:54:37Z DEBUG The ipa-server-upgrade command failed, exception: RemoteRetrieveError: Failed to authenticate to CA REST API
2025-06-10T03:54:37Z ERROR Unexpected error - see /var/log/ipaupgrade.log for details:
RemoteRetrieveError: Failed to authenticate to CA REST API
| lang = terminal
}}
My first step was to try downgrading. The second oldest container image was FreeIPA 4.11 and attempting to downgrade to this version didn't work as the data I had already had migrated to 4.12.

Some searching turned up a [https://access.redhat.com/solutions/7122683 Red Hat knowledgebase article] which states that this is a bug with ipa-server 4.12.2-14. The fix is to change the following files:

* Add <code>/etc/pki/pki-tomcat/Catalina/localhost/rewrite.config</code> (copy it from <code>/usr/share/pki/server/conf/Catalina/localhost/rewrite.config</code>)
* Edit <code>/etc/pki/pki-tomcat/server.xml</code> to include before the closing <code></Host></code> tag near the bottom of the file: {{Highlight
| code = <Valve className="org.apache.catalina.valves.rewrite.RewriteValve"/>
| lang = text
}}

Because both files are in <code>/etc</code>, these two files should be in the data volume that's mounted into the FreeIPA container. Edit both files from the data volume and then try restarting the container. It should start properly.

Enterprise software? FreeIPA feels like it was put together with duct tape.

== See also ==

* https://bgstack15.wordpress.com/2017/05/10/samba-share-with-freeipa-auth/ - Older guide walking how to set up FreeIPA and Samba
* https://blog.cubieserver.de/2018/synology-nas-samba-nfs-and-kerberos-with-freeipa-ldap/
* https://www.freeipa.org/page/Howto/Integrating_a_Samba_File_Server_With_IPA - Samba integration using kerberos (not ipasam)
* https://freeipa-users.redhat.narkive.com/ez2uKpFS/authenticate-samba-3-or-4-with-freeipa

{{Navbox Linux}}
[[Category:Linux]]
[[Category:Networking]]

Dreame L40 Ultra

2025-12-29T04:06:57Z

Leo:

The Dreame L40 Ultra is a robot vacuum and mop cleaner. I purchased this off Amazon in early November 2025 for about $580 CAD.

== Running Valetudo ==
The Dreame L40 Ultra robot will need the Dreame App to function. This requires the robot to have a constant internet network connection to Dreame's servers. If this doesn't sound appleaing, you can root and then install Valetudo on the robot. Valetudo serves a web-interface right from the robot itself and offers most of the functionality of the Dreame App. It is able to make the necessary calls to the underlying firmware such that you do not need to rely on Dreame's cloud services for the robot to function, thereby making the robot completely self-sufficient on your home network.

Be aware of the downsides with Valetudo:
* Controlling it from outside your home network or sharing it with multiple people requires more work (a reverse proxy or tailscale should do the trick). If you're using HomeAssistant, this might not be a huge issue as there's a HA integration with Valetudo.
* Some of the more niche features from Dreame is not available (CleanGenius is not available, so things like auto-recleaning, or auto-rewashing won't be available. There is no way to do the washboard base cleaning in Valetudo either)
* No multi-level / multi-map support (you can sort of hack this by replacing the map directory when you switch the robot)
* No camera view (but you can install rtc2go in addition to Valetudo and watch the camera stream it that way)
* Map view is good, but there's no way to hide/delete rooms.
* Because you've rooted the device and divorced it from the cloud, there won't be any more firmware / security updates or new (anti-)features.

== Rooting and installing Valetudo ==
You can root and install Valetudo on the Dreame L40 Ultra without needing to disassemble the vacuum cleaner by using the exposed debug headers. The steps on how to do this is documented on Valetudo's installation page (https://valetudo.cloud/pages/installation/dreame.html) and ideally requires the Dreame breakout PCB because you will need to use the FEL / fastboot installation method for this model. The UART Shell method is not available for this model due to secure boot. The installation instructions is somewhat scattered across multiple different pages and websites and I am documenting everything here for this specific model so it's all together.

{{Warning|Trusting random people on the internet|Pretty much everything below requires trusting that Dennis Giese (the person behind rooting the robot) isn't doing something malicous.

The FEL boot images that we'll be running is entirely generated by the dustbuilder service that Dennis is providing and it isn't completely open source at the time of writing.

Once you gain root access, you can look around for anything suspicious. At the time of writing, I was able to get a robot that, after changing the NTP server to one hosted locally, was absolutely network-silent.

But then again... you did get a Chinese robot vacuum cleaner that requires constant Internet access...}}

=== Prerequsites ===
Things you need to root and install Valetudo:

* Dreame Breakout PCB - https://github.com/Hypfer/valetudo-dreameadapter
** You can order the PCB from JLCPCB and it'll come in about 2 weeks (~$7 CAD).
** You will need to make sure you have 2mm headers ($2 CAD), some 2.4mm female headers ($1 CAD), a DIP button, a vertical micro USB header that fits the PCB layout ($2 CAD for 10), and a USB 2.0 type A female port ($2 CAD for 20, optional for this device/installation process).
* A 3.3v USB serial adapter and some dupont wires (optional, but is nice to have if you want to connect to the robot via serial)
* A computer running Linux (Ubuntu or Debian is preferable because we need to run Hypfer's fork of the LiveSuit program which was targeted for those distros). Ideally, this computer shouldn't be used for something important because you'll need to download/install a kernel module (which I don't really know where the source is from)
** Install Livesuit (https://github.com/Hypfer/valetudo-sunxi-livesuit). Follow the instructions in the README
* A rooted FEL firmware from https://builder.dontvacuum.me/_dreame_r2492.html (more on this in the instructions below).
** You will need to use the dustbuilder tool to generate FEL image that allows us to run the robot without secure boot. The tools to do this doesn't seem to be fully open source, possibly because Dennis (the author) wants to delay Dreame from patching this issue.

=== Installation ===

# Pry off the front of the robot to expose the debug headers. Your Dreame Breakout PCB with the 2mm headers should fit and the top of the breakout board should be facing the LIDAR.
# Download https://builder.dontvacuum.me/nextgen/dust-livesuit-mr813-ddr4.img and then run LiveSuit. Set the image to dust-livesuit-mr813-ddr4.img
# Factory reset the vacuum cleaner by pressing the reset button next to the wifi light for 10 seconds (it's next to the dust compartment). Once it comes back up, press and hold the power button until the vacuum is off
# While pressing the boot sel button the breakout PCB, press and hold power button on the vacuum cleaner for 5 seconds until the vacuum indicator lights are flashing. You can release the boot sel button once this happens.
# Plug a USB micro cable from your computer to the breakout port
# LiveSuit will ask to format the partition. Press No.
# In a terminal, you should be able to run: {{Highlight
| code = # fastboot devices
# fastboot getvar dustversion
# fastboot getvar config
| lang = terminal
}}
# Go to the dustbuilder page (https://builder.dontvacuum.me/_dreame_r2492.html) and enter your device serial number (the QR code next to the dust compartment or the serial number under the dust compartment), the config value from the command above, and ensure you check off 'Create FEL image'. The dustbuilder took about 20 minutes for me before it generated a FEL image I can download. While that's being generated, we can continue...
# Backup the boot stages in case we need it. Keep the bin files somewhere safe. They should all be approximately 400MB.{{Highlight
| code = # fastboot get_staged dustx100.bin
# fastboot oem stage1
# fastboot get_staged dustx101.bin
# fastboot oem stage2
# fastboot get_staged dustx102.bin
| lang = terminal
}}
# Download the firmware image from the dustbuilder. You should have received a dreame.vacuum.r2492_1680_fel.zip file in the package. Extract the contents.
# Open LiveSuit and set the image to the _dreame.vacuum.r2492_phoenixsuit.img image file.
# Make a note of the contents in check.txt. You need this ready for the next step.
# Reboot the vacuum cleaner and re-enter fastboot again. We do this because there is a 160 second watchdog that reboots the robot and we want as much time as we can get to avoid having it reboot on us before we're done.
# Run: {{Highlight
| code = # fastboot getvar config ## Confirm fastboot works
# fastboot oem dust xxxxxxx ## Replace xxxx with the contents in check.txt
# fastboot oem prep ## If this fails, don't proceed. Ensure LiveSuit has the correct img file set
## Ensure that everything below can run before the watchdog resets the device. If you took too long above, reset and try again.
# fastboot flash toc1 toc1.img ## Should return OKAY. Stop otherwise and re-assess.
# fastboot flash boot1 boot.img
# fastboot flash rootfs1 rootfs.img
# fastboot flash boot2 boot.img
# fastboot flash rootfs2 rootfs.img
| lang = terminal
}}
# Once all the flash commands succeed, you can run <code>fastboot reboot</code>.
# Press the outer buttons together for 3 seconds so that the robot starts its WiFi access point. Connect your computer to the robot's WiFi network.
# Confirm that you can SSH to the robot at root@192.168.5.1 using the SSH key that you selected (or were provided) in dustbuilder. If you don't have a working key, you can always connect to the robot via serial (there's a root shell running there)
# Backup all the calibration data from the robot. {{Highlight
| code = robot# tar -czf /tmp/calibration.tar.gz /mnt

## on your computer, scp it out, or if dropbear utils aren't available, you can just do:
computer# ssh -i *rsa root@192.168.5.1 cat /tmp/calibration.tar.gz > calibration.tar.gz
| lang = terminal
}}
# Download Valetudo's latest release from https://github.com/Hypfer/Valetudo and copy it into the robot to <code>/data/valetudo</code>. {{Highlight
| code = robot# scp leo@computer:~/valetudo /data/valetudo
robot# chmod +x /data/valetudo
robot# cp /misc/_root_postboot.sh.tpl /data/_root_postboot.sh
robot# chmod +x /data/_root_postboot.sh
| lang = terminal
}}
# Reboot and once the robot comes back up, you should see the Valetudo web interface on the robot at http://192.168.5.1. Complete the robot setup using Valetudo (such as connecting it to your home's WiFi network).

=== Camera Streaming ===
The camera on the vacuum cleaner can be seen via go2rtc. Anthony Zhang's blog talks more about this (https://anthony-zhang.me/blog/offline-robot-vacuum/) and it requires installing tihmstar's vacuumstreamer repo and the official go2rtc binary:
* https://github.com/tihmstar/vacuumstreamer
* https://github.com/AlexxIT/go2rtc
{{Warning|Trusting random people on the internet... part 2|The vacuumstreamer repo has a video_monitor binary which seems to have come from another AVA powered vacuum cleaner. The source of this binary isn't clear. You're running someone's binary blob at this point.}}{{Highlight
| code = ## on the robot, make /data/vacuumstramer
robot# mkdir /data/vacuumstreamer

## on your computer, download vacuumstreamer packages and the go2rtc binary and copy it to the robot
computer# git clone https://github.com/tihmstar/vacuumstreamer
computer# scp vacuumstreamer/vacuumstreamer.so root@vacuum:/data/vacuumstreamer/
computer# scp vacuumstreamer/dist/usr/bin/video_monitor root@vacuum:/data/vacuumstreamer/
computer# scp -r vacuumstreamer/dist/ava/conf/video_monitor root@vacuum:/data/vacuumstreamer/ava_conf_video_monitor

computer# wget https://github.com/AlexxIT/go2rtc/releases/download/v1.9.13/go2rtc_linux_arm64
computer# scp go2rtc_linux_arm64 root@vacuum:/data/vacuumstreamer/go2rtc

## back on the robot, set up the streamer
robot# cp -r /mnt/private /data/vacuumstreamer/mnt_private_copy
robot# touch /data/vacuumstreamer/mnt_private_copy/certificate.bin # workaround for missing certificate bug, see https://github.com/tihmstar/vacuumstreamer/issues/1 for details
robot# cat <<EOF >> /data/_root_postboot.sh

if [[ -f /data/vacuumstreamer/video_monitor ]]; then
mount --bind /data/vacuumstreamer/ava_conf_video_monitor /ava/conf/video_monitor
mount --bind /data/vacuumstreamer/mnt_private_copy /mnt/private
LD_PRELOAD=/data/vacuumstreamer/vacuumstreamer.so /data/vacuumstreamer/video_monitor > /dev/null 2>&1 &
/data/vacuumstreamer/go2rtc -c '{"streams": {"tcp_magic": "tcp://127.0.0.1:6969"<nowiki>}}</nowiki>' > /dev/null 2>&1 &
fi
EOF
}}

Reboot. You should see the video stream at http://robot:1984.

=== Multiple maps (for different floors) ===
Because Valetudo only supports working with only one map, you will have to save/restore the map data if you want to use multiple maps. There is a project that already does this via MQTT called Maploader (https://github.com/pkoehlers/maploader). It uses the MQTT settings from the Valetudo config file.

==== Installation ====
Install Maploader on the vacuum cleaner (see the project's README).

In Home Assistant's configuration.yaml file, add the following yaml config. I added a <code>unique_id</code> so that these can be added to a dashboard.
{{Highlight
| code = mqtt:
sensor:
- state_topic: valetudo/DreameL40Ultra/maploader/status
unique_id: "vacuum_maploader_status"
name: "vacuum_maploader_status"
select:
- command_topic: valetudo/DreameL40Ultra/maploader/map/set
state_topic: valetudo/DreameL40Ultra/maploader/map
unique_id: "vacuum_maploader_map"
name: "vacuum_maploader_map"
options:
- main
- second_floor
| lang = yaml
}}
Restart Home Assistant and the Vacuum cleaner so that Maploader is running. In Home Assistant, find the sensors by searching for 'maploader' and then add it to your dashboard.

Benchmark for Intel E5-1660 v3

2025-12-24T23:13:02Z

Leo:

This is the CPU on the x99 board that I got off EBay in early 2025.

{{Benchmark
| hardware = X99-E WS/USB 3.1
| cpu = Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz CPU @ 3.0GHz
| memory = 128 GB DDR4 2133
| disk = 128 GB SanDisk SD7SB3Q-128G-1006, SATA III
| os = Debian GNU/Linux 12
| type = server
| score = 1471.6 / 8075.3
| raw =
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)

System: pve: GNU/Linux
OS: GNU/Linux -- 6.8.12-15-pve -- #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-15 (2025-09-12T11:02Z)
Machine: x86_64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 1: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 2: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 3: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 4: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 5: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 6: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 7: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 8: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 9: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 10: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 11: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 12: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 13: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 14: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 15: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
14:28:32 up 30 days, 5:17, 2 users, load average: 3.47, 3.91, 3.85; runlevel 2025-11-24

------------------------------------------------------------------------
Benchmark Run: Wed Dec 24 2025 14:28:32 - 14:56:32
16 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 48113206.2 lps (10.0 s, 7 samples)
Double-Precision Whetstone 7561.5 MWIPS (10.0 s, 7 samples)
Execl Throughput 3508.8 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 780264.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 202458.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2477067.6 KBps (30.0 s, 2 samples)
Pipe Throughput 1013971.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 150952.0 lps (10.0 s, 7 samples)
Process Creation 7841.4 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 13070.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 5925.7 lpm (60.0 s, 2 samples)
System Call Overhead 557537.5 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 48113206.2 4122.8
Double-Precision Whetstone 55.0 7561.5 1374.8
Execl Throughput 43.0 3508.8 816.0
File Copy 1024 bufsize 2000 maxblocks 3960.0 780264.4 1970.4
File Copy 256 bufsize 500 maxblocks 1655.0 202458.5 1223.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 2477067.6 4270.8
Pipe Throughput 12440.0 1013971.6 815.1
Pipe-based Context Switching 4000.0 150952.0 377.4
Process Creation 126.0 7841.4 622.3
Shell Scripts (1 concurrent) 42.4 13070.0 3082.6
Shell Scripts (8 concurrent) 6.0 5925.7 9876.2
System Call Overhead 15000.0 557537.5 371.7
========
System Benchmarks Index Score 1471.6

------------------------------------------------------------------------
Benchmark Run: Wed Dec 24 2025 14:56:32 - 15:24:35
16 CPUs in system; running 16 parallel copies of tests

Dhrystone 2 using register variables 503918570.5 lps (10.0 s, 7 samples)
Double-Precision Whetstone 105274.5 MWIPS (10.0 s, 7 samples)
Execl Throughput 31544.8 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1496137.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 392970.6 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 4403975.7 KBps (30.0 s, 2 samples)
Pipe Throughput 9061747.9 lps (10.0 s, 7 samples)
Pipe-based Context Switching 1607736.7 lps (10.0 s, 7 samples)
Process Creation 74350.4 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 82013.7 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 10784.8 lpm (60.0 s, 2 samples)
System Call Overhead 4649424.3 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 503918570.5 43180.7
Double-Precision Whetstone 55.0 105274.5 19140.8
Execl Throughput 43.0 31544.8 7336.0
File Copy 1024 bufsize 2000 maxblocks 3960.0 1496137.5 3778.1
File Copy 256 bufsize 500 maxblocks 1655.0 392970.6 2374.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 4403975.7 7593.1
Pipe Throughput 12440.0 9061747.9 7284.4
Pipe-based Context Switching 4000.0 1607736.7 4019.3
Process Creation 126.0 74350.4 5900.8
Shell Scripts (1 concurrent) 42.4 82013.7 19342.8
Shell Scripts (8 concurrent) 6.0 10784.8 17974.7
System Call Overhead 15000.0 4649424.3 3099.6
========
System Benchmarks Index Score 8075.3

}}

{{Navbox Benchmarks}}

Benchmark for Intel E5-1660 v3

2025-12-24T21:02:12Z

Leo: Created page with "This is the CPU on the x99 board that I got off EBay in early 2025. {{Benchmark | hardware = X99-E WS/USB 3.1 | cpu = Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz CPU @ 3.0GHz | memory = 128 GB DDR4 2133 | disk = 128 GB SanDisk SD7SB3Q-128G-1006, SATA III | os = Debian GNU/Linux 12 | type = server | score = 1397.1 / 7161.3 | raw = ======================================================================== BYTE UNIX Benchmarks..."

This is the CPU on the x99 board that I got off EBay in early 2025.

{{Benchmark
| hardware = X99-E WS/USB 3.1
| cpu = Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz CPU @ 3.0GHz
| memory = 128 GB DDR4 2133
| disk = 128 GB SanDisk SD7SB3Q-128G-1006, SATA III
| os = Debian GNU/Linux 12
| type = server
| score = 1397.1 / 7161.3
| raw =
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)

System: pve: GNU/Linux
OS: GNU/Linux -- 6.8.12-15-pve -- #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-15 (2025-09-12T11:02Z)
Machine: x86_64 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 1: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 2: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 3: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 4: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 5: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 6: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 7: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 8: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 9: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 10: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 11: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 12: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 13: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 14: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 15: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz (6010.2 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
18:15:54 up 28 days, 9:04, 3 users, load average: 2.33, 2.03, 2.09; runlevel 2025-11-24

------------------------------------------------------------------------
Benchmark Run: Mon Dec 22 2025 18:15:54 - 18:54:46
16 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 46788047.8 lps (10.0 s, 7 samples)
Double-Precision Whetstone 7496.2 MWIPS (10.0 s, 7 samples)
Execl Throughput 3359.7 lps (29.3 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 753698.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 197234.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2184722.3 KBps (30.0 s, 2 samples)
Pipe Throughput 999845.1 lps (10.0 s, 7 samples)
Pipe-based Context Switching 139904.5 lps (10.0 s, 7 samples)
Process Creation 7180.2 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 12207.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 5380.9 lpm (60.0 s, 2 samples)
System Call Overhead 549632.4 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 46788047.8 4009.3
Double-Precision Whetstone 55.0 7496.2 1362.9
Execl Throughput 43.0 3359.7 781.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 753698.8 1903.3
File Copy 256 bufsize 500 maxblocks 1655.0 197234.5 1191.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 2184722.3 3766.8
Pipe Throughput 12440.0 999845.1 803.7
Pipe-based Context Switching 4000.0 139904.5 349.8
Process Creation 126.0 7180.2 569.9
Shell Scripts (1 concurrent) 42.4 12207.0 2879.0
Shell Scripts (8 concurrent) 6.0 5380.9 8968.1
System Call Overhead 15000.0 549632.4 366.4
========
System Benchmarks Index Score 1397.1

------------------------------------------------------------------------
Benchmark Run: Mon Dec 22 2025 18:54:46 - 19:34:27
16 CPUs in system; running 16 parallel copies of tests

Dhrystone 2 using register variables 446158122.0 lps (10.0 s, 7 samples)
Double-Precision Whetstone 97049.0 MWIPS (10.0 s, 7 samples)
Execl Throughput 27936.0 lps (29.2 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1364285.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 368175.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3847624.4 KBps (30.0 s, 2 samples)
Pipe Throughput 8199713.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 1435331.2 lps (10.0 s, 7 samples)
Process Creation 61806.6 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 66920.1 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 9336.9 lpm (60.0 s, 2 samples)
System Call Overhead 4298119.8 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 446158122.0 38231.2
Double-Precision Whetstone 55.0 97049.0 17645.3
Execl Throughput 43.0 27936.0 6496.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 1364285.5 3445.2
File Copy 256 bufsize 500 maxblocks 1655.0 368175.9 2224.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 3847624.4 6633.8
Pipe Throughput 12440.0 8199713.7 6591.4
Pipe-based Context Switching 4000.0 1435331.2 3588.3
Process Creation 126.0 61806.6 4905.3
Shell Scripts (1 concurrent) 42.4 66920.1 15783.1
Shell Scripts (8 concurrent) 6.0 9336.9 15561.4
System Call Overhead 15000.0 4298119.8 2865.4
========
System Benchmarks Index Score 7161.3

}}

{{Navbox Benchmarks}}

Dreame L40 Ultra

2025-12-24T00:11:31Z

Leo:

Dreame L40 Ultra

2025-12-24T00:00:33Z

Leo:

The Dreame L40 Ultra is a robot vacuum and mop cleaner. I purchased this off Amazon in early November 2025 for about $580 CAD.

== Running Valetudo ==
The Dreame L40 Ultra robot will need the Dreame App to function. This requires the robot to have a constant internet network connection to Dreame's servers. If this doesn't sound appleaing, you can install Valetudo on this robot vacuum which will present a web-based interface served from the robot itself that has most of the functionality from the Dreame app. Part of the rooting process can also divorce your robot from all internet/cloud-facing services so it is completely self-sufficient on your home network.

Be aware of the downsides with Valetudo:

* No native IOS client, but you can always control it via a web browser (even from a desktop! so it's really a plus)
* Controlling it from outside your home network or sharing it with multiple people requires more work (a reverse proxy or tailscale should do the trick). If you're using HomeAssistant, this might not be a huge issue as there's a HA integration with Valetudo.
* Some of the more niche features from Dreame is not available (CleanGenius is not available, so things like auto-recleaning, or auto-rewashing won't be available. There is no way to do the washboard base cleaning in Valetudo either)
* No multi-level / multi-map support (you can sort of hack this by replacing the map directory when you switch the robot)
* No camera view (but you can install rtc2go in addition to Valetudo and watch the camera stream it that way)
* Map view is good, but there's no way to hide/delete rooms.

The upside is that it is completely self-sufficient and doesn't require any cloud services to function.

== Rooting and installing Valetudo ==
You can root and install Valetudo on the Dreame L40 Ultra without needing to disassemble the vacuum cleaner by using the exposed debug headers. The steps on how to do this is documented on Valetudo's installation page (https://valetudo.cloud/pages/installation/dreame.html) and ideally requires the Dreame breakout PCB because you will need to use the FEL / fastboot installation method for this model. The UART Shell method is not available for this model due to secure boot. The installation instructions is somewhat scattered across multiple different pages and websites and I am documenting everything here for this specific model so it's all together.

{{Warning|Trusting random people on the internet|Pretty much everything below requires trusting that Dennis Giese (the person behind rooting the robot) isn't doing something malicous.

The FEL boot images that we'll be running is entirely generated by the dustbuilder service that Dennis is providing and it isn't completely open source at the time of writing.

Once you gain root access, you can look around for anything suspicious. At the time of writing, I was able to get a robot that, after changing the NTP server to one hosted locally, was absolutely network-silent.

But then again... you did get a Chinese robot vacuum cleaner that requires constant Internet access...}}

=== Prerequsites ===
Things you need to root and install Valetudo:

* Dreame Breakout PCB - https://github.com/Hypfer/valetudo-dreameadapter
** You can order the PCB from JLCPCB and it'll come in about 2 weeks (~$7 CAD).
** You will need to make sure you have 2mm headers ($2 CAD), some 2.4mm female headers ($1 CAD), a DIP button, a vertical micro USB header that fits the PCB layout ($2 CAD for 10), and a USB 2.0 type A female port ($2 CAD for 20, optional for this device/installation process).
* A 3.3v USB serial adapter and some dupont wires (optional, but is nice to have if you want to connect to the robot via serial)
* A computer running Linux (Ubuntu or Debian is preferable because we need to run Hypfer's fork of the LiveSuit program which was targeted for those distros). Ideally, this computer shouldn't be used for something important because you'll need to download/install a kernel module (which I don't really know where the source is from)
** Install Livesuit (https://github.com/Hypfer/valetudo-sunxi-livesuit). Follow the instructions in the README
* A rooted FEL firmware from https://builder.dontvacuum.me/_dreame_r2492.html (more on this in the instructions below).
** You will need to use the dustbuilder tool to generate FEL image that allows us to run the robot without secure boot. The tools to do this doesn't seem to be fully open source, possibly because Dennis (the author) wants to delay Dreame from patching this issue.

=== Installation ===

# Pry off the front of the robot to expose the debug headers. Your Dreame Breakout PCB with the 2mm headers should fit and the top of the breakout board should be facing the LIDAR.
# Download https://builder.dontvacuum.me/nextgen/dust-livesuit-mr813-ddr4.img and then run LiveSuit. Set the image to dust-livesuit-mr813-ddr4.img
# Factory reset the vacuum cleaner by pressing the reset button next to the wifi light for 10 seconds (it's next to the dust compartment). Once it comes back up, press and hold the power button until the vacuum is off
# While pressing the boot sel button the breakout PCB, press and hold power button on the vacuum cleaner for 5 seconds until the vacuum indicator lights are flashing. You can release the boot sel button once this happens.
# Plug a USB micro cable from your computer to the breakout port
# LiveSuit will ask to format the partition. Press No.
# In a terminal, you should be able to run: {{Highlight
| code = # fastboot devices
# fastboot getvar dustversion
# fastboot getvar config
| lang = terminal
}}
# Go to the dustbuilder page (https://builder.dontvacuum.me/_dreame_r2492.html) and enter your device serial number (the QR code next to the dust compartment or the serial number under the dust compartment), the config value from the command above, and ensure you check off 'Create FEL image'. The dustbuilder took about 20 minutes for me before it generated a FEL image I can download. While that's being generated, we can continue...
# Backup the boot stages in case we need it. Keep the bin files somewhere safe. They should all be approximately 400MB.{{Highlight
| code = # fastboot get_staged dustx100.bin
# fastboot oem stage1
# fastboot get_staged dustx101.bin
# fastboot oem stage2
# fastboot get_staged dustx102.bin
| lang = terminal
}}
# Download the firmware image from the dustbuilder. You should have received a dreame.vacuum.r2492_1680_fel.zip file in the package. Extract the contents.
# Open LiveSuit and set the image to the _dreame.vacuum.r2492_phoenixsuit.img image file.
# Make a note of the contents in check.txt. You need this ready for the next step.
# Reboot the vacuum cleaner and re-enter fastboot again. We do this because there is a 160 second watchdog that reboots the robot and we want as much time as we can get to avoid having it reboot on us before we're done.
# Run: {{Highlight
| code = # fastboot getvar config ## Confirm fastboot works
# fastboot oem dust xxxxxxx ## Replace xxxx with the contents in check.txt
# fastboot oem prep ## If this fails, don't proceed. Ensure LiveSuit has the correct img file set
## Ensure that everything below can run before the watchdog resets the device. If you took too long above, reset and try again.
# fastboot flash toc1 toc1.img ## Should return OKAY. Stop otherwise and re-assess.
# fastboot flash boot1 boot.img
# fastboot flash rootfs1 rootfs.img
# fastboot flash boot2 boot.img
# fastboot flash rootfs2 rootfs.img
| lang = terminal
}}
# Once all the flash commands succeed, you can run <code>fastboot reboot</code>.
# Press the outer buttons together for 3 seconds so that the robot starts its WiFi access point. Connect your computer to the robot's WiFi network.
# Confirm that you can SSH to the robot at root@192.168.5.1 using the SSH key that you selected (or were provided) in dustbuilder. If you don't have a working key, you can always connect to the robot via serial (there's a root shell running there)
# Backup all the calibration data from the robot. {{Highlight
| code = robot# tar -czf /tmp/calibration.tar.gz /mnt

## on your computer, scp it out, or if dropbear utils aren't available, you can just do:
computer# ssh -i *rsa root@192.168.5.1 cat /tmp/calibration.tar.gz > calibration.tar.gz
| lang = terminal
}}
# Download Valetudo's latest release from https://github.com/Hypfer/Valetudo and copy it into the robot to <code>/data/valetudo</code>. {{Highlight
| code = robot# scp leo@computer:~/valetudo /data/valetudo
robot# chmod +x /data/valetudo
robot# cp /misc/_root_postboot.sh.tpl /data/_root_postboot.sh
robot# chmod +x /data/_root_postboot.sh
| lang = terminal
}}
# Reboot and once the robot comes back up, you should see the Valetudo web interface on the robot at http://192.168.5.1. Complete the robot setup using Valetudo (such as connecting it to your home's WiFi network).

=== Camera Streaming ===
The camera on the vacuum cleaner can be seen via go2rtc. Anthony Zhang's blog talks more about this (https://anthony-zhang.me/blog/offline-robot-vacuum/) and it requires installing tihmstar's vacuumstreamer repo and the official go2rtc binary:
* https://github.com/tihmstar/vacuumstreamer
* https://github.com/AlexxIT/go2rtc
{{Warning|Trusting random people on the internet... part 2|The vacuumstreamer repo has a video_monitor binary which seems to have come from another AVA powered vacuum cleaner. The source of this binary isn't clear. You're running someone's binary blob at this point.}}{{Highlight
| code = ## on the robot, make /data/vacuumstramer
robot# mkdir /data/vacuumstreamer

## on your computer, download vacuumstreamer packages and the go2rtc binary and copy it to the robot
computer# git clone https://github.com/tihmstar/vacuumstreamer
computer# scp vacuumstreamer/vacuumstreamer.so root@vacuum:/data/vacuumstreamer/
computer# scp vacuumstreamer/dist/usr/bin/video_monitor root@vacuum:/data/vacuumstreamer/
computer# scp -r vacuumstreamer/dist/ava/conf/video_monitor root@vacuum:/data/vacuumstreamer/ava_conf_video_monitor

computer# wget https://github.com/AlexxIT/go2rtc/releases/download/v1.9.13/go2rtc_linux_arm64
computer# scp go2rtc_linux_arm64 root@vacuum:/data/vacuumstreamer/go2rtc

## back on the robot, set up the streamer
robot# cp -r /mnt/private /data/vacuumstreamer/mnt_private_copy
robot# touch /data/vacuumstreamer/mnt_private_copy/certificate.bin # workaround for missing certificate bug, see https://github.com/tihmstar/vacuumstreamer/issues/1 for details
robot# cat <<EOF >> /data/_root_postboot.sh

if [[ -f /data/vacuumstreamer/video_monitor ]]; then
mount --bind /data/vacuumstreamer/ava_conf_video_monitor /ava/conf/video_monitor
mount --bind /data/vacuumstreamer/mnt_private_copy /mnt/private
LD_PRELOAD=/data/vacuumstreamer/vacuumstreamer.so /data/vacuumstreamer/video_monitor > /dev/null 2>&1 &
/data/vacuumstreamer/go2rtc -c '{"streams": {"tcp_magic": "tcp://127.0.0.1:6969"<nowiki>}}</nowiki>' > /dev/null 2>&1 &
fi
EOF
}}

Reboot. You should see the video stream at http://robot:1984.

Dreame L40 Ultra

2025-12-23T23:36:43Z

Leo:

The Dreame L40 Ultra is a robot vacuum and mop cleaner. I purchased this off Amazon in early November 2025 for about $580 CAD.

== Rooting and installing Valetudo ==
You can root and install Valetudo on the Dreame L40 Ultra without needing to disassemble the vacuum cleaner by using the exposed debug headers. The steps on how to do this is documented on Valetudo's installation page (https://valetudo.cloud/pages/installation/dreame.html) and ideally requires the Dreame breakout PCB because you will need to use the FEL / fastboot installation method for this model. The installation instructions is somewhat scattered across multiple different pages and websites and I am documenting everything here for this specific model so it's all together.

{{Warning|Trusting random people on the internet|Pretty much everything below requires trusting that Dennis Giese (the person behind rooting the robot) isn't doing something malicous.

The FEL boot images that we'll be running is entirely generated by the dustbuilder service that Dennis is providing and it isn't completely open source at the time of writing.

Once you gain root access, you can look around for anything suspicious. At the time of writing, I was able to get a robot that, after changing the NTP server to one hosted locally, was absolutely network-silent.

But then again... you did get a Chinese robot vacuum cleaner that requires constant Internet access...}}

=== Prerequsites ===
Things you need to root and install Valetudo:

* Dreame Breakout PCB - https://github.com/Hypfer/valetudo-dreameadapter
** You can order the PCB from JLCPCB and it'll come in about 2 weeks (~$7 CAD).
** You will need to make sure you have 2mm headers ($2 CAD), some 2.4mm female headers ($1 CAD), a DIP button, a vertical micro USB header that fits the PCB layout ($2 CAD for 10), and a USB 2.0 type A female port ($2 CAD for 20, optional for this device/installation process).
* A 3.3v USB serial adapter and some dupont wires (optional, but is nice to have if you want to connect to the robot via serial)
* A computer running Linux (Ubuntu or Debian is preferable because we need to run Hypfer's fork of the LiveSuit program which was targeted for those distros). Ideally, this computer shouldn't be used for something important because you'll need to download/install a kernel module (which I don't really know where the source is from)
** Install Livesuit (https://github.com/Hypfer/valetudo-sunxi-livesuit). Follow the instructions in the README
* A rooted FEL firmware from https://builder.dontvacuum.me/_dreame_r2492.html (more on this in the instructions below).
** You will need to use the dustbuilder tool to generate FEL image that allows us to run the robot without secure boot. The tools to do this doesn't seem to be fully open source, possibly because Dennis (the author) wants to delay Dreame from patching this issue.

=== Installation ===

# Pry off the front of the robot to expose the debug headers. Your Dreame Breakout PCB with the 2mm headers should fit and the top of the breakout board should be facing the LIDAR.
# Download https://builder.dontvacuum.me/nextgen/dust-livesuit-mr813-ddr4.img and then run LiveSuit. Set the image to dust-livesuit-mr813-ddr4.img
# Factory reset the vacuum cleaner by pressing the reset button next to the wifi light for 10 seconds (it's next to the dust compartment). Once it comes back up, press and hold the power button until the vacuum is off
# While pressing the boot sel button the breakout PCB, press and hold power button on the vacuum cleaner for 5 seconds until the vacuum indicator lights are flashing. You can release the boot sel button once this happens.
# Plug a USB micro cable from your computer to the breakout port
# LiveSuit will ask to format the partition. Press No.
# In a terminal, you should be able to run: {{Highlight
| code = # fastboot devices
# fastboot getvar dustversion
# fastboot getvar config
| lang = terminal
}}
# Go to the dustbuilder page (https://builder.dontvacuum.me/_dreame_r2492.html) and enter your device serial number (the QR code next to the dust compartment or the serial number under the dust compartment), the config value from the command above, and ensure you check off 'Create FEL image'. The dustbuilder took about 20 minutes for me before it generated a FEL image I can download. While that's being generated, we can continue...
# Backup the boot stages in case we need it. Keep the bin files somewhere safe. They should all be approximately 400MB.{{Highlight
| code = # fastboot get_staged dustx100.bin
# fastboot oem stage1
# fastboot get_staged dustx101.bin
# fastboot oem stage2
# fastboot get_staged dustx102.bin
| lang = terminal
}}
# Download the firmware image from the dustbuilder. You should have received a dreame.vacuum.r2492_1680_fel.zip file in the package. Extract the contents.
# Open LiveSuit and set the image to the _dreame.vacuum.r2492_phoenixsuit.img image file.
# Make a note of the contents in check.txt. You need this ready for the next step.
# Reboot the vacuum cleaner and re-enter fastboot again. We do this because there is a 160 second watchdog that reboots the robot and we want as much time as we can get to avoid having it reboot on us before we're done.
# Run: {{Highlight
| code = # fastboot getvar config ## Confirm fastboot works
# fastboot oem dust xxxxxxx ## Replace xxxx with the contents in check.txt
# fastboot oem prep ## If this fails, don't proceed. Ensure LiveSuit has the correct img file set
## Ensure that everything below can run before the watchdog resets the device. If you took too long above, reset and try again.
# fastboot flash toc1 toc1.img ## Should return OKAY. Stop otherwise and re-assess.
# fastboot flash boot1 boot.img
# fastboot flash rootfs1 rootfs.img
# fastboot flash boot2 boot.img
# fastboot flash rootfs2 rootfs.img
| lang = terminal
}}
# Once all the flash commands succeed, you can run <code>fastboot reboot</code>.
# Press the outer buttons together for 3 seconds so that the robot starts its WiFi access point. Connect your computer to the robot's WiFi network.
# Confirm that you can SSH to the robot at root@192.168.5.1 using the SSH key that you selected (or were provided) in dustbuilder. If you don't have a working key, you can always connect to the robot via serial (there's a root shell running there)
# Backup all the calibration data from the robot. {{Highlight
| code = robot# tar -czf /tmp/calibration.tar.gz /mnt

## on your computer, scp it out, or if dropbear utils aren't available, you can just do:
computer# ssh -i *rsa root@192.168.5.1 cat /tmp/calibration.tar.gz > calibration.tar.gz
| lang = terminal
}}
# Download Valetudo's latest release from https://github.com/Hypfer/Valetudo and copy it into the robot to <code>/data/valetudo</code>. {{Highlight
| code = robot# scp leo@computer:~/valetudo /data/valetudo
robot# chmod +x /data/valetudo
robot# cp /misc/_root_postboot.sh.tpl /data/_root_postboot.sh
robot# chmod +x /data/_root_postboot.sh
| lang = terminal
}}
# Reboot and once the robot comes back up, you should see the Valetudo web interface on the robot at http://192.168.5.1. Complete the robot setup using Valetudo (such as connecting it to your home's WiFi network).

=== Camera Streaming ===
The camera on the vacuum cleaner can be seen via go2rtc. Anthony Zhang's blog talks more about this (https://anthony-zhang.me/blog/offline-robot-vacuum/) and it requires installing tihmstar's vacuumstreamer repo and the official go2rtc binary:
* https://github.com/tihmstar/vacuumstreamer
* https://github.com/AlexxIT/go2rtc
{{Warning|Trusting random people on the internet... part 2|The vacuumstreamer repo has a video_monitor binary which seems to have come from another AVA powered vacuum cleaner. The source of this binary isn't clear. You're running someone's binary blob at this point.}}{{Highlight
| code = ## on the robot, make /data/vacuumstramer
robot# mkdir /data/vacuumstreamer

## on your computer, download vacuumstreamer packages and the go2rtc binary and copy it to the robot
computer# git clone https://github.com/tihmstar/vacuumstreamer
computer# scp vacuumstreamer/vacuumstreamer.so root@vacuum:/data/vacuumstreamer/
computer# scp vacuumstreamer/dist/usr/bin/video_monitor root@vacuum:/data/vacuumstreamer/
computer# scp -r vacuumstreamer/dist/ava/conf/video_monitor root@vacuum:/data/vacuumstreamer/ava_conf_video_monitor

computer# wget https://github.com/AlexxIT/go2rtc/releases/download/v1.9.13/go2rtc_linux_arm64
computer# scp go2rtc_linux_arm64 root@vacuum:/data/vacuumstreamer/go2rtc

## back on the robot, set up the streamer
robot# cp -r /mnt/private /data/vacuumstreamer/mnt_private_copy
robot# touch /data/vacuumstreamer/mnt_private_copy/certificate.bin # workaround for missing certificate bug, see https://github.com/tihmstar/vacuumstreamer/issues/1 for details
robot# cat <<EOF >> /data/_root_postboot.sh

if [[ -f /data/vacuumstreamer/video_monitor ]]; then
mount --bind /data/vacuumstreamer/ava_conf_video_monitor /ava/conf/video_monitor
mount --bind /data/vacuumstreamer/mnt_private_copy /mnt/private
LD_PRELOAD=/data/vacuumstreamer/vacuumstreamer.so /data/vacuumstreamer/video_monitor > /dev/null 2>&1 &
/data/vacuumstreamer/go2rtc -c '{"streams": {"tcp_magic": "tcp://127.0.0.1:6969"<nowiki>}}</nowiki>' > /dev/null 2>&1 &
fi
EOF
}}

Reboot. You should see the video stream at http://robot:1984.

Dreame L40 Ultra

2025-12-23T23:27:00Z

Leo:

The Dreame L40 Ultra is a robot vacuum and mop cleaner. I purchased this off Amazon in early November 2025 for about $580 CAD.

== Rooting and installing Valetudo ==
You can root and install Valetudo on the Dreame L40 Ultra without needing to disassemble the vacuum cleaner by using the exposed debug header pins. The steps on how to do this is documented on Valetudo's installation page (https://valetudo.cloud/pages/installation/dreame.html) and ideally requires the Dreame breakout PCB because you will need to use the FEL / fastboot installation method for this model. The installation instructions is somewhat scattered across multiple different pages and websites and I am documenting everything here for this specific model so it's all together.

{{Warning|Trusting random people on the internet|Pretty much everything below requires trusting that Dennis Giese (the person behind rooting the robot) isn't doing something malicous.

The FEL boot images that we'll be running is entirely generated by the dustbuilder service that Dennis is providing and it isn't completely open source at the time of writing.}}

=== Prerequsites ===
Things you need to root and install Valetudo:

* Dreame Breakout PCB - https://github.com/Hypfer/valetudo-dreameadapter
** You can order the PCB from JLCPCB and it'll come in about 2 weeks (~$7 CAD).
** You will need to make sure you have 2mm headers ($2 CAD), some 2.4mm female headers ($1 CAD), a DIP button, a vertical micro USB header that fits the PCB layout ($2 CAD for 10), and a USB 2.0 type A female port ($2 CAD for 20, optional for this device/installation process).
* A 3.3v USB serial adapter and some dupont wires (optional, but is nice to have if you want to connect to the robot via serial)
* A computer running Linux (Ubuntu or Debian is preferable because we need to run Hypfer's fork of the LiveSuit program which was targeted for those distros). Ideally, this computer shouldn't be used for something important because you'll need to download/install a kernel module (which I don't really know where the source is from)
** Install Livesuit (https://github.com/Hypfer/valetudo-sunxi-livesuit). Follow the instructions in the README
* A rooted FEL firmware from https://builder.dontvacuum.me/_dreame_r2492.html (more on this in the instructions below).
** You will need to use the dustbuilder tool to generate FEL image that allows us to run the robot without secure boot. The tools to do this doesn't seem to be fully open source, possibly because Dennis (the author) wants to delay Dreame from patching this issue.

=== Installation ===

# Pry off the front of the robot to expose the debug headers. Your Dreame Breakout PCB with the 2mm headers should fit and the top of the breakout board should be facing the LIDAR.
# Download https://builder.dontvacuum.me/nextgen/dust-livesuit-mr813-ddr4.img and then run LiveSuit. Set the image to dust-livesuit-mr813-ddr4.img
# Factory reset the vacuum cleaner by pressing the reset button next to the wifi light for 10 seconds (it's next to the dust compartment). Once it comes back up, press and hold the power button until the vacuum is off
# While pressing the boot sel button the breakout PCB, press and hold power button on the vacuum cleaner for 5 seconds until the vacuum indicator lights are flashing. You can release the boot sel button once this happens.
# Plug a USB micro cable from your computer to the breakout port
# LiveSuit will ask to format the partition. Press No.
# In a terminal, you should be able to run: {{Highlight
| code = # fastboot devices
# fastboot getvar dustversion
# fastboot getvar config
| lang = terminal
}}
# Go to the dustbuilder page (https://builder.dontvacuum.me/_dreame_r2492.html) and enter your device serial number (the QR code next to the dust compartment or the serial number under the dust compartment), the config value from the command above, and ensure you check off 'Create FEL image'. The dustbuilder took about 20 minutes for me before it generated a FEL image I can download. While that's being generated, we can continue...
# Backup the boot stages in case we need it. Keep the bin files somewhere safe. They should all be approximately 400MB.{{Highlight
| code = # fastboot get_staged dustx100.bin
# fastboot oem stage1
# fastboot get_staged dustx101.bin
# fastboot oem stage2
# fastboot get_staged dustx102.bin
| lang = terminal
}}
# Download the firmware image from the dustbuilder. You should have received a dreame.vacuum.r2492_1680_fel.zip file in the package. Extract the contents.
# Open LiveSuit and set the image to the _dreame.vacuum.r2492_phoenixsuit.img image file.
# Make a note of the contents in check.txt. You need this ready for the next step.
# Reboot the vacuum cleaner and re-enter fastboot again. We do this because there is a 160 second watchdog that reboots the robot and we want as much time as we can get to avoid having it reboot on us before we're done.
# Run: {{Highlight
| code = # fastboot getvar config ## Confirm fastboot works
# fastboot oem dust xxxxxxx ## Replace xxxx with the contents in check.txt
# fastboot oem prep ## If this fails, don't proceed. Ensure LiveSuit has the correct img file set
## Ensure that everything below can run before the watchdog resets the device. If you took too long above, reset and try again.
# fastboot flash toc1 toc1.img ## Should return OKAY. Stop otherwise and re-assess.
# fastboot flash boot1 boot.img
# fastboot flash rootfs1 rootfs.img
# fastboot flash boot2 boot.img
# fastboot flash rootfs2 rootfs.img
| lang = terminal
}}
# Once all the flash commands succeed, you can run <code>fastboot reboot</code>.
# Press the outer buttons together for 3 seconds so that the robot starts its WiFi access point. Connect your computer to the robot's WiFi network.
# Confirm that you can SSH to the robot at root@192.168.5.1 using the SSH key that you selected (or were provided) in dustbuilder. If you don't have a working key, you can always connect to the robot via serial (there's a root shell running there)
# Backup all the calibration data from the robot. {{Highlight
| code = robot# tar -czf /tmp/calibration.tar.gz /mnt

## on your computer, scp it out, or if dropbear utils aren't available, you can just do:
computer# ssh -i *rsa root@192.168.5.1 cat /tmp/calibration.tar.gz > calibration.tar.gz
| lang = terminal
}}
# Download Valetudo's latest release from https://github.com/Hypfer/Valetudo and copy it into the robot to <code>/data/valetudo</code>. {{Highlight
| code = robot# scp leo@computer:~/valetudo /data/valetudo
robot# chmod +x /data/valetudo
robot# cp /misc/_root_postboot.sh.tpl /data/_root_postboot.sh
robot# chmod +x /data/_root_postboot.sh
| lang = terminal
}}
# Reboot and once the robot comes back up, you should see the Valetudo web interface on the robot at http://192.168.5.1. Complete the robot setup using Valetudo (such as connecting it to your home's WiFi network).

=== Camera Streaming ===
The camera on the vacuum cleaner can be seen via go2rtc. Anthony Zhang's blog talks more about this (https://anthony-zhang.me/blog/offline-robot-vacuum/) and it requires installing tihmstar's vacuumstreamer repo and the official go2rtc binary:
* https://github.com/tihmstar/vacuumstreamer
* https://github.com/AlexxIT/go2rtc
{{Warning|Trusting random people on the internet... part 2|The vacuumstreamer repo has a video_monitor binary which seems to have come from another AVA powered vacuum cleaner. The source of this binary isn't clear. You're running someone's binary blob at this point.}}{{Highlight
| code = ## on the robot, make /data/vacuumstramer
robot# mkdir /data/vacuumstreamer

## on your computer, download vacuumstreamer packages and the go2rtc binary and copy it to the robot
computer# git clone https://github.com/tihmstar/vacuumstreamer
computer# scp vacuumstreamer/vacuumstreamer.so root@vacuum:/data/vacuumstreamer/
computer# scp vacuumstreamer/dist/usr/bin/video_monitor root@vacuum:/data/vacuumstreamer/
computer# scp -r vacuumstreamer/dist/ava/conf/video_monitor root@vacuum:/data/vacuumstreamer/ava_conf_video_monitor

computer# wget https://github.com/AlexxIT/go2rtc/releases/download/v1.9.13/go2rtc_linux_arm64
computer# scp go2rtc_linux_arm64 root@vacuum:/data/vacuumstreamer/go2rtc

## back on the robot, set up the streamer
robot# cp -r /mnt/private /data/vacuumstreamer/mnt_private_copy
robot# touch /data/vacuumstreamer/mnt_private_copy/certificate.bin # workaround for missing certificate bug, see https://github.com/tihmstar/vacuumstreamer/issues/1 for details
robot# cat <<EOF >> /data/_root_postboot.sh

if [[ -f /data/vacuumstreamer/video_monitor ]]; then
mount --bind /data/vacuumstreamer/ava_conf_video_monitor /ava/conf/video_monitor
mount --bind /data/vacuumstreamer/mnt_private_copy /mnt/private
LD_PRELOAD=/data/vacuumstreamer/vacuumstreamer.so /data/vacuumstreamer/video_monitor > /dev/null 2>&1 &
/data/vacuumstreamer/go2rtc -c '{"streams": {"tcp_magic": "tcp://127.0.0.1:6969"<nowiki>}}</nowiki>' > /dev/null 2>&1 &
fi
EOF
}}

Reboot. You should see the video stream at http://robot:1984.

Dreame L40 Ultra

2025-12-23T23:16:55Z

Leo:

The Dreame L40 Ultra is a robot vacuum and mop cleaner. I purchased this off Amazon in early November 2025 for about $580 CAD.

== Rooting and installing Valetudo ==
You can root and install Valetudo on the Dreame L40 Ultra without needing to disassemble the vacuum cleaner by using the exposed debug header pins. The steps on how to do this is documented on Valetudo's installation page (https://valetudo.cloud/pages/installation/dreame.html) and ideally requires the Dreame breakout PCB because you will need to use the FEL / fastboot installation method for this model. The installation instructions is somewhat scattered across multiple different pages and websites and I am documenting everything here for this specific model so it's all together.

=== Prerequsites ===
Things you need to root and install Valetudo:

* Dreame Breakout PCB - https://github.com/Hypfer/valetudo-dreameadapter
** You can order the PCB from JLCPCB and it'll come in about 2 weeks (~$7 CAD).
** You will need to make sure you have 2mm headers ($2 CAD), some 2.4mm female headers ($1 CAD), a DIP button, a vertical micro USB header that fits the PCB layout ($2 CAD for 10), and a USB 2.0 type A female port ($2 CAD for 20, optional for this device/installation process).
* A 3.3v USB serial adapter and some dupont wires (optional, but is nice to have if you want to connect to the robot via serial)
* A computer running Linux (Ubuntu or Debian is preferable because we need to run Hypfer's fork of the LiveSuit program which was targeted for those distros). Ideally, this computer shouldn't be used for something important because you'll need to download/install a kernel module (which I don't really know where the source is from)
** Install Livesuit (https://github.com/Hypfer/valetudo-sunxi-livesuit). Follow the instructions in the README
* A rooted FEL firmware from https://builder.dontvacuum.me/_dreame_r2492.html (more on this in the instructions below).
** You will need to use the dustbuilder tool to generate FEL image that allows us to run the robot without secure boot. The tools to do this doesn't seem to be fully open source, possibly because Dennis (the author) wants to delay Dreame from patching this issue.

=== Installation ===

# Pry off the front of the robot to expose the debug headers. Your Dreame Breakout PCB with the 2mm headers should fit and the top of the breakout board should be facing the LIDAR.
# Download https://builder.dontvacuum.me/nextgen/dust-livesuit-mr813-ddr4.img and then run LiveSuit. Set the image to dust-livesuit-mr813-ddr4.img
# Factory reset the vacuum cleaner by pressing the reset button next to the wifi light for 10 seconds (it's next to the dust compartment). Once it comes back up, press and hold the power button until the vacuum is off
# While pressing the boot sel button the breakout PCB, press and hold power button on the vacuum cleaner for 5 seconds until the vacuum indicator lights are flashing. You can release the boot sel button once this happens.
# Plug a USB micro cable from your computer to the breakout port
# LiveSuit will ask to format the partition. Press No.
# In a terminal, you should be able to run: {{Highlight
| code = # fastboot devices
# fastboot getvar dustversion
# fastboot getvar config
| lang = terminal
}}
# Go to the dustbuilder page (https://builder.dontvacuum.me/_dreame_r2492.html) and enter your device serial number (the QR code next to the dust compartment or the serial number under the dust compartment), the config value from the command above, and ensure you check off 'Create FEL image'. The dustbuilder took about 20 minutes for me before it generated a FEL image I can download. While that's being generated, we can continue...
# Backup the boot stages in case we need it. Keep the bin files somewhere safe. They should all be approximately 400MB.{{Highlight
| code = # fastboot get_staged dustx100.bin
# fastboot oem stage1
# fastboot get_staged dustx101.bin
# fastboot oem stage2
# fastboot get_staged dustx102.bin
| lang = terminal
}}
# Download the firmware image from the dustbuilder. You should have received a dreame.vacuum.r2492_1680_fel.zip file in the package. Extract the contents.
# Open LiveSuit and set the image to the _dreame.vacuum.r2492_phoenixsuit.img image file.
# Make a note of the contents in check.txt. You need this ready for the next step.
# Reboot the vacuum cleaner and re-enter fastboot again. We do this because there is a 160 second watchdog that reboots the robot and we want as much time as we can get to avoid having it reboot on us before we're done.
# Run: {{Highlight
| code = # fastboot getvar config ## Confirm fastboot works
# fastboot oem dust xxxxxxx ## Replace xxxx with the contents in check.txt
# fastboot oem prep ## If this fails, don't proceed. Ensure LiveSuit has the correct img file set
## Ensure that everything below can run before the watchdog resets the device. If you took too long above, reset and try again.
# fastboot flash toc1 toc1.img ## Should return OKAY. Stop otherwise and re-assess.
# fastboot flash boot1 boot.img
# fastboot flash rootfs1 rootfs.img
# fastboot flash boot2 boot.img
# fastboot flash rootfs2 rootfs.img
| lang = terminal
}}
# Once all the flash commands succeed, you can run <code>fastboot reboot</code>.
# Press the outer buttons together for 3 seconds so that the robot starts its WiFi access point. Connect your computer to the robot's WiFi network.
# Confirm that you can SSH to the robot at root@192.168.5.1 using the SSH key that you selected (or were provided) in dustbuilder. If you don't have a working key, you can always connect to the robot via serial (there's a root shell running there)
# Backup all the calibration data from the robot. {{Highlight
| code = robot# tar -czf /tmp/calibration.tar.gz /mnt

## on your computer, scp it out, or if dropbear utils aren't available, you can just do:
computer# ssh -i *rsa root@192.168.5.1 cat /tmp/calibration.tar.gz > calibration.tar.gz
| lang = terminal
}}
# Download Valetudo's latest release from https://github.com/Hypfer/Valetudo and copy it into the robot to <code>/data/valetudo</code>. {{Highlight
| code = robot# scp leo@computer:~/valetudo /data/valetudo
robot# chmod +x /data/valetudo
robot# cp /misc/_root_postboot.sh.tpl /data/_root_postboot.sh
robot# chmod +x /data/_root_postboot.sh
| lang = terminal
}}
# Reboot and once the robot comes back up, you should see the Valetudo web interface on the robot at http://192.168.5.1. Complete the robot setup using Valetudo (such as connecting it to your home's WiFi network).

=== Camera Streaming ===
The camera on the vacuum cleaner can be seen via go2rtc. Anthony Zhang's blog talks more about this (https://anthony-zhang.me/blog/offline-robot-vacuum/) and it requires installing his vacuumstreamer repo and the go2rtc binary:
* https://github.com/tihmstar/vacuumstreamer
* https://github.com/AlexxIT/go2rtc

{{Highlight
| code = ## on the robot, make /data/vacuumstramer
robot# mkdir /data/vacuumstreamer

## on your computer, download vacuumstreamer packages and the go2rtc binary and copy it to the robot
computer# git clone https://github.com/tihmstar/vacuumstreamer
computer# scp vacuumstreamer/vacuumstreamer.so root@vacuum:/data/vacuumstreamer/
computer# scp vacuumstreamer/dist/usr/bin/video_monitor root@vacuum:/data/vacuumstreamer/
computer# scp -r vacuumstreamer/dist/ava/conf/video_monitor root@vacuum:/data/vacuumstreamer/ava_conf_video_monitor

computer# wget https://github.com/AlexxIT/go2rtc/releases/download/v1.9.13/go2rtc_linux_arm64
computer# scp go2rtc_linux_arm64 root@vacuum:/data/vacuumstreamer/go2rtc

## back on the robot, set up the streamer
robot# cp -r /mnt/private /data/vacuumstreamer/mnt_private_copy
robot# touch /data/vacuumstreamer/mnt_private_copy/certificate.bin # workaround for missing certificate bug, see https://github.com/tihmstar/vacuumstreamer/issues/1 for details
robot# cat <<EOF >> /data/_root_postboot.sh

if [[ -f /data/vacuumstreamer/video_monitor ]]; then
mount --bind /data/vacuumstreamer/ava_conf_video_monitor /ava/conf/video_monitor
mount --bind /data/vacuumstreamer/mnt_private_copy /mnt/private
LD_PRELOAD=/data/vacuumstreamer/vacuumstreamer.so /data/vacuumstreamer/video_monitor > /dev/null 2>&1 &
/data/vacuumstreamer/go2rtc -c '{"streams": {"tcp_magic": "tcp://127.0.0.1:6969"}}' > /dev/null 2>&1 &
fi
EOF
}}

Reboot. You should see the video stream at http://robot:1984.

2025-08-18T23:57:03Z

Leo: /* Mariadb 12 */

MediaWiki is an opensource wiki engine written in PHP by the Wikimedia Foundation. It is used by both Wikipedia and this site.

Visit the project website at:

*https://www.mediawiki.org

==Running inside Docker==
You can run MediaWiki from a Docker container. A proof of concept can be found at:

*https://git.steamr.com/docker/mediawiki

In order to make a single MediaWiki container image which can be used regardless of custom skins or extensions, I intentionally separated custom skins and extensions into a separate directory. When the container first starts, a setup script will symlink these additional extensions and skins to the primary directory. Custom extensions and skins need to be placed in the extension and skins volume and then added to the LocalSettings.php configuration file. When adding a new extension or skin, you must restart the container for these changes to take effect.

To get started with my image, create the container with volumes for:

#Images / Uploads as /mediawiki/images
#Extensions as /extensions
#Skins as /skins
#The {{code|LocalSettings.php}} configuration file under /config
#Database (on a remote server or container), or a SQLite file on a volume

An example docker-compose configuration running a wiki:

{{Highlight
| lang = yaml
| code = wiki:
image: registry.steamr.com/docker/mediawiki:1.34
networks:
- traefik
- db-net
expose:
- "8888"
environment:
- VIRTUAL_HOST=wiki.steamr.com
- APP_ROOT=/mediawiki
- DOCUMENT_ROOT=/mediawiki
- DB_HOST=db
- DB_DATABASE=wiki
- DB_USERNAME=wiki
- DB_PASSWORD=wiki
restart: always
volumes:
- /var/volumes/wiki/config:/config
- /var/volumes/wiki/extensions:/extensions
- /var/volumes/wiki/skins:/skins
- /var/volumes/wiki/images:/mediawiki/images
}}

==Configuration==
MediaWiki only has a single configuration file at {{code|LocalSettings.php}}.

If you wish to run a wiki on the root of a domain, you need to set <code>$wgScriptPath</code> empty like so:
{{highlight
| lang = php
| code = $wgScriptPath = "";
$wgArticlePath = "/$1";
}}

===Extensions===
Extensions are placed under the {{code|/extensions}} directory. Common extensions are bundled with the base installation of MediaWiki but not enabled by default. Extentions that are bundled can be enabled by adding a {{code|wfLoadExtention('extention')}} call to the {{code|LocalSettings.php}} file.

There are some extensions that this wiki requires:

;Intersection
:https://github.com/wikimedia/mediawiki-extensions-intersection.git
:Generates dynamic lists
;Scribunto
:https://github.com/wikimedia/mediawiki-extensions-Scribunto.git
:Generates scripted outputs using Lua
;Math
:https://github.com/wikimedia/mediawiki-extensions-Math
:Generates math formulas
;NativeSvgHandler
:https://github.com/wikimedia/mediawiki-extensions-NativeSvgHandler.git
:Embeds SVG files as an image for client-side rendering. Requires appending {{code|http://www.w3.org/tr/rec-rdf-syntax/}} to {{code|$validNamespaces}} in {{code|UploadBase.php}}.
;WikiSEO
:https://github.com/octfx/wiki-seo/
:Enables custom SEO meta keyword and description tags on the wiki.
;Score
:https://www.mediawiki.org/wiki/Extension:Score
: Lets you embed LilyPond music notation into your Wiki

===Skins===
Skins are placed under the {{code|/skins}} directory. Modern skins are loaded with the {{code|wfLoadSkin('skin')}} call in {{code|LocalSettings.php}}, which reads the skin's {{code|skin.json}} manifest file in the skin's directory. The manifest file contains the skin's name, autoload class files, as well as which resource modules to load loaded, including the stylesheets and javascript files that are part of the skin.

MediaWiki's guide on skinning is relatively up to date albeit a little confusing to understand at first.

*https://www.mediawiki.org/wiki/Manual:Skinning_Part_2

The Reader skin used on this wiki:

*https://git.steamr.com/leo/mediawiki-reader-skin

The {{code|OutputPage}} object is the thing that handles all HTML generation as well as linking javascript and CSS modules. You can use this to inject HTML or certain code to the page with some method calls. See: https://www.mediawiki.org/wiki/Manual:OutputPage.php

===Transcluded Pages===
There are some pages that are used by the MediaWiki software itself, including:

*[[MediaWiki:Aboutsite]]
*[[MediaWiki:Disclaimers]]
*[[MediaWiki:Privacy]]
*[[MediaWiki:Toolbox]]
*[[MediaWiki:Sidebar]]

Some resources are also loaded from pages including:

*[[MediaWiki:Geshi.css]]
*[[MediaWiki:Common.css]]
*[[MediaWiki:Common.js]]

Skins should be able to handle contents on these MediaWiki pages which are typically shown somewhere on the page. Of course, custom skins can also reference their own set of pages such as the bootstrap/reader skin I'm currently using:

*[[Bootstrap:Footer]]
*[[Bootstrap:Sidebar]]
*[[Bootstrap:Jumbotron]]

==Tasks==

===Enabling Visual Editor===
The Visual Editor has been included out of the box since MediaWiki 1.35. '''The following steps are no longer required.'''

Visual Editor is an extension that enables the WYSIWYG editor. This extension requires Parsoid in order to properly save changes.

Installing the Visual Editor extension is simple:

#Download the extension to the extensions directory {{highlight
| lang = text
| code = $ cd extensions
$ wget https://extdist.wmflabs.org/dist/extensions/VisualEditor-REL1_34-74116a7.tar.gz
$ tar -xzf VisualEditor*gz
}}

#Edit LocalSettings.php and enable the extension. {{Highlight
| lang = php
| code = wfLoadExtension('VisualEditor');
$wgDefaultUserOptions['visualeditor-enable'] = 1;
$wgVirtualRestConfig['modules']['parsoid'] = array(
// URL to the Parsoid instance
// Use port 8142 if you use the Debian package
'url' => 'http://parsoid:8000',
// Parsoid "domain", see below (optional)
'domain' => 'wiki',
# // Parsoid "prefix", see below (optional)
# 'prefix' => 'localhost'
);
}}

Getting Parsoid running is slightly trickier. I recommend running Parsoid in a docker container to simplify installation. There is one built at thenets/parsoid which works just fine. A clone of that project is available at https://git.steamr.com/docker/parsoid. The container takes in an environment variables to configure which domains Parsoid is to work for. An example docker-compose file with everything working is given below.

{{highlight
| lang = yaml
| code = version: '3.3'
services:
wiki:
image: registry.steamr.com/docker/mediawiki:1.34
labels:
- "traefik.enable=true"
- "traefik.port=8888"
- "traefik.docker.network=traefik"
- "traefik.frontend.rule=Host:wiki.steamr.com"
networks:
- traefik
- db-net
expose:
- "8888"
environment:
- VIRTUAL_HOST=wiki.home.steamr.com
- APP_ROOT=/mediawiki
- DOCUMENT_ROOT=/mediawiki
- DB_HOST=db
- DB_DATABASE=wiki
- DB_USERNAME=wiki
- DB_PASSWORD=wiki
restart: always
volumes:
- /var/volumes/wiki/config:/config
- /var/volumes/wiki/extensions:/extensions
- /var/volumes/wiki/skins:/skins
- /var/volumes/wiki/images:/mediawiki/images

parsoid:
image: registry.steamr.com/docker/parsoid:latest
labels:
- "traefik.enable=true"
- "traefik.port=8000"
- "traefik.docker.network=traefik"
- "traefik.frontend.rule=Host:parsoid.steamr.com"
environment:
- PARSOID_DOMAIN_wiki=http://wiki:8888/api.php
networks:
- traefik
expose:
- "8000"
restart: always
}}

The Visual Editor should be functional at this point. If the editor does not start without any visual messages, check the javascript console for error messages. There are certain skin requirements that need to be met in order for the Visual Editor to work outlined at https://www.mediawiki.org/wiki/VisualEditor/Skin_requirements.

===Inserting a custom script in {{code|<head>}}===
If using a custom skin, use the {{code|OutputPage}} and call {{code|addHeadItem('name', '<script>...</script')}} to inject a custom script block within the document head.

Alternatively, write a specific OutputPageBeforeHTML hook, and from there call addInlineScript.

===Adding a custom button to WikiEditor Toolbar===
To add a custom button to the WikiEditor toolbar next to the existing bold and italic buttons, edit the {{code|MediaWiki:Common.js}} file. This file can only be changed by {{code|interface administrator}} users. Membership to this group can be assigned at [[Special:UserRights/username]].

The {{code|Common.js}} file used to add the Code and SyntaxHighlight templates used on this wiki is given below.

{{highlight
| lang = js
| code = var customizeToolbar = function () {
$('#wpTextbox1').wikiEditor('addToToolbar', {
'section': 'main',
'group': 'format',
'tools': {
'code': {
label: 'code',
type: 'button',
oouiIcon: 'code',
action: {
type: 'encapsulate',
options: {
pre: "{{code|",
post: "}}"
}
}
}
}
});
$('#wpTextbox1').wikiEditor('addToToolbar', {
'section': 'main',
'group': 'format',
'tools': {
'terminal': {
label: 'highlight',
type: 'button',
oouiIcon: 'tag',
action: {
type: 'encapsulate',
options: {
pre: "<nowiki>{{</nowiki>highlight{{!}}lang=terminal{{!}}code=\n",
<nowiki> post: "}}"</nowiki>
}
}
}
}
});
};
}}

Documentation for this can be found at:

*https://www.mediawiki.org/wiki/Extension:WikiEditor/Toolbar_customization#Basic_setup

===Remove the 'Retrieved from' footer message===
There are a few ways to hide the 'Retrieved from' message that appears at the end of every article.

#Edit the [[MediaWiki:Retrievedfrom]] page with an Administrator account. Comment out or remove the contents to suppress the message.
#Hide the content with CSS by adding to [[MediaWiki:Common.css]] the following rule: {{highlight
| lang = css
| code = /* hide the "Retrieved from" message */
.printfooter { display: none; <nowiki>}</nowiki>
}}
#Edit the template that your skin is using. It'll look something like: {{highlight
| lang = text
| code = {{#html-printfooter}}<div class="printfooter">{{{html-printfooter}}}</div>{{/html-printfooter}}
}}. Either remove, or comment out the line using mustache <code>{{! ... }}</code>.

===How to promote a user into a sysop, bureaucrat, or interface-admin group===
Certain actions on the Wiki are restricted to specific user groups. The default permissions are listed at https://www.mediawiki.org/wiki/Manual:User_rights. To add a user to a specific group:
{{Highlight
| code = $ mysql -u $User -p$Password $Database

## In MySQL prompt, determine the user's user_id
MariaDB [wiki]><nowiki> SELECT user_id, user_name FROM user WHERE user_name = 'leo'
+---------+-------------
| user_id | user_name
| 1 | Leo
+---------+-------------
</nowiki>
## Add the user into the desired group using the user_id above.
MariaDB [wiki]> INSERT INTO `user_groups` VALUES (1, 'sysop'), (1, 'bureaucrat'), (1, 'interface-admin');
| lang = terminal
}}

=== Running the Docker container under a different prefix using Traefik ===
The official Docker image as well as my customized version based off of it expects the Wiki to be served from the domain root. However, if you want to serve the Wiki from a 'subdirectory' like <code>/wiki</code>, you need to do a few things:

#In Traefik, your router should have the following rules: <code>traefik.http.routers.wiki-https.rule=(Host(`my-wiki-server.tld`) && PathPrefix(`/wiki`))</code>.
#In your Docker container, you need to symlink the <code>/wiki</code> directory to <code>/var/www/html</code>. <code>ln -s /var/www/html /var/www/html/wiki</code>
#In your MediaWiki <code>LocalSettings.php</code> configuration, you need to:
##Set <code>$wgVirtualRestConfig['modules']['parsoid'] = ['url' => '<nowiki>http://localhost:8080/wiki/rest.php'</nowiki>];</code> so that Parsoid continues to work.
##Set <code>$wgScriptPath = "/wiki";</code>
## Set <code>$wgArticlePath = "/wiki/$1";</code>

===Running MediaWiki behind a reverse proxy===
When running MediaWiki behind a reverse proxy like squid, nginx, or Traefik, edits made by users will appear as originating from reverse proxy server.

To fix this issue, you need to tell MediaWiki which address ranges are from the reverse proxy and to have it use the X-Forwarded-For header instead. Since MediaWiki 1.35, this can be accomplished by adding [https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:$wgUsePrivateIPs $wgUsePrivateIPs] and [https://www.mediawiki.org/wiki/Manual:$wgCdnServersNoPurge $wgCdnServersNoPurge] in your <code>LocalSettings.php</code> file:
{{Highlight
| code = $wgUsePrivateIPs = true;
$wgUseCdn = true;
$wgCdnServersNoPurge = [];
$wgCdnServersNoPurge[] = "172.18.0.0/16";
| lang = php
}}
Replace 172.18.0.0/16 with the CIDR of your reverse proxy networks (this CIDR is my internal Traefik network for my Docker stack).

=== Adding the Score extension ===
The Score extension allows the rendering of music notation using the Lilypond project as the renderer. I ran into a series of hurdles getting this to work for MediaWiki 1.39.

Issues are:

#The extention highly recommends the use of ShellBox to isolate Lilypond because Lilypond's insecure and may allow remote execution. Since this Wiki is editable to the public, this has to be done. You'll have to set up a ShellBox instance (as a separate container). MediaWiki/Wikipedia has their own container image which doesn't work outside their infrastructure so I had to make my own.
# Lilypond has issues and breaks in my environment. I had to:
#*Modify <code>/usr/bin/lilypond</code> to include <code>export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" </code>otherwise it will fail silently in the extension's script. If you enable verbose mode on lilypond, you will see <code>warning: g_spawn_sync failed (0): gs: Failed to execute child process “gs” (No such file or directory)</code>.
#*Symlink the lilypond fonts because on Debian Bullseye, the package versions for Lilypond and Lilypond-fonts are mismatched so the paths are broken (What the heck Debian?)
#*I tried to enable SVG which works after some tweaking to the generator script, but is a full page in size and can't be easily trimmed.
#The Score extension's included shell script is questionable...
#* The fuzzy png image apparently is from the ghostscript .ps to .png conversion. I'm not sure why this is even done separately when lilypond also generates the pngs. I hacked this by copying 'file.png' over 'file-page1.png' and commented out the <code>runGhostscript</code> function call in the <code>generatePngAndMidi.sh</code> script. That seems to make the png not shrink down and be fuzzy.
After overcoming all the aforementioned issues, I think I've finally got it working.

* Use the following Dockerfiles
*Add the following docker-compose entries
*In the volume, run: {{Highlight
| code = $ git clone https://gerrit.wikimedia.org/r/mediawiki/libs/Shellbox shellbox

## In the fpm container, run as the shellbox user:
# su shellbox
$ cd /srv/shellbox
$ /usr/local/bin/composer install --no-dev
| lang = terminal
}}
*Enable the plugin:
{{Highlight
| code = wfLoadExtension( 'Score' );
$wgScoreTrim = true;
$wgScoreUseSvg = false;
$wgShellboxUrl = 'http://shellbox/shellbox';
$wgShellboxSecretKey = 'secret_key';
| lang = terminal
}}

* You have to tweak [[MediaWiki:Common.css]] so that the embedded images aren't too big. I made them 4em max-height as I intend to only embed single staff snippets for my notes.

==Troubleshooting==

===Scribunto Lua Failures===
If templates cause this error:
{{highlight
| lang = text
| code = Lua error: Internal error: The interpreter exited with status 127.
}}

This likely means that you do not have Lua installed or it is not in the PATH. You will need to specify the Lua path in {{code|LocalSettings.php}} with this line:
{{highlight
| lang = php
| code = $wgScribuntoEngineConf['luastandalone']['luaPath'] = "/usr/bin/lua5.1";
}}

===Database Import Incomplete ===
Database imports from MySQL 5.7.27 to a MariaDB 10.4.7 seems to fail. Imports only appear to complete if the database dump was made without {{code|Enclose export in a transaction}} enabled in PHPMyAdmin but subsequent edits on the destination wiki will result in this error message:
{{highlight
| lang = text
| code = The revision #0 of the page named "some-article" does not exist.

This is usually caused by following an outdated history link to a page that has been deleted. Details can be found in the deletion log.
}}

==== Solution====
It turns out the destination database server (mariadb:10.4.7, in docker) was not set up properly after being upgraded from MariaDB-10.1. On start up, it showed the following error messages:
{{highlight
| lang = text
| code = 2019-09-01 21:14:44 0 [Note] Server socket created on IP: '::'.
2019-09-01 21:14:44 0 [Warning] 'user' entry 'root@localhost.localdomain' ignored in --skip-name-resolve mode.
2019-09-01 21:14:44 0 [Warning] 'proxies_priv' entry '@% root@localhost.localdomain' ignored in --skip-name-resolve mode.
2019-09-01 21:14:44 0 [ERROR] Missing system table mysql.roles_mapping; please run mysql_upgrade to create it
2019-09-01 21:14:44 0 [ERROR] Incorrect definition of table mysql.event: expected column 'sql_mode' at position 14 to have type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT'), found type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALU
2019-09-01 21:14:44 0 [ERROR] mysqld: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
2019-09-01 21:14:44 6 [Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist
2019-09-01 21:14:44 0 [Note] Reading of all Master_info entries succeeded
2019-09-01 21:14:44 0 [Note] Added new Master_info '' to hash table
2019-09-01 21:14:44 0 [Note] mysqld: ready for connections.
Version: '10.4.7-MariaDB-1:10.4.7+maria~bionic' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution
2019-09-01 21:14:46 0 [Note] InnoDB: Buffer pool(s) load completed at 190901 21:14:46
2019-09-01 21:14:49 8 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
2019-09-01 21:14:49 8 [ERROR] Transaction not registered for MariaDB 2PC, but transaction is active
2019-09-01 21:15:13 9 [ERROR] Transaction not registered for MariaDB 2PC, but transaction is active
2019-09-01 21:15:13 9 [ERROR] Transaction not registered for MariaDB 2PC, but transaction is active
2019-09-01 21:15:13 9 [ERROR] Transaction not registered for MariaDB 2PC, but transaction is active
}}

Running {{code|mysql_upgrade}} fixed these errors and a subsequent database import was successful.

===Math Extension===
Using the latest Math extension, formulas constantly return errors similar to:
{{highlight
| lang = text
| code = Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V=IR<nowiki>}</nowiki>
}}

I installed Mathoid and tried to set the {{code|$wgMathFullRestbaseURL}} to the service to no avail since {{code|$wgMathFullRestbaseURL}} required the Restbase API rather than the Mathoid API. I did not want to install Restbase for a simple wiki and requiring Restbase will make hosting it on a shared hosting environment tricky.

====Solution====
<s>It turns out that the Math extension versions 1.30 and prior works.</s>

Save yourself the headache and use MathML, then set the RestbaseURL and MathML URL to Wikipedia's.
{{highlight
| lang = terminal
| code = $wgDefaultUserOptions['math'] = 'mathml';
$wgMathFullRestbaseURL = 'https://en.wikipedia.org/api/rest_';
$wgMathMathMLUrl = 'https://mathoid-beta.wmflabs.org/';
}}

===Visual Editor===

====All templates are puzzle pieces====
All templates within the Visual Editor view appear as puzzle pieces. Investigate by looking at the Parsoid logs. The issue I had which was not obvious was that the template expansion was failing because I referenced the unencrypted HTTP URL rather than the HTTPS one. The 301 HTTPS redirect rule I had caused Parsoid to fail which resulted in templates being rendered as puzzle pieces.
{{Highlight
| code = {"name":"parsoid","hostname":"c2774ece52ab","pid":9,"level":40,"logType":"warn","wiki":"wiki$0","title":"Kubernetes","oldId":3829,"reqId":null,"userAgent":"VisualEditor-MediaWiki/1.34.1","msg":"non-200 response: 301 <!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>301 Moved Permanently</title>\n</head><body>\n<h1>Moved Permanently</h1>\n<p>The document has moved <a href=\"https://leo.leung.xyz/wiki/api.php\">here</a>.</p>\n</body></html>\n","longMsg":"non-200 response:\n301\n<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>301 Moved Permanently</title>\n</head><body>\n<h1>Moved Permanently</h1>\n<p>The document has moved <a href=\"https://leo.leung.xyz/wiki/api.php\">here</a>.</p>\n</body></html>\n","levelPath":"warn","time":"2020-05-24T13:02:41.718Z","v":0}
| lang = text
}}
If you are using the docker image I referenced above, the {{Code|docker-compose.yml}} file should have the environment variable {{Code|PARSOID_DOMAIN_domain}} reference the appropriate https version of the URL.

====Error contacting the Parsoid/RESTBase server (HTTP 412)====
A private wiki had this error. The error was caused by a traefik middleware used for authentication. I believe the script used for the middleware was generating output which somehow interfered with the rest.php request. Disabling (and later fixing) the middleware fixed this issue.

===The DynamicPageList extension doesn't sort by lastedit properly===
As mentioned in [https://www.mediawiki.org/wiki/Special:MyLanguage/Extension:DynamicPageList_(Wikimedia)#ordermethod the DynamicPageList extension documentation], 'lastedit' sort method doesn't actually work as you'd expect:
{{Quote|quote=It should be noted, that lastedit really sorts by the last time the page was touched. In some cases this is not equivalent to the last edit (for example, this includes permission changes, creation or deletion of linked pages, and alteration of contained templates).|source=https://www.mediawiki.org/wiki/Special:MyLanguage/Extension:DynamicPageList_(Wikimedia)#ordermethod}}
Worse yet, editing <code>LocalSettings.php</code> would cause all pages to be 'touched' and render the generated list to be a random tossup of all pages within the Wiki and rendering the list to be completely useless.

A fix would be to tweak the query for 'lastedit' in <code>DynamicPageListHooks.php</code>:
{{Highlight
| code = // From:
case 'lastedit':
$sqlSort = 'page_touched';

// To:
case 'lastedit':
$fields['rc_timestamp'] = 'MAX(recentchanges.rc_timestamp)';
$tables['recentchanges'] = 'recentchanges';
$join['recentchanges'] = ['LEFT JOIN', 'recentchanges.rc_title = page_title'];
$options['GROUP BY'] = "page_id";
$sqlSort = 'rc_timestamp';
| lang = php
}}

===Updating from 1.34 to 1.35===
Version 1.35 includes a built-in PHP based parser to replace Parsoid and bundles the VisualEditor extension.

I had issues getting the 1.34 Docker container to work after updating it to 1.35 because the wiki wasn't able to reach its REST API via localhost within the container. It was only after setting the servername to the public IP address in /etc/hosts that the API was able to reach its REST API and then have visual editor work properly. This is stupid however because REST API traffic will leave the container, out to my ISP, then come right back in to the same container via the reverse proxy.

Changes I had to make for VisualEditor to work:

# Add wiki.home.steamr.com to my external IP address in /etc/hosts.
# Add to nginx.conf after the <code>location /</code> section: {{Highlight
| code = location /rest.php/ {
try_files $uri $uri/ /rest.php?$query_string;
}
| lang = text
}}

===Updating from 1.35 to 1.37===
I tried updating from 1.35 to 1.37 using the official Docker image and had issues getting Parsoid and the VisualEditor to work. Using my existing <code>LocalSettings.php</code> config, I ran into 404 errors when launching the VisualEditor. This was resolved by defining the URL in <code>$wgVirtualRestConfig['modules']['parsoid']</code> to point to the Docker container itself:
{{Highlight
| code = $wgVirtualRestConfig['modules']['parsoid'] = array(
// URL to the Parsoid instance
'url' => 'http://localhost:8080/rest.php',
);
| lang = php
}}
This however resulted in another error 400 which looking at the web browser developer console shows "The requested relative path (...) did not match any known handler". I did see the rest.php calls land on the web server of the container. This was eventually fixed by loading the Parsoid extension in LocalSettings.php. What I have in my LocalSettings.php includes the following lines:
{{Highlight
| code = wfLoadExtension('VisualEditor');
wfLoadExtension( 'Parsoid', 'vendor/wikimedia/parsoid/extension.json' );
$wgDefaultUserOptions['visualeditor-enable'] = 1;
| lang = php
}}

===Updating from 1.37 to 1.39===
Skin deprecation issues. To hide deprecation notices, I added the following lines to LocalSettings.php:
{{Highlight
| code = $wgShowExceptionDetails = false;
$wgDeprecationReleaseLimit = '1.30';
| lang = php
}}

=== Corrupt objectcache table ===
For some odd reason, my Wiki's objectcache table is randomly getting corrupted. This can be recovered from if you drop and re-create the objectcache table.
{{Highlight
| code = # mariadb -u root -p$MYSQL_ROOT_PASSWORD
MariaDB [(none)]> use wiki
MariaDB [wiki]> DROP TABLE IF EXISTS `objectcache`;
MariaDB [wiki]> CREATE TABLE `objectcache` (
`keyname` varbinary(255) NOT NULL DEFAULT '',
`value` mediumblob DEFAULT NULL,
`exptime` binary(14) NOT NULL,
`modtoken` varbinary(17) NOT NULL DEFAULT '00000000000000000',
`flags` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`keyname`),
KEY `exptime` (`exptime`)
) ENGINE=InnoDB DEFAULT CHARSET=binary;
| lang = terminal
}}

=== Mariadb 12 ===
Mariadb 12 is not supported by MediaWiki (as of the time of writing, MediaWiki 1.44). I had some docker-compose files which references <code>mariadb:latest</code> which was recently changed over to verion 12 and this caused MediaWiki to think the database is in read-only mode. This resulted in this warning when trying to edit a page:
{{Highlight
| code = The primary database server is running in read-only mode.
| lang = text
}}
The fix is to downgrade the docker image back down to Mariadb 11. Fortunately, everything works even after Mariadb 12's database migration.

This bug is tracked at https://phabricator.wikimedia.org/T401570<nowiki/>{{Navbox Web}}
[[Category:WebApp]]

2025-08-02T05:27:39Z

Leo:

PTH-9C Board

Inkbird PTH-9C

2025-08-02T05:11:28Z

Leo: Created page with "The Inkbird PTH-9C is a 3-in-1 CO2/temperature/humidity detector with a LCD display. It has a Winsen MH-Z19E sensor which is capable of detecting CO2 between 400 and 10,000 ppm, although the firmware in this device only reports up to 5,000 ppm. === Wireless support? === The PTH-9C does not come with any wireless capabilities, unlike the PTH-9CW model. However, it appears that the main board is identical to the PTH-9CW with some component..."

The Inkbird PTH-9C is a 3-in-1 CO2/temperature/humidity detector with a LCD display. It has a [[MH-Z19 Carbon Dioxide Sensor|Winsen MH-Z19E]] sensor which is capable of detecting CO2 between 400 and 10,000 ppm, although the firmware in this device only reports up to 5,000 ppm.

=== Wireless support? ===
The PTH-9C does not come with any wireless capabilities, unlike the PTH-9CW model. However, it appears that the main board is identical to the PTH-9CW with some components, including the crucial tuya chip missing.

Is it possible to solder a ESP8266 chip on the existing pads and take readings? Likely not. I don't see any serial traffic on the TX/RX pads, so the firmware that's on the unmarked IC is likely not configured to send traffic out.

If you really do want to get the CO2 sensor data, read it from the MH-Z19E directly from pin 3 (center pin on the 5 pin header) at 9600 baud 8N1. The IC on this unit polls sensor data about every 1 second with the sensor reporting measurements in this format:
{{Highlight
| code = start, command = 86, high + low, 42, 3 bytes of 0's, checksum
ff 86 01 f5 42 000000 42
ff 86 01 f5 42 000000 42
ff 86 01 f5 42 000000 42
| lang = terminal
}}
The important bits are the high+low values, which when read as <code>0x01F5</code> gives you 501 (ppm).

== See also ==

* [[MH-Z19 Carbon Dioxide Sensor|The MH-Z19 CO2 sensor]]

CloudStack

2025-07-10T17:49:48Z

Leo: /* Rebuilding UI */

Apache CloudStack is open-source cloud computing software. It is used to deploy a infrastructure as a service (IaaS) platform on virtualization technologies such as KVM, VMware, and Xen. This is similar to OpenStack but is significantly simpler to setup and manage (albeit with less features).

This page contains my notes on setting up and using CloudStack 4.15. I am by no means a CloudStack expert so take my notes here with a huge grain of salt and feel free to make corrections.

==Installation==
This installation is based on CloudStack 4.15 using CentOS 8. The setup described below uses KVM and Open vSwitch. I'm basing the design decisions and approach from the installation guide at [http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html#management-server-installation http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html]

===Overview===
I will have 1 management node and a few bare metal nodes. All nodes will have the same processor (Intel something) and memory (24GB).

Each node will have the same network configuration based on OpenVSwitch. There will be only 1 ethernet connection per node with various VLANs trunked to each node. The VLANs are:
{| class="wikitable"
!Network
!Vlan
!Network subnet
|-
|Management
|11, untagged
|172.19.0.0/20
|-
|Storage
|3205
|172.22.0.0/24
|-
|Guest
|100 - 200
|n/a
|-
|Public
|2
|136.159.1.0/24
|}
The network configs for the 4 nodes I'll be using are listed below. There is also a NFS server used for primary storage. The reason for the weird IPs is because this was set up on an existing network.
{| class="wikitable"
!Node
!Networks
|-
|management
|Management: 172.19.12.141/20
Storage: 172.22.0.241/24
|-
|baremetal1
|Management: 172.19.12.142/20
Storage: 172.22.0.242/24
|-
|baremetal2
|Management: 172.19.12.143/20
Storage: 172.22.0.243/24
|-
|baremetal3
|Management: 172.19.12.144/20
Storage: 172.22.0.244/24
|-
|netapp1
|Storage: 172.22.0.19/24
|}

===Switch config===
For completeness, here's the configuration of the HP Procurve switch that the nodes are connected to. The switch should have all the guest VLANs defined and tagged.
{{Highlight
| code = config

# Guest VLANs
vlan 100 name guest100
vlan 101 name guest101
...
vlan 200 name guest 200
interface 1-8 tagged vlan 100-200

# Public, management, storage VLANs
vlan 2 name public
vlan 11 name management
vlan 3205 name storage
interface 1-8 untagged vlan 11
interface 1-8 tagged vlan 2,3205
| lang = terminal
}}

===Node setup===
Each node will be set up with the following sub-steps.

====CloudStack Repos====
Install CloudStack repos.{{Highlight
| code = # cat > /etc/yum.repos.d/cloudstack.repo <<EOF
[cloudstack]
name=cloudstack
baseurl=http://download.cloudstack.org/centos/8/4.15/
enabled=1
gpgcheck=0
EOF
| lang = terminal
}}

====Install base packages====
Install all other dependencies.{{Highlight
| code = # yum -y install epel-release
# yum -y install bridge-utils net-tools
| lang = terminal
}}Install OpenVSwitch from CentOS Extras:
{{Highlight
| code = # yum -y install \
http://mirror.centos.org/centos/8/extras/x86_64/os/Packages/centos-release-nfv-openvswitch-1-3.el8.noarch.rpm \
http://mirror.centos.org/centos/8/extras/x86_64/os/Packages/centos-release-nfv-common-1-3.el8.noarch.rpm
| lang = terminal
}}

====Disable SELinux====
The system should have SELinux disabled. Use <code>setenforce</code> and edit the selinux config:{{Highlight
| code = # setenforce 0
# vi /etc/selinux/config
## disable selinux
| lang = terminal
}}

====Disable firewalld====
{{Highlight
| code = # systemctl stop firewalld
# systemctl disable firewalld
| lang = terminal
}}

====Configure Open vSwitch====
{{Highlight
| code = # echo "blacklist bridge" >> /etc/modprobe.d/local-blacklist.conf
# echo "install bridge /bin/false" >> /etc/modprobe.d/local-dontload.conf

# systemctl enable --now openvswitch
| lang = terminal
}}
We will be using network-scripts to configure the Open vSwitch bridges later. I removed NetworkManager but retained network-scripts to ensure NetworkManager doesn't interfere with my network setup. The install guide leaves NetworkManager around.

I create a 'shared' bridge that's tied to the network interface called <code>nic0</code>. This was done to make it easier to change the bridge setup during my testing but this could be simplified. Each of the physical networks I later set up in CloudStack are its own individual bridge to make it obvious how VMs get connected to the network.
{{Highlight
| code = # ovs-vsctl add-br nic0
# ovs-vsctl add-port nic0 enp4s0f0 tag=11 vlan_mode=native-untagged
# ovs-vsctl set port nic0 trunks=2,11,40-49,3205

# ovs-vsctl add-br management0 nic0 11
# ovs-vsctl add-br cloudbr0 nic0 2
# ovs-vsctl add-br cloudbr1 nic0 100
# ovs-vsctl add-br storage0 nic0 3205
| lang = terminal
}}The node's management IP address needs to be removed from the primary network interface and then assigned on the management0 interface. If you're doing this to a node remotely, this might interrupt your connection.
{{Highlight
| code = # ip addr del 172.19.12.141/20 dev enp4s0f0
# ip addr add 172.19.12.141/20 dev management0
# ip route add default via 172.19.0.3
# ip addr add 172.22.0.241/24 dev storage0

# ip link set management0 up
# ip link set storage0 up
| lang = terminal
}}

====Network configuration====
Once the Open vSwitch bridges are set up, configure the interfaces as follows:
{| class="wikitable"
!Network Interface
!Role
!Configuration
|-
|enp4s0f0
|primary NIC in the host
|up on boot; no IP
|-
|nic0
|network OVS switch that connects to the other bridges to the NIC
|up on boot; no IP
|-
|cloudbr0
|public traffic.
|up on boot; no IP
|-
|cloudbr1
|guest traffic
|up on boot; no IP
|-
|management0
|management traffic
|up on boot; assigned with management IP
|-
|storage0
|storage traffic
|up on boot; assigned with storage network IP
|-
|cloud0
|link local traffic
|up on boot; assigned 169.254.0.1/16
|}Network configs are applied using network-scripts. The idea here is to have the network interfaces be configured when the system boots automatically. For interfaces that require a static IP address, I used the following network-scripts file. Adjust the device name and IP address as required.
{{Highlight
| code = # cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=static
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
IPADDR=172.16.10.2
GATEWAY=172.16.10.1
NETMASK=255.255.255.0
DNS1=8.8.8.8
DNS2=8.8.4.4
USERCTL=no
NM_CONTROLLED=no
EOF
| lang = terminal
}}For devices that don't require a static IP:
{{Highlight
| code = cat <<EOF > ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=OVSBridge
DEVICETYPE=ovs
ONBOOT=yes
BOOTPROTO=none
HOTPLUG=no
NM_CONTROLLED=no
EOF
| lang = text
}}Once configured, verify that your node comes up with the proper network settings on a reboot.

===Management node setup===
On the management node, set up the network configs and the CloudStack management packages.

====Setup Storage====
If you intend to use the management server as the primary and secondary storage, you will need to set up a NFS server. If you intend to use an external NFS server as the primary storage, you can skip this step.{{Highlight
| code = # mkdir -p /export/primary /export/secondary
# yum -y install nfs-utils
# cat > /etc/exports <<EOF
/export/secondary *(rw,async,no_root_squash,no_subtree_check)
/export/primary *(rw,async,no_root_squash,no_subtree_check)
EOF
# systemctl enable --now nfs-server
| lang = terminal
}}

====CloudStack management services====
Install MySQL. MariaDB isn't supported and the installation fails with it.
{{Highlight
| code = # rpm -ivh http://repo.mysql.com/mysql80-community-release-el8.rpm
# yum -y install mysql-server
# yum -y install mysql-connector-python

## edit /etc/my.cnf to have the following lines.
cat >> /etc/my.cnf <<EOF
[mysqld]
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
EOF

# systemctl enable --now mysqld
| lang = terminal
}}
Setup CloudStack.{{Highlight
| code = # yum -y install cloudstack-management

# cloudstack-setup-databases cloud:password@localhost --deploy-as=root
# cloudstack-setup-management
# systemctl enable --now cloudstack-management
| lang = terminal
}}After starting <code>cloudstack-management</code> for the firs time, it might take from 2-10 minutes for the database to set up completely. During this time, the web interface won't be responsive. In the mean time, you will need to seed the system VM images to the secondary storage. If you are using an external NFS server for your secondary storage, adjust the mount point in the following command accordingly.
{{Highlight
| code = ## Seed the systemvm into secondary storage
# /usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m /export/secondary -u https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-kvm.qcow2.bz2 -h kvm -F
| lang = terminal
}}
We will continue the setup process via the web interface after setting up a bare metal node.

===Bare metal node setup===
You should set up at least one bare metal node which will be used to set up your first zone and pod.

On a bare metal node, set up everything outlined in the [[CloudStack#Node setup|Node setup]] section above. The node should have the CloudStack repos, Open vSwitch, SElinux/firewalld, and the networking configured. The agent node must have virtualization enabled on the CPU and KVM should be installed. You should be able to find <code>/dev/kvm</code> on the system.

====CloudStack Agent====
To set up the node, install the <code>cloudstack-agent</code> package.

{{Highlight
| code = # yum -y install cloudstack-agent
| lang = terminal
}}

Configure <code>qemu</code> and <code>libvirtd</code>. If you need some starting configs, try the following:{{Highlight
| code = ## edit /etc/libvirt/qemu.conf
vnc_listen=0.0.0.0

## edit /etc/libvirt/libvirtd.conf
listen_tcp = 1
tcp_port = "16509"
listen_tls = 0
tls_port = "16514"
auth_tcp = "none"
mdns_adv = 0
| lang = terminal
}}The CloudStack install guide instructs you to edit the libvirtd arguments to <code>--listen</code>, but this will prevent libvirtd from starting using systemd. Instead, you should skip this step entirely because the CloudStack agent will configure this for you when you add the node to a zone.
{{Highlight
| code = ## The install guide suggests editing /etc/sysconfig/libvirtd to use the listen flag.
## However, this only works if you're not using systemd or using the libvirtd-tcp socket.
## I skipped this step since the agent will configure this later on.
LIBVIRTD_ARGS="--listen"
| lang = terminal
}}Start the CloudStack agent. Verify that the cloudstack-agent is running. At this point libvirtd should also running (it's a service dependency).

With the agent running on the node, you should now be able to add the node to the CloudStack cluster. This process will rewrite the <code>libvirtd.conf</code> file and it should set <code>listen_tcp=0</code> and <code>listen_tls=1</code> for you (so that libvirt traffic such as migrations are done via TLS rather than basic TCP).

====Allow sudo access====
Ensure that <code>/etc/sudoers</code> does not require TTY. In the older documentation, CloudStack requires that the 'cloud' user be able to sudo with the addition of <code>Defaults:cloud !requiretty</code>. However, looking at the installation on the CentOS 8 box, the agent actually runs as root, so perhaps root needs to be able to sudo?

===Setting up your first zone===
At this point in the process, you should have at least one bare metal host and your management node should be up and running and it should be serving the CloudStack web UI at http://cloudstack:8080/client. Login using the default <code>admin</code> / <code>password</code> credentials.

You will be greeted with a setup wizard. I have had no luck with this and it's better to ignore it. Instead, navigate to Infrastructure -> zones and manually set up your first zone.
{| class="wikitable"
!Description
!Screenshot
|-
|There are 3 types of zones that you can create:

#'''Basic zone''' - All guest VMs are placed on a single shared flat network. There is no isolation or security policies in place to prevent guest VMs from seeing each other.
#'''Advanced zone''' - Guest VMs can be placed in one or more VLAN based networks. Guest networks can either be isolated or L2. Isolated networks (depending on the chosen network offering) comes with a virtual router (VR) which offers NAT/SNAT and firewall services and uses one or more public IP addresses. L2 networks are similar but doesn't have a virtual router but instead requires these services to be offered externally. Tenants can also create something called a virtual private cloud (VPC). A VPC is like a regular isolated guest network but with additional features. A VPC allows the user to:
##Create multiple subnets (called tiers) which can route with each other
##Network traffic between tiers can be controlled through Network ACLs
##One or more public IPs can be associated to a VPC.
##Like an isolated guest network, all subnets can be NATed out through a single public IP
##You can create a private gateway (and therefore static routes) within a VPC
##You can create a VPN connection to a VPC
#'''Advanced zone with security groups''' - Guest VMs are placed on a shared network that is publicly routable. There is no concept of a 'public' network because the guest network should also be public. As a result, there is no ability to create any other kind of guest networks or VPCs. The only benefit here is the ability to define security groups per-VM (which is implemented via IPTables on the bare metal host). Because enabling security groups in a zone will restrict that zone from being able to create isolated guest networks or VPCs, the security group feature only appears useful in an environment where guests only need to connect to the internet.

Be aware of each type's limitations before continuing.

We will be creating an advanced network zone.
|[[File:CloudStack - New Zone 1.png|left|thumb]]
|-
|We will add the DNS resolvers for the zone and specify the hypervisor type (KVM).
Empty the guest CIDR since we're going to allow users to specify their own.
|[[File:CloudStack - New Zone 2.png|left|thumb]]
|-
|When using the advanced zone, you need to specify the physical networks for the management, storage, and public networks.
These should correspond to the physical network devices on the hypervisor. Recall that in the previous step where we set up the Open vSwitch bridges, we created the following bridges for each role:

*management - management0
*storage - storage0
*public - cloudbr0
*guest - cloudbr1
|<gallery>
File:CloudStack - New Zone 3.png
File:CloudStack - New Zone 3a.png
</gallery>
|-
|Specify the public network. The addresses defined here populates the 'Public IP' pool.

All isolated guest networks and all VPCs will use one of the addresses defined in this pool for the SNAT/NAT. The addresses specified here should therefore be accessible from the internet.
|[[File:CloudStack - New Zone 4.png|left|thumb]]
|-
|Create a new pod.

The pod network here should cover your management network subnet. The reserved IP addresses here will be used by system VMs that require access to the management network.
|[[File:CloudStack - New Zone 5.png|left|thumb]]
|-
|Specify the guest network VLAN range.

Because we're using VLAN as an isolation method, this range specifies what VLANs the guest networks will use over the guest physical network.
|[[File:CloudStack - New Zone 6.png|left|thumb]]
|-
|Specify the storage network.

The reserved start/end IPs will be used by system VMs that require access to the primary storage.

If you are assigning static IPs on your bare metal hosts, ensure that the reserved addresses don't overlap with the IP range specified here (because I had CloudStack assign a VM with the same IP as a bare metal host)
|[[File:CloudStack - New Zone 7.png|left|thumb]]
|-
|Specify a cluster name.
|[[File:CloudStack - New Zone 8.png|left|thumb]]
|-
|Add your first bare metal host.
You must add one host now and can add additional ones later.
|[[File:CloudStack - New Zone 9.png|left|thumb]]
|-
|Specify your primary storage.
The server should be accessible from the storage network.
|[[File:CloudStack - New Zone 10.png|left|thumb]]
|-
|Specify your secondary storage.

You need to have at least one NFS secondary storage that has been seeded with the system VM template.

Secondary storage pools should be accessible from the management network ('''confirm'''?)
|[[File:CloudStack - New Zone 11.png|left|thumb]]
|-
|Launch the zone.
This step might take a few minutes. If all goes well, you can then enable the zone shortly after. If you run into any problems, check the logs on the management node at /var/log/cloudstack/management.
|[[File:CloudStack - New Zone 12.png|left|thumb]]
|}
Once your zone has been enabled, it should automatically start a Console Proxy VM and secondary storage VM. You can find this under Infrastructure -> System VMs. If for some reason the System VMs are not starting, check that your systemvm template is available in your secondary storage and that the cloud0 bridge on each host is up. You should be able to ping the link local IP address (the 169.254.x.x address) from the hypervisor.

Once the two system VMs are running, verify that you're able to create new guest networks or VPCs. These networks should create a virtual router.

==Configuration==

===Service offerings===

====Deployment planner====
There are a few deployment techniques that can be used. These are set within a compute offering and cannot be changed after it's been created ('''really?''' can we change it via API?). The options are:
{| class="wikitable"
!Deployment planner
!Description
|-
|First fit
|Placed on the first host that has sufficient capacity
|-
|User dispersing
|Evenly distributes VMs by account across clusters
|-
|User concentrated
|Opposite of the above.
|-
|Implicit dedication
|requires or prefers (depending on planner mode) a dedicated host
|-
|Bare metal
|requires a bare metal host
|}
More information from [https://docs.cloudstack.apache.org/en/4.15.2.0/adminguide/service_offerings.html#compute-and-disk-service-offerings CloudStack's documentation on Compute and Disk Service Offerings].

===Enable SAML2 authentication===
Enable the SAML2 plugin by setting <code>saml2.enabled=true</code> under Global Settings.

Set up SAML authentication by specifying the following settings:
{| class="wikitable"
!Setting
!Description
!Example value
|-
|saml2.default.idpid
|The URL of the identity provider. This is likely obtained from the metadata URL and set by the SAML2 plugin every time CloudStack starts.
|<nowiki>https://sts.windows.net/c609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/</nowiki>
|-
|saml2.idp.metadata.url
|The metadata XML URL
|<nowiki>https://login.microsoftonline.com/609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/federationmetadata/2007-06/federationmetadata.xml?appid=c5b8df24-xxx-xxx-xxx-xxxxxxxxxxxx</nowiki>
|-
|saml2.sp.id
|The identifier string for this application
|cloudstack-test.my-organization.tld
|-
|saml2.redirect.url
|The redirect URL using your cloudstack domain.
|<nowiki>https://cloudstack-test.my-organization.tld/client</nowiki>
|-
|saml2.user.attribute
|The attribute to use as the username.
If you're not sure what's available, look at the management logs after a login attempt.
|For Azure AD, use the email address attribute: <nowiki>http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress</nowiki>
|}
Restart the management server. To allow a user access, create the user and enable SSO. The user's username must match the value that's obtained from the saml2.user.attribute field.

====Bugs====

=====SAML Request being rejected by Azure AD=====
If you are using Azure AD, you may have issues authenticating because the SAML request ID that's generated might begin with a number. When this happens, you will get an error similar to: <code>AADSTS7500529: The value '692rv91k6dgmdas33vr3b2keahr4lqjv' is not a valid SAML ID. The ID must not begin with a number.</code>. For more information, see: https://github.com/apache/cloudstack/issues/5548

=====Users cannot login via SSO=====
Users that will be using SAML for authentication will need to have their CloudStack accounts created with SSO enabled. There seems to be a bug with the CloudStack web UI where a user's SAML IdPID isn't settable (it gets set to a '0'). A work-around would be to create and authorize users via CloudMonkey.

The steps on adding a new user are:

#Create the user: <code>create user firstname=First lastname=User email=user1@ucalgary.ca username=user1@ucalgary.ca account=RCS state=enabled password=asdf</code>
#Find the user's ID: <code>list users domainid=<tab> filter=username,id</code>
#Authorize the user: <code>authorize samlsso enable=true entityid=<nowiki>https://sts.windows.net/c609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/</nowiki> userid=user-id</code>
#Verify that the user is enabled for SSO: <code>list samlauthorization filter=userid,idpid,status</code>

When authorizing a user, the entityid must be the URL of the identity provider. The end slash is also mandatory.

===Enable SSL===
A few things to note about enabling SSL:

*If you added hosts via IP address, enabling SSL would likely break the management-to-client connection. You might need to re-add the host so that the certificates all match up.
* On CloudStack 4.16, the button to upload a new certificate in the SSL dialog box does not work. This is fixed in 4.16.1.

==== Preparing your SSL certificates ====
First, generate a private key and certificate signing request and then obtain your SSL certificate from a certificate authority. For a typical CloudStack installation, you should obtain SSL certificates for both your management server as well as your console proxy.
{{Highlight
| code = # openssl genrsa -out server.key 4096
# openssl req -new -sha256 \
-key server.key \
-subj "/C=CA/ST=Alberta/O=Steamr/CN=cloudstack-test.example.com" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:cloudstack-test-console.example.com")) \
-out server.csr
## With the server.csr file, upload it to your Certificate Authority to obtained a signed certificate.
| lang = terminal
}}
Your certificate authority should have given you your signed certificate as well as the root and any other intermediate certificates in a X.509 (.crt) format. If you need to self sign this certificate signing request, do the following:{{Highlight
| code = ## Run the following only if you want to self sign your certificate
## Make your root CA
# openssl genrsa -des3 -out rootCA.key 4096
# openssl req -x509 -new -subj "/C=CA/ST=Alberta/O=Steamr/CN=example.com" -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.crt

## Sign the certificate
# openssl x509 -req -in server.csr -CA rootCA.crt -CAkey rootCA.key -CAcreateserial -out server.crt -days 500 -sha256

## Check the certificate
# openssl x509 -in server.crt -text -noout
| lang = terminal
}}Next, you need to convert your certificate into a PKCS12 format and your private key into a PKCS8 format. This is the only format that works with the CloudStack management server. We place the PKCS12 keystore file at <code>/etc/cloudstack/management/ssl_keystore.pkcs12</code>.
{{Highlight
| code = ## Combine Files
# cat server.key server.crt intermediate.crt root.crt > combined.crt

## Create keystore
## You may use 'password' as the password
# openssl pkcs12 -in combined.crt -export -out combined.pkcs12

## Import keystore
## Provide the same password above. Eg. 'password'
# keytool -importkeystore -srckeystore combined.pkcs12 -srcstoretype PKCS12 -destkeystore /etc/cloudstack/management/ssl_keystore.pkcs12 -deststoretype pkcs12

## Convert the private key into PKCS8 format
## Provide the same password above. Eg. 'password'
# openssl pkcs8 -topk8 -in server.key -out server.pkcs8.encrypted.key
# openssl pkcs8 -in server.pkcs8.encrypted.key -out server.pkcs8.key
| lang = terminal
}}

==== Upload SSL certificates ====
You can upload SSL certificates to CloudStack under Infrastructure -> Summary and then clicking on the 'SSL Certificates" button. Provide the root certificate authority, the certificate, the private key (in PKCS8 format), and the domain that the certificate applies to. Wildcard domains should be specified as <code>*.example.com</code>.

Alternatively, you may use the CloudMonkey tool to upload certificates using the file parameter passing feature like so:

{{Highlight
| code = # cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=1 name=root certificate=@root.crt
# cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=2 name=intermediate1 certificate=@intermediate.crt
# cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=3 privatekey=@server.pkcs8.key certificate=@domain.crt
| lang = terminal
}}

==== Enabling HTTPS ====
Next, you will need to enable HTTPS on both the management console and console proxy.

Note that you may enable the HTTPS setting only after at least one certificate has been uploaded. If the server has no certificates, the option is ignored.

===== Enable HTTPS on the management console =====
The management console can be configured by editing <code>/etc/cloudstack/management/server.properties</code> with the following lines. Set the keystore password to the same password you used above to import it.
{{Highlight
| code = https.enable=true
https.port=8443
https.keystore=/etc/cloudstack/management/ssl_keystore.pkcs12
https.keystore.password=password
| lang = text
}}Restart the management server for this to apply.
{{Info
| title = Why port 8443?
| message = Because CloudStack runs under a non-root account, it can only bind to high port (> 1024) numbers.

You can still have CloudStack visible on port 443 if you use a [[IPTables#Mapping incoming traffic to a different internal port|IPTables rule]].
}}
Confirm that you are able to reach your management console via HTTPS.

==== Enable HTTPS on the console proxy ====
If you enable SSL on the management console, you will also need to enable SSL for the console proxies for the VNC web sockets to work properly. If your management console certificate (from the previous sections) contain a Subject Alternative Name (SAN) or is a wildcard certificate that includes your console proxy's DNS name, SSL for the console proxy should be working. If your certificates do not include the console proxy's DNS name, you will need to obtain another SSL certificate and add it to the SSL keystore and upload it to CloudStack using the same instructions above.

==== Renewing SSL certificate ====
To renew a SSL certificate, you'll have to ensure that the keystore is updated and also upload the certificate via CloudMonkey or the management console (under Summary -> Certificates).

In a folder containing your certificate (<code>server.crt</code>), intermediate and root certificates (<code>intermediate.crt</code>, <code>root.crt</code>), and also your private key (<code>server.key</code>), run the following to update your SSL keystore and upload the certificates via cmk:
{{Highlight
| code = # cat server.key server.crt intermediate.crt root.crt > combined.crt

## Note the keystore location that's defined in your configs
# grep https.keystore /etc/cloudstack/management/server.properties
https.keystore=/etc/cloudstack/management/ssl_keystore.pkcs12
https.keystore.password=xxxxxxx

## Create keystore and import your certificate into it.
# openssl pkcs12 -in combined.crt -export -out combined.pkcs12
# keytool -importkeystore -srckeystore combined.pkcs12 -srcstoretype PKCS12 -destkeystore ssl_keystore.pkcs12 -deststoretype pkcs12

## Move the keystore over the existing one as defined in the server config. You may want to backup the old one just in case.
# mv /etc/cloudstack/management/ssl_keystore.pkcs12 /etc/cloudstack/management/ssl_keystore.pkcs12-org
# mv ssl_keystore.pkcs12 /etc/cloudstack/management/ssl_keystore.pkcs12

## Convert your key to pkcs8 if you haven't already done so. Use the same password for both commands.
# openssl pkcs8 -topk8 -in server.key -out server.pkcs8.key-encrypted
# openssl pkcs8 -in server.pkcs8.key-encrypted -out server.pkcs8.key

## Upload your certificate
# for domain in $(openssl x509 -in server.crt -text -noout {{!}} grep DNS: {{!}} tr -d , {{!}} sed 's/DNS://g') ; do
echo "Uploading domain for $domain"

cmk upload customcertificate domainsuffix=$domain id=1 name=root certificate=@root.crt
cmk upload customcertificate domainsuffix=$domain id=2 name=intermediate1 certificate=@intermediate.crt
cmk upload customcertificate domainsuffix=$domain id=3 privatekey=@server.pkcs8.key certificate=@server.crt
done
| lang = terminal
}}

== Tasks ==
===Re-add existing KVM bare metal host===
Once a host has been added to CloudStack, the CloudStack agent will have generated some public/private keys and configured itself to talk to the management node. If you need to remove and re-add a host, you will need to clean up the agent before re-adding it back to CloudStack again. Based on my experience, I had to do the following:

#Before removing the host from CloudStack, drain it of all VMs. <code>virsh list</code> should be empty. If not and you've removed the host from the management server already, manually kill each VM with <code>virsh destroy</code>.
#<code>systemctl stop cloudstack-agent</code>
#<code>rm -rf /etc/cloudstack/agent/cloud*</code>
#unmount any primary storages with <code>umount /mnt/*</code> and clean up with <code>rmdir /mnt/*</code>
#<code>systemctl stop libvirtd</code>
#<code>rm -rf /var/lib/libvirt/qemu</code>
#You may need to edit <code>/etc/sysconfig/libvirtd</code> to not use the listen flag. This might prevent libvirtd (and subsequently cloudstack-agent) from starting.
#Edit <code>/etc/cloudstack/agent/agent.properties</code> and remove the keystore passphrase, any UUIDs, cluster/pod/zone, and the host. You should keep the guid or regenerate it with uuidgen. You should also keep the public/private/guest network devices set.
#Restart with <code>systemctl start cloudstack-agent</code> (libvirt should come up automatically as it's a dependency). Ensure that it comes up OK.

You may then re-add the host back to CloudStack.

===Building RPMs===
To build the RPM packages from scratch, you'll need to install a bunch of dependencies and then run the build script. For more information, see:

*<nowiki>https://docs.cloudstack.apache.org/en/4.15.2.0/installguide/building_from_source.html#building-rpms-from-source</nowiki>
{{Highlight
| code = # yum groupinstall "Development Tools"
# yum install java-11-openjdk-devel genisoimage mysql mysql-server createrepo
# yum install epel-release

# curl -sL https://rpm.nodesource.com/setup_12.x {{!}} sudo bash -
# yum install nodejs

# cat <<EOF > /etc/yum.repos.d/mysql.repo
[mysql-community]
name=MySQL Community connectors
baseurl=http://repo.mysql.com/yum/mysql-connectors-community/el/$releasever/$basearch/
gpgkey=http://repo.mysql.com/RPM-GPG-KEY-mysql
enabled=1
gpgcheck=1
EOF
# yum -y install mysql-connector-python

enable powertools

# yum install jpackage-utils maven

# git clone https://github.com/apache/cloudstack.git
# cd cloudstack
# git checkout 4.15

# cd packaging
# sh package.sh --distribution centos8
| lang = terminal
}}

=== Rebuilding UI ===
CloudStack's web interface is bundled with the pre-built cloudstack-ui package. If you need to make any custom changes to the UI, you can follow the build instructions from the [https://github.com/apache/cloudstack/tree/main/ui README file under the ui directory.]

Building the UI is a straight forward process once you have all the necessary software dependencies in place. Once the UI is built, you can then install it on the management server and have it served instead of the 'stock' bundled UI.

==== Building the UI ====
You will need a server with npm installed along with a copy of the CloudStack repo. The simplest way I've found to accomplish this is to create the following Docker image and then run the build process within the container.
{{Highlight
| code = FROM rockylinux/rockylinux:8.10

RUN curl -sL https://rpm.nodesource.com/setup_16.x {{!}} bash -
RUN yum -y install nodejs
| lang = dockerfile
}}
Clone the CloudStack repo (I placed it in /tmp in this example) and then run the following:
{{Highlight
| code = # docker build -t cs-ui-builder --progress=plain --no-cache .

# docker run --rm -ti \
-v /tmp/cloudstack:/cloudstack \
-v /tmp/npm:/root/.npm cs-ui-builder \
bash -c "cd /cloudstack/ui; npm install; npm run build"
| lang = terminal
}}
We make a volume for the .npm directory to help speed up subsequent builds as the npm dependencies are cached, but this is entirely optional.

==== Installing the UI ====
# Copy the <code>dist/</code> directory to <code>/usr/share/cloudstack-management/webapp/</code>
# Edit <code>/etc/cloudstack/management/server.properties</code> and make sure that <code>webapp.dir</code> is set to: <code>webapp.dir=/usr/share/cloudstack-management/webapp</code>

Restart the CloudStack management service and then reload the console page. Ensure that the vue app isn't cached.

==== Installing and using CloudStack's prebuilt UI ====
The cloudstack-ui package contains the prebuilt CloudStack UI. The location of the UI files are placed under <code>/usr/share/cloudstack-ui</code>.

As of CloudStack 4.18, when I tried using this prebuilt package, I had to do the following things:

* Edit <code>/etc/cloudstack/management/server.properties</code> and set <code>webapp.dir=/usr/share/cloudstack-ui</code>
* In <code>/usr/share/cloudstack-ui</code>, run: <code>find . -type d -exec chmod -v o+x {} \;</code> because the directories aren't executable by the 'cloud' user.
* Create /usr/share/cloudstack-ui/WEB-INF and place this [https://github.com/apache/cloudstack/blob/main/client/src/main/webapp/WEB-INF/web.xml web.xml file] within. Otherwise, requests to the API break.

===Usage server===
Install <code>cloudstack-usage</code>. Start it and restart the management server. Set <code>enable.usage.server=true</code> in global settings.

The usage data will be stored in the usage database on your management server. Metrics are gathered daily and can be viewed through Cloud Monkey. There is no option to view this data in the management console.

The collected data is coarse in nature, but it should be sufficient enough for you to determine an account or VM's resource utilization over a time period of a day or more and should be good enough to implement a rough billing / showback amount.

===Adding some Linux templates===
You can add the "Generic Cloud" qcow2 disk images as a system template to CloudStack.

Because these cloud images uses cloud-init, you will need to provide some custom userdata when deploying these images. Userdata will only work when the VM is deployed on a network that offers the "User Data" service offering. If you can't use userdata or if you want the VMs to come up with a specific root password, you can use [[virt-customize]] to set the root password on the qcow2 file.
{| class="wikitable"
!Distro
!Type
!URL
|-
|Rocky Linux 8.4
|CentOS 8
|https://download.rockylinux.org/pub/rocky/8.4/images/Rocky-8-GenericCloud-8.4-20210620.0.x86_64.qcow2
|-
|CentOS 8.4
|CentOS 8
|https://cloud.centos.org/centos/8/x86_64/images/CentOS-8-GenericCloud-8.4.2105-20210603.0.x86_64.qcow2
|-
|Fedora 34
|Fedora Linux (64 bit)
|https://download.fedoraproject.org/pub/fedora/linux/releases/34/Cloud/x86_64/images/Fedora-Cloud-Base-34-1.2.x86_64.qcow2
|-
|Ubuntu Server 21.04
|
|http://cloud-images.ubuntu.com/hirsute/current/hirsute-server-cloudimg-amd64.img
You need to convert img to qcow with qemu-img:

<code>qemu-img create -F qcow2 -b cloudimg-amd64.img -f qcow2 cloudimg-adm64.qcow2 10G</code>
|}
Here's an example of a cloud-init configuration which you would put in the userdata field when deploying a VM:
{{Highlight
| code = #cloud-config
hostname: vm01
manage_etc_hosts: true
users:
- name: vmadm
sudo: ALL=(ALL) NOPASSWD:ALL
groups: users, admin
home: /home/vmadm
shell: /bin/bash
lock_passwd: false
ssh_pwauth: true
disable_root: false
chpasswd:
list: {{!}}
vmadm:vmadm
expire: false
| lang = text
}}

=== Importing a VMware Virtual Machine ===
To import a VMware virtual machine:

# Copy the virtual machine's .vmdk disk file to a CloudStack node
# Convert the .vmdk into a .qcow format using the qemu-img convert command. Eg. <code>qemu-img convert -f linux.vmdk -O linux.qcow2</code>
# Run the file command on the qcow disk and make a note of its size (in bytes).
# Create a new virtual machine in CloudStack. Use an ISO and not a template. Set the size of the VM's ROOT disk to match the disk size noted from the previous step.
# Start and stop the VM to ensure the virtual disk is created. Make a note of the virtual disk's ID.
# Copy the converted qcow disk over the existing virtual disk image in the primary storage.
# Restart the VM in CloudStack.

Some things to note with this process:

* The disk subsystem might differ between KVM and VMware. As a result, you may need to [[Rebuilding the initial ramdisk|rebuild the initrd file]] so that it has the necessary drivers to boot properly.

=== Increasing the management console's timeout ===
The default timeout is 30 minutes. You may adjust the number of minutes in the <code>session.timeout</code> value stored in <code>/etc/cloudstack/management/server.properties</code>.
{{Highlight
| code = session.timeout=60
| lang = xml
}}
Restart the cloudstack-management service to apply.

=== Upgrade CloudStack ===
Before upgrading CloudStack, review the upgrade instructions from CloudStack's documentation. For 4.17 to 4.18, see: https://docs.cloudstack.apache.org/en/4.18.0.0/upgrading/upgrade/upgrade-4.17.html

In a nutshell, upgrading CloudStack for KVM hosts requires the following steps:

# Before upgrading, load the next systemvm template image. System templates are available from: http://download.cloudstack.org/systemvm/. The systemvm template for KVM should be named something like: <code>systemvm-kvm-4.18.0</code>. When adding the template, specify qcow2 as its format.
# Backup your CloudStack and usage database. {{Highlight
| code = $ mysqldump -u root -p -R cloud > cloud-backup_$(date +%Y-%m-%d-%H%M%S)
$ mysqldump -u root -p cloud_usage > cloud_usage-backup_$(date +%Y-%m-%d-%H%M%S)
| lang = terminal
}}
# If you have outstanding system packages to upgrade, do so now (excluding CloudStack packages) and reboot.
# Stop the CloudStack Management server. Manually unmount any CloudStack mounts. I had to do when I upgraded from 4.16 to 4.17 since it prevented CloudStack from starting. Upgrade the cloudstack-management and cloudstack-common packages. Restart CloudStack. {{Highlight
| code = # systemctl stop cloudstack-management
# umount /var/cloudstack/mnt/*
# yum -y update cloudstack\*
# systemctl start cloudstack-management
| lang = terminal
}}
# Ensure things are running. Watch the logs in <code>tail -f /var/log/cloud/management/*log</code>. Verify that the management server can still communicate with the hosts.
# For each CloudStack host, drain it of hosts, stop the cloudstack-agent service, do a full upgrade, reboot. {{Highlight
| code = # systemctl stop cloudstack-agent
# yum -y update cloudstack-agent

## Restart the service or reboot just to make sure the host can come up by itself
## reboot
# systemctl restart cloudstack-agent
| lang = terminal
}}

==== Updating the system VMs ====
Check your list of virtual routers under Infrastructure -> Virtual routers. Update any VMs that are marked with 'requires upgrade'. Do so by selecting the VM and clicking on the 'upgrade router to use newer template' button.

If the virtual router doesn't start up properly after performing an upgrade, make sure that the VM is running on a node with an appropriate CloudStack agent version. Virtual routers that land on a node with an older version of the agent won't start properly.

==== Other upgrade notes ====
Things to watch out for:

* Don't upgrade CloudStack packages on a server until you stop the CloudStack services. For some reason, I've had issues in the past where something with Java gets corrupted if you try to do an upgrade while the java processes are still running. This then results in some odd class loader issue which results in the service being unable to start after the upgrade.
* System VM template: I upgraded the CloudStack management server to 4.16.1 using a custom compiled RPM package. However, the management server didn't start and inspecting the logs show that it was expecting a system VM template at <code>/usr/share/cloudstack-management/templates/systemvm/systemvmtemplate-4.16.1-kvm.qcow2.bz2</code>. This is easily fixed by downloading the template and restarting the management server. <code>wget <nowiki>http://download.cloudstack.org/systemvm/4.16/systemvmtemplate-4.16.1-kvm.qcow2.bz2</nowiki> -O /usr/share/cloudstack-management/templates/systemvm/systemvmtemplate-4.16.1-kvm.qcow2.bz2</code>

=== Traefik ===

==== Using Traefik for SSL termination ====
With the console proxy served using SSL, we could put a reverse proxy in front of both the management UI and the console proxy service VMs with a valid certificate. This allows us to 'mask' the self-signed certificate with Traefik's ability to request for a proper certificate from Let's Encrypt.

In my test version of CloudStack, I've set up Traefik with the following configs. I updated the console proxy to use a dynamic URL by setting <code>consoleproxy.url.domain</code> to something like <code>*.cloudstack-test.example.com</code>. CloudStack's console proxy service will translate the <code>*</code> by the system VM's IP address (Eg. 10.1.1.1 becomes 10-1-1-1). We'll tell Traefik to reverse proxy these domains for both HTTPS and WSS on ports 443 and 8080 respectively. My dynamic traefik configs to make this happen looks like the following:{{Highlight
| code = http:
serversTransports:
ignorecert:
insecureSkipVerify: true

routers:
cloudstack:
rule: Host(`cloudstack-test.example.com`)
service: cloudstack-poc
entrypoints:
- http
middlewares:
- https-redirect

cloudstack-https:
rule: Host(`cloudstack-test.example.com`)
service: cloudstack-poc
entrypoints:
- https
tls:
certresolver: letsencrypt

cloudstack-pub-ip-136-159-1-100:
rule: Host(`136-159-1-1.cloudstack-test.example.com`)
service: 136-159-1-100
entrypoints:
- https
tls:
certresolver: letsencrypt

cloudstack-pub-ip-136-159-1-100-ws:
rule: Host(`136-159-1-1.cloudstack-test.example.com`)
service: 136-159-1-100-ws
entrypoints:
- httpws
tls:
certresolver: letsencrypt

services:
cloudstack-poc:
loadBalancer:
servers:
- url: "http://172.19.12.141:8080"

136-159-1-100:
loadBalancer:
servers:
- url: "https://136.159.1.100"
serversTransport: ignorecert

136-159-1-100-ws:
loadBalancer:
servers:
- url: "https://136.159.1.100:8080"
serversTransport: ignorecert

middlewares:
https-redirect:
redirectscheme:
scheme: https
| lang = yaml
}}And the following traefik configs:
{{Highlight
| code = entryPoints:
http:
address: ":80"
https:
address: ":443"
httpws:
address: ":8080"

certificatesResolvers:
letsencrypt:
acme:
email: user@example.com
storage: "/config/acme.json"
httpChallenge:
entryPoint: http
| lang = yaml
}}

=== Change guest VM CPU flags ===
The default CPU flags that guest VMs sees are set to qemu64 compatible features. The qemu64 feature set covers a very small subset of the available features that modern CPUs have which makes the guest VM be compatible to nearly all available CPUs at the cost of reduced features. The feature flags in qemu64 are: <code>fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl xtopology cpuid tsc_known_freq pni cx16 x2apic hypervisor lahf_lm cpuid_fault pti</code>

For virtualized workloads that require additional feature sets, you can edit the CloudStack agent to use a different guest CPU mode. Select one of:

* '''custom''': This is the default and it defaults to the x86_qemu64 feature set defined in <code>/usr/share/libvirt/cpu_map/x86_qemu64.xml</code>. You may select a different CPU map by specifying <code>guest.cpu.model</code>.
* '''host-model''': Uses a CPU model compatible with your host. Most feature flags are available. Guest CPUs will identify itself as a generic CPU of that family such as <code>Intel Xeon Processor (Icelake)</code> (note the lack of '(R)' after Intel and Xeon brands and no specific CPU model number).
* '''host-passthrough''': Use CPU passthrough; feature flags match exactly. Migrations only work with matching CPUs and may still fail when using this mode. Guest CPUs will identify itself as the underlying CPU in that hypervisor (such as <code>Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz</code>)

For CloudStack clusters with identical CPUs, it's recommended to use host-model. I've tried using host-passthrough on matching hosts using a Intel Xeon Silver 4316 and migrations sometimes fail and the VM requires a reset to be brought back up.

For more information, see: http://docs.cloudstack.apache.org/en/4.15.0.0/installguide/hypervisor/kvm.html#install-and-configure-the-agent

To change the CPU mode, you simply need to add the appropriate line into the agent properties file and restart the agent:
{{Highlight
| code = ## Matching host model
# echo "guest.cpu.mode=host-model" >> /etc/cloudstack/agent/agent.properties
# systemctl restart cloudstack-agent.service

## Passthrough
# echo "guest.cpu.mode=host-passthrough" >> /etc/cloudstack/agent/agent.properties
# systemctl restart cloudstack-agent.service
| lang = terminal
}}

=== Using Open vSwitch and DPDK ===
Getting DPDK working with Open vSwitch is relatively straight forward. You need to install the DPDK packages, configure the kernel to use hugepages and IO passthrough, enable the vfio driver on your network interfaces for DPDK support, reconfigure Open vSwitch to use the DPDK device, and enable DPDK on the CloudStack agent.

There are some existing resources that might help.

*https://www.shapeblue.com/openvswitch-with-dpdk-support-on-cloudstack/
* https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/ovs-dpdk_end_to_end_troubleshooting_guide/configure_and_test_lacp_bonding_with_open_vswitch_dpdk

Install DPDK tools:

{{Highlight
| code = # yum -y install dpdk dpdk-tools
| lang = terminal
}}

Reconfigure your kernel by editing <code>/etc/default/grub</code>. Add the following. Adjust the <code>isolcpus</code> depending on your CPUs available. I assigned 4 cores out of 80 vCPUs. I am also using 16 1GB huge pages. Adjust this according to how much memory your system has (and probably what performance you're seeing){{Highlight
| code = # vi /etc/default/grub
## default_hugepagesz=1GB hugepagesz=1G hugepages=16 iommu=pt intel_iommu=on isolcpus=1-19,21-39,41-59,61-79 intel_pstate=disable nosoftlockup

# grub2-mkconfig -o /boot/grub2/grub.cfg
| lang = terminal
}}You can also configure huge pages by sysctl (optional if you set it in the kernel cmdline){{Highlight
| code = # echo 'vm.nr_hugepages=16' > /etc/sysctl.d/hugepages.conf
# sysctl -w vm.nr_hugepages=16
| lang = terminal
}}Load the <code>vfio-pci</code> kernel module on boot
{{Highlight
| code = # echo vfio-pci > /etc/modules-load.d/vfio-pci.conf
| lang = terminal
}}
Reboot the machine. When it comes back, verify that you have hugepages and vfio-pci loaded, and that IOMMU is working.
{{Highlight
| code = # cat /proc/cmdline {{!}} grep iommu=pt
# cat /proc/cmdline {{!}} grep intel_iommu=on
# dmesg {{!}} grep -e DMAR -e IOMMU
# grep HugePages_ /proc/meminfo
# lsmod {{!}} grep vfio-pci
| lang = terminal
}}
Set the network interfaces you wish to use DPDK on to the vfio-pci driver. This is done using the <code>dpdk-devbind.py</code> script that's provided by the DPDK tools package.{{Highlight
| code = # modprobe vfio-pci
# dpdk-devbind.py --bind=vfio-pci ens2f0
# dpdk-devbind.py --bind=vfio-pci ens2f1
## Verify
# dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:31:00.0 'Ethernet Controller X710 for 10GBASE-T 15ff' drv=vfio-pci unused=i40e
0000:31:00.1 'Ethernet Controller X710 for 10GBASE-T 15ff' drv=vfio-pci unused=i40e
| lang = terminal
}}Enable DPDK on Open vSwitch. pmd-cpu-mask defines what cores are used for data path packet processing. The dpdk-lcore-mask defines cores that non-datapath OVS-DPDK threads such as handler and revalidator threads run. These two masks should not overlap. For more information on these parameters, see: https://developers.redhat.com/blog/2017/06/28/ovs-dpdk-parameters-dealing-with-multi-numa<nowiki/>.{{Highlight
| code =
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x00000001
# ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0x17c0017c
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024"

## Verify
# ovs-vsctl get Open_vSwitch . dpdk_initialized
# ovs-vsctl get Open_vSwitch . dpdk_version
| lang = terminal
}}If Open vSwitch is already configured to use these interfaces by name, you will just need to change the interface type to dpdk and set its PCI address.
{{Highlight
| code = # ovs-vsctl set interface ens2f0 type=dpdk
# ovs-vsctl set interface ens2f0 options:dpdk-devargs=0000:31:00.0
# ovs-vsctl set interface ens2f1 type=dpdk
# ovs-vsctl set interface ens2f1 options:dpdk-devargs=0000:31:00.1
| lang = terminal
}}
The bridge that these interfaces are connected to must also have its datapath_type updated:
{{Highlight
| code = # ovs-vsctl set bridge nic0 datapath_type=netdev
| lang = terminal
}}
Restart Open vSwitch for these to apply properly and confirm that it's working
{{Highlight
| code = # systemctl restart openvswitch
# ovs-vsctl show
...
Port bond0
Interface ens2f1
type: dpdk
options: {dpdk-devargs="0000:31:00.1"}
Interface ens2f0
type: dpdk
options: {dpdk-devargs="0000:31:00.0"}
| lang = terminal
}}
Update the CloudStack agent so that this host has the DPDK capability. Edit <code>/etc/cloudstack/agent/agent.properties</code>. Note that the keyword is <code>openvswitch.dpdk.enabled</code> (enabled ending with -ed). The example from ShapeBlue's blog post is wrong.{{Highlight
| code = network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver
openvswitch.dpdk.enabled=true
openvswitch.dpdk.ovs.path=/var/run/openvswitch/
| lang = text
}}Restart the CloudStack agent for this capability to be visible by the management server. You should be able to call <code>list hosts filter=capabilities,name</code> and have the host list dpdk as a capability. Eg:
{{Highlight
| code = (localcloud) 🐱 > list hosts filter=capabilities,name
count = 22
host:
+-------------------+----------+
{{!}} CAPABILITIES {{!}} NAME {{!}}
+-------------------+----------+
{{!}} hvm,snapshot,dpdk {{!}} cs9 {{!}}
{{!}} hvm,snapshot,dpdk {{!}} cs10 {{!}}
{{!}} hvm,snapshot,dpdk {{!}} cs11 {{!}}
| lang = terminal
}}
If you don't see this, double check your agent configs and restart it again.

For VMs to take advantage of DPDK, you must either set extraconfig on the virtual machine or create a new compute service offering. Extraconfig might get overwritten whenever the VM is updated, so it's not a reliable solution. Extraconfig is a URL encoded config and you cannot use single quotes in it or else you will break the VM deployment. Eg:
{{Highlight
| code = (localcloud) 🐱 > update virtualmachine extraconfig=dpdk-hugepages:%0A%3CmemoryBacking%3E%0A%20%20%20%3Chugepages%3E%0A%20%20%20%20%3C/hugepages%3E%0A%3C/memoryBacking%3E%0A%0Adpdk-numa:%0A%3Ccpu%20mode=%22host-passthrough%22%3E%0A%20%20%20%3Cnuma%3E%0A%20%20%20%20%20%20%20%3Ccell%20id=%220%22%20cpus=%220%22%20memory=%229437184%22%20unit=%22KiB%22%20memAccess=%22shared%22/%3E%0A%20%20%20%3C/numa%3E%0A%3C/cpu%3E%0A%0Adpdk-interface-queue:%0A%3Cdriver%20name=%22vhost%22%20queues=%22128%22/%3E id=af64cc80-a4e4-4c17-9c7d-c34ed234dc6a
virtualmachine = map[account:RCS affinitygroup:[] cpunumber:2 cpuspeed:1000 cpuused:5.88% created:2022-05-03T13:16:02-0600 details:map[Message.ReservedCapacityFreed.Flag:false dpdk-hugepages:a extraconfig-dpdk-hugepages:<memoryBacking>
| lang = terminal
}}

==== Troubleshooting ====
{{Highlight
| code = 2022-05-05T22:35:28.312Z{{!}}281704{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.312Z{{!}}281705{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281706{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281707{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281708{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
| lang = text
}}
Check the agent logs for issues from qemu. I had defined an invalid property which prevented the VM from starting.
{{Highlight
| code = [root@cs10 agent]# grep qemu agent.log
org.libvirt.LibvirtException: internal error: process exited while connecting to monitor: 2022-05-05T22:35:52.060450Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
2022-05-05T22:35:52.060633Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
2022-05-05T22:35:52.060817Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
| lang = terminal
}}

==Tools==

===CloudMonkey===
Get started:

*Download from: https://github.com/apache/cloudstack-cloudmonkey/releases/tag/6.1.0
*Documentation at: https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+cloudmonkey+CLI

When you first run CloudMonkey, you will need to set the CloudStack instance URL and credentials and then run sync.
{{Highlight
| code = $ cmk
> set url http://172.19.12.141:8080/client/api
> set username admin
> set password password
> sync
| lang = terminal
}}
The settings are then saved to <code>~/.cmk/config</code>.

The sync command fetches all the available API calls that your account can use. Once that is done, you can then use tab completion while in the CloudMonkey CLI.

====Cheat sheet====
{| class="wikitable"
!What
!Command
|-
|Change output format
|<code><nowiki>set display table|json</nowiki></code>
|-
|Create compute offering
|<code>create serviceoffering name=rcs.c2 displaytext=Medium cpunumber=2 cpuspeed=750 memory=2048 storagetype=shared provisioningtype=thin offerha=false limitcpuuse=false isvolatile=false issystem=false deploymentplanner=UserDispersingPlanner cachemode=none customized=false</code>
|-
|Add a new host
|<code>add host clusterid=XX podid=XX zoneid=XX hypervisor=KVM password=**** username=root url=<nowiki>http://bm01</nowiki></code>
|}

====Automate zone deployments====
There is an example script on how to automate a basic zone deployment at: https://github.com/apache/cloudstack-cloudmonkey/wiki/Usage

=== Terraform ===
The Terraform CloudStack provide works for the most part. However, for CloudStack 4.16, you'll need to recompile it from scratch because the distributed binaries don't work properly (resulting in deployments hanging indefinitely). To build the Terraform provider, I will use Docker:

{{Highlight
| code = # git clone https://github.com/apache/cloudstack-terraform-provider.git
# cd cloudstack-terraform-provide
# git clone https://github.com/tetra12/cloudstack-go.git
# cat <<EOF >> go.mod
replace github.com/apache/cloudstack-go/v2 => ./cloudstack-go
exclude github.com/apache/cloudstack-go/v2 v2.11.0
EOF
# docker run --rm -ti -v /home/me/cloudstack-terraform-provider/:/build golang bash
> cd /build
> go build
| lang = terminal
}}

Copy the resulting binary to your terraform plugins path. Because I ran terraform init, it placed it in my terraform directory under <code>.terraform/providers/registry.terraform.io/cloudstack/cloudstack/0.4.0/linux_amd64/terraform-provider-cloudstack_v0.4.0</code>. Edit the metadata file in the same directory as the provider executable and remove the file hash so that terraform runs the provider.

See also: [[Terraform#CloudStack]]

=== Packer ===
The Packer CloudStack provider also works for the most part, but is limited in that it cannot enter keyboard inputs. Any OS deployments will require some sort of manual inputs or require that the ISO media you use is completely automated. I also had to compile the provider manually since the default plugin that's fetched by packer doesn't quite work due to API changes.

See also: [[Packer#CloudStack]]

==Troubleshooting==
When you run into issues, check the logs in <code>/var/log/cloudstack/</code>. There's typically a stacktrace which gets generated whenever you encounter an error.

===Can't create shared network in a advanced zone using Open vSwitch===
Whenever I try creating a shared network in an advanced zone that is using OVS, the step fails with: "Unable to convert network offering with specified id to network profile".

Stack trace shows that the [https://github.com/apache/cloudstack/blob/bf6266188c89a5487383f216333ae10e878d2c10/plugins/network-elements/ovs/src/main/java/com/cloud/network/guru/OvsGuestNetworkGuru.java#L99 OVS guest network guru] isn't able at designing the network because the zone isn't capable of handling this network offering.

{{Highlight
| code = 2021-09-28 16:36:26,416 DEBUG [c.c.a.ApiServer] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) CIDRs from which account 'Acct[76a1585d-1bf6-11ec-a3c5-8f3e88f01ab1-admin]' is allowed to perform API calls: 0.0.0.0/0,::/0
2021-09-28 16:36:26,439 DEBUG [c.c.u.AccountManagerImpl] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Access granted to Acct[76a1585d-1bf6-11ec-a3c5-8f3e88f01ab1-admin] to [Network Offering [7-Guest-DefaultSharedNetworkOffering] by AffinityGroupAccessChecker
2021-09-28 16:36:26,517 DEBUG [c.c.n.g.BigSwitchBcfGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network, the physical isolation type is not BCF_SEGMENT
2021-09-28 16:36:26,521 DEBUG [o.a.c.n.c.m.ContrailGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,524 DEBUG [c.c.n.g.NiciraNvpGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,527 DEBUG [o.a.c.n.o.OpendaylightGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,530 DEBUG [c.c.n.g.OvsGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,536 DEBUG [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) GRE: VLAN
2021-09-28 16:36:26,536 DEBUG [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) GRE: VXLAN
2021-09-28 16:36:26,536 INFO [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,539 INFO [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,543 DEBUG [o.a.c.n.g.SspGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) SSP not configured to be active
2021-09-28 16:36:26,546 DEBUG [c.c.n.g.BrocadeVcsGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,549 DEBUG [o.a.c.e.o.NetworkOrchestrator] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Releasing lock for Acct[76a0531f-1bf6-11ec-a3c5-8f3e88f01ab1-system]
2021-09-28 16:36:26,624 DEBUG [c.c.u.d.T.Transaction] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Rolling back the transaction: Time = 172 Name = qtp1816147548-400; called by -TransactionLegacy.rollback:888-TransactionLegacy.removeUpTo:831-TransactionLegacy.close:655-Transaction.execute:38-Transaction.execute:47-NetworkOrches
trator.createGuestNetwork:2572-NetworkOrchestrator.createGuestNetwork:2327-NetworkServiceImpl$4.doInTransaction:1502-NetworkServiceImpl$4.doInTransaction:1450-Transaction.execute:40-NetworkServiceImpl.commitNetwork:1450-NetworkServiceImpl.createGuestNetwork:1366
2021-09-28 16:36:26,667 ERROR [c.c.a.ApiServer] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) unhandled exception executing api command: [Ljava.lang.String;@69a8823d
com.cloud.utils.exception.CloudRuntimeException: Unable to convert network offering with specified id to network profile
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.setupNetwork(NetworkOrchestrator.java:739)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator$10.doInTransaction(NetworkOrchestrator.java:2634)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator$10.doInTransaction(NetworkOrchestrator.java:2572)
at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:50)
at com.cloud.utils.db.Transaction.execute(Transaction.java:40)
at com.cloud.utils.db.Transaction.execute(Transaction.java:47)
...
| lang = text
}}

====Possible answer====
The guest network was set up with GRE isolation. This however isn't supported with KVM as the hypervisor (see [https://events.static.linuxfound.org/sites/events/files/slides/CloudStack%20Collab%20Hypervisor.pdf this presentation]). After re-creating the zone with the guest physical network set up with just VLAN isolation, I was able to create a regular shared guest network that all tenants within the zone can see and use.

To make the shared network SNAT out, I created another shared network offering that also has SourceNat and StaticNat.
{{Highlight
| code = $ cmk list serviceofferings issystem=true name='System Offering For Software Router'
$ cmk create networkoffering \
name=SharedNetworkOfferingWithSourceNatService displaytext="Shared Network Offering with Source NAT Service" traffictype=GUEST guestiptype=shared conservemode=true specifyvlan=true specifyipranges=true \
serviceofferingid=307b14d8-afd1-43ea-948c-ffe882cd5926 \
supportedservices=Dhcp,Dns,Firewall,SourceNat,StaticNat,PortForwarding \
serviceProviderList[0].service=Dhcp serviceProviderList[0].provider=VirtualRouter \
serviceProviderList[1].service=Dns serviceProviderList[1].provider=VirtualRouter \
serviceProviderList[2].service=Firewall serviceProviderList[2].provider=VirtualRouter \
serviceProviderList[3].service=SourceNat serviceProviderList[3].provider=VirtualRouter \
serviceProviderList[4].service=StaticNat serviceProviderList[4].provider=VirtualRouter \
serviceProviderList[5].service=PortForwarding serviceProviderList[5].provider=VirtualRouter \
servicecapabilitylist[0].service=SourceNat servicecapabilitylist[0].capabilitytype=SupportedSourceNatTypes servicecapabilitylist[0].capabilityvalue=peraccount
| lang = terminal
}}
Using this network offering, I was able to create a shared network in the advanced networking zone that has a NAT service which is visible to all accounts. The only issue with this approach is that there isn't a way to create a port forwarding for a specific VM because the account that owns this network is 'system'.

=== Libvirtd can't start due to expired certificate ===
For some reason, the host stopped renewing agent certs with the management server. As a result, libvirtd will not restart. I only noticed this after rebooting an affected node after migrating all the VMs off the system.
{{Highlight
| code = # libvirtd -l
2023-11-30 18:16:25.116+0000: 39448: info : libvirt version: 8.0.0, package: 22.module+el8.9.0+1405+b6048078 (infrastructure@rockylinux.org, 2023-07-31-18:01:38, )
2023-11-30 18:16:25.116+0000: 39448: info : hostname: cs1
2023-11-30 18:16:25.116+0000: 39448: error : virNetTLSContextCheckCertTimes:142 : The server certificate /etc/pki/libvirt/servercert.pem has expired
| lang = terminal
}}
Note that the certficate file <code>/etc/pki/libvirt/servercert.pem</code> symlinks to <code>/etc/cloudstack/agent/cloud.crt</code>. The certificates and private keys under <code>/etc/cloudstack/agent/cloud*</code> are generated by the CloudStack management server and then sent to and saved by the agent<ref>Certificate saved by the agent: https://github.com/apache/cloudstack/blob/cb62ce67671699fa01564b3b4b0d3d83eb3d5acb/agent/src/main/java/com/cloud/agent/Agent.java#L671</ref>.

Since the node is already out of service, the easiest fix here is to [[CloudStack#Re-add existing KVM bare metal host|re-add this KVM bare metal host]] back into CloudStack again.

==Open-ended questions==
===Compute offerings with 'unlimited' CPU cycles?===
Compute offerings require a MHz value assigned. Why is this? Can we just assign a VM entire cores?

- If you read the docs, CPU (in MHz) only has an effect if CPU cap is selected. In all other cases, the value here is something akin to 'cpu shares'.

- if you put in a huge number like 9999, deployment would fail though.

===How to implement showback?===
Is there a way to implement showback based on resources consumed by account?

===Monitoring resources?===
Is there a way to monitor resource usage by account, node? Any good way to push VMs into a CMDB like ServiceNow?

===NetApp integration?===
Is it possible to do guest VM snapshots by leveraging NetApp?

===Backups?===
The only backup plugins that are available are 'dummy' which does nothing and 'veeam' which only supports VMware + Veeam. If you're using KVM, there doesn't seem to be any way to easily backup/restore VMs.<br />

<br />
{{Navbox Linux}}
[[Category:Linux]]
[[Category:LinuxUtilities]]

Traefik

2025-07-02T16:44:28Z

Leo: /* Certificates with intermediate signing certificates */

Traefik is an open source HTTP reverse proxy and load balancer written in golang. It can be integrated into various infrastructures such as Docker and Kubernetes or used as a standalone service. Traefik comes with lots of automation features such as dynamic configuration through container labels and automatic SSL certificate renewals using the ACME protocol.

==Docker integration==
Traefik can be used as the reverse proxy for all web-related services on a Docker host. The added benefit with this approach is that new web services that are added can be automatically configured through docker labels and SSL certificates are automatically obtained provided that the DNS names already resolve.

Extra care is required when using Traefik and Docker since Traefik needs to have access to the Docker socket in order to see container events for its automatic configuration. Since having access to the Docker socket is pretty much equivalent to owning the Docker host, from a security stand-point, it's best to have an intermediate Docker container proxy the Docker socket so that it can only be read from but not written to. More on this set up below.

===Setting up Traefik with Docker Compose===
Create a new 'traefik' Docker network. This network will be used by any other containers on the host that needs to be reverse proxied by Treafik.
{{Highlight
| code = # docker network create traefik
| lang = terminal
}}
Next, we will set up the Traefik docker-compose stack. Traefik should be configured to work with Docker. We'll also restrict Traefik from exposing containers unless they are explicitly labeled. Review the [https://doc.traefik.io/traefik/providers/docker/ Traefik Docker documentation] for all the available Docker related options.

Included here is the socket-proxy which we'll use to restrict what Traefik can do to Docker in case it gets compromised.
{{Highlight
| code = version: '3.3'

services:
traefik:
image: traefik:latest
restart: always
command: --web --docker --docker.watch --docker.exposedbydefault=false --providers.docker.endpoint=tcp://socket-proxy:2375
volumes:
- ./config/traefik.yml:/traefik.yml
- ./config:/config
- /var/log/traefik:/var/log/traefik
networks:
- traefik
- internal
environment:
- "TZ=MST7MDT,M3.2.0,M11.1.0"
ports:
- "80:80"
- "443:443"

# https://github.com/Tecnativa/docker-socket-proxy
socket-proxy:
build:
context: ./socket-proxy
dockerfile: Dockerfile
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
CONTAINERS: 1
networks:
- internal
expose:
- "2375"

networks:
internal:
traefik:
name: traefik
external: true
| lang = yaml
}}
Bring the entire stack up and when everything comes up, you're done.
{{Highlight
| code = # docker-compose up -d
| lang = terminal
}}

=== Using Traefik with your Docker containers ===
With the the Traefik container set up, we can now create containers to take advantage of Traefik as the ingress controller for the Docker host. This can be done through just container labels.
Containers should be labeled with the following labels.
{| class="wikitable"
!Description
!Label (example)
|-
|Enable Traefik for this container
|<code>traefik.enable=true</code>
|-
|Define the Docker network that Traefik needs to use to connect to this container
|<code>traefik.docker.network=traefik</code>
|-
|Define a new service to this container. For example, if your container is listening on port 8888, define a new Traefik service that listens on port 8888.
|<code>traefik.http.services.my-new-service.loadbalancer.server.port=8888</code>
|-
|Expose your container using the service created previously via the HTTPS entrypoint.
This will make Traefik forward HTTPS data matching your virtual host to this container.

SSL certificates are handled with Let's Encrypt.
|<code>traefik.http.routers.new-service-router-https.entrypoints=https</code>
<code>traefik.http.routers.new-service-router-https.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router-https.service=gitlab-service</code>

<code>traefik.http.routers.new-service-router-https.tls=true</code>

<code>traefik.http.routers.new-service-router-https.tls.certresolver=letsencrypt</code>
|-
|You can also expose your container via HTTP.
Be sure that your router name is distinct.
|<code>traefik.http.routers.new-service-router.entrypoints=http</code>

<code>traefik.http.routers.new-service-router.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router.service=gitlab-service</code>
|-
|If you want your HTTP traffic redirected to the HTTPS version, create a new middleware that redirects to the HTTPS scheme.
|<code>traefik.http.routers.new-service-router.entrypoints=http</code>

<code>traefik.http.routers.new-service-router.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router.middlewares=new-service-redirect</code>

<code>traefik.http.middlewares.new-service-redirect.redirectscheme.scheme=https</code>
|}
If your container exposes multiple services, you can define multiple services and routers in your labels. Just make sure things are named uniquely.

Here's the configuration for one of my test Wiki containers using Traefik.
{{Highlight
| code = wiki:
image: mediawiki:1.37
labels:
- traefik.enable=true
- traefik.docker.network=traefik
- traefik.http.middlewares.wiki-https-redirect.redirectscheme.scheme=https
# http
- traefik.http.routers.wiki.entrypoints=http
- traefik.http.routers.wiki.rule=Host(`wiki-test.example.com`)
- traefik.http.routers.wiki.middlewares=wiki-https-redirect
# https
- traefik.http.routers.wiki-https.entrypoints=https
- traefik.http.routers.wiki-https.rule=Host(`wiki-test.example.com`)
- traefik.http.routers.wiki-https.tls=true
- traefik.http.routers.wiki-https.tls.certresolver=letsencrypt
- traefik.http.routers.wiki-https.service=wiki-https
- traefik.http.services.wiki-https.loadbalancer.server.port=8080
networks:
- traefik
expose:
- "8080"
restart: always
| lang = terminal
}}

== Standalone setup ==
Traefik can be used as a standalone service to reverse proxy and do SSL termination. The example below will set up a simple reverse proxy for a local web service listening on port 8000. The SSL certificate will automatically be obtained using Let's Encrypt.

Create the following <code>traefik.yml</code> configuration file at <code>/etc/traefik/traefik.yml</code>:
{{Highlight
| code = log:
level: INFO
filepath: "/var/log/traefik.log"

accessLog:
filepath: "/var/log/access.log"
fields:
defaultMode: keep
headers:
defaultMode: keep

api:
dashboard: false
insecure: false
debug: true

entryPoints:
http:
address: ":80"
https:
address: ":443"

providers:
file:
filename: "/etc/traefik/dynamic_config.yml"
watch: true

certificatesResolvers:
letsencrypt:
acme:
email: my-email-address@example.com
storage: "/etc/traefik/acme.json"
httpChallenge:
entryPoint: http
| lang = yaml
}}
Create a <code>dynamic_config.yml</code> file at <code>/etc/traefik/dynamic_config.yml</code>. This file will define our routes and backend services. You must have a separate router for each of the insecure http and secure https routes to your backend service.
{{Highlight
| code = http:
serversTransports:
ignorecert:
insecureSkipVerify: true

routers:
testing-http:
rule: Host(`testing.example.com`)
service: my-testing-service
entrypoints:
- http

testing-https:
rule: Host(`testing.example.com`)
service: my-testing-service
entrypoints:
- https
tls:
certresolver: letsencrypt

services:
my-testing-service:
loadBalancer:
servers:
- url: "http://127.0.0.1:8000"
| lang = yaml
}}
Start traefik:
{{Highlight
| code = # traefik --web --config /etc/traefik/traefik.yml
| lang = terminal
}}

==Tasks==

===Add host header on requests to the backend===
You can specify custom request headers by injecting a middleware. For example, when reverse proxying HTTP traffic for a website, you may wish to have the reverse proxy provide the 'Host' header. To do this, we can add the following dynamic config.
{{Highlight
| code = http:
routers:
public:
rule: Host(`public.example.com`)
service: public
entrypoints:
- http
middlewares:
- public

services:
public:
loadBalancer:
servers:
- url: "http://10.1.2.200"

middlewares:
public:
headers:
customRequestHeaders:
Host: "public.example.com"
| lang = text
}}

=== Add a path prefix ===
To serve content under a specific path (eg. /downloads) using another container, create the service as you would normally but specify the router rule to include a <code>PathPrefix</code>. Traefik can also strip out this path prefix so that the underlying server sees only the root path (eg. have /downloads/file become /file) using the stripprefix middleware.

Here is an example <code>docker-compose.yml</code> definition:
{{Highlight
| code = services:
downloads:
build:
context: ./downloads
labels:
- traefik.enable=true
- traefik.docker.network=traefik
# http
- traefik.http.routers.downloads.entrypoints=http
- 'traefik.http.routers.downloads.rule=(Host(`example.com`) && PathPrefix(`/downloads/`))'
- traefik.http.routers.downloads.middlewares=downloads-https-redirect
- traefik.http.middlewares.downloads-https-redirect.redirectscheme.scheme=https
# https
- traefik.http.routers.downloads-https.entrypoints=https
- 'traefik.http.routers.downloads-https.rule=(Host(`example.com`) && PathPrefix(`/downloads/`))'
- traefik.http.routers.downloads-https.tls=true
- traefik.http.routers.downloads-https.tls.certresolver=letsencrypt
- traefik.http.services.downloads-https.loadbalancer.server.port=8080
- traefik.http.routers.downloads-https.service=downloads-https
- traefik.http.routers.downloads-https.middlewares=downloads-strip-prefix
- traefik.http.middlewares.downloads-strip-prefix.stripprefix.prefixes=/downloads
networks:
- traefik
expose:
- "8080"
| lang = terminal
}}

=== Install a custom certificate ===
Installing a custom certificate and private key in Traefik is simple:

# Edit the traefik.conf and ensure that it has a dynamic configuration set. {{Highlight
| code = providers:
file:
filename: "/config/dynamic_config.yml"
watch: true
| lang = yaml
}}
# In the dynamic configuration file, specify the certificate and key files. For example, here is what I would place for a *.example.com wildcard certificate: {{Highlight
| code = tls:
certificates:
- certFile: /config/ssl/example.com.crt
keyFile: /config/ssl/example.com.key
| lang = terminal
}}
# Place the certificates in the /config/ssl directory.

Restart Traefik if you didn't previously have the dynamic config file defined in the traefik.conf previously. Traefik will automatically determine what domains it has certificates for and use it as required.

Your certificates must be set to mode 0600 (that is, it cannot be readable to anyone else) for them to be loaded properly.

==== Certificates with intermediate signing certificates ====
If your certificate is signed by one or more intermediate certificates, you will likely want to include them within the certificate file you pass to Traefik. The certificate file containing the full set of intermediate and root certificates is called a fullchain certificate. This fullchain cert file can be generated by concatenating certificates furthest from the root certificate first. That is: your certificate, followed by the intermediates furthest from root first, then the root certificate.

For example, if I have an Entrust certificate, concatenate the files together:
{{Highlight
| code = # cat ServerCertificate.crt Intermediate1.crt Intermediate2.crt Root.crt > fullchain.crt

## If your certs don't end on a new line...
# for i in ServerCertificate.crt Intermediate1.crt Intermediate2.crt Root.crt ; do cat $i ; echo ; done > fullchain.crt
| lang = terminal
}}Remember to set the mode to 0600 when using the fullchain certificate.{{Navbox Linux}}
{{Navbox Web}}
[[Category:WebApp]]
[[Category:Linux]]

Traefik

2025-07-02T16:36:17Z

Leo: Certificates with intermediate certs

Traefik is an open source HTTP reverse proxy and load balancer written in golang. It can be integrated into various infrastructures such as Docker and Kubernetes or used as a standalone service. Traefik comes with lots of automation features such as dynamic configuration through container labels and automatic SSL certificate renewals using the ACME protocol.

==Docker integration==
Traefik can be used as the reverse proxy for all web-related services on a Docker host. The added benefit with this approach is that new web services that are added can be automatically configured through docker labels and SSL certificates are automatically obtained provided that the DNS names already resolve.

Extra care is required when using Traefik and Docker since Traefik needs to have access to the Docker socket in order to see container events for its automatic configuration. Since having access to the Docker socket is pretty much equivalent to owning the Docker host, from a security stand-point, it's best to have an intermediate Docker container proxy the Docker socket so that it can only be read from but not written to. More on this set up below.

===Setting up Traefik with Docker Compose===
Create a new 'traefik' Docker network. This network will be used by any other containers on the host that needs to be reverse proxied by Treafik.
{{Highlight
| code = # docker network create traefik
| lang = terminal
}}
Next, we will set up the Traefik docker-compose stack. Traefik should be configured to work with Docker. We'll also restrict Traefik from exposing containers unless they are explicitly labeled. Review the [https://doc.traefik.io/traefik/providers/docker/ Traefik Docker documentation] for all the available Docker related options.

Included here is the socket-proxy which we'll use to restrict what Traefik can do to Docker in case it gets compromised.
{{Highlight
| code = version: '3.3'

services:
traefik:
image: traefik:latest
restart: always
command: --web --docker --docker.watch --docker.exposedbydefault=false --providers.docker.endpoint=tcp://socket-proxy:2375
volumes:
- ./config/traefik.yml:/traefik.yml
- ./config:/config
- /var/log/traefik:/var/log/traefik
networks:
- traefik
- internal
environment:
- "TZ=MST7MDT,M3.2.0,M11.1.0"
ports:
- "80:80"
- "443:443"

# https://github.com/Tecnativa/docker-socket-proxy
socket-proxy:
build:
context: ./socket-proxy
dockerfile: Dockerfile
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
CONTAINERS: 1
networks:
- internal
expose:
- "2375"

networks:
internal:
traefik:
name: traefik
external: true
| lang = yaml
}}
Bring the entire stack up and when everything comes up, you're done.
{{Highlight
| code = # docker-compose up -d
| lang = terminal
}}

=== Using Traefik with your Docker containers ===
With the the Traefik container set up, we can now create containers to take advantage of Traefik as the ingress controller for the Docker host. This can be done through just container labels.
Containers should be labeled with the following labels.
{| class="wikitable"
!Description
!Label (example)
|-
|Enable Traefik for this container
|<code>traefik.enable=true</code>
|-
|Define the Docker network that Traefik needs to use to connect to this container
|<code>traefik.docker.network=traefik</code>
|-
|Define a new service to this container. For example, if your container is listening on port 8888, define a new Traefik service that listens on port 8888.
|<code>traefik.http.services.my-new-service.loadbalancer.server.port=8888</code>
|-
|Expose your container using the service created previously via the HTTPS entrypoint.
This will make Traefik forward HTTPS data matching your virtual host to this container.

SSL certificates are handled with Let's Encrypt.
|<code>traefik.http.routers.new-service-router-https.entrypoints=https</code>
<code>traefik.http.routers.new-service-router-https.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router-https.service=gitlab-service</code>

<code>traefik.http.routers.new-service-router-https.tls=true</code>

<code>traefik.http.routers.new-service-router-https.tls.certresolver=letsencrypt</code>
|-
|You can also expose your container via HTTP.
Be sure that your router name is distinct.
|<code>traefik.http.routers.new-service-router.entrypoints=http</code>

<code>traefik.http.routers.new-service-router.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router.service=gitlab-service</code>
|-
|If you want your HTTP traffic redirected to the HTTPS version, create a new middleware that redirects to the HTTPS scheme.
|<code>traefik.http.routers.new-service-router.entrypoints=http</code>

<code>traefik.http.routers.new-service-router.rule=Host(`my-new-service.example.com`)</code>

<code>traefik.http.routers.new-service-router.middlewares=new-service-redirect</code>

<code>traefik.http.middlewares.new-service-redirect.redirectscheme.scheme=https</code>
|}
If your container exposes multiple services, you can define multiple services and routers in your labels. Just make sure things are named uniquely.

Here's the configuration for one of my test Wiki containers using Traefik.
{{Highlight
| code = wiki:
image: mediawiki:1.37
labels:
- traefik.enable=true
- traefik.docker.network=traefik
- traefik.http.middlewares.wiki-https-redirect.redirectscheme.scheme=https
# http
- traefik.http.routers.wiki.entrypoints=http
- traefik.http.routers.wiki.rule=Host(`wiki-test.example.com`)
- traefik.http.routers.wiki.middlewares=wiki-https-redirect
# https
- traefik.http.routers.wiki-https.entrypoints=https
- traefik.http.routers.wiki-https.rule=Host(`wiki-test.example.com`)
- traefik.http.routers.wiki-https.tls=true
- traefik.http.routers.wiki-https.tls.certresolver=letsencrypt
- traefik.http.routers.wiki-https.service=wiki-https
- traefik.http.services.wiki-https.loadbalancer.server.port=8080
networks:
- traefik
expose:
- "8080"
restart: always
| lang = terminal
}}

== Standalone setup ==
Traefik can be used as a standalone service to reverse proxy and do SSL termination. The example below will set up a simple reverse proxy for a local web service listening on port 8000. The SSL certificate will automatically be obtained using Let's Encrypt.

Create the following <code>traefik.yml</code> configuration file at <code>/etc/traefik/traefik.yml</code>:
{{Highlight
| code = log:
level: INFO
filepath: "/var/log/traefik.log"

accessLog:
filepath: "/var/log/access.log"
fields:
defaultMode: keep
headers:
defaultMode: keep

api:
dashboard: false
insecure: false
debug: true

entryPoints:
http:
address: ":80"
https:
address: ":443"

providers:
file:
filename: "/etc/traefik/dynamic_config.yml"
watch: true

certificatesResolvers:
letsencrypt:
acme:
email: my-email-address@example.com
storage: "/etc/traefik/acme.json"
httpChallenge:
entryPoint: http
| lang = yaml
}}
Create a <code>dynamic_config.yml</code> file at <code>/etc/traefik/dynamic_config.yml</code>. This file will define our routes and backend services. You must have a separate router for each of the insecure http and secure https routes to your backend service.
{{Highlight
| code = http:
serversTransports:
ignorecert:
insecureSkipVerify: true

routers:
testing-http:
rule: Host(`testing.example.com`)
service: my-testing-service
entrypoints:
- http

testing-https:
rule: Host(`testing.example.com`)
service: my-testing-service
entrypoints:
- https
tls:
certresolver: letsencrypt

services:
my-testing-service:
loadBalancer:
servers:
- url: "http://127.0.0.1:8000"
| lang = yaml
}}
Start traefik:
{{Highlight
| code = # traefik --web --config /etc/traefik/traefik.yml
| lang = terminal
}}

==Tasks==

===Add host header on requests to the backend===
You can specify custom request headers by injecting a middleware. For example, when reverse proxying HTTP traffic for a website, you may wish to have the reverse proxy provide the 'Host' header. To do this, we can add the following dynamic config.
{{Highlight
| code = http:
routers:
public:
rule: Host(`public.example.com`)
service: public
entrypoints:
- http
middlewares:
- public

services:
public:
loadBalancer:
servers:
- url: "http://10.1.2.200"

middlewares:
public:
headers:
customRequestHeaders:
Host: "public.example.com"
| lang = text
}}

=== Add a path prefix ===
To serve content under a specific path (eg. /downloads) using another container, create the service as you would normally but specify the router rule to include a <code>PathPrefix</code>. Traefik can also strip out this path prefix so that the underlying server sees only the root path (eg. have /downloads/file become /file) using the stripprefix middleware.

Here is an example <code>docker-compose.yml</code> definition:
{{Highlight
| code = services:
downloads:
build:
context: ./downloads
labels:
- traefik.enable=true
- traefik.docker.network=traefik
# http
- traefik.http.routers.downloads.entrypoints=http
- 'traefik.http.routers.downloads.rule=(Host(`example.com`) && PathPrefix(`/downloads/`))'
- traefik.http.routers.downloads.middlewares=downloads-https-redirect
- traefik.http.middlewares.downloads-https-redirect.redirectscheme.scheme=https
# https
- traefik.http.routers.downloads-https.entrypoints=https
- 'traefik.http.routers.downloads-https.rule=(Host(`example.com`) && PathPrefix(`/downloads/`))'
- traefik.http.routers.downloads-https.tls=true
- traefik.http.routers.downloads-https.tls.certresolver=letsencrypt
- traefik.http.services.downloads-https.loadbalancer.server.port=8080
- traefik.http.routers.downloads-https.service=downloads-https
- traefik.http.routers.downloads-https.middlewares=downloads-strip-prefix
- traefik.http.middlewares.downloads-strip-prefix.stripprefix.prefixes=/downloads
networks:
- traefik
expose:
- "8080"
| lang = terminal
}}

=== Install a custom certificate ===
Installing a custom certificate and private key in Traefik is simple:

# Edit the traefik.conf and ensure that it has a dynamic configuration set. {{Highlight
| code = providers:
file:
filename: "/config/dynamic_config.yml"
watch: true
| lang = yaml
}}
# In the dynamic configuration file, specify the certificate and key files. For example, here is what I would place for a *.example.com wildcard certificate: {{Highlight
| code = tls:
certificates:
- certFile: /config/ssl/example.com.crt
keyFile: /config/ssl/example.com.key
| lang = terminal
}}
# Place the certificates in the /config/ssl directory.

Restart Traefik if you didn't previously have the dynamic config file defined in the traefik.conf previously. Traefik will automatically determine what domains it has certificates for and use it as required.

==== Certificates with intermediate signing certificates ====
If your certificate is signed by one or more intermediate certificates, you will likely want to include them within the certificate file you pass to Traefik. Do this by concatenating certificates furthest from the root certificate first. That is: your certificate, followed by the intermediates furthest from root first, then the root certificate.

For example, if I have an Entrust certificate, concatenate the files together:
{{Highlight
| code = # cat ServerCertificate.crt Intermediate1.crt Intermediate2.crt Root.crt > certificate.crt
| lang = terminal
}}
{{Navbox Linux}}
{{Navbox Web}}
[[Category:WebApp]]
[[Category:Linux]]

Entrust

2025-07-02T16:25:23Z

Leo: Sectigo signing certificate info

Entrust is a certificate authority.

== Sectigo ==
Google stopped trusting Entrust signed certificates since around June 2024. Entrust has since started signing their certificates through Sectigo. The certificate chain for a EV certificate now looks like this:

USERTrust RSA Certification Authority

* [https://paste.steamr.com/view/raw/97e71c82 Sectigo Public Server Authentication Root R46]
** [https://paste.steamr.com/view/raw/8a4028f7 Entrust EV TLS Issuing RSA CA 2]

Like Entrust's crap certs, you'll have to install the Sectigo and Entrust intermediate certs to make most things work. For some reason, I don't see the intermediate certs listed on their website, so the links above go to a copy on my pastebin.

== Troubleshooting ==
The Entrust Root Certification Authority (G2) doesn't appear to be part of the system root CA. As a result, if you try to <code>wget</code> or <code>curl</code> to a resource using certificates signed by Entrust, you'll get an error like the one below.
{{Highlight
| code = # wget -O - https://somewhere.ucalgary.ca/
--2022-03-29 16:58:39-- https://somewhere.ucalgary.ca/
Resolving somewhere.ucalgary.ca (somewhere.ucalgary.ca)... 10.43.144.134
Connecting to somewhere.ucalgary.ca (somewhere.ucalgary.ca){{!}}10.43.144.134{{!}}:443... connected.
ERROR: cannot verify somewhere.ucalgary.ca's certificate, issued by ‘CN=Entrust Certification Authority - L1K,OU=(c) 2012 Entrust\\, Inc. - for authorized use only,OU=See www.entrust.net/legal-terms,O=Entrust\\, Inc.,C=US’:
Unable to locally verify the issuer's authority.
To connect to somewhere.ucalgary.ca insecurely, use `--no-check-certificate'.
| lang = terminal
}}
To fix this, you'll need to install the root and chain certificates that are provided by Entrust at https://www.entrust.com/resources/certificate-solutions/tools/root-certificate-downloads.

=== Installing certificates ===
On RHEL based systems:{{Highlight
| code = # wget https://web.entrust.com/root-certificates/entrust_l1k.cer -O /usr/share/pki/ca-trust-source/anchors/entrust_l1k.cer
# wget https://web.entrust.com/root-certificates/entrust_g2_ca.cer -O /usr/share/pki/ca-trust-source/anchors/entrust_g2_ca.cer
# wget https://web.entrust.com/root-certificates/entrust_l1m_sha2.cer -O /usr/share/pki/ca-trust-source/anchors/entrust_l1m_sha2.cer
# update-ca-trust extract
| lang = terminal
}}On Ubuntu:{{Highlight
| code = # wget https://web.entrust.com/root-certificates/entrust_l1k.cer -O /usr/share/ca-certificates/entrust_l1k.cer
# wget https://web.entrust.com/root-certificates/entrust_g2_ca.cer -O /usr/share/ca-certificates/entrust_g2_ca.cer
# wget https://web.entrust.com/root-certificates/entrust_l1m_sha2.cer -O /usr/share/ca-certificates/entrust_l1m_sha2.cer
# update-ca-certificates
| lang = terminal
}}On Debian:{{Highlight
| code = # mkdir -p /usr/local/share/ca-certificates
# cd /usr/local/share/ca-certificates
# curl https://web.entrust.com/root-certificates/entrust_l1k.cer > entrust_l1k.cer
# curl https://web.entrust.com/root-certificates/entrust_g2_ca.cer > entrust_g2_ca.cer
# curl https://web.entrust.com/root-certificates/entrust_l1m_sha2.cer > entrust_l1m_sha2.cer
# for i in *cer; do openssl x509 -inform PEM -in $i -outform PEM -out ${i%.cer}.crt ; done
# update-ca-certificates
| lang = terminal
}}

CloudStack

2025-06-18T15:56:54Z

Leo: /* Re-add existing KVM bare metal host */

Apache CloudStack is open-source cloud computing software. It is used to deploy a infrastructure as a service (IaaS) platform on virtualization technologies such as KVM, VMware, and Xen. This is similar to OpenStack but is significantly simpler to setup and manage (albeit with less features).

This page contains my notes on setting up and using CloudStack 4.15. I am by no means a CloudStack expert so take my notes here with a huge grain of salt and feel free to make corrections.

==Installation==
This installation is based on CloudStack 4.15 using CentOS 8. The setup described below uses KVM and Open vSwitch. I'm basing the design decisions and approach from the installation guide at [http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html#management-server-installation http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html]

===Overview===
I will have 1 management node and a few bare metal nodes. All nodes will have the same processor (Intel something) and memory (24GB).

Each node will have the same network configuration based on OpenVSwitch. There will be only 1 ethernet connection per node with various VLANs trunked to each node. The VLANs are:
{| class="wikitable"
!Network
!Vlan
!Network subnet
|-
|Management
|11, untagged
|172.19.0.0/20
|-
|Storage
|3205
|172.22.0.0/24
|-
|Guest
|100 - 200
|n/a
|-
|Public
|2
|136.159.1.0/24
|}
The network configs for the 4 nodes I'll be using are listed below. There is also a NFS server used for primary storage. The reason for the weird IPs is because this was set up on an existing network.
{| class="wikitable"
!Node
!Networks
|-
|management
|Management: 172.19.12.141/20
Storage: 172.22.0.241/24
|-
|baremetal1
|Management: 172.19.12.142/20
Storage: 172.22.0.242/24
|-
|baremetal2
|Management: 172.19.12.143/20
Storage: 172.22.0.243/24
|-
|baremetal3
|Management: 172.19.12.144/20
Storage: 172.22.0.244/24
|-
|netapp1
|Storage: 172.22.0.19/24
|}

===Switch config===
For completeness, here's the configuration of the HP Procurve switch that the nodes are connected to. The switch should have all the guest VLANs defined and tagged.
{{Highlight
| code = config

# Guest VLANs
vlan 100 name guest100
vlan 101 name guest101
...
vlan 200 name guest 200
interface 1-8 tagged vlan 100-200

# Public, management, storage VLANs
vlan 2 name public
vlan 11 name management
vlan 3205 name storage
interface 1-8 untagged vlan 11
interface 1-8 tagged vlan 2,3205
| lang = terminal
}}

===Node setup===
Each node will be set up with the following sub-steps.

====CloudStack Repos====
Install CloudStack repos.{{Highlight
| code = # cat > /etc/yum.repos.d/cloudstack.repo <<EOF
[cloudstack]
name=cloudstack
baseurl=http://download.cloudstack.org/centos/8/4.15/
enabled=1
gpgcheck=0
EOF
| lang = terminal
}}

====Install base packages====
Install all other dependencies.{{Highlight
| code = # yum -y install epel-release
# yum -y install bridge-utils net-tools
| lang = terminal
}}Install OpenVSwitch from CentOS Extras:
{{Highlight
| code = # yum -y install \
http://mirror.centos.org/centos/8/extras/x86_64/os/Packages/centos-release-nfv-openvswitch-1-3.el8.noarch.rpm \
http://mirror.centos.org/centos/8/extras/x86_64/os/Packages/centos-release-nfv-common-1-3.el8.noarch.rpm
| lang = terminal
}}

====Disable SELinux====
The system should have SELinux disabled. Use <code>setenforce</code> and edit the selinux config:{{Highlight
| code = # setenforce 0
# vi /etc/selinux/config
## disable selinux
| lang = terminal
}}

====Disable firewalld====
{{Highlight
| code = # systemctl stop firewalld
# systemctl disable firewalld
| lang = terminal
}}

====Configure Open vSwitch====
{{Highlight
| code = # echo "blacklist bridge" >> /etc/modprobe.d/local-blacklist.conf
# echo "install bridge /bin/false" >> /etc/modprobe.d/local-dontload.conf

# systemctl enable --now openvswitch
| lang = terminal
}}
We will be using network-scripts to configure the Open vSwitch bridges later. I removed NetworkManager but retained network-scripts to ensure NetworkManager doesn't interfere with my network setup. The install guide leaves NetworkManager around.

I create a 'shared' bridge that's tied to the network interface called <code>nic0</code>. This was done to make it easier to change the bridge setup during my testing but this could be simplified. Each of the physical networks I later set up in CloudStack are its own individual bridge to make it obvious how VMs get connected to the network.
{{Highlight
| code = # ovs-vsctl add-br nic0
# ovs-vsctl add-port nic0 enp4s0f0 tag=11 vlan_mode=native-untagged
# ovs-vsctl set port nic0 trunks=2,11,40-49,3205

# ovs-vsctl add-br management0 nic0 11
# ovs-vsctl add-br cloudbr0 nic0 2
# ovs-vsctl add-br cloudbr1 nic0 100
# ovs-vsctl add-br storage0 nic0 3205
| lang = terminal
}}The node's management IP address needs to be removed from the primary network interface and then assigned on the management0 interface. If you're doing this to a node remotely, this might interrupt your connection.
{{Highlight
| code = # ip addr del 172.19.12.141/20 dev enp4s0f0
# ip addr add 172.19.12.141/20 dev management0
# ip route add default via 172.19.0.3
# ip addr add 172.22.0.241/24 dev storage0

# ip link set management0 up
# ip link set storage0 up
| lang = terminal
}}

====Network configuration====
Once the Open vSwitch bridges are set up, configure the interfaces as follows:
{| class="wikitable"
!Network Interface
!Role
!Configuration
|-
|enp4s0f0
|primary NIC in the host
|up on boot; no IP
|-
|nic0
|network OVS switch that connects to the other bridges to the NIC
|up on boot; no IP
|-
|cloudbr0
|public traffic.
|up on boot; no IP
|-
|cloudbr1
|guest traffic
|up on boot; no IP
|-
|management0
|management traffic
|up on boot; assigned with management IP
|-
|storage0
|storage traffic
|up on boot; assigned with storage network IP
|-
|cloud0
|link local traffic
|up on boot; assigned 169.254.0.1/16
|}Network configs are applied using network-scripts. The idea here is to have the network interfaces be configured when the system boots automatically. For interfaces that require a static IP address, I used the following network-scripts file. Adjust the device name and IP address as required.
{{Highlight
| code = # cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=Bridge
ONBOOT=yes
BOOTPROTO=static
IPV6INIT=no
IPV6_AUTOCONF=no
DELAY=5
IPADDR=172.16.10.2
GATEWAY=172.16.10.1
NETMASK=255.255.255.0
DNS1=8.8.8.8
DNS2=8.8.4.4
USERCTL=no
NM_CONTROLLED=no
EOF
| lang = terminal
}}For devices that don't require a static IP:
{{Highlight
| code = cat <<EOF > ifcfg-cloudbr0
DEVICE=cloudbr0
TYPE=OVSBridge
DEVICETYPE=ovs
ONBOOT=yes
BOOTPROTO=none
HOTPLUG=no
NM_CONTROLLED=no
EOF
| lang = text
}}Once configured, verify that your node comes up with the proper network settings on a reboot.

===Management node setup===
On the management node, set up the network configs and the CloudStack management packages.

====Setup Storage====
If you intend to use the management server as the primary and secondary storage, you will need to set up a NFS server. If you intend to use an external NFS server as the primary storage, you can skip this step.{{Highlight
| code = # mkdir -p /export/primary /export/secondary
# yum -y install nfs-utils
# cat > /etc/exports <<EOF
/export/secondary *(rw,async,no_root_squash,no_subtree_check)
/export/primary *(rw,async,no_root_squash,no_subtree_check)
EOF
# systemctl enable --now nfs-server
| lang = terminal
}}

====CloudStack management services====
Install MySQL. MariaDB isn't supported and the installation fails with it.
{{Highlight
| code = # rpm -ivh http://repo.mysql.com/mysql80-community-release-el8.rpm
# yum -y install mysql-server
# yum -y install mysql-connector-python

## edit /etc/my.cnf to have the following lines.
cat >> /etc/my.cnf <<EOF
[mysqld]
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
EOF

# systemctl enable --now mysqld
| lang = terminal
}}
Setup CloudStack.{{Highlight
| code = # yum -y install cloudstack-management

# cloudstack-setup-databases cloud:password@localhost --deploy-as=root
# cloudstack-setup-management
# systemctl enable --now cloudstack-management
| lang = terminal
}}After starting <code>cloudstack-management</code> for the firs time, it might take from 2-10 minutes for the database to set up completely. During this time, the web interface won't be responsive. In the mean time, you will need to seed the system VM images to the secondary storage. If you are using an external NFS server for your secondary storage, adjust the mount point in the following command accordingly.
{{Highlight
| code = ## Seed the systemvm into secondary storage
# /usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m /export/secondary -u https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-kvm.qcow2.bz2 -h kvm -F
| lang = terminal
}}
We will continue the setup process via the web interface after setting up a bare metal node.

===Bare metal node setup===
You should set up at least one bare metal node which will be used to set up your first zone and pod.

On a bare metal node, set up everything outlined in the [[CloudStack#Node setup|Node setup]] section above. The node should have the CloudStack repos, Open vSwitch, SElinux/firewalld, and the networking configured. The agent node must have virtualization enabled on the CPU and KVM should be installed. You should be able to find <code>/dev/kvm</code> on the system.

====CloudStack Agent====
To set up the node, install the <code>cloudstack-agent</code> package.

{{Highlight
| code = # yum -y install cloudstack-agent
| lang = terminal
}}

Configure <code>qemu</code> and <code>libvirtd</code>. If you need some starting configs, try the following:{{Highlight
| code = ## edit /etc/libvirt/qemu.conf
vnc_listen=0.0.0.0

## edit /etc/libvirt/libvirtd.conf
listen_tcp = 1
tcp_port = "16509"
listen_tls = 0
tls_port = "16514"
auth_tcp = "none"
mdns_adv = 0
| lang = terminal
}}The CloudStack install guide instructs you to edit the libvirtd arguments to <code>--listen</code>, but this will prevent libvirtd from starting using systemd. Instead, you should skip this step entirely because the CloudStack agent will configure this for you when you add the node to a zone.
{{Highlight
| code = ## The install guide suggests editing /etc/sysconfig/libvirtd to use the listen flag.
## However, this only works if you're not using systemd or using the libvirtd-tcp socket.
## I skipped this step since the agent will configure this later on.
LIBVIRTD_ARGS="--listen"
| lang = terminal
}}Start the CloudStack agent. Verify that the cloudstack-agent is running. At this point libvirtd should also running (it's a service dependency).

With the agent running on the node, you should now be able to add the node to the CloudStack cluster. This process will rewrite the <code>libvirtd.conf</code> file and it should set <code>listen_tcp=0</code> and <code>listen_tls=1</code> for you (so that libvirt traffic such as migrations are done via TLS rather than basic TCP).

====Allow sudo access====
Ensure that <code>/etc/sudoers</code> does not require TTY. In the older documentation, CloudStack requires that the 'cloud' user be able to sudo with the addition of <code>Defaults:cloud !requiretty</code>. However, looking at the installation on the CentOS 8 box, the agent actually runs as root, so perhaps root needs to be able to sudo?

===Setting up your first zone===
At this point in the process, you should have at least one bare metal host and your management node should be up and running and it should be serving the CloudStack web UI at http://cloudstack:8080/client. Login using the default <code>admin</code> / <code>password</code> credentials.

You will be greeted with a setup wizard. I have had no luck with this and it's better to ignore it. Instead, navigate to Infrastructure -> zones and manually set up your first zone.
{| class="wikitable"
!Description
!Screenshot
|-
|There are 3 types of zones that you can create:

#'''Basic zone''' - All guest VMs are placed on a single shared flat network. There is no isolation or security policies in place to prevent guest VMs from seeing each other.
#'''Advanced zone''' - Guest VMs can be placed in one or more VLAN based networks. Guest networks can either be isolated or L2. Isolated networks (depending on the chosen network offering) comes with a virtual router (VR) which offers NAT/SNAT and firewall services and uses one or more public IP addresses. L2 networks are similar but doesn't have a virtual router but instead requires these services to be offered externally. Tenants can also create something called a virtual private cloud (VPC). A VPC is like a regular isolated guest network but with additional features. A VPC allows the user to:
##Create multiple subnets (called tiers) which can route with each other
##Network traffic between tiers can be controlled through Network ACLs
##One or more public IPs can be associated to a VPC.
##Like an isolated guest network, all subnets can be NATed out through a single public IP
##You can create a private gateway (and therefore static routes) within a VPC
##You can create a VPN connection to a VPC
#'''Advanced zone with security groups''' - Guest VMs are placed on a shared network that is publicly routable. There is no concept of a 'public' network because the guest network should also be public. As a result, there is no ability to create any other kind of guest networks or VPCs. The only benefit here is the ability to define security groups per-VM (which is implemented via IPTables on the bare metal host). Because enabling security groups in a zone will restrict that zone from being able to create isolated guest networks or VPCs, the security group feature only appears useful in an environment where guests only need to connect to the internet.

Be aware of each type's limitations before continuing.

We will be creating an advanced network zone.
|[[File:CloudStack - New Zone 1.png|left|thumb]]
|-
|We will add the DNS resolvers for the zone and specify the hypervisor type (KVM).
Empty the guest CIDR since we're going to allow users to specify their own.
|[[File:CloudStack - New Zone 2.png|left|thumb]]
|-
|When using the advanced zone, you need to specify the physical networks for the management, storage, and public networks.
These should correspond to the physical network devices on the hypervisor. Recall that in the previous step where we set up the Open vSwitch bridges, we created the following bridges for each role:

*management - management0
*storage - storage0
*public - cloudbr0
*guest - cloudbr1
|<gallery>
File:CloudStack - New Zone 3.png
File:CloudStack - New Zone 3a.png
</gallery>
|-
|Specify the public network. The addresses defined here populates the 'Public IP' pool.

All isolated guest networks and all VPCs will use one of the addresses defined in this pool for the SNAT/NAT. The addresses specified here should therefore be accessible from the internet.
|[[File:CloudStack - New Zone 4.png|left|thumb]]
|-
|Create a new pod.

The pod network here should cover your management network subnet. The reserved IP addresses here will be used by system VMs that require access to the management network.
|[[File:CloudStack - New Zone 5.png|left|thumb]]
|-
|Specify the guest network VLAN range.

Because we're using VLAN as an isolation method, this range specifies what VLANs the guest networks will use over the guest physical network.
|[[File:CloudStack - New Zone 6.png|left|thumb]]
|-
|Specify the storage network.

The reserved start/end IPs will be used by system VMs that require access to the primary storage.

If you are assigning static IPs on your bare metal hosts, ensure that the reserved addresses don't overlap with the IP range specified here (because I had CloudStack assign a VM with the same IP as a bare metal host)
|[[File:CloudStack - New Zone 7.png|left|thumb]]
|-
|Specify a cluster name.
|[[File:CloudStack - New Zone 8.png|left|thumb]]
|-
|Add your first bare metal host.
You must add one host now and can add additional ones later.
|[[File:CloudStack - New Zone 9.png|left|thumb]]
|-
|Specify your primary storage.
The server should be accessible from the storage network.
|[[File:CloudStack - New Zone 10.png|left|thumb]]
|-
|Specify your secondary storage.

You need to have at least one NFS secondary storage that has been seeded with the system VM template.

Secondary storage pools should be accessible from the management network ('''confirm'''?)
|[[File:CloudStack - New Zone 11.png|left|thumb]]
|-
|Launch the zone.
This step might take a few minutes. If all goes well, you can then enable the zone shortly after. If you run into any problems, check the logs on the management node at /var/log/cloudstack/management.
|[[File:CloudStack - New Zone 12.png|left|thumb]]
|}
Once your zone has been enabled, it should automatically start a Console Proxy VM and secondary storage VM. You can find this under Infrastructure -> System VMs. If for some reason the System VMs are not starting, check that your systemvm template is available in your secondary storage and that the cloud0 bridge on each host is up. You should be able to ping the link local IP address (the 169.254.x.x address) from the hypervisor.

Once the two system VMs are running, verify that you're able to create new guest networks or VPCs. These networks should create a virtual router.

==Configuration==

===Service offerings===

====Deployment planner====
There are a few deployment techniques that can be used. These are set within a compute offering and cannot be changed after it's been created ('''really?''' can we change it via API?). The options are:
{| class="wikitable"
!Deployment planner
!Description
|-
|First fit
|Placed on the first host that has sufficient capacity
|-
|User dispersing
|Evenly distributes VMs by account across clusters
|-
|User concentrated
|Opposite of the above.
|-
|Implicit dedication
|requires or prefers (depending on planner mode) a dedicated host
|-
|Bare metal
|requires a bare metal host
|}
More information from [https://docs.cloudstack.apache.org/en/4.15.2.0/adminguide/service_offerings.html#compute-and-disk-service-offerings CloudStack's documentation on Compute and Disk Service Offerings].

===Enable SAML2 authentication===
Enable the SAML2 plugin by setting <code>saml2.enabled=true</code> under Global Settings.

Set up SAML authentication by specifying the following settings:
{| class="wikitable"
!Setting
!Description
!Example value
|-
|saml2.default.idpid
|The URL of the identity provider. This is likely obtained from the metadata URL and set by the SAML2 plugin every time CloudStack starts.
|<nowiki>https://sts.windows.net/c609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/</nowiki>
|-
|saml2.idp.metadata.url
|The metadata XML URL
|<nowiki>https://login.microsoftonline.com/609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/federationmetadata/2007-06/federationmetadata.xml?appid=c5b8df24-xxx-xxx-xxx-xxxxxxxxxxxx</nowiki>
|-
|saml2.sp.id
|The identifier string for this application
|cloudstack-test.my-organization.tld
|-
|saml2.redirect.url
|The redirect URL using your cloudstack domain.
|<nowiki>https://cloudstack-test.my-organization.tld/client</nowiki>
|-
|saml2.user.attribute
|The attribute to use as the username.
If you're not sure what's available, look at the management logs after a login attempt.
|For Azure AD, use the email address attribute: <nowiki>http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress</nowiki>
|}
Restart the management server. To allow a user access, create the user and enable SSO. The user's username must match the value that's obtained from the saml2.user.attribute field.

====Bugs====

=====SAML Request being rejected by Azure AD=====
If you are using Azure AD, you may have issues authenticating because the SAML request ID that's generated might begin with a number. When this happens, you will get an error similar to: <code>AADSTS7500529: The value '692rv91k6dgmdas33vr3b2keahr4lqjv' is not a valid SAML ID. The ID must not begin with a number.</code>. For more information, see: https://github.com/apache/cloudstack/issues/5548

=====Users cannot login via SSO=====
Users that will be using SAML for authentication will need to have their CloudStack accounts created with SSO enabled. There seems to be a bug with the CloudStack web UI where a user's SAML IdPID isn't settable (it gets set to a '0'). A work-around would be to create and authorize users via CloudMonkey.

The steps on adding a new user are:

#Create the user: <code>create user firstname=First lastname=User email=user1@ucalgary.ca username=user1@ucalgary.ca account=RCS state=enabled password=asdf</code>
#Find the user's ID: <code>list users domainid=<tab> filter=username,id</code>
#Authorize the user: <code>authorize samlsso enable=true entityid=<nowiki>https://sts.windows.net/c609a0ec-xxx-xxx-xxx-xxxxxxxxxxxx/</nowiki> userid=user-id</code>
#Verify that the user is enabled for SSO: <code>list samlauthorization filter=userid,idpid,status</code>

When authorizing a user, the entityid must be the URL of the identity provider. The end slash is also mandatory.

===Enable SSL===
A few things to note about enabling SSL:

*If you added hosts via IP address, enabling SSL would likely break the management-to-client connection. You might need to re-add the host so that the certificates all match up.
* On CloudStack 4.16, the button to upload a new certificate in the SSL dialog box does not work. This is fixed in 4.16.1.

==== Preparing your SSL certificates ====
First, generate a private key and certificate signing request and then obtain your SSL certificate from a certificate authority. For a typical CloudStack installation, you should obtain SSL certificates for both your management server as well as your console proxy.
{{Highlight
| code = # openssl genrsa -out server.key 4096
# openssl req -new -sha256 \
-key server.key \
-subj "/C=CA/ST=Alberta/O=Steamr/CN=cloudstack-test.example.com" \
-reqexts SAN \
-extensions SAN \
-config <(cat /etc/pki/tls/openssl.cnf <(printf "[SAN]\nsubjectAltName=DNS:cloudstack-test-console.example.com")) \
-out server.csr
## With the server.csr file, upload it to your Certificate Authority to obtained a signed certificate.
| lang = terminal
}}
Your certificate authority should have given you your signed certificate as well as the root and any other intermediate certificates in a X.509 (.crt) format. If you need to self sign this certificate signing request, do the following:{{Highlight
| code = ## Run the following only if you want to self sign your certificate
## Make your root CA
# openssl genrsa -des3 -out rootCA.key 4096
# openssl req -x509 -new -subj "/C=CA/ST=Alberta/O=Steamr/CN=example.com" -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.crt

## Sign the certificate
# openssl x509 -req -in server.csr -CA rootCA.crt -CAkey rootCA.key -CAcreateserial -out server.crt -days 500 -sha256

## Check the certificate
# openssl x509 -in server.crt -text -noout
| lang = terminal
}}Next, you need to convert your certificate into a PKCS12 format and your private key into a PKCS8 format. This is the only format that works with the CloudStack management server. We place the PKCS12 keystore file at <code>/etc/cloudstack/management/ssl_keystore.pkcs12</code>.
{{Highlight
| code = ## Combine Files
# cat server.key server.crt intermediate.crt root.crt > combined.crt

## Create keystore
## You may use 'password' as the password
# openssl pkcs12 -in combined.crt -export -out combined.pkcs12

## Import keystore
## Provide the same password above. Eg. 'password'
# keytool -importkeystore -srckeystore combined.pkcs12 -srcstoretype PKCS12 -destkeystore /etc/cloudstack/management/ssl_keystore.pkcs12 -deststoretype pkcs12

## Convert the private key into PKCS8 format
## Provide the same password above. Eg. 'password'
# openssl pkcs8 -topk8 -in server.key -out server.pkcs8.encrypted.key
# openssl pkcs8 -in server.pkcs8.encrypted.key -out server.pkcs8.key
| lang = terminal
}}

==== Upload SSL certificates ====
You can upload SSL certificates to CloudStack under Infrastructure -> Summary and then clicking on the 'SSL Certificates" button. Provide the root certificate authority, the certificate, the private key (in PKCS8 format), and the domain that the certificate applies to. Wildcard domains should be specified as <code>*.example.com</code>.

Alternatively, you may use the CloudMonkey tool to upload certificates using the file parameter passing feature like so:

{{Highlight
| code = # cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=1 name=root certificate=@root.crt
# cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=2 name=intermediate1 certificate=@intermediate.crt
# cmk upload customcertificate domainsuffix=cloudstack.steamr.com id=3 privatekey=@server.pkcs8.key certificate=@domain.crt
| lang = terminal
}}

==== Enabling HTTPS ====
Next, you will need to enable HTTPS on both the management console and console proxy.

Note that you may enable the HTTPS setting only after at least one certificate has been uploaded. If the server has no certificates, the option is ignored.

===== Enable HTTPS on the management console =====
The management console can be configured by editing <code>/etc/cloudstack/management/server.properties</code> with the following lines. Set the keystore password to the same password you used above to import it.
{{Highlight
| code = https.enable=true
https.port=8443
https.keystore=/etc/cloudstack/management/ssl_keystore.pkcs12
https.keystore.password=password
| lang = text
}}Restart the management server for this to apply.
{{Info
| title = Why port 8443?
| message = Because CloudStack runs under a non-root account, it can only bind to high port (> 1024) numbers.

You can still have CloudStack visible on port 443 if you use a [[IPTables#Mapping incoming traffic to a different internal port|IPTables rule]].
}}
Confirm that you are able to reach your management console via HTTPS.

==== Enable HTTPS on the console proxy ====
If you enable SSL on the management console, you will also need to enable SSL for the console proxies for the VNC web sockets to work properly. If your management console certificate (from the previous sections) contain a Subject Alternative Name (SAN) or is a wildcard certificate that includes your console proxy's DNS name, SSL for the console proxy should be working. If your certificates do not include the console proxy's DNS name, you will need to obtain another SSL certificate and add it to the SSL keystore and upload it to CloudStack using the same instructions above.

==== Renewing SSL certificate ====
To renew a SSL certificate, you'll have to ensure that the keystore is updated and also upload the certificate via CloudMonkey or the management console (under Summary -> Certificates).

In a folder containing your certificate (<code>server.crt</code>), intermediate and root certificates (<code>intermediate.crt</code>, <code>root.crt</code>), and also your private key (<code>server.key</code>), run the following to update your SSL keystore and upload the certificates via cmk:
{{Highlight
| code = # cat server.key server.crt intermediate.crt root.crt > combined.crt

## Note the keystore location that's defined in your configs
# grep https.keystore /etc/cloudstack/management/server.properties
https.keystore=/etc/cloudstack/management/ssl_keystore.pkcs12
https.keystore.password=xxxxxxx

## Create keystore and import your certificate into it.
# openssl pkcs12 -in combined.crt -export -out combined.pkcs12
# keytool -importkeystore -srckeystore combined.pkcs12 -srcstoretype PKCS12 -destkeystore ssl_keystore.pkcs12 -deststoretype pkcs12

## Move the keystore over the existing one as defined in the server config. You may want to backup the old one just in case.
# mv /etc/cloudstack/management/ssl_keystore.pkcs12 /etc/cloudstack/management/ssl_keystore.pkcs12-org
# mv ssl_keystore.pkcs12 /etc/cloudstack/management/ssl_keystore.pkcs12

## Convert your key to pkcs8 if you haven't already done so. Use the same password for both commands.
# openssl pkcs8 -topk8 -in server.key -out server.pkcs8.key-encrypted
# openssl pkcs8 -in server.pkcs8.key-encrypted -out server.pkcs8.key

## Upload your certificate
# for domain in $(openssl x509 -in server.crt -text -noout {{!}} grep DNS: {{!}} tr -d , {{!}} sed 's/DNS://g') ; do
echo "Uploading domain for $domain"

cmk upload customcertificate domainsuffix=$domain id=1 name=root certificate=@root.crt
cmk upload customcertificate domainsuffix=$domain id=2 name=intermediate1 certificate=@intermediate.crt
cmk upload customcertificate domainsuffix=$domain id=3 privatekey=@server.pkcs8.key certificate=@server.crt
done
| lang = terminal
}}

== Tasks ==
===Re-add existing KVM bare metal host===
Once a host has been added to CloudStack, the CloudStack agent will have generated some public/private keys and configured itself to talk to the management node. If you need to remove and re-add a host, you will need to clean up the agent before re-adding it back to CloudStack again. Based on my experience, I had to do the following:

#Before removing the host from CloudStack, drain it of all VMs. <code>virsh list</code> should be empty. If not and you've removed the host from the management server already, manually kill each VM with <code>virsh destroy</code>.
#<code>systemctl stop cloudstack-agent</code>
#<code>rm -rf /etc/cloudstack/agent/cloud*</code>
#unmount any primary storages with <code>umount /mnt/*</code> and clean up with <code>rmdir /mnt/*</code>
#<code>systemctl stop libvirtd</code>
#<code>rm -rf /var/lib/libvirt/qemu</code>
#You may need to edit <code>/etc/sysconfig/libvirtd</code> to not use the listen flag. This might prevent libvirtd (and subsequently cloudstack-agent) from starting.
#Edit <code>/etc/cloudstack/agent/agent.properties</code> and remove the keystore passphrase, any UUIDs, cluster/pod/zone, and the host. You should keep the guid or regenerate it with uuidgen. You should also keep the public/private/guest network devices set.
#Restart with <code>systemctl start cloudstack-agent</code> (libvirt should come up automatically as it's a dependency). Ensure that it comes up OK.

You may then re-add the host back to CloudStack.

===Building RPMs===
To build the RPM packages from scratch, you'll need to install a bunch of dependencies and then run the build script. For more information, see:

*<nowiki>https://docs.cloudstack.apache.org/en/4.15.2.0/installguide/building_from_source.html#building-rpms-from-source</nowiki>
{{Highlight
| code = # yum groupinstall "Development Tools"
# yum install java-11-openjdk-devel genisoimage mysql mysql-server createrepo
# yum install epel-release

# curl -sL https://rpm.nodesource.com/setup_12.x {{!}} sudo bash -
# yum install nodejs

# cat <<EOF > /etc/yum.repos.d/mysql.repo
[mysql-community]
name=MySQL Community connectors
baseurl=http://repo.mysql.com/yum/mysql-connectors-community/el/$releasever/$basearch/
gpgkey=http://repo.mysql.com/RPM-GPG-KEY-mysql
enabled=1
gpgcheck=1
EOF
# yum -y install mysql-connector-python

enable powertools

# yum install jpackage-utils maven

# git clone https://github.com/apache/cloudstack.git
# cd cloudstack
# git checkout 4.15

# cd packaging
# sh package.sh --distribution centos8
| lang = terminal
}}

=== Rebuilding UI ===
Instructions to rebuild the UI are available on the [https://github.com/apache/cloudstack/tree/main/ui README file under the ui directory.]

In summary, to rebuild the CloudStack UI, such as to test a new feature or bug fix, you must compile the VueJS + Ant design files and place have the CloudStack management server serve these files. This is accomplished by doing the following:

# On a server with npm installed, clone the CloudStack git repo and checkout the branch with the UI fix/feature
# Navigate to <code>cloudstack/ui/</code>
# Run <code>npm install</code>
# Run <code>npm run build</code>
# Copy the <code>dist/</code> directory to <code>/usr/share/cloudstack-management/webapp/</code>
# Edit <code>/etc/cloudstack/management/server.properties</code> and make sure that <code>webapp.dir</code> is set to: <code>webapp.dir=/usr/share/cloudstack-management/webapp</code>

Restart the CloudStack management service and then reload the console page. Ensure that the vue app isn't cached.

==== Installing and using CloudStack's prebuilt UI ====
The cloudstack-ui package contains the prebuilt CloudStack UI. The location of the UI files are placed under <code>/usr/share/cloudstack-ui</code>.

As of CloudStack 4.18, when I tried using this prebuilt package, I had to do the following things:

* Edit <code>/etc/cloudstack/management/server.properties</code> and set <code>webapp.dir=/usr/share/cloudstack-ui</code>
* In <code>/usr/share/cloudstack-ui</code>, run: <code>find . -type d -exec chmod -v o+x {} \;</code> because the directories aren't executable by the 'cloud' user.
* Create /usr/share/cloudstack-ui/WEB-INF and place this [https://github.com/apache/cloudstack/blob/main/client/src/main/webapp/WEB-INF/web.xml web.xml file] within. Otherwise, requests to the API break.

===Usage server===
Install <code>cloudstack-usage</code>. Start it and restart the management server. Set <code>enable.usage.server=true</code> in global settings.

The usage data will be stored in the usage database on your management server. Metrics are gathered daily and can be viewed through Cloud Monkey. There is no option to view this data in the management console.

The collected data is coarse in nature, but it should be sufficient enough for you to determine an account or VM's resource utilization over a time period of a day or more and should be good enough to implement a rough billing / showback amount.

===Adding some Linux templates===
You can add the "Generic Cloud" qcow2 disk images as a system template to CloudStack.

Because these cloud images uses cloud-init, you will need to provide some custom userdata when deploying these images. Userdata will only work when the VM is deployed on a network that offers the "User Data" service offering. If you can't use userdata or if you want the VMs to come up with a specific root password, you can use [[virt-customize]] to set the root password on the qcow2 file.
{| class="wikitable"
!Distro
!Type
!URL
|-
|Rocky Linux 8.4
|CentOS 8
|https://download.rockylinux.org/pub/rocky/8.4/images/Rocky-8-GenericCloud-8.4-20210620.0.x86_64.qcow2
|-
|CentOS 8.4
|CentOS 8
|https://cloud.centos.org/centos/8/x86_64/images/CentOS-8-GenericCloud-8.4.2105-20210603.0.x86_64.qcow2
|-
|Fedora 34
|Fedora Linux (64 bit)
|https://download.fedoraproject.org/pub/fedora/linux/releases/34/Cloud/x86_64/images/Fedora-Cloud-Base-34-1.2.x86_64.qcow2
|-
|Ubuntu Server 21.04
|
|http://cloud-images.ubuntu.com/hirsute/current/hirsute-server-cloudimg-amd64.img
You need to convert img to qcow with qemu-img:

<code>qemu-img create -F qcow2 -b cloudimg-amd64.img -f qcow2 cloudimg-adm64.qcow2 10G</code>
|}
Here's an example of a cloud-init configuration which you would put in the userdata field when deploying a VM:
{{Highlight
| code = #cloud-config
hostname: vm01
manage_etc_hosts: true
users:
- name: vmadm
sudo: ALL=(ALL) NOPASSWD:ALL
groups: users, admin
home: /home/vmadm
shell: /bin/bash
lock_passwd: false
ssh_pwauth: true
disable_root: false
chpasswd:
list: {{!}}
vmadm:vmadm
expire: false
| lang = text
}}

=== Importing a VMware Virtual Machine ===
To import a VMware virtual machine:

# Copy the virtual machine's .vmdk disk file to a CloudStack node
# Convert the .vmdk into a .qcow format using the qemu-img convert command. Eg. <code>qemu-img convert -f linux.vmdk -O linux.qcow2</code>
# Run the file command on the qcow disk and make a note of its size (in bytes).
# Create a new virtual machine in CloudStack. Use an ISO and not a template. Set the size of the VM's ROOT disk to match the disk size noted from the previous step.
# Start and stop the VM to ensure the virtual disk is created. Make a note of the virtual disk's ID.
# Copy the converted qcow disk over the existing virtual disk image in the primary storage.
# Restart the VM in CloudStack.

Some things to note with this process:

* The disk subsystem might differ between KVM and VMware. As a result, you may need to [[Rebuilding the initial ramdisk|rebuild the initrd file]] so that it has the necessary drivers to boot properly.

=== Increasing the management console's timeout ===
The default timeout is 30 minutes. You may adjust the number of minutes in the <code>session.timeout</code> value stored in <code>/etc/cloudstack/management/server.properties</code>.
{{Highlight
| code = session.timeout=60
| lang = xml
}}
Restart the cloudstack-management service to apply.

=== Upgrade CloudStack ===
Before upgrading CloudStack, review the upgrade instructions from CloudStack's documentation. For 4.17 to 4.18, see: https://docs.cloudstack.apache.org/en/4.18.0.0/upgrading/upgrade/upgrade-4.17.html

In a nutshell, upgrading CloudStack for KVM hosts requires the following steps:

# Before upgrading, load the next systemvm template image. System templates are available from: http://download.cloudstack.org/systemvm/. The systemvm template for KVM should be named something like: <code>systemvm-kvm-4.18.0</code>. When adding the template, specify qcow2 as its format.
# Backup your CloudStack and usage database. {{Highlight
| code = $ mysqldump -u root -p -R cloud > cloud-backup_$(date +%Y-%m-%d-%H%M%S)
$ mysqldump -u root -p cloud_usage > cloud_usage-backup_$(date +%Y-%m-%d-%H%M%S)
| lang = terminal
}}
# If you have outstanding system packages to upgrade, do so now (excluding CloudStack packages) and reboot.
# Stop the CloudStack Management server. Manually unmount any CloudStack mounts. I had to do when I upgraded from 4.16 to 4.17 since it prevented CloudStack from starting. Upgrade the cloudstack-management and cloudstack-common packages. Restart CloudStack. {{Highlight
| code = # systemctl stop cloudstack-management
# umount /var/cloudstack/mnt/*
# yum -y update cloudstack\*
# systemctl start cloudstack-management
| lang = terminal
}}
# Ensure things are running. Watch the logs in <code>tail -f /var/log/cloud/management/*log</code>. Verify that the management server can still communicate with the hosts.
# For each CloudStack host, drain it of hosts, stop the cloudstack-agent service, do a full upgrade, reboot. {{Highlight
| code = # systemctl stop cloudstack-agent
# yum -y update cloudstack-agent

## Restart the service or reboot just to make sure the host can come up by itself
## reboot
# systemctl restart cloudstack-agent
| lang = terminal
}}

==== Updating the system VMs ====
Check your list of virtual routers under Infrastructure -> Virtual routers. Update any VMs that are marked with 'requires upgrade'. Do so by selecting the VM and clicking on the 'upgrade router to use newer template' button.

If the virtual router doesn't start up properly after performing an upgrade, make sure that the VM is running on a node with an appropriate CloudStack agent version. Virtual routers that land on a node with an older version of the agent won't start properly.

==== Other upgrade notes ====
Things to watch out for:

* Don't upgrade CloudStack packages on a server until you stop the CloudStack services. For some reason, I've had issues in the past where something with Java gets corrupted if you try to do an upgrade while the java processes are still running. This then results in some odd class loader issue which results in the service being unable to start after the upgrade.
* System VM template: I upgraded the CloudStack management server to 4.16.1 using a custom compiled RPM package. However, the management server didn't start and inspecting the logs show that it was expecting a system VM template at <code>/usr/share/cloudstack-management/templates/systemvm/systemvmtemplate-4.16.1-kvm.qcow2.bz2</code>. This is easily fixed by downloading the template and restarting the management server. <code>wget <nowiki>http://download.cloudstack.org/systemvm/4.16/systemvmtemplate-4.16.1-kvm.qcow2.bz2</nowiki> -O /usr/share/cloudstack-management/templates/systemvm/systemvmtemplate-4.16.1-kvm.qcow2.bz2</code>

=== Traefik ===

==== Using Traefik for SSL termination ====
With the console proxy served using SSL, we could put a reverse proxy in front of both the management UI and the console proxy service VMs with a valid certificate. This allows us to 'mask' the self-signed certificate with Traefik's ability to request for a proper certificate from Let's Encrypt.

In my test version of CloudStack, I've set up Traefik with the following configs. I updated the console proxy to use a dynamic URL by setting <code>consoleproxy.url.domain</code> to something like <code>*.cloudstack-test.example.com</code>. CloudStack's console proxy service will translate the <code>*</code> by the system VM's IP address (Eg. 10.1.1.1 becomes 10-1-1-1). We'll tell Traefik to reverse proxy these domains for both HTTPS and WSS on ports 443 and 8080 respectively. My dynamic traefik configs to make this happen looks like the following:{{Highlight
| code = http:
serversTransports:
ignorecert:
insecureSkipVerify: true

routers:
cloudstack:
rule: Host(`cloudstack-test.example.com`)
service: cloudstack-poc
entrypoints:
- http
middlewares:
- https-redirect

cloudstack-https:
rule: Host(`cloudstack-test.example.com`)
service: cloudstack-poc
entrypoints:
- https
tls:
certresolver: letsencrypt

cloudstack-pub-ip-136-159-1-100:
rule: Host(`136-159-1-1.cloudstack-test.example.com`)
service: 136-159-1-100
entrypoints:
- https
tls:
certresolver: letsencrypt

cloudstack-pub-ip-136-159-1-100-ws:
rule: Host(`136-159-1-1.cloudstack-test.example.com`)
service: 136-159-1-100-ws
entrypoints:
- httpws
tls:
certresolver: letsencrypt

services:
cloudstack-poc:
loadBalancer:
servers:
- url: "http://172.19.12.141:8080"

136-159-1-100:
loadBalancer:
servers:
- url: "https://136.159.1.100"
serversTransport: ignorecert

136-159-1-100-ws:
loadBalancer:
servers:
- url: "https://136.159.1.100:8080"
serversTransport: ignorecert

middlewares:
https-redirect:
redirectscheme:
scheme: https
| lang = yaml
}}And the following traefik configs:
{{Highlight
| code = entryPoints:
http:
address: ":80"
https:
address: ":443"
httpws:
address: ":8080"

certificatesResolvers:
letsencrypt:
acme:
email: user@example.com
storage: "/config/acme.json"
httpChallenge:
entryPoint: http
| lang = yaml
}}

=== Change guest VM CPU flags ===
The default CPU flags that guest VMs sees are set to qemu64 compatible features. The qemu64 feature set covers a very small subset of the available features that modern CPUs have which makes the guest VM be compatible to nearly all available CPUs at the cost of reduced features. The feature flags in qemu64 are: <code>fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl xtopology cpuid tsc_known_freq pni cx16 x2apic hypervisor lahf_lm cpuid_fault pti</code>

For virtualized workloads that require additional feature sets, you can edit the CloudStack agent to use a different guest CPU mode. Select one of:

* '''custom''': This is the default and it defaults to the x86_qemu64 feature set defined in <code>/usr/share/libvirt/cpu_map/x86_qemu64.xml</code>. You may select a different CPU map by specifying <code>guest.cpu.model</code>.
* '''host-model''': Uses a CPU model compatible with your host. Most feature flags are available. Guest CPUs will identify itself as a generic CPU of that family such as <code>Intel Xeon Processor (Icelake)</code> (note the lack of '(R)' after Intel and Xeon brands and no specific CPU model number).
* '''host-passthrough''': Use CPU passthrough; feature flags match exactly. Migrations only work with matching CPUs and may still fail when using this mode. Guest CPUs will identify itself as the underlying CPU in that hypervisor (such as <code>Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz</code>)

For CloudStack clusters with identical CPUs, it's recommended to use host-model. I've tried using host-passthrough on matching hosts using a Intel Xeon Silver 4316 and migrations sometimes fail and the VM requires a reset to be brought back up.

For more information, see: http://docs.cloudstack.apache.org/en/4.15.0.0/installguide/hypervisor/kvm.html#install-and-configure-the-agent

To change the CPU mode, you simply need to add the appropriate line into the agent properties file and restart the agent:
{{Highlight
| code = ## Matching host model
# echo "guest.cpu.mode=host-model" >> /etc/cloudstack/agent/agent.properties
# systemctl restart cloudstack-agent.service

## Passthrough
# echo "guest.cpu.mode=host-passthrough" >> /etc/cloudstack/agent/agent.properties
# systemctl restart cloudstack-agent.service
| lang = terminal
}}

=== Using Open vSwitch and DPDK ===
Getting DPDK working with Open vSwitch is relatively straight forward. You need to install the DPDK packages, configure the kernel to use hugepages and IO passthrough, enable the vfio driver on your network interfaces for DPDK support, reconfigure Open vSwitch to use the DPDK device, and enable DPDK on the CloudStack agent.

There are some existing resources that might help.

*https://www.shapeblue.com/openvswitch-with-dpdk-support-on-cloudstack/
* https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/ovs-dpdk_end_to_end_troubleshooting_guide/configure_and_test_lacp_bonding_with_open_vswitch_dpdk

Install DPDK tools:

{{Highlight
| code = # yum -y install dpdk dpdk-tools
| lang = terminal
}}

Reconfigure your kernel by editing <code>/etc/default/grub</code>. Add the following. Adjust the <code>isolcpus</code> depending on your CPUs available. I assigned 4 cores out of 80 vCPUs. I am also using 16 1GB huge pages. Adjust this according to how much memory your system has (and probably what performance you're seeing){{Highlight
| code = # vi /etc/default/grub
## default_hugepagesz=1GB hugepagesz=1G hugepages=16 iommu=pt intel_iommu=on isolcpus=1-19,21-39,41-59,61-79 intel_pstate=disable nosoftlockup

# grub2-mkconfig -o /boot/grub2/grub.cfg
| lang = terminal
}}You can also configure huge pages by sysctl (optional if you set it in the kernel cmdline){{Highlight
| code = # echo 'vm.nr_hugepages=16' > /etc/sysctl.d/hugepages.conf
# sysctl -w vm.nr_hugepages=16
| lang = terminal
}}Load the <code>vfio-pci</code> kernel module on boot
{{Highlight
| code = # echo vfio-pci > /etc/modules-load.d/vfio-pci.conf
| lang = terminal
}}
Reboot the machine. When it comes back, verify that you have hugepages and vfio-pci loaded, and that IOMMU is working.
{{Highlight
| code = # cat /proc/cmdline {{!}} grep iommu=pt
# cat /proc/cmdline {{!}} grep intel_iommu=on
# dmesg {{!}} grep -e DMAR -e IOMMU
# grep HugePages_ /proc/meminfo
# lsmod {{!}} grep vfio-pci
| lang = terminal
}}
Set the network interfaces you wish to use DPDK on to the vfio-pci driver. This is done using the <code>dpdk-devbind.py</code> script that's provided by the DPDK tools package.{{Highlight
| code = # modprobe vfio-pci
# dpdk-devbind.py --bind=vfio-pci ens2f0
# dpdk-devbind.py --bind=vfio-pci ens2f1
## Verify
# dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:31:00.0 'Ethernet Controller X710 for 10GBASE-T 15ff' drv=vfio-pci unused=i40e
0000:31:00.1 'Ethernet Controller X710 for 10GBASE-T 15ff' drv=vfio-pci unused=i40e
| lang = terminal
}}Enable DPDK on Open vSwitch. pmd-cpu-mask defines what cores are used for data path packet processing. The dpdk-lcore-mask defines cores that non-datapath OVS-DPDK threads such as handler and revalidator threads run. These two masks should not overlap. For more information on these parameters, see: https://developers.redhat.com/blog/2017/06/28/ovs-dpdk-parameters-dealing-with-multi-numa<nowiki/>.{{Highlight
| code =
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x00000001
# ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0x17c0017c
# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024"

## Verify
# ovs-vsctl get Open_vSwitch . dpdk_initialized
# ovs-vsctl get Open_vSwitch . dpdk_version
| lang = terminal
}}If Open vSwitch is already configured to use these interfaces by name, you will just need to change the interface type to dpdk and set its PCI address.
{{Highlight
| code = # ovs-vsctl set interface ens2f0 type=dpdk
# ovs-vsctl set interface ens2f0 options:dpdk-devargs=0000:31:00.0
# ovs-vsctl set interface ens2f1 type=dpdk
# ovs-vsctl set interface ens2f1 options:dpdk-devargs=0000:31:00.1
| lang = terminal
}}
The bridge that these interfaces are connected to must also have its datapath_type updated:
{{Highlight
| code = # ovs-vsctl set bridge nic0 datapath_type=netdev
| lang = terminal
}}
Restart Open vSwitch for these to apply properly and confirm that it's working
{{Highlight
| code = # systemctl restart openvswitch
# ovs-vsctl show
...
Port bond0
Interface ens2f1
type: dpdk
options: {dpdk-devargs="0000:31:00.1"}
Interface ens2f0
type: dpdk
options: {dpdk-devargs="0000:31:00.0"}
| lang = terminal
}}
Update the CloudStack agent so that this host has the DPDK capability. Edit <code>/etc/cloudstack/agent/agent.properties</code>. Note that the keyword is <code>openvswitch.dpdk.enabled</code> (enabled ending with -ed). The example from ShapeBlue's blog post is wrong.{{Highlight
| code = network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver
openvswitch.dpdk.enabled=true
openvswitch.dpdk.ovs.path=/var/run/openvswitch/
| lang = text
}}Restart the CloudStack agent for this capability to be visible by the management server. You should be able to call <code>list hosts filter=capabilities,name</code> and have the host list dpdk as a capability. Eg:
{{Highlight
| code = (localcloud) 🐱 > list hosts filter=capabilities,name
count = 22
host:
+-------------------+----------+
{{!}} CAPABILITIES {{!}} NAME {{!}}
+-------------------+----------+
{{!}} hvm,snapshot,dpdk {{!}} cs9 {{!}}
{{!}} hvm,snapshot,dpdk {{!}} cs10 {{!}}
{{!}} hvm,snapshot,dpdk {{!}} cs11 {{!}}
| lang = terminal
}}
If you don't see this, double check your agent configs and restart it again.

For VMs to take advantage of DPDK, you must either set extraconfig on the virtual machine or create a new compute service offering. Extraconfig might get overwritten whenever the VM is updated, so it's not a reliable solution. Extraconfig is a URL encoded config and you cannot use single quotes in it or else you will break the VM deployment. Eg:
{{Highlight
| code = (localcloud) 🐱 > update virtualmachine extraconfig=dpdk-hugepages:%0A%3CmemoryBacking%3E%0A%20%20%20%3Chugepages%3E%0A%20%20%20%20%3C/hugepages%3E%0A%3C/memoryBacking%3E%0A%0Adpdk-numa:%0A%3Ccpu%20mode=%22host-passthrough%22%3E%0A%20%20%20%3Cnuma%3E%0A%20%20%20%20%20%20%20%3Ccell%20id=%220%22%20cpus=%220%22%20memory=%229437184%22%20unit=%22KiB%22%20memAccess=%22shared%22/%3E%0A%20%20%20%3C/numa%3E%0A%3C/cpu%3E%0A%0Adpdk-interface-queue:%0A%3Cdriver%20name=%22vhost%22%20queues=%22128%22/%3E id=af64cc80-a4e4-4c17-9c7d-c34ed234dc6a
virtualmachine = map[account:RCS affinitygroup:[] cpunumber:2 cpuspeed:1000 cpuused:5.88% created:2022-05-03T13:16:02-0600 details:map[Message.ReservedCapacityFreed.Flag:false dpdk-hugepages:a extraconfig-dpdk-hugepages:<memoryBacking>
| lang = terminal
}}

==== Troubleshooting ====
{{Highlight
| code = 2022-05-05T22:35:28.312Z{{!}}281704{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.312Z{{!}}281705{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281706{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281707{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
2022-05-05T22:35:28.313Z{{!}}281708{{!}}netdev_dpdk{{!}}INFO{{!}}vHost Device '/var/run/openvswitch/csdpdk-1' connection has been destroyed
| lang = text
}}
Check the agent logs for issues from qemu. I had defined an invalid property which prevented the VM from starting.
{{Highlight
| code = [root@cs10 agent]# grep qemu agent.log
org.libvirt.LibvirtException: internal error: process exited while connecting to monitor: 2022-05-05T22:35:52.060450Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
2022-05-05T22:35:52.060633Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
2022-05-05T22:35:52.060817Z qemu-kvm: -netdev vhost-user,chardev=charnet0,queues=256,id=hostnet0: you are asking more queues than supported: 128
| lang = terminal
}}

==Tools==

===CloudMonkey===
Get started:

*Download from: https://github.com/apache/cloudstack-cloudmonkey/releases/tag/6.1.0
*Documentation at: https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+cloudmonkey+CLI

When you first run CloudMonkey, you will need to set the CloudStack instance URL and credentials and then run sync.
{{Highlight
| code = $ cmk
> set url http://172.19.12.141:8080/client/api
> set username admin
> set password password
> sync
| lang = terminal
}}
The settings are then saved to <code>~/.cmk/config</code>.

The sync command fetches all the available API calls that your account can use. Once that is done, you can then use tab completion while in the CloudMonkey CLI.

====Cheat sheet====
{| class="wikitable"
!What
!Command
|-
|Change output format
|<code><nowiki>set display table|json</nowiki></code>
|-
|Create compute offering
|<code>create serviceoffering name=rcs.c2 displaytext=Medium cpunumber=2 cpuspeed=750 memory=2048 storagetype=shared provisioningtype=thin offerha=false limitcpuuse=false isvolatile=false issystem=false deploymentplanner=UserDispersingPlanner cachemode=none customized=false</code>
|-
|Add a new host
|<code>add host clusterid=XX podid=XX zoneid=XX hypervisor=KVM password=**** username=root url=<nowiki>http://bm01</nowiki></code>
|}

====Automate zone deployments====
There is an example script on how to automate a basic zone deployment at: https://github.com/apache/cloudstack-cloudmonkey/wiki/Usage

=== Terraform ===
The Terraform CloudStack provide works for the most part. However, for CloudStack 4.16, you'll need to recompile it from scratch because the distributed binaries don't work properly (resulting in deployments hanging indefinitely). To build the Terraform provider, I will use Docker:

{{Highlight
| code = # git clone https://github.com/apache/cloudstack-terraform-provider.git
# cd cloudstack-terraform-provide
# git clone https://github.com/tetra12/cloudstack-go.git
# cat <<EOF >> go.mod
replace github.com/apache/cloudstack-go/v2 => ./cloudstack-go
exclude github.com/apache/cloudstack-go/v2 v2.11.0
EOF
# docker run --rm -ti -v /home/me/cloudstack-terraform-provider/:/build golang bash
> cd /build
> go build
| lang = terminal
}}

Copy the resulting binary to your terraform plugins path. Because I ran terraform init, it placed it in my terraform directory under <code>.terraform/providers/registry.terraform.io/cloudstack/cloudstack/0.4.0/linux_amd64/terraform-provider-cloudstack_v0.4.0</code>. Edit the metadata file in the same directory as the provider executable and remove the file hash so that terraform runs the provider.

See also: [[Terraform#CloudStack]]

=== Packer ===
The Packer CloudStack provider also works for the most part, but is limited in that it cannot enter keyboard inputs. Any OS deployments will require some sort of manual inputs or require that the ISO media you use is completely automated. I also had to compile the provider manually since the default plugin that's fetched by packer doesn't quite work due to API changes.

See also: [[Packer#CloudStack]]

==Troubleshooting==
When you run into issues, check the logs in <code>/var/log/cloudstack/</code>. There's typically a stacktrace which gets generated whenever you encounter an error.

===Can't create shared network in a advanced zone using Open vSwitch===
Whenever I try creating a shared network in an advanced zone that is using OVS, the step fails with: "Unable to convert network offering with specified id to network profile".

Stack trace shows that the [https://github.com/apache/cloudstack/blob/bf6266188c89a5487383f216333ae10e878d2c10/plugins/network-elements/ovs/src/main/java/com/cloud/network/guru/OvsGuestNetworkGuru.java#L99 OVS guest network guru] isn't able at designing the network because the zone isn't capable of handling this network offering.

{{Highlight
| code = 2021-09-28 16:36:26,416 DEBUG [c.c.a.ApiServer] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) CIDRs from which account 'Acct[76a1585d-1bf6-11ec-a3c5-8f3e88f01ab1-admin]' is allowed to perform API calls: 0.0.0.0/0,::/0
2021-09-28 16:36:26,439 DEBUG [c.c.u.AccountManagerImpl] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Access granted to Acct[76a1585d-1bf6-11ec-a3c5-8f3e88f01ab1-admin] to [Network Offering [7-Guest-DefaultSharedNetworkOffering] by AffinityGroupAccessChecker
2021-09-28 16:36:26,517 DEBUG [c.c.n.g.BigSwitchBcfGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network, the physical isolation type is not BCF_SEGMENT
2021-09-28 16:36:26,521 DEBUG [o.a.c.n.c.m.ContrailGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,524 DEBUG [c.c.n.g.NiciraNvpGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,527 DEBUG [o.a.c.n.o.OpendaylightGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,530 DEBUG [c.c.n.g.OvsGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,536 DEBUG [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) GRE: VLAN
2021-09-28 16:36:26,536 DEBUG [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) GRE: VXLAN
2021-09-28 16:36:26,536 INFO [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,539 INFO [c.c.n.g.DirectNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,543 DEBUG [o.a.c.n.g.SspGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) SSP not configured to be active
2021-09-28 16:36:26,546 DEBUG [c.c.n.g.BrocadeVcsGuestNetworkGuru] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Refusing to design this network
2021-09-28 16:36:26,549 DEBUG [o.a.c.e.o.NetworkOrchestrator] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Releasing lock for Acct[76a0531f-1bf6-11ec-a3c5-8f3e88f01ab1-system]
2021-09-28 16:36:26,624 DEBUG [c.c.u.d.T.Transaction] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) Rolling back the transaction: Time = 172 Name = qtp1816147548-400; called by -TransactionLegacy.rollback:888-TransactionLegacy.removeUpTo:831-TransactionLegacy.close:655-Transaction.execute:38-Transaction.execute:47-NetworkOrches
trator.createGuestNetwork:2572-NetworkOrchestrator.createGuestNetwork:2327-NetworkServiceImpl$4.doInTransaction:1502-NetworkServiceImpl$4.doInTransaction:1450-Transaction.execute:40-NetworkServiceImpl.commitNetwork:1450-NetworkServiceImpl.createGuestNetwork:1366
2021-09-28 16:36:26,667 ERROR [c.c.a.ApiServer] (qtp1816147548-400:ctx-291672d1 ctx-3f19296a) (logid:83c45c2a) unhandled exception executing api command: [Ljava.lang.String;@69a8823d
com.cloud.utils.exception.CloudRuntimeException: Unable to convert network offering with specified id to network profile
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.setupNetwork(NetworkOrchestrator.java:739)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator$10.doInTransaction(NetworkOrchestrator.java:2634)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator$10.doInTransaction(NetworkOrchestrator.java:2572)
at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:50)
at com.cloud.utils.db.Transaction.execute(Transaction.java:40)
at com.cloud.utils.db.Transaction.execute(Transaction.java:47)
...
| lang = text
}}

====Possible answer====
The guest network was set up with GRE isolation. This however isn't supported with KVM as the hypervisor (see [https://events.static.linuxfound.org/sites/events/files/slides/CloudStack%20Collab%20Hypervisor.pdf this presentation]). After re-creating the zone with the guest physical network set up with just VLAN isolation, I was able to create a regular shared guest network that all tenants within the zone can see and use.

To make the shared network SNAT out, I created another shared network offering that also has SourceNat and StaticNat.
{{Highlight
| code = $ cmk list serviceofferings issystem=true name='System Offering For Software Router'
$ cmk create networkoffering \
name=SharedNetworkOfferingWithSourceNatService displaytext="Shared Network Offering with Source NAT Service" traffictype=GUEST guestiptype=shared conservemode=true specifyvlan=true specifyipranges=true \
serviceofferingid=307b14d8-afd1-43ea-948c-ffe882cd5926 \
supportedservices=Dhcp,Dns,Firewall,SourceNat,StaticNat,PortForwarding \
serviceProviderList[0].service=Dhcp serviceProviderList[0].provider=VirtualRouter \
serviceProviderList[1].service=Dns serviceProviderList[1].provider=VirtualRouter \
serviceProviderList[2].service=Firewall serviceProviderList[2].provider=VirtualRouter \
serviceProviderList[3].service=SourceNat serviceProviderList[3].provider=VirtualRouter \
serviceProviderList[4].service=StaticNat serviceProviderList[4].provider=VirtualRouter \
serviceProviderList[5].service=PortForwarding serviceProviderList[5].provider=VirtualRouter \
servicecapabilitylist[0].service=SourceNat servicecapabilitylist[0].capabilitytype=SupportedSourceNatTypes servicecapabilitylist[0].capabilityvalue=peraccount
| lang = terminal
}}
Using this network offering, I was able to create a shared network in the advanced networking zone that has a NAT service which is visible to all accounts. The only issue with this approach is that there isn't a way to create a port forwarding for a specific VM because the account that owns this network is 'system'.

=== Libvirtd can't start due to expired certificate ===
For some reason, the host stopped renewing agent certs with the management server. As a result, libvirtd will not restart. I only noticed this after rebooting an affected node after migrating all the VMs off the system.
{{Highlight
| code = # libvirtd -l
2023-11-30 18:16:25.116+0000: 39448: info : libvirt version: 8.0.0, package: 22.module+el8.9.0+1405+b6048078 (infrastructure@rockylinux.org, 2023-07-31-18:01:38, )
2023-11-30 18:16:25.116+0000: 39448: info : hostname: cs1
2023-11-30 18:16:25.116+0000: 39448: error : virNetTLSContextCheckCertTimes:142 : The server certificate /etc/pki/libvirt/servercert.pem has expired
| lang = terminal
}}
Note that the certficate file <code>/etc/pki/libvirt/servercert.pem</code> symlinks to <code>/etc/cloudstack/agent/cloud.crt</code>. The certificates and private keys under <code>/etc/cloudstack/agent/cloud*</code> are generated by the CloudStack management server and then sent to and saved by the agent<ref>Certificate saved by the agent: https://github.com/apache/cloudstack/blob/cb62ce67671699fa01564b3b4b0d3d83eb3d5acb/agent/src/main/java/com/cloud/agent/Agent.java#L671</ref>.

Since the node is already out of service, the easiest fix here is to [[CloudStack#Re-add existing KVM bare metal host|re-add this KVM bare metal host]] back into CloudStack again.

==Open-ended questions==
===Compute offerings with 'unlimited' CPU cycles?===
Compute offerings require a MHz value assigned. Why is this? Can we just assign a VM entire cores?

- If you read the docs, CPU (in MHz) only has an effect if CPU cap is selected. In all other cases, the value here is something akin to 'cpu shares'.

- if you put in a huge number like 9999, deployment would fail though.

===How to implement showback?===
Is there a way to implement showback based on resources consumed by account?

===Monitoring resources?===
Is there a way to monitor resource usage by account, node? Any good way to push VMs into a CMDB like ServiceNow?

===NetApp integration?===
Is it possible to do guest VM snapshots by leveraging NetApp?

===Backups?===
The only backup plugins that are available are 'dummy' which does nothing and 'veeam' which only supports VMware + Veeam. If you're using KVM, there doesn't seem to be any way to easily backup/restore VMs.<br />

<br />
{{Navbox Linux}}
[[Category:Linux]]
[[Category:LinuxUtilities]]

FreeIPA

2025-06-10T04:15:24Z

Leo: FreeIPA 4.12 upgrade failure

== Installation ==
Here are my notes as I fumble my way setting up FreeIPA.

=== Docker ===
There is an official Docker container that has a complete FreeIPA installation. This container uses systemd to start up FreeIPA along with the other related services such as OpenLDAP, Bind, and Kerberos. See more at: https://github.com/freeipa/freeipa-container

If you are using Docker, you '''must disable cgroup v2''' (this is enabled by default on RHEL9 and above). More about this in the Troubleshooting section below.

Use the following <code>docker-compose.yml</code> stack to quickly get started with FreeIPA:
{{Highlight
| code = version: '3.3'

services:

freeipa:
image: freeipa/freeipa-server:rocky-8
restart: unless-stopped
tty: true
stdin_open: true
hostname: ipa
domainname: home.steamr.com
extra_hosts:
- "ipa.home.steamr.com:10.1.2.12"
environment:
- IPA_SERVER_HOSTNAME=ipa.home.steamr.com
- IPA_SERVER_IP=10.1.2.12
- DNS=10.1.0.8
- TZ=America/Edmonton
command:
- ipa-server-install
- --realm=home.steamr.com
- --domain=home.steamr.com
- --ds-password=xxxxxxxxxx
- --admin-password=xxxxxxxxxx
- --no-host-dns
- --setup-dns
- --auto-forwarders
- --allow-zone-overlap
- --no-dnssec-validation
- --unattended
sysctls:
- net.ipv6.conf.all.disable_ipv6=0
volumes:
- ./data:/data
- ./logs:/var/logs
- /sys/fs/cgroup:/sys/fs/cgroup:ro
tmpfs:
- /run
- /var/cache
- /tmp
cap_add:
- SYS_TIME
ports:
- "10.1.2.12:80:80/tcp"
- "10.1.2.12:443:443/tcp"
# DNS
- "10.1.2.12:53:53/tcp"
- "10.1.2.12:53:53/udp"
# LDAP(S)
- "10.1.2.12:389:389/tcp"
- "10.1.2.12:636:636/tcp"
# Kerberos
- "10.1.2.12:88:88/tcp"
- "10.1.2.12:464:464/tcp"
- "10.1.2.12:88:88/udp"
- "10.1.2.12:464:464/udp"
| lang = yaml
}}

== Samba integration ==
There are 3 methods to using FreeIPA with Samba. I eventually settled on method #2.

# Configure Samba to use FreeIPA as a simple LDAP server, using '''ldapsam''' as the passdb backend. This requires a schema change to include the sambaSAMAccount and sambaGroupMapping, and sambaSID object classes. There is a DNA (distributed numeric assignment) plugin that can be used to update these fields.
# Configure Samba to use FreeIPA using '''ipasam''' as the passdb backend. This requires the <code>ipasam.so</code> module installed on the samba servers.
# Configure Samba to use '''Kerberos'''. Does not seem to allow users to use password authentication.

=== Method 1: ldapsam ===
{{Warning|I didn't get this to work|I wasn't able to fully get this method to work. If you are using OpenLDAP only, this way of integrating Samba does work. The issue I had was getting the DNA plugin to work as advertised.

I eventually settled on the ipasam method in the section below.}}
To use ldapsam, we need to make some changes to the FreeIPA LDAP server by adding sambaSAMAccount and sambaGroupMapping as a default user object class and group object class.

You can either set this in the FreeIPA web interface under configuration, or run:{{Highlight
| code = # ldapmodify <<EOF
dn: cn=ipaConfig,cn=etc,dc=home,dc=steamr,dc=com
changetype: modify
add: ipaUserObjectClasses
ipaUserObjectClasses: sambaSAMAccount
-
add: ipaGroupObjectClasses
ipaGroupObjectClasses: sambaGroupMapping
EOF
| lang = terminal
}}We will then need to make a custom DNA (distributed numeric assignment) plugin to update these attributes whenever something related to the object (such as the password) is changed. This can be done by adding a DNA object into LDAP.{{Highlight
| code = ldapadd <<EOF
dn: cn=SambaSid,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
dnatype: sambaSID
dnaprefix: S-1-5-21-2049073866-1371207509-1214748462
dnainterval: 1
dnamagicregen: assign
dnafilter: ({{!}}(objectclass=sambasamaccount)(objectclass=sambagroupmapping))
dnascope: dc=home,dc=steamr,dc=com
cn: SambaSid
dnanextvalue: 2

dn: cn=sambaGroupType,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
cn: sambaGroupType
dnatype: sambaGroupType
dnainterval: 1
dnamagicregen: assign
dnafilter: (objectClass=sambagroupmapping)
dnascope: dc=home,dc=steamr,dc=com
dnanextvalue: 2
EOF
| lang = bash
}}

A ldapsam user entry should have these fields. I don't see these though.
{{Highlight
| code = dn: uid=guest2, ou=People,dc=quenya,dc=org
sambaLMPassword: 878D8014606CDA29677A44EFA1353FC7
sambaPwdMustChange: 2147483647
sambaPrimaryGroupSID: S-1-5-21-2447931902-1787058256-3961074038-513
sambaNTPassword: 552902031BEDE9EFAAD3B435B51404EE
sambaPwdLastSet: 1010179124
sambaLogonTime: 0
objectClass: sambaSamAccount
uid: guest2
sambaKickoffTime: 2147483647
sambaAcctFlags: [UX ]
sambaLogoffTime: 2147483647
sambaSID: S-1-5-21-2447931902-1787058256-3961074038-5006
sambaPwdCanChange: 0
| lang = text
}}

==== Configure the Samba server ====
You can either use a specific binding credential that's shared across all your samba servers, or use the machine's cifs service account to authenticate to the LDAP server.

I tried to do the following using the admin account as the bind DN: ('''using the admin account like this is probably a bad idea, I'm just testing''')
{{Highlight
| code = [global]
# freeipa configurations
passdb backend = ipasam:ldap://home.steamr.com
ldap admin dn = uid=admin,cn=users,cn=accounts,dc=home,dc=steamr,dc=com
ldapsam:trusted = yes
ldap suffix = cn=accounts,dc=home,dc=steamr,dc=com
ldap user suffix = cn=users,cn=accounts
ldap machine suffix = cn=computers,cn=accounts
ldap group suffix = cn=groups,cn=accounts
ldap passwd sync = only
ldap ssl = no
| lang = text
}}
Run <code>smbpasswd -w password</code> to set your bind credential passwords.

=== Method 2: ipasam ===
See: https://bgstack15.wordpress.com/2017/05/10/samba-share-with-freeipa-auth/

Install the adtrust components on the FreeIPA server. Install <code>ipa-server-trust-ad</code> and run <code>ipa-adtrust-install --add-sids</code>. This will add the additional IPASAM attributes such as ipaNtPassword in user objects.
{{Highlight
| code = # ipa-adtrust-install --add-sids
## Answer yes to overwrite smb.conf
## Answer yes to install slapi-nis
| lang = terminal
}}
Ensure that your hostname is set to the FQDN of the hostname otherwise this process will fail. If you are using an external DNS server, ensure that the additional service records are present. If you are using a Docker container, set the hostname to the full FQDN (eg. ipa.example.com, rather than just 'ipa').

Next, create a new user and then change the user's password. This should populate the <code>ipaNTPassword</code> attribute. {{Highlight
| code = # ipa user-add leo --first=Leo --last=Leung
# ipa group-add-member smbgrp --users=leo
| lang = terminal
}}
When modifying smb.conf, the service account or bind DN must have access to these new ipasam attributes. You need to create a new set of privilege and role and grant the account access. Create the role and permissions in the web interface or run:
{{Highlight
| code = # ipa permission-add "CIFS server can read user passwords" --attrs={ipaNTHash,ipaNTSecurityIdentifier} --type=user --right={read,search,compare} --bindtype=permission
# ipa privilege-add "CIFS server privilege"
# ipa privilege-add-permission "CIFS server privilege" --permission="CIFS server can read user passwords"
# ipa role-add "CIFS server"
# ipa role-add-privilege "CIFS server" --privilege="CIFS server privilege"
| lang = terminal
}}
Then, add your service account or bind DN to the 'CIFS server' role. For example:

{{Highlight
| code = # ipa service-add cifs/dnas.home.steamr.com
# ipa role-add-member "CIFS server" --services=cifs/dnas.home.steamr.com
| lang = terminal
}}

==== Configure the Samba server ====
Generate a keytab file for samba.
{{Highlight
| code = # kinit -kt /etc/krb5.keytab
## Note: we ran this in the previous step.
## ipa service-add cifs/dnas.home.steamr.com
# ipa-getkeytab -s ipa.home.steamr.com -p cifs/dnas.home.steamr.com -k /etc/samba/samba.keytab
| lang = terminal
}}
Then, tweak smb.conf.{{Highlight
| code = [global]
passdb backend = ipasam:ldap://ipa.home.steamr.com
ldapsam:trusted = yes
ldap suffix = dc=home,dc=steamr,dc=com
ldap user suffix = cn=users,cn=accounts
ldap machine suffix = cn=computers,cn=accounts
ldap group suffix = cn=groups,cn=accounts
ldap ssl = no
idmap config * : backend = tdb
create krb5 conf = No
dedicated keytab file = FILE:/etc/samba/samba.keytab
kerberos method = dedicated keytab
| lang = text
}}If you get: <code>NT_STATUS_BAD_TOKEN_TYPE</code>, you need to disable MS-POC in the FreeIPA settings or disable it specifically for this cifs service account.

The ipasam passdb provider is available from the <code>ipa-server-trust-ad</code> package. However, this package also pulls in a ton of other IPA dependencies which aren't needed if you just want to run Samba that talks to IPA and not the entire FreeIPA server. If you just want the provider to work on a bare minimal samba server, you can simply just copy (or extract from the <code>ipa-server-trust-ad</code> package) the <code>ipasam.so</code> file to <code>/usr/lib64/samba/pdb/ipasam.so</code> with this set of commands:

{{Highlight
| code = # yum download ipa-server-trust-ad
# mkdir x && cd x
# rpm2cpio ../ipa-server-trust-ad*rpm {{!}} cpio -id ./usr/lib64/samba/pdb/ipasam.so
# cp ./usr/lib64/samba/pdb/ipasam.so /usr/lib64/samba/pdb/ipasam.so
| lang = terminal
}}

=== Method 3: Kerberos ===

This method is similar to the ipasam method above and you will need to set up the server in the same way. However, the way you configure Samba is different.

==== On the Samba server ====
{{Highlight
| code = ## Join the samba server to FreeIPA
# ipa-client-install

## Then add the client to samba.
# ipa-client-samba
| lang = terminal
}}
Note that when you try adding the samba client, the IPA server has to be able to resolve the A record for the Samba server you're adding or else it will fail.

This should automatically set up the cifs service accounts for this particular samba server, get the samba keytab file in <code>/etc/samba/samba.keytab</code>, and then tweak the smb.conf file to use this keytab file.

The smb.conf file now looks like:

{{Highlight
| code = [global]
# Limit number of forked processes to avoid SMBLoris attack
max smbd processes = 1000
# Use dedicated Samba keytab. The key there must be synchronized
# with Samba tdb databases or nothing will work
dedicated keytab file = FILE:/etc/samba/samba.keytab
kerberos method = dedicated keytab
# Set up logging per machine and Samba process
log file = /var/log/samba/log.%m
log level = 1
# We force 'member server' role to allow winbind automatically
# discover what is supported by the domain controller side
server role = member server
realm = HOME.STEAMR.COM
netbios name = DNAS
workgroup = HOME
# Local writable range for IDs not coming from IPA or trusted domains
idmap config * : range = 0 - 0
idmap config * : backend = tdb

idmap config HOME : range = 100000 - 299999
idmap config HOME : backend = sss

# Default homes share
[homes]
read only = no
| lang = text
}}

You need to then start winbind and smb. For some reason, this method doesn't seem to work as winbind is stuck with "<code>wb_parent_idmap_setup_lookupname_done: Lookup domain name 'home' failed 'NT_STATUS_DOMAIN_CONTROLLER_NOT_FOUND'</code>"

=== Debug samba issues ===
Use <code>smbclient</code> to help debug issues. This utility is provided by the <code>samba-client</code> package. You can then test authentication by running: <code>smbclient -d 10 -U leo //dnas/home</code>

== Tasks ==

=== Join a computer to a FreeIPA ===
Use the <code>ipa-client-install</code> command to add a computer to a FreeIPA server. This should also automatically add a computer account, generate a keytab file, and tweak sssd to use FreeIPA as an authentication mechanism.
{{Highlight
| code = # ipa-client-install -U -p admin -w $Password --server ipa.home.steamr.com --domain home.steamr.com --force-join --no-ntp --fixed-primary
| lang = terminal
}}

== Troubleshooting ==

=== Error: did not receive Kerberos credentials ===
Tools such as 'ipa' uses your session's Kerberos tickets for authentication. If you don't have any tickets or if your tickets expired, you may get an <code>ipa: ERROR: did not receive Kerberos credentials</code> error. Fix this by running:
{{Highlight
| code = ## Renew/obtain Kerberos tickets for 'admin'
# kinit admin
Password for admin@HOME.STEAMR.COM: ****
| lang = terminal
}}
Verify if your tickets are available with <code>klist</code>:
{{Highlight
| code = # klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: admin@STEAMR.COM

Valid starting Expires Service principal
03/06/22 14:44:35 03/07/22 14:39:47 krbtgt/STEAMR.COM@STEAMR.COM
| lang = terminal
}}

=== Container issues ===

* Don't mount /var/log because the container image symlinks everything into /data. If you do mount /var/log, make sure you make the expected directories or else the installer will fail.
* Error with <code>AssertionError: Another instance named 'HOME-STEAMR-COM' may already exist</code>. I can't figure out what's causing lib389 to think there's another instance. I built a container image on top of this image with the assertion patched out. This seemed to have fixed the issue.
* {{Highlight
| code = FROM freeipa/freeipa-server:rocky-8
RUN sed 's/assert_c(len(insts)/# assert_c(len(insts)/' -i /usr/lib/python3.6/site-packages/lib389/instance/setup.py
| lang = text
}}

==== Can't find the ipa-adtrust-install package ====
The FreeIPA packages are under a different app stream repo. Enable it by running <code>dnf -y module enable idm:DL1</code>.

=== sssd: Decrypt integrity check failed ===
After recreating the FreeIPA server, I uninstalled and reinstalled the FreeIPA client on a machine. Kinit works as expected, but sssd authentication fails with the following error in /var/log/sssd/.

<code>[krb5_child[29691]] [get_and_save_tgt] (0x0020): [RID#6] 1725: [-1765328353][Decrypt integrity check failed]</code>

The fix here is to wipe out all the caches. <code>sss_cache -E</code> isn't sufficient. You have to stop sssd and delete all the databases:

{{Highlight
| code = # systemctl stop sssd
# rm -rf /var/lib/sss/db/*
# systemctl start sssd
| lang = terminal
}}

=== File permission issues ===
I ran into issues starting FreeIPA in a Docker container. Symptoms include the following issues in the subsections below.

==== Bind / named doesn't start: ====
{{Highlight
| code = ipa named-pkcs11[5407]: LDAP error: Invalid credentials: bind to LDAP server failed
ipa named-pkcs11[5407]: couldn't establish connection in LDAP connection pool: permission denied
ipa named-pkcs11[5407]: dynamic database 'ipa' configuration failed: permission denied
ipa named-pkcs11[5407]: loading configuration: permission denied
ipa named-pkcs11[5407]: exiting (due to fatal error)
| lang = text
}}
Ignore this for now. You have other issues, likely.

==== Samba doesn't start: Error: Invalid credentials ====
When trying to start smb, I get the following:
{{Highlight
| code = [2023/01/24 05:17:46.170883, 0, pid=5124] ipa_sam.c:4945(bind_callback)
bind_callback: cannot perform interactive SASL bind with GSSAPI. LDAP security error is 49
[2023/01/24 05:17:46.171155, 0, pid=5124] ../../source3/lib/smbldap.c:1059(smbldap_connect_system)
failed to bind to server ldapi://%2fvar%2frun%2fslapd-HOME-STEAMR-COM.socket with dn="[Anonymous bind]" Error: Invalid credentials
(unknown)
[2023/01/24 05:17:46.171419, 1, pid=5124] ../../source3/lib/smbldap.c:1272(get_cached_ldap_connect)
Connection to LDAP server failed for the 1 try!
| lang = terminal
}}
The fix was to stop FreeIPA with <code>ipactl stop</code>, then <code>rm /run/samba/krb5cc_samba</code>, and restarting FreeIPA with <code>ipactl start</code>. If the error recurrs after restarting FreeIPA, then you have other issues that's preventing Samba from starting.

==== Tomcat doesn't start: status=5/NOTINSTALLED ====
When trying to start Tomcat, you get a 5 exit code, as reported by systemd:
{{Highlight
| code = pki-tomcatd@pki-tomcat.service: Control process exited, code=exited, status=5/NOTINSTALLED
pki-tomcatd@pki-tomcat.service: Failed with result 'exit-code'.
Failed to start PKI Tomcat Server pki-tomcat.
| lang = terminal
}}
The exit code 5 being 'NOTINSTALLED' is a red herring. You likely have permission issues that's preventing the user running tomcat (pkiuser) from accessing some certs or configs. Go through your Docker volumes and chown anything directory called 'pki' to the pkiuser.
{{Highlight
| code = # chown -R 17:17 etc/pki etc/sysconfig/pki var/lib/pki var/lib/ipa/pki-ca
| lang = terminal
}}

==== Tomcat still doesn't start: start-post operation timed out. Terminating. ====
Tomcat doesn't start as it times out.
{{Highlight
| code = pki-tomcatd@pki-tomcat.service: start-post operation timed out. Terminating.
pki-tomcatd@pki-tomcat.service: Control process exited, code=killed, status=15/TERM
pki-tomcatd@pki-tomcat.service: Failed with result 'timeout'.
| lang = terminal
}}
There's likely still something tomcat can't write to.

It seemed to get better when made the ownerships for /data/var/lib/pki/pki-tomcat/logs/ and /var/log/pki/pki-tomcat to pkiuser.

To help troubleshoot, su as pkiuser and try running tomcat as per the service file and see if you get any stack traces.

==== IPA Web GUI reports "Your session has expired. Please re-login" ====
Review the logs for dirsrv and see if there are any errors.
{{Highlight
| code = # tail -f /var/log/dirsrv/slapd-*/access
| lang = terminal
}}
For this instance, this was caused when I accidentally removed <code>/etc/dirsrv/ds.keytab.</code>

=== Configuring Samba issues ===
When running # ipa-adtrust-install --add-sids, you get:
{{Highlight
| code = # ipa-adtrust-install --add-sids
...
Configuring CIFS
[1/25]: validate server hostname
[error] ValueError: Host reports different name than configured: 'ipa' versus 'ipa.home.steamr.com'. Samba requires to have the same hostname or Kerberos principal 'cifs/ipa.home.steamr.com' will not be found in Samba keytab.
Unexpected error - see /var/log/ipaserver-adtrust-install.log for details:
ValueError: Host reports different name than configured: 'ipa' versus 'ipa.home.steamr.com'. Samba requires to have the same hostname or Kerberos principal 'cifs/ipa.home.steamr.com' will not be found in Samba keytab.
| lang = terminal
}}
Ensure that your hostname is the full FQDN.
{{Highlight
| code = # hostname
ipa
# hostname -f
ipa.home.steamr.com

## You need to fix the hostname so that this is what you get:
# hostname ipa.home.steamr.com
# hostname
ipa.home.steamr.com
| lang = terminal
}}
Once the hostname is fixed, try the command again.

=== DNS is missing A/AAAA entries for hosts ===
When trying to add a new host, the DNS silently gets ignored. Possible issues:

* I think this is related to this error when running ipa-client-install: <code>Could not update DNS SSHFP records.</code>.
* Possibly bad DNS entries during install was detected and it skipped the DNS tasks? {{Highlight
| code = Hostname (fc37.home.steamr.com) does not have A/AAAA record.
Failed to update DNS records.
Missing A/AAAA record(s) for host fc37.home.steamr.com: 10.1.2.32.
Incorrect reverse record(s):
10.1.2.32 is pointing to fc35.home.steamr.com. instead of fc37.home.steamr.com.
| lang = text
}}

Without this DNS entry set, other issues will crop up later on:
{{Highlight
| code = [root@ipa /]# ipa service-add cifs/dnas.home.steamr.com
ipa: ERROR: Host 'dnas.home.steamr.com' does not have corresponding DNS A/AAAA record
| lang = terminal
}}
The quick work-around would be to add the DNS entry manually and try again.

=== FreeIPA doesn't start under Docker: Failed to allocate manager object ===
When trying to run FreeIPA under Docker, you get the following message almost immediately on startup:
{{Highlight
| code = Detected virtualization docker.
Detected architecture x86-64.
Failed to create /init.scope control group: Read-only file system
Failed to allocate manager object: Read-only file system
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
| lang = terminal
}}
You're likely running this under a Docker host that has cgroup v2 enabled.

'''Possible workaround''' (though I couldn't get it to work before I reverted to Rocky Linux 8, but I realized after downgrading FreeIPA takes a ton of time before it appears to be functional): You will have to disable this by adding <code>systemd.unified_cgroup_hierarchy=0</code> as a kernel arguments to <code>/etc/default/grub</code>: . Rebuild grub configs <code>grub2-mkconfig -o /boot/grub2/grub.cfg</code> and reboot. Bring the container up as usual.

Alternatively, use Podman (except that I can't because I use docker-compose for all my setups).

=== Failed to authenticate to CA REST API ===
After suffering a brief power outage, my FreeIPA stack stopped working. A contributing factor might have been my auto-update mechanism which pulled in the most recent version of the freeipa/freeipa:rocky-9 container image. Looking at the container logs, I see that the container beings to shutdown after the upgrade command fails. Looking at the <code>/var/log/ipaupgrade.log</code> log file, I see the following error:
{{Highlight
| code = 2025-06-10T03:54:37Z DEBUG The ipa-server-upgrade command failed, exception: RemoteRetrieveError: Failed to authenticate to CA REST API
2025-06-10T03:54:37Z ERROR Unexpected error - see /var/log/ipaupgrade.log for details:
RemoteRetrieveError: Failed to authenticate to CA REST API
| lang = terminal
}}
My first step was to try downgrading. The second oldest container image was FreeIPA 4.11 and attempting to downgrade to this version didn't work as the data I had already had migrated to 4.12.

Some searching turned up a [https://access.redhat.com/solutions/7122683 Red Hat knowledgebase article] which states that this is a bug with ipa-server 4.12.2-14. The fix is to change the following files:

* Add <code>/etc/pki/pki-tomcat/Catalina/localhost/rewrite.config</code> (copy it from <code>/usr/share/pki/server/conf/Catalina/localhost/rewrite.config</code>)
* Edit <code>/etc/pki/pki-tomcat/server.xml</code> to include before the closing <code></Host></code> tag near the bottom of the file: {{Highlight
| code = <Valve className="org.apache.catalina.valves.rewrite.RewriteValve"/>
| lang = text
}}

Because both files are in <code>/etc</code>, these two files should be in the data volume that's mounted into the FreeIPA container. Edit both files from the data volume and then try restarting the container. It should start properly.

Enterprise software? FreeIPA feels like it was put together with duct tape.

== See also ==

* https://bgstack15.wordpress.com/2017/05/10/samba-share-with-freeipa-auth/ - Older guide walking how to set up FreeIPA and Samba
* https://blog.cubieserver.de/2018/synology-nas-samba-nfs-and-kerberos-with-freeipa-ldap/
* https://www.freeipa.org/page/Howto/Integrating_a_Samba_File_Server_With_IPA - Samba integration using kerberos (not ipasam)
* https://freeipa-users.redhat.narkive.com/ez2uKpFS/authenticate-samba-3-or-4-with-freeipa

{{Navbox Linux}}
[[Category:Linux]]
[[Category:Networking]]

Santek EZ Door Sign

2025-03-29T03:09:30Z

Leo:

The Santek EZ Door Sign is cordless e-ink display intended to be mounted outside of offices or rooms. It is capable of storing 5 custom images that can be displayed and rotated with the side button.

The 2.9" variant which I obtained has a resolution of 296 x 128 pixels and is capable of displaying black, white, and red.

== Serial protocol ==

=== Background ===
Santek provides a Windows based app written in C# that is able to interact with the door signs. My goal is to have some Linux based device that can periodically update these displays (such as the weather or calendar events on an hourly basis). In order to do that, I will need to figure out how exactly this device talks via the USB port.

There are some GitHub projects that try to interface with this device already including https://github.com/m3m0r7/ez-door-sign in PHP and https://github.com/kenichi884/ezsign.py in Python. However, the documentation on the serial protocol is a bit lacking. Fortunately, the protocol doesn't seem too complicated.

=== Serial configuration ===
Use 9600 baud. No parity.

The display has to be on. Ensure that the blue LED is lit before trying to talk to it via serial.

=== Commands and messages ===
Here are some of the supported commands by the display.

Command Type is 0 if it's a command that's sent to the display. It will be a 1 if it's data that's a response to a command made by the display.
{| class="wikitable"
!Header
!Command
Type
!Command
ID
!Data
Length
!Data
!Chksum
!Ending
!Notes
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|slide (1 byte)
|<code>??</code>
|<code>0x7E</code>
|Redraws the specified slide (0 ~ 4)
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x00</code>
|<code>0x01</code>
|slide (1 byte)
|<code>??</code>
|<code>0x7E</code>
|Redraw slide response
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0xFE</code>
|<code>0xFF</code>
|<code>0x7E</code>
|Command next slide
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0xFF</code>
|<code>0x00</code>
|<code>0x7E</code>
|Command previous slide (called 'up' in C# code)
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0x01</code>
|slide (1 byte)
|
|<code>0x7E</code>
|Check slide data
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x01</code>
|<code>0x01</code>
|status (1 byte)
|
|<code>0x7E</code>
|Response to the check slide call. Status is one of:

* 0xFF - Slide is blank or unwritten
* 0xFE - Slide has data
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x03</code>
|<code>0x02</code>
|slide-canvas
(2 bytes)
|<code>??</code>
|<code>0x7E</code>
|Begin sending data for a specific slide and canvas. All data for the slide must be rewritten as everything is wiped.
Data is always 2 bytes long, containing:

* slide: slide number (0 through 4)
* canvas: Color canvas (0 or 1)
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x03</code>
|<code>0x01</code>
|<code>0x01</code>
|
|<code>0x7E</code>
|Response to the slide-canvas command.
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x04</code>
|<code>0x12</code>
|column-bitmap
(18 bytes)
|<code>??</code>
|<code>0x7E</code>
|send image bitmap data by column.
Data is always 18 bytes long. 2 for column number, 16 for bitmap data.

Bitmap data here is based on the canvas setting in order to get that extra bit for color.

* Column: 2 bytes in decimal. (0 ~ 295) (0x00,0x00 ~ 0x01,0x27)
* Bitmap: 16 bytes. One bit per pixel makes 128 pixels
This is how you can retrieve data for the 296x128 pixel display.
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x04</code>
|<code>0x01</code>
|<code>0x01</code>
|
|<code>0x7E</code>
|did send image row data succeed? Data should be 0x01
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x05</code>
|<code>0x04</code>
|slide-canvas-column
(4 bytes)
|<code>??</code>
|<code>0x7E</code>
|Get image bitmap from display memory.
Data is always 4 bytes long.

* Slide: Slide number (0 through 4)
* Canvas: Color canvas (0 or 1)
* Column: 2 bytes. (0 ~ 295)

To retrieve the actual color, you have to read both canvas

See the response payload below.
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x05</code>
|<code>0x12</code>
|column-bitmap
(24 bytes)
|<code>??</code>
|<code>0x7E</code>
|Bitmap data received from a get command. The response message is always 24 bytes long.
There are 18 bytes of payload data and is the same format as the send-image-column command.

* Column: 2 bytes. (0 ~ 295)
* Bitmap: 16 bytes. One bit per pixel makes 128 pixels
|-
|<code>0xBB</code>
|
|<code>0x06</code>
|
|
|
|<code>0x7E</code>
|iap model
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x07</code>
|<code>0x01</code>
|<code>0x00</code>
|<code>0x08</code>
|<code>0x7E</code>
|power off
|}
=== Writing bitmap information ===
The display is tri-color and uses 2 bits per pixel to denote color. The 2 bits are written across two separate 'canvases' as part of the write operation. When writing an image:

# Call the start write command (<code>0x03</code>) and specify the desired slide and <code>0x00</code> canvas
# Write the bitmap with command <code>0x04</code>. This command operates column by column, from left to right, top to bottom. There should be 128 pixels (as the display is 128 pixels high). Black and red pixels should be 0. White should be 1. (see table below)
# Call the start write command (<code>0x03</code>) again but this time set canvas to <code>0x01</code>
# Write the bitmap again with command <code>0x04</code> as in step 2, but this time, set the bitmap to 0 for black and white and 1 for red.

You must write to all addresses. The memory for the slide appears to get wiped when a write is initiaited. Skipping any of the columns for either canvases will cause the image to become corrupt.

If you call the 0x03 begin write command but don't actually write anything, the slide data becomes uninitialized and appears completely red. The check slide command (<code>0x01</code>) also returns <code>0xFF</code> rather than <code>0xFE</code> denoting that there is no data at this slide.
{| class="wikitable"
!
!Canvas = 0
!Canvas = 1
!Color
|-
! rowspan="3" |Pixel bit value
|0
|0
|Black
|-
|1
|0
|White
|-
|0
|1
|Red
|}
For entirely black and white only pictures, you still need to write all 0's on the second canvas. Otherwise, the slide is will be rendered completely red. It's likely any unwritten data is erased to 0xFF on the flash.

Santek EZ Door Sign

2025-03-29T03:07:49Z

Leo:

The Santek EZ Door Sign is cordless e-ink display intended to be mounted outside of offices or rooms. It is capable of storing 5 custom images that can be displayed and rotated with the side button.

The 2.9" variant which I obtained has a resolution of 296 x 128 pixels and is capable of displaying black, white, and red.

== Serial protocol ==

=== Background ===
Santek provides a Windows based app written in C# that is able to interact with the door signs. My goal is to have some Linux based device that can periodically update these displays (such as the weather or calendar events on an hourly basis). In order to do that, I will need to figure out how exactly this device talks via the USB port.

There are some GitHub projects that try to interface with this device already including https://github.com/m3m0r7/ez-door-sign in PHP and https://github.com/kenichi884/ezsign.py in Python. However, the documentation on the serial protocol is a bit lacking. Fortunately, the protocol doesn't seem too complicated.

=== Serial configuration ===
Use 9600 baud. No parity.

The display has to be on. Ensure that the blue LED is lit before trying to talk to it via serial.

=== Commands and messages ===
Here are some of the supported commands by the display.

Command Type is 0 if it's a command that's sent to the display. It will be a 1 if it's data that's a response to a command made by the display.
{| class="wikitable"
!Header
!Command
Type
!Command
ID
!Data
Length
!Data
!Chksum
!Ending
!Notes
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|slide (1 byte)
|<code>??</code>
|<code>0x7E</code>
|Redraws the specified slide (0 ~ 4)
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x00</code>
|<code>0x01</code>
|slide (1 byte)
|<code>??</code>
|<code>0x7E</code>
|Redraw slide response
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0xFE</code>
|<code>0xFF</code>
|<code>0x7E</code>
|Command next slide
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0xFF</code>
|<code>0x00</code>
|<code>0x7E</code>
|Command previous slide (called 'up' in C# code)
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x01</code>
|<code>0x01</code>
|slide (1 byte)
|
|<code>0x7E</code>
|Check slide data
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x01</code>
|<code>0x01</code>
|status (1 byte)
|
|<code>0x7E</code>
|Response to the check slide call. Status is one of:

* 0xFF - Slide is blank or unwritten
* 0xFE - Slide has data
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x03</code>
|<code>0x02</code>
|slide-canvas
(2 bytes)
|<code>??</code>
|<code>0x7E</code>
|Begin sending data for a specific slide and canvas. All data for the slide must be rewritten as everything is wiped.
Data is always 2 bytes long, containing:

* slide: slide number (0 through 4)
* canvas: Color canvas (0 or 1)
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x03</code>
|<code>0x01</code>
|<code>0x01</code>
|
|<code>0x7E</code>
|Response to the slide-canvas command.
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x04</code>
|<code>0x12</code>
|column-bitmap
(18 bytes)
|<code>??</code>
|<code>0x7E</code>
|send image bitmap data by column.
Data is always 18 bytes long. 2 for column number, 16 for bitmap data.

Bitmap data here is based on the canvas setting in order to get that extra bit for color.

* Column: 2 bytes in decimal. (0 ~ 295) (0x00,0x00 ~ 0x01,0x27)
* Bitmap: 16 bytes. One bit per pixel makes 128 pixels
This is how you can retrieve data for the 296x128 pixel display.
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x04</code>
|<code>0x01</code>
|<code>0x01</code>
|
|<code>0x7E</code>
|did send image row data succeed? Data should be 0x01
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x05</code>
|<code>0x04</code>
|slide-canvas-column
(4 bytes)
|<code>??</code>
|<code>0x7E</code>
|Get image bitmap from display memory.
Data is always 4 bytes long.

* Slide: Slide number (0 through 4)
* Canvas: Color canvas (0 or 1)
* Column: 2 bytes. (0 ~ 295)

To retrieve the actual color, you have to read both canvas

See the response payload below.
|-
|<code>0xBB</code>
|<code>0x01</code>
|<code>0x05</code>
|<code>0x12</code>
|column-bitmap
(24 bytes)
|<code>??</code>
|<code>0x7E</code>
|Bitmap data received from a get command. The response message is always 24 bytes long.
There are 18 bytes of payload data and is the same format as the send-image-column command.

* Column: 2 bytes. (0 ~ 295)
* Bitmap: 16 bytes. One bit per pixel makes 128 pixels
|-
|<code>0xBB</code>
|
|<code>0x06</code>
|
|
|
|<code>0x7E</code>
|iap model
|-
|<code>0xBB</code>
|<code>0x00</code>
|<code>0x07</code>
|<code>0x01</code>
|<code>0x00</code>
|<code>0x08</code>
|<code>0x7E</code>
|power off
|}
=== Writing bitmap information ===
The display is tri-color and uses 2 bits per pixel to denote color. The 2 bits are written across two separate 'canvases' as part of the write operation. When writing an image:

# Call the start write command (<code>0x03</code>) and specify the desired slide and <code>0x00</code> canvas
# Write the bitmap with command <code>0x04</code>. This command operates column by column, from left to right, top to bottom. There should be 128 pixels (as the display is 128 pixels high). Black and red pixels should be 0. White should be 1. (see table below)
# Call the start write command (<code>0x03</code>) again but this time set canvas to <code>0x01</code>
# Write the bitmap again with command <code>0x04</code> as in step 2, but this time, set the bitmap to 0 for black and white and 1 for red.

You must write to all addresses. The memory for the slide appears to get wiped when a write is initiaited. Skipping any of the columns for either canvases will cause the image to become corrupt.

If you call the 0x03 begin write command but don't actually write anything, the slide data becomes uninitialized and appears completely red. The check slide command (<code>0x01</code>) also returns <code>0xFF</code> rather than <code>0xFE</code> denoting that there is no data at this slide.
{| class="wikitable"
!
!Canvas = 0
!Canvas = 1
!Color
|-
! rowspan="3" |Pixel bit value
|0
|0
|Black
|-
|1
|0
|White
|-
|0
|1
|Red
|}
For black and white only pictures, you can effectively ignore setting canvas=1.

Santek EZ Door Sign

2025-03-27T23:50:46Z