How to hot-swap SATA disks on Linux

From Leo's Notes
Last edited on 10 October 2023, at 21:38.

When running Linux on a server or NAS device, you might want to hot-swap disks without bringing the system down. For machines without a RAID controller, this might be a little bit tricker since you have to tell Linux to remove and add the device manually. This page will go over how to hot-swap SATA devices in Linux.

Removing a Drive[edit | edit source]

If a device becomes unresponsive or if you simply just want to remove a SATA disk from a running system:

  1. Ensure the disk is unmounted.
  2. Ensure the disk isn't used by swap or LVM groups.
  3. Remove the disk from the system by running echo 1 > /sys/block/sdX/device/delete. This should also power off the drive. If this doesn't work, try using hdparm to take the deivce to the lowest power setting with hdparm -Y /dev/sdx and then try again.

Once the drive spins down, you can disconnect the power and SATA connector.

Rescanning SATA Bus[edit | edit source]

Depending on the chipset and SATA controller, when you connect a drive to the system, you may need to force a rescan of the bus before the drive shows up on the system. On Linux, you will need to trigger a rescan.

Determine the controller the disk is attached to and trigger a rescan. Listing the scsi_host directory should give you a clue on what host is associated with which device:

# ls -al /sys/class/scsi_host
lrwxrwxrwx  1 root root 0 Feb 18 22:21 host0 -> ../../devices/pci0000:00/0000:00:06.0/ata1/host0/scsi_host/host0/
lrwxrwxrwx  1 root root 0 Feb 18 22:21 host1 -> ../../devices/pci0000:00/0000:00:06.0/ata2/host1/scsi_host/host1/
lrwxrwxrwx  1 root root 0 Feb 18 22:21 host2 -> ../../devices/pci0000:00/0000:00:04.0/0000:01:06.0/ata3/host2/scsi_host/host2/

Trigger a rescan by writing 3 dashes (which denotes wildcards) to the scan command. Each of the 3 fields represents the channel, SCSI target ID, and LUN, respectively.

# echo "- - -" > /sys/class/scsi_host/host#/scan

If you're not sure which controller your disk is connected to, you may try to rescan all controllers:

# for arg in /sys/class/scsi_host/*/scan; do echo "- - -" > $arg; done

Once the rescan is triggered, run dmesg to see if a new disk was attached to the system.