CPU Frequency Scaling

From Leo's Notes
Last edited on 31 January 2023, at 18:39.

CPU clock frequency scales up and down with the CPU usage as a way to reduce power usage. On Linux, this behavior is governed by the CPU governor.

The CPU frequency also scales down as a response to high temperature, or if it's running certain instructions like AVX-512.

Linux's CPU frequency governor

You can check your current CPU governor by running:

# cpupower frequency-info --governors
analyzing CPU 0:
  available cpufreq governors: performance powersave

Alternatively, you may check /sys/devices/system/cpu and see the setting for each CPU core.

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance

Types of scaling governors

Scaling governors are policies that affect the CPU frequency scaling algorithm.

See: https://www.kernel.org/doc/html/v4.14/admin-guide/pm/cpufreq.html

Governor A very brief description
performance Run the CPU at the highest frequency, as defined within the scaling_max_freq limit.
powersave Run the CPU at the lowest frequency, as defined within the scaling_min_freq limit.
ondemand   Dynamically sets frequency proportional to the load of the CPUs.
conservative Like ondemand, but frequency changes happen in smaller steps.
userspace Run the CPU at user specified frequency, controlled by writing to the scaling_setspeed attribute.

Intel Turbo Boost frequency

Some Intel CPUs support a turbo boost frequency where a CPU can clock faster than its stock speed depending on available power and thermal conditions. Refer to the CPU's specification to determine a particular CPU's turbo boost frequency.

I have observed on some compute servers that the CPU will run somewhere between the base frequency and the turbo boost frequency when the temperature is around the mid 70's Celsius when under full load. I'm not sure why this is the case.

Some helpful commands

Description Command
Get all CPU core frequencies grep MHz /proc/cpuinfo
Get average CPU core frequency grep MHz /proc/cpuinfo | awk '{print $4}' | awk '{temp+=$1;n++} END{printf("%f", temp/n);}'
Get max CPU core frequency grep MHz /proc/cpuinfo | awk '{print $4}' | sort -rn | head -n 1
Get min CPU core frequency grep MHz /proc/cpuinfo | awk '{print $4}' | sort -n | head -n 1
Get average CPU temperature cat /sys/class/thermal/thermal_zone*/temp | awk '{temp+=$1;n++} END{printf("%f", temp/n/1000);}'
Get max CPU temperature cat /sys/class/thermal/thermal_zone*/temp | sort -rn | head -n 1 | awk '{printf("%f", $1/1000);}'

Issues

My CPU is stuck at a very low clock speed

I've had servers stuck at around 100MHz which is well below the minimum clock speed of the CPU. I wrote about it on Troubleshooting a Slow Linux System as well.

Some tips if this happens to you are:

  • Do a BIOS update. A bug with the BIOS could make the CPU think there isn't enough power, for example.
  • Check your power supplies. A failing PSU could cause the CPU to throttle.
  • Check CPU temperature. A blocked inlet, or not enough airflow could possibly be an issue.
  • Try to powercycle the server. If it uses a SMASH-CLP DRAC, try to reset the /system1/pwrmanagement1 target (this seemed to help on a problematic server)

My CPU clocks down under load

If your CPU clocks down while under load, check if it is:

  1. a thermal issue
  2. a power issue
  3. due to slow SIMD instructions like AVX2 or AVX-512, which will cause the core to clock down. See: https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency/56861355#56861355

CPU clocking down to the minimum clock speed

At RCS, the servers are configured to be using the performance CPU governor in order to eek as much performance as we can from the hardware. Because of the governor setting, we expect the CPU frequency to sit somewhere between the base CPU frequency and the turbo boost frequency. However, I have observed some idle nodes at 1GHz, the minimum limit for these processors.

The minimum clock speed can be found with cpupower. A properly working node shows the following output, including the governor in use.

# cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1000 MHz - 3.70 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 1000 MHz and 3.70 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 3.10 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

On nodes that clock to the minimum 1GHz, cpupower does not show the governor:

# cpupower frequency-info
analyzing CPU 0:
  no or unknown cpufreq driver is active on this CPU
  CPUs which run at the same hardware frequency: Not Available
  CPUs which need to have their frequency coordinated by software: Not Available
  maximum transition latency:  Cannot determine or is not supported.
Not Available
  available cpufreq governors: Not Available
  Unable to determine current policy
  current CPU frequency: Unable to call hardware
  current CPU frequency:  Unable to call to kernel
  boost state support:
    Supported: yes
    Active: yes
# grep MHz /proc/cpuinfo
cpu MHz         : 999.870
cpu MHz         : 999.851
cpu MHz         : 999.849
cpu MHz         : 999.870
cpu MHz         : 1000.130
cpu MHz         : 999.914
cpu MHz         : 999.870
cpu MHz         : 999.926
cpu MHz         : 1000.030
cpu MHz         : 999.845
cpu MHz         : 999.967
cpu MHz         : 999.921
...

See Also