Benchmark for APM X-Gene

From Leo's Notes
Last edited on 10 August 2017, at 23:31.

APM X-Gene has a 64-bit ARM processor.

Original APM Linux

The following benchmark was done on the original linux kernel that was shipped with the box. This was able to see all the cores and was able to run slightly faster even on one CPU.

Results Overview

Hardware APM X-Gene
Memory 16 GB DDR3 EEC
Disk 1x500 GB SATA
Operating System Fedora 22 aarch64
Score 926.9 / 3402.7

Raw Output

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: GNU/Linux
   OS: GNU/Linux -- 3.12.0-mustang_sw_1.12.09-beta -- #1 SMP Thu Jun 12 10:27:15 PDT 2014
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 1: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 2: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 3: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 4: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 5: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 6: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   CPU 7: APM X-Gene Mustang board @ 2.40GHz (0.0 bogomips)
   17:28:54 up 27 min,  3 users,  load average: 1.39, 0.87, 0.61; runlevel 3

Benchmark Run: Fri Jun 26 2015 17:28:54 - 17:56:56
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       17763947.5 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2114.7 MWIPS (9.6 s, 7 samples)
Execl Throughput                               2459.8 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        468331.1 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          130247.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1151363.8 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1067484.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 192167.1 lps   (10.0 s, 7 samples)
Process Creation                               7660.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4207.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1634.0 lpm   (60.0 s, 2 samples)
System Call Overhead                        1438914.1 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   17763947.5   1522.2
Double-Precision Whetstone                       55.0       2114.7    384.5
Execl Throughput                                 43.0       2459.8    572.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     468331.1   1182.7
File Copy 256 bufsize 500 maxblocks            1655.0     130247.8    787.0
File Copy 4096 bufsize 8000 maxblocks          5800.0    1151363.8   1985.1
Pipe Throughput                               12440.0    1067484.4    858.1
Pipe-based Context Switching                   4000.0     192167.1    480.4
Process Creation                                126.0       7660.0    607.9
Shell Scripts (1 concurrent)                     42.4       4207.7    992.4
Shell Scripts (8 concurrent)                      6.0       1634.0   2723.3
System Call Overhead                          15000.0    1438914.1    959.3
System Benchmarks Index Score                                         926.9

Benchmark Run: Fri Jun 26 2015 17:56:56 - 18:25:05
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables      141540380.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    16919.0 MWIPS (9.6 s, 7 samples)
Execl Throughput                              13012.0 lps   (29.8 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        579580.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          155434.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1673229.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                             8441190.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1463165.9 lps   (10.0 s, 7 samples)
Process Creation                              32854.4 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  16740.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2477.4 lpm   (60.1 s, 2 samples)
System Call Overhead                        7650246.7 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  141540380.6  12128.6
Double-Precision Whetstone                       55.0      16919.0   3076.2
Execl Throughput                                 43.0      13012.0   3026.1
File Copy 1024 bufsize 2000 maxblocks          3960.0     579580.8   1463.6
File Copy 256 bufsize 500 maxblocks            1655.0     155434.8    939.2
File Copy 4096 bufsize 8000 maxblocks          5800.0    1673229.9   2884.9
Pipe Throughput                               12440.0    8441190.9   6785.5
Pipe-based Context Switching                   4000.0    1463165.9   3657.9
Process Creation                                126.0      32854.4   2607.5
Shell Scripts (1 concurrent)                     42.4      16740.8   3948.3
Shell Scripts (8 concurrent)                      6.0       2477.4   4128.9
System Call Overhead                          15000.0    7650246.7   5100.2
System Benchmarks Index Score                                        3402.7

Fedora 23

Ran again on a Fedora 23 install. This time with the TianoCore UEFI bios installed.

Results Overview

Hardware APM X-Gene
Memory 16 GB DDR3 EEC
Disk 1x500 GB SATA
Operating System Fedora 23 aarch64
Score 890.0 / 3436.9

Raw Output

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: csa2: GNU/Linux
   OS: GNU/Linux -- 4.2.8-300.fc23.aarch64 -- #1 SMP Mon Dec 21 06:10:24 UTC 201                                      5
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   17:32:18 up 22:39,  1 user,  load average: 0.13, 1.36, 1.31; runlevel 3

Benchmark Run: Fri Jan 08 2016 17:32:18 - 18:00:14
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       17767160.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2177.1 MWIPS (9.6 s, 7 samples)
Execl Throughput                               2373.3 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        545523.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          160952.3 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1286987.6 KBps  (30.0 s, 2 samples)
Pipe Throughput                              727293.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 117655.4 lps   (10.0 s, 7 samples)
Process Creation                               5668.4 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   4352.4 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1841.7 lpm   (60.0 s, 2 samples)
System Call Overhead                        1537463.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   17767160.3   1522.5
Double-Precision Whetstone                       55.0       2177.1    395.8
Execl Throughput                                 43.0       2373.3    551.9
File Copy 1024 bufsize 2000 maxblocks          3960.0     545523.3   1377.6
File Copy 256 bufsize 500 maxblocks            1655.0     160952.3    972.5
File Copy 4096 bufsize 8000 maxblocks          5800.0    1286987.6   2218.9
Pipe Throughput                               12440.0     727293.2    584.6
Pipe-based Context Switching                   4000.0     117655.4    294.1
Process Creation                                126.0       5668.4    449.9
Shell Scripts (1 concurrent)                     42.4       4352.4   1026.5
Shell Scripts (8 concurrent)                      6.0       1841.7   3069.6
System Call Overhead                          15000.0    1537463.6   1025.0
System Benchmarks Index Score                                         890.0

Benchmark Run: Fri Jan 08 2016 18:00:14 - 18:28:09
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables      142091681.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    17414.9 MWIPS (9.6 s, 7 samples)
Execl Throughput                              11301.6 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks       1032947.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          286632.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       2569016.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                             5662756.1 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1090607.6 lps   (10.0 s, 7 samples)
Process Creation                              15654.0 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  17548.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2488.6 lpm   (60.0 s, 2 samples)
System Call Overhead                        7592134.7 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  142091681.3  12175.8
Double-Precision Whetstone                       55.0      17414.9   3166.4
Execl Throughput                                 43.0      11301.6   2628.3
File Copy 1024 bufsize 2000 maxblocks          3960.0    1032947.0   2608.5
File Copy 256 bufsize 500 maxblocks            1655.0     286632.5   1731.9
File Copy 4096 bufsize 8000 maxblocks          5800.0    2569016.9   4429.3
Pipe Throughput                               12440.0    5662756.1   4552.1
Pipe-based Context Switching                   4000.0    1090607.6   2726.5
Process Creation                                126.0      15654.0   1242.4
Shell Scripts (1 concurrent)                     42.4      17548.7   4138.9
Shell Scripts (8 concurrent)                      6.0       2488.6   4147.6
System Call Overhead                          15000.0    7592134.7   5061.4
System Benchmarks Index Score                                        3436.9

Fedora 26

Fedora 26 with the most recent firmware and BIOS (Tianocore / SlimPro 3.06.25).

The decrease in performance could be related to the BIOS update or Fedora 26.

Results Overview

Hardware APM X-Gene
Memory 16 GB DDR3 EEC
Disk 1x500 GB SATA
Operating System Fedora 26 aarch64
Score 723.0 / 2765.1

Raw Output

   BYTE UNIX Benchmarks (Version 5.1.3)

   System: csa2: GNU/Linux
   OS: GNU/Linux -- 4.11.9-300.fc26.aarch64 -- #1 SMP Wed Jul 5 16:15:00 UTC 2017
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   14:45:47 up 17 min,  1 user,  load average: 0.46, 0.18, 0.12; runlevel 2017-07-17

Benchmark Run: Mon Jul 17 2017 14:45:47 - 15:13:43
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       18892194.7 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     2471.9 MWIPS (9.6 s, 7 samples)
Execl Throughput                               1755.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        374673.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          102829.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        963569.3 KBps  (30.0 s, 2 samples)
Pipe Throughput                              676573.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  94482.3 lps   (10.0 s, 7 samples)
Process Creation                               5147.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3200.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1478.3 lpm   (60.0 s, 2 samples)
System Call Overhead                        1080252.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   18892194.7   1618.9
Double-Precision Whetstone                       55.0       2471.9    449.4
Execl Throughput                                 43.0       1755.7    408.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     374673.9    946.1
File Copy 256 bufsize 500 maxblocks            1655.0     102829.9    621.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     963569.3   1661.3
Pipe Throughput                               12440.0     676573.4    543.9
Pipe-based Context Switching                   4000.0      94482.3    236.2
Process Creation                                126.0       5147.9    408.6
Shell Scripts (1 concurrent)                     42.4       3200.9    754.9
Shell Scripts (8 concurrent)                      6.0       1478.3   2463.9
System Call Overhead                          15000.0    1080252.6    720.2
System Benchmarks Index Score                                         723.0

Benchmark Run: Mon Jul 17 2017 15:13:43 - 15:41:42
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables      150729262.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    19767.7 MWIPS (9.6 s, 7 samples)
Execl Throughput                              12385.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        531394.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          143088.9 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1530740.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                             5364483.4 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 821279.1 lps   (10.0 s, 7 samples)
Process Creation                              26902.7 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  15055.4 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2095.9 lpm   (60.1 s, 2 samples)
System Call Overhead                        3121065.5 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  150729262.4  12916.0
Double-Precision Whetstone                       55.0      19767.7   3594.1
Execl Throughput                                 43.0      12385.7   2880.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     531394.2   1341.9
File Copy 256 bufsize 500 maxblocks            1655.0     143088.9    864.6
File Copy 4096 bufsize 8000 maxblocks          5800.0    1530740.2   2639.2
Pipe Throughput                               12440.0    5364483.4   4312.3
Pipe-based Context Switching                   4000.0     821279.1   2053.2
Process Creation                                126.0      26902.7   2135.1
Shell Scripts (1 concurrent)                     42.4      15055.4   3550.8
Shell Scripts (8 concurrent)                      6.0       2095.9   3493.2
System Call Overhead                          15000.0    3121065.5   2080.7
System Benchmarks Index Score                                        2765.1