Dell OpenManage

From Leo's Notes
Last edited on 2 November 2020, at 17:01.

This page will focus on Dell OpenManage on Linux only.

Installation

Install Dell's OpenManage programs from their repository available at http://linux.dell.com/repo/hardware/. You can either manually install the packages or run their bootstraper script to install their repository.

## From http://linux.dell.com/repo/hardware/DSU_16.12.00/
# wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash 

## Install everything
# yum install srvadmin-\*

## Or -- just the CLI tools:
# yum install srvadmin-\*-cli

Once installed, the binaries can be found in /opt/dell/srvadmin/bin. You may want to append this path to your PATH variable.

# PATH=$PATH:/opt/dell/srvadmin/bin

You may also need to manually start some services such as the data engine service in order to query hardware information about storage systems by running /opt/dell/srvadmin/sbin/dsm_sa_datamgrd.

Usage

Open Manage Report Utility (omreport)

Run omreport -? for a list of available options on your system.

Storage System

To monitor your server's RAID array, run:

# omreport storage pdisk controller=0

Open Mange Configuration Utility (omconfig)

Creating a new vdisk

Suppose you need to create a new vdisk after inserting a new disk into an empty slot on your server. Use omconfig to set up the disk.

Find the disks using

# omreport storage pdisk controller=0 | grep -E '^(ID|Status|Capacity|Sector Size|Bus|Power|Media)'

If you're not sure what controller ID to use, you can find it by running omreport storage controller.

# omreport storage controller
 Controller  PERC 5/i Integrated(Embedded)

Controller
ID                                            : 0
Status                                        : Ok
Name                                          : PERC 5/i Integrated
Slot ID                                       : Embedded
State                                         : Ready
Firmware Version                              : 5.2.2-0072
Minimum Required Firmware Version             : Not Applicable
Driver Version                                : 06.810.09.00-rh1
Minimum Required Driver Version               : Not Applicable
Storport Driver Version                       : Not Applicable
Minimum Required Storport Driver Version      : Not Applicable
Number of Connectors                          : 2
Rebuild Rate                                  : 30%
BGI Rate                                      : 30%
Check Consistency Rate                        : 30%
Reconstruct Rate                              : 30%
Alarm State                                   : Not Applicable
Cluster Mode                                  : Not Applicable
SCSI Initiator ID                             : Not Applicable
Cache Memory Size                             : 256 MB
Patrol Read Mode                              : Auto
Patrol Read State                             : Stopped
Patrol Read Rate                              : 30%
Patrol Read Iterations                        : 48
Abort Check Consistency on Error              : Not Applicable
Allow Revertible Hot Spare and Replace Member : Not Applicable
Load Balance                                  : Not Applicable
Auto Replace Member on Predictive Failure     : Not Applicable
Redundant Path view                           : Not Applicable
CacheCade Capable                             : Not Applicable
Persistent Hot Spare                          : Not Applicable
Encryption Capable                            : Not Applicable
Encryption Key Present                        : Not Applicable
Encryption Mode                               : Not Applicable
Preserved Cache                               : Not Applicable
T10 Protection Information Capable            : No
Non-RAID HDD Disk Cache Policy                : Not Applicable

# omreport storage pdisk controller=0 | grep -E '^(ID|Status|Capacity|Sector Size|Bus|Power|Media)'
ID                              : 0:1:6
Status                          : Ready
Power Status                    : Not Applicable
Bus Protocol                    : SATA
Media                           : HDD
Capacity                        : 1,396.75 GB (1499748892672 bytes)
Sector Size                     : 512B

To create the vdisk using the disk above:

# omconfig storage controller controller=0 action=createvdisk raid=r0 size=max pdisk=0:1:6 writepolicy=fwb name=disk06
Command successful!

The different write policies are talked about in a section below.

If you need additional disks, append them separated by commas. Eg:

# omconfig storage controller controller=0 action=createvdisk raid=r10 size=max pdisk=0:1:2,0:1:3,0:1:4,0:1:5,0:1:6,0:1:7 writepolicy=fwb stripesize=64kb name=uga
Command successful!

Verify:

root@nsb:/# omreport storage vdisk controller=0
List of Virtual Disks on Controller PERC H710 Mini (Embedded)

ID                                : 1
Status                            : Ok
Name                              : uga
State                             : Ready
Hot Spare Policy violated         : Not Assigned
Encrypted                         : No
Layout                            : RAID-10
Size                              : 836.63 GB (898319253504 bytes)
T10 Protection Information Status : No
Associated Fluid Cache State      : Not Applicable
Device Name                       : /dev/sdb
Bus Protocol                      : SAS
Media                             : HDD
Read Policy                       : Adaptive Read Ahead
Write Policy                      : Force Write Back
Cache Policy                      : Not Applicable
Stripe Element Size               : 64 KB
Disk Cache Policy                 : Enabled

Deleting a vdisk

To delete a vdisk, determine the vdisk ID using omreport

# omreport storage vdisk controller=0 | grep -E '^(ID|Name)'
ID                                : 0
Name                              : one
ID                                : 1
Name                              : two
ID                                : 2
Name                              : three
ID                                : 3
Name                              : four
ID                                : 4
Name                              : five
ID                                : 5
Name                              : six

To delete six:

# omconfig storage vdisk action=deletevdisk controller=0 vdisk=5

Wiping Foreign Configs

To wipe foreign configs on all drives attached to a controller:

# omreport storage pdisk controller=0
ID                              : 1:0:5
Status                          : Non-Critical
Name                            : Physical Disk 1:0:5
State                           : Foreign
...

# omconfig storage controller action=clearforeignconfig controller=0
Command successful!

If this still doesn't work, try wiping the drive manually by plugging it to a toaster or computer and then running:

## Wipe first 1MB
# dd bs=512 if=/dev/zero of=/dev/sdb count=2048
## Wipe last 1MB
# dd bs=512 if=/dev/zero of=/dev/sdb count=2048 seek=$((`blockdev --getsz /dev/sdb` - 2048))

If this still doesn't work, you might need to reboot the server. I had this exact issue and a reboot into the RAID BIOS screen shows the drive as 'OK' even though it was 'Foreign' according to OpenManage prior to the reboot.

For reference, a drive in the foreign state cannot be used to create a vdisk:

# omreport storage pdisk controller=0 | grep -E '^(ID|Status|Capacity|Sector Size|Bus|Power|Media|State)'
ID                              : 0:0:0
Status                          : Non-Critical
State                           : Foreign
Power Status                    : Not Applicable
Bus Protocol                    : SATA
Media                           : HDD
Capacity                        : 698.13 GB (749606010880 bytes)
Sector Size                     : 512B

# omconfig storage controller controller=0 action=createvdisk raid=r0 size=max pdisk=0:0:0 writepolicy=fwb name=one
The virtual disk cannot be created on the physical disks you selected. Possible reasons include: Insufficient disk space, available disks are not initialized, incorrect number of disks selected, unsupported mix of SAS and SATA type disks, unsupported mix of SSD and HDD type disks, unsupported mix of 512Bytes and 4KBytes sector size disks, unsupported mix of PI capable and incapable type disks,  controller restrictions or unsupported configuration.

Write Policy

You can change the disk's write policy using omconfig.

# omconfig storage vdisk action=changepolicy writepolicy=fwb controller=0 vdisk=0

Available options for writepolicy= are:

Policy Description
wb Write Back
wt Write Through
wc Write Cache
nwc No Write Cache
fwb Force Write Back (even with no battery)

Troubleshooting

Upgrading PERC 5/i Firmware

omreport will report a controller as degraded if the installed firmware version do not match those used by the version of open manage that's running. When running omreport with mismatched version numbers, you will see something similar to:

# omreport storage controller
...
State                                         : Degraded
Firmware Version                              : 5.1.1-0040
Latest Available Firmware Version             : 5.2.2-0072
...

To upgrade this firmware:

  1. Search for and download the updated firmware binary from Dell's website
  2. Install the 32bit dependencies for the firmware binary
  3. Stop any Dell OpenManage services (such as dataeng or the Server Administrator web service)
  4. Run the update package as root.
  5. Reboot machine
## 1. Download the firmware update.
# ls ; chmod 755 *BIN
SAS-RAID_Firmware_8VM7T_LN32_5.2.2-0076_A11.BIN

## 2. Dependencies
# yum install compat-libstdc++-33.i686 libstdc++.i686 libxml2.i686

## 3. Stop OpenManage services.
# /etc/init.d/dataeng stop
Stopping Systems Management Data Engine:
Stopping dsm_sa_snmpd:                                     [  OK  ]
Stopping dsm_sa_eventmgrd:                                 [  OK  ]
Stopping dsm_sa_datamgrd:                                  [  OK  ]
# omconfig system webserver action=stop

## 4. Run update
# ./SAS-RAID_Firmware_8VM7T_LN32_5.2.2-0076_A11.BIN
Collecting inventory...
.........
Running validation...

PERC 5/E Adapter Controller 1

The version of this Update Package is newer than the currently installed version.
Software application name: PERC 5/E Adapter Controller 1 Firmware
Package version: 5.2.2-0076
Installed version: 5.1.1-0040

PERC 5/E Adapter Controller 2

The version of this Update Package is newer than the currently installed version.
Software application name: PERC 5/E Adapter Controller 2 Firmware
Package version: 5.2.2-0076
Installed version: 5.1.1-0040


Continue? Y/N:y
Executing update...
WARNING: DO NOT STOP THIS PROCESS OR INSTALL OTHER DELL PRODUCTS WHILE UPDATE IS IN PROGRESS.
THESE ACTIONS MAY CAUSE YOUR SYSTEM TO BECOME UNSTABLE!
...........................................................................
The operation was successful.
Would you like to reboot your system now?
Continue? Y/N:n
The system should be restarted for the update to take effect.

Fatal firmware error

I had to hard reboot a machine that had the following issue. The PERC 5/i firmware was out of date which may have lead to this issue.

The storage device was not writable after this point which caused applications to hang.

## Note the latest firmware version vs. the actual installed firmware.
# omreport storage controller
List of Controllers in the system

Controllers
ID                                            : 0
Status                                        : Non-Critical
Name                                          : PERC 5/i Integrated
Slot ID                                       : Embedded
State                                         : Degraded
Firmware Version                              : 5.1.1-0040
Latest Available Firmware Version             : 5.2.2-0072
Driver Version                                : 06.810.09.00-rh1
Minimum Required Driver Version               : Not Applicable
Storport Driver Version                       : Not Applicable
Minimum Required Storport Driver Version      : Not Applicable
Number of Connectors                          : 2
Rebuild Rate                                  : 30%
BGI Rate                                      : 30%
Check Consistency Rate                        : 30%
Reconstruct Rate                              : 30%
Alarm State                                   : Not Applicable
Cluster Mode                                  : Not Applicable
SCSI Initiator ID                             : Not Applicable
Cache Memory Size                             : 256 MB
Patrol Read Mode                              : Auto
Patrol Read State                             : Stopped
Patrol Read Rate                              : 30%
Patrol Read Iterations                        : 506
Abort Check Consistency on Error              : Not Applicable
Allow Revertible Hot Spare and Replace Member : Not Applicable
Load Balance                                  : Not Applicable
Auto Replace Member on Predictive Failure     : Not Applicable
Redundant Path view                           : Not Applicable
CacheCade Capable                             : Not Applicable
Persistent Hot Spare                          : Not Applicable
Encryption Capable                            : Not Applicable
Encryption Key Present                        : Not Applicable
Encryption Mode                               : Not Applicable
Preserved Cache                               : Not Applicable
T10 Protection Information Capable            : No

## Kernel messages that lead up to the failure.
# dmesg
May 12 06:03:36 nse kernel: megaraid_sas 0000:0f:0e.0: 91302 (547884100s/0x0020/DEAD) - Fatal firmware error: Line 1014 in ../../raid/verdeMain.c
May 12 06:03:36 nse kernel:
May 12 06:03:36 nse kernel: megaraid_sas 0000:0f:0e.0: wait adp restart
May 12 06:03:36 nse kernel: megaraid_sas 0000:0f:0e.0: moving cmd[0]:ffff880127fb62c0:1:(null) the defer queue as internal
May 12 06:03:36 nse kernel: megaraid_sas 0000:0f:0e.0: fwState=f0000000, stage:1
May 12 06:03:36 nse kernel: megaraid_sas 0000:0f:0e.0: FW detected to be in faultstate, restarting it...
May 12 06:03:39 nse kernel: megaraid_sas 0000:0f:0e.0: pcidata = 30400
May 12 06:03:39 nse kernel: megaraid_sas 0000:0f:0e.0: FW restarted successfully,initiating next stage...
May 12 06:03:39 nse kernel: megaraid_sas 0000:0f:0e.0: HBA recovery state machine,state 2 starting...
May 12 06:04:09 nse kernel: megaraid_sas 0000:0f:0e.0: Waiting for FW to come to ready state
May 12 06:06:45 nse kernel: INFO: task dsm_sa_datamgrd:2590 blocked for more than 120 seconds.
May 12 06:06:45 nse kernel:      Not tainted 2.6.32-642.6.2.el6.x86_64 #1
May 12 06:06:45 nse kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 12 06:06:45 nse kernel: dsm_sa_datamg D 0000000000000003     0  2590      1 0x00000080
May 12 06:06:45 nse kernel: ffff88012c8dbc58 0000000000000082 0000000000000000 ffffffff8112ee4a
May 12 06:06:45 nse kernel: 0000000000000000 ffffffffa03bcf5f 0007826d92c2eca4 ffff8801081209f0
May 12 06:06:45 nse kernel: 000000102c8dbe48 000000017df868b4 ffff8801284345f8 ffff88012c8dbfd8
May 12 06:06:45 nse kernel: Call Trace:
May 12 06:06:45 nse kernel: [<ffffffff8112ee4a>] ? generic_file_buffered_write+0x1da/0x2e0
May 12 06:06:45 nse kernel: [<ffffffffa03bcf5f>] ? ext4_dirty_inode+0x4f/0x60 [ext4]
May 12 06:06:45 nse kernel: [<ffffffffa0307ed5>] megasas_issue_blocked_cmd+0x115/0x200 [megaraid_sas]
May 12 06:06:45 nse kernel: [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
May 12 06:06:45 nse kernel: [<ffffffffa030dd46>] megasas_mgmt_fw_ioctl+0x466/0x9d0 [megaraid_sas]
May 12 06:06:46 nse kernel: [<ffffffff8113ea24>] ? __pagevec_free+0x44/0x90
May 12 06:06:46 nse kernel: [<ffffffffa030e480>] megasas_mgmt_ioctl_fw+0x1d0/0x240 [megaraid_sas]
May 12 06:06:46 nse kernel: [<ffffffffa0310940>] megasas_mgmt_ioctl+0x30/0x50 [megaraid_sas]
May 12 06:06:46 nse kernel: [<ffffffff811af562>] vfs_ioctl+0x22/0xa0
May 12 06:06:46 nse kernel: [<ffffffff8115f410>] ? unmap_region+0x110/0x130
May 12 06:06:46 nse kernel: [<ffffffff811af704>] do_vfs_ioctl+0x84/0x580
May 12 06:06:46 nse kernel: [<ffffffff8115d5ee>] ? remove_vma+0x6e/0x90
May 12 06:06:46 nse kernel: [<ffffffff8115fb97>] ? do_munmap+0x317/0x3b0
May 12 06:06:46 nse kernel: [<ffffffff811afc81>] sys_ioctl+0x81/0xa0
May 12 06:06:46 nse kernel: [<ffffffff810ee25e>] ? __audit_syscall_exit+0x25e/0x290
May 12 06:06:46 nse kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
May 12 06:08:46 nse kernel: INFO: task dsm_sa_datamgrd:2590 blocked for more than 120 seconds.
May 12 06:08:46 nse kernel:      Not tainted 2.6.32-642.6.2.el6.x86_64 #1
May 12 06:08:46 nse kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 12 06:08:46 nse kernel: dsm_sa_datamg D 0000000000000003     0  2590      1 0x00000080
May 12 06:08:46 nse kernel: ffff88012c8dbc58 0000000000000082 0000000000000000 ffffffff8112ee4a
May 12 06:08:46 nse kernel: 0000000000000000 ffffffffa03bcf5f 0007826d92c2eca4 ffff8801081209f0
May 12 06:08:46 nse kernel: 000000102c8dbe48 000000017df868b4 ffff8801284345f8 ffff88012c8dbfd8
May 12 06:08:46 nse kernel: Call Trace:
May 12 06:08:46 nse kernel: [<ffffffff8112ee4a>] ? generic_file_buffered_write+0x1da/0x2e0
May 12 06:08:46 nse kernel: [<ffffffffa03bcf5f>] ? ext4_dirty_inode+0x4f/0x60 [ext4]
May 12 06:08:46 nse kernel: [<ffffffffa0307ed5>] megasas_issue_blocked_cmd+0x115/0x200 [megaraid_sas]
May 12 06:08:46 nse kernel: [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
May 12 06:08:46 nse kernel: [<ffffffffa030dd46>] megasas_mgmt_fw_ioctl+0x466/0x9d0 [megaraid_sas]
May 12 06:08:46 nse kernel: [<ffffffff8113ea24>] ? __pagevec_free+0x44/0x90
May 12 06:08:46 nse kernel: [<ffffffffa030e480>] megasas_mgmt_ioctl_fw+0x1d0/0x240 [megaraid_sas]
May 12 06:08:46 nse kernel: [<ffffffffa0310940>] megasas_mgmt_ioctl+0x30/0x50 [megaraid_sas]
May 12 06:08:46 nse kernel: [<ffffffff811af562>] vfs_ioctl+0x22/0xa0
May 12 06:08:46 nse kernel: [<ffffffff8115f410>] ? unmap_region+0x110/0x130
May 12 06:08:46 nse kernel: [<ffffffff811af704>] do_vfs_ioctl+0x84/0x580
May 12 06:08:46 nse kernel: [<ffffffff8115d5ee>] ? remove_vma+0x6e/0x90
May 12 06:08:46 nse kernel: [<ffffffff8115fb97>] ? do_munmap+0x317/0x3b0
May 12 06:08:46 nse kernel: [<ffffffff811afc81>] sys_ioctl+0x81/0xa0
May 12 06:08:46 nse kernel: [<ffffffff810ee25e>] ? __audit_syscall_exit+0x25e/0x290
May 12 06:08:46 nse kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
May 12 06:10:09 nse kernel: megaraid_sas 0000:0f:0e.0: adapter not ready
May 12 06:10:09 nse kernel: megaraid_sas 0000:0f:0e.0: Kill HBA is called
May 12 06:10:10 nse kernel: megaraid_sas 0000:0f:0e.0: Controller in crit error
May 12 06:10:10 nse kernel: megaraid_sas 0000:0f:0e.0: Controller in crit error

Non-Critical Disk Status

After replacing a failed disk with a non-dell branded disk, the disk status is now 'Non-Critical'. As a result, even though the disk is healthy, the array will be presented as degraded.

ID                              : 0:1:8
Status                          : Non-Critical
Name                            : Physical Disk 0:1:8
State                           : Online
Power Status                    : Spun Up
Bus Protocol                    : SAS
Media                           : HDD
Part of Cache Pool              : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted               : No
Revision                        : HPD6
Driver Version                  : Not Applicable
Model Number                    : Not Applicable
T10 PI Capable                  : No
Certified                       : No
Encryption Capable              : No
Encrypted                       : Not Applicable
Progress                        : Not Applicable
Mirror Set ID                   : Not Applicable
Capacity                        : 1,117.25 GB (1199638052864 bytes)
Used RAID Disk Space            : 1,117.25 GB (1199638052864 bytes)
Available RAID Disk Space       : 0.00 GB (0 bytes)
Hot Spare                       : No
Vendor ID                       : HP
Product ID                      : EG1200FDNJT
Serial No.                      : L0G5L5XH
Part Number                     : Not Available
Negotiated Speed                : 6.00 Gbps
Capable Speed                   : 6.00 Gbps
PCIe Negotiated Link Width      : Not Applicable
PCIe Maximum Link Width         : Not Applicable
Sector Size                     : 512B
Device Write Cache              : Not Applicable
Manufacture Day                 : Not Available
Manufacture Week                : Not Available
Manufacture Year                : Not Available
SAS Address                     : 5000CCA0720A2891
Non-RAID HDD Disk Cache Policy  : Not Applicable
Disk Cache Policy               : Not Applicable
Sub Vendor                      : Not Available
Available Spare                 : Not Available
Cryptographic Erase Capable     : No

ID                          : 0:1
Status                      : Non-Critical
Name                        : Backplane
State                       : Degraded
Connector                   : 0
Target ID                   : Not Applicable
Configuration               : Not Applicable
Firmware Version            : 1.03
Downstream Firmware Version : Not Applicable
Service Tag                 : Not Applicable
Express Service Code        : Not Applicable
Asset Tag                   : Not Applicable
Asset Name                  : Not Applicable
Backplane Part Number       : Not Applicable
Split Bus Part Number       : Not Applicable
Enclosure Part Number       : Not Applicable
SAS Address                 : 5C8CA0A0F1E43B00
Enclosure Alarm             : Not Applicable


See Also