About Virtual Disks, SCSI and Linux Device Numbering
Note : This section is applied only to situations where you decide to reconfigure VM disk devices in an operational NAS (not generally recommended).
Due to the way device naming occurs in Linux running within a virtual machine environment, deleting a virtual disk drive can cause the Linux devices to be renamed whenever the SoftNAS VM gets rebooted, which causes a configuration issue and faulted pool configuration.
Therefore, it is recommended to avoid deleting virtual hard disks in an operational production configuration. As you will see below, if you are going to alter disk device configurations (e.g., deleting an existing virtual disk device), be sure to carefully document the original configuration before making any changes, so you have a good record of how virtual disks, SCSI devices and Linux device names are each allocated.
Example:
Original Configuration
VHD 2 (SCSI 0:1) - /dev/sdb - Pool 1, drive 1 (mirror disk 0)
VHD 3 (SCSI 0:2) - /dev/sdc - Pool 1, drive 2 (mirror disk 1)
VHD 4 - (SCSI 0:3) - /dev/sdd - Pool 2, drive 1 (RAID)
VHD 5 - (SCSI 0:4) - /dev/sde - Pool 2, drive 2
VHD 6 - (SCSI 0:5) - /dev/sdf - Pool 2, drive 3
VHD 7 - (SCSI 0:6) - /dev/sdg - Pool 2, drive 4
VHD 8 - (SCSI 0:7) -/dev/sdh - Pool 3, drive 5
After Reconfiguration/Reboot
VHD 2 - (removed)
VHD 3 - (removed)
VHD 4 - (SCSI 0:1) - /dev/sdb - Pool 2, drive 1 (RAID - FAULTED!)
VHD 5 - (SCSI 0:2) - /dev/sdc - Pool 2, drive 2
VHD 6 - (SCSI 0:3) - /dev/sdd - Pool 2, drive 3
VHD 7 - (SCSI 0:4) - /dev/sde - Pool 2, drive 4
VHD 8 - (SCSI 0:5) - /dev/sdf - Pool 3, drive 5
Note: The above example is related to that for VMware, but the same kind of issue can occur in other hypervisors.
When Linux is rebooted, it only sees 5 disk drives. Linux device naming begins with the actual assigned disk devices (not the SCSI device numbers). This causes the corresponding Linux device names to be different after a reboot. This would cause Pool 2 to become FAULTED, as it's disk drives are not where they should be, in terms of devices which were expected (they are shifted by two places, due to the two virtual disk that were removed).
The good news is, no data loss has occured - this is just a SCSI device to Linux device mapping problem. It can be corrected by adding the two missing disks back, and ensuring the SCSI device numbers match the original SCSI devices. Here's the repaired configuration:
VHD 4 - (SCSI 0:3) - /dev/sdd - Pool 2, drive 1 (RAID)
VHD 5 - (SCSI 0:4) - /dev/sde - Pool 2, drive 2
VHD 6 - (SCSI 0:5) - /dev/sdf - Pool 2, drive 3
VHD 7 - (SCSI 0:6) - /dev/sdg - Pool 2, drive 4
VHD 8 - (SCSI 0:7) -/dev/sdh - Pool 3, drive 5
VHD 2 (SCSI 0:1) - /dev/sdb - (new disk or restored deleted VMDK)
VHD 3 (SCSI 0:2) - /dev/sdc - (new disk or restored deleted VMDK)
Now Pool 2 is okay and everything is operational again, because the configuration hole has been plugged with two drives assigned to the same SCSI devices. This causes Linux to see the same disk geometry it had previously, which allows the five RAID device names in Linux to be named as expected.
It is important to realize that Linux names devices based on the actual number of devices presented - NOT the SCSI numbering, so be careful not to create configuration holes in the device number scheme once your pools are established (you can always replace a drive within a pool, which assigned a new /dev/sdXX device name to the pool, which is fine).
Finally, in the event you should find yourself with such a "hole" in your configuration and unable to figure out how to reconfigure the disks properly at the VM layer, then you can simply use the Storage Pool Import function to recover from the pool configuration problem (see storage pool "Import" feature in the User Reference Guide for more details).
|