Wednesday, December 3, 2014

VMKcore partitions on ESXi hosts with non-local Disks

When running ESXi from local storage, a VMKcore partition is created during install.
If your ESXi Host 5.0/5.1/5.5 experiences a Purple Screen Of Death (PSOD), it hopefully creates a diagsnostic coredump. This coredump contains useful information for root cause analysis.

When a PSOD should occur, you can retrieve the dump information using the esxcfg-dumppart command: esxcfg-dumppart –log <ESX dump file> or esxcfg-dumppart –L <ESX dump file> from a shell session.

If there is no available disk partition for a coredump on your ESXi host, such as in Auto-Deploy or "USB/Memcard installs" where there is no local disks, you will get the following error message:

No vmkcore disk partition is available and no network coredump server has been configured. Host core dumps cannot be saved.

In such configuration cases, it is better to move the core dumps to a datastore. This has to be a VMFS volume, which rules out NFS. Since the vmkcore dump partition has to be available at boot time, software iSCSI is ruled out too. Only hardware iSCSI or FC LUNs are possible.

Setting VMKcore partition
The following steps are needed to configure the vmkcore partition. In my example I’m using a 10GB LUN provisioned by iSCSI.
Create the LUN
On my shared storage I created a 10GB iSCSI target and assigned it to my ESXi host. Then on the ESXi host you add the iSCSI target. Do a rescan and then add the iSCSI target like you would normally add a new datastore by pressing the “Add Storage” option in the Storage menu on the configuration tab. Choose to add a Disk/LUN and name it something like: vmkcore-esx01. After a rescan the LUN should be available in your storage view.
Change the partition type
Now the datastore needs to have the disk type changed. To do this you will have to logon to the ESXi host using tech support mode. After you are logged in, list all partitions using the fdisk -l command. You will now see a list of partitions in which you should search for your 10GB disk. In my case it looked like:
Disk /dev/disks/naa.5000144f33903730: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Now to change the partition type, run fdisk /dev/disks/naa.5000144f33903730 (copy it from your fdisk list). In fdisk hit “t” to change the partition’s ID, then hit “fc” to change the partition type to VMKcore. Now hit “w” to write the partition table and exit fdisk.
Set and activate the partition
The last step is now to tell ESXi to use a new vmkcore partition using the following command. First we double check for suitable vmkcore partitions:
esxcfg-dumppart –f
 
If the fdisk action went well, you should now see the /dev/disks/naa.5000144f33903730 partition again in the list. To set the partition use the following command:
esxcfg-dumppart -s naa.5000144f33903730:1
If you receive the message: “Unable to set dump partition naa.5000144f33903730:1. 
Use the command switch -a to activate the partition using the following command:
esxcfg-dumppart -a naa.5000144f33903730:1

No comments:

Post a Comment