Recovering from accidental VMFS Datastore deletion

When you forcefully remove a Datastore while ESXi servers are still connected to it, all sorts of weird and wonderful things can happen.¬†We recently had a team member accidentally install ESXi onto a LUN, blowing away the Datastore that was on it. ESXi servers become unstable, the VM’s that were running from the Datastore go into a zombie state where they may respond to pings but are not fully there because they are still running from memory but the disks have been removed from under them. The vCenter server will exhibit high DB load as the ESXi servers try to update the statuses of VM’s which aren’t actually there anymore. It’s a mess and we’ve recently had to go through this. It took us an entire day to discover what really happened because the person that did it didn’t even realise he had done it, so we had to check LUN presentation and the rest of it. It had actually stumped us making us think the VMFS filesystem had somehow gotten corrupted, until VMware support jumped onto our servers through WebEx and found the problem.

Here’s how you clean up the mess.

  1. Turn off HA and DRS at the cluster level because it gets in the way
  2. Remove any greyed out VM’s from vCenter Server.
  3. Log into each ESXi server individually and remove the “Unknown” greyed out VM’s from Inventory.
  4. SSH into each server at a time, /sbin/services.sh stop, cd /opt/vmware/vmware/uninstallers and uninstall both aam and vpxa. /sbin/services.sh start. Add back into cluster/VC, doing this will resync the ESX server’s inventory with vCenter Server.
  5. Reboot each ESX server in turn to remove all zombie processes/VMs which will still be running in the background.
  6. Now we need to clean up references to the old Datastore. Go into Inventory -> Datastores, if you still see the old datastore there, click on it, then click on the Virtual Machines tab, then remove anything from there from Inventory. Since none of these VMs exist anymore, it’s a safe operation. The datastore should disappear after that.

It’s not a pretty process. If you can, disconnect any fibre cables before the install, but in the case of blade servers, just be very very careful.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top