While attempting to free disk space in a datastore on a VMWare host (ESX-based), I deleted a running VM instance, hosting an Oracle RDBMS. Stupidity is still my constant companion. Nothing happened immediately, but that night, the VMWare host was hit by a power outage. In the morning, where the instance used to be was "Unknown (inaccessible)" instead of the instance name. Yes, I blown my foot right off.
In the datastore holding the VM instance files, all that was left was two vmdk files (with "-flat" in their name), a vswp file and a .vmx.lck file. Hmm, so the virtual hard disks were still present; was it possible to resurrect the VM?
I had another VM instance of the same OS, still safe in a different datastore. Perhaps I could copy the files that were missing and edit them to fit the orphaned disks. You can't do that via the VSphere client (at least the 5.1 version I was using), so I had to resort to the command line by SSH'ing into the VMWare box. Luckily this had been left enabled.
I copied the following files, renaming to use the name of destroyed VM machine:
I edited the .vmx file, so that references to .vmdk files were renamed to cite the ophaned disks (the ones with "-flat" in them). The .vmxf file justed seemed to contain the name of the VM instance, which I changed to match the destroyed VM.
I dropped the "Unknown (unavailable)" entry from the inventory. To add the newly repaired(?) VM instance, one has to use the datastore browser, right-click on the .vmx file and select the appropriate menu entry. It will ask for a new name, a lo, the instance was now listed in the inventory. Moment of truth: power-up the instance. I was prompted with a question, asking if the VM was moved or copied. I chose copied, and ...
The boot failed, saying it could not find the disks. Damn!
I looked more closely at the working VM .vmx file. Ah, the disks reference were not the .vmdk files with "-flat" in them, but .vmdk files without that postfix. Inside those files, luckily readable, were the references to the "-flat" versions, which are obviously the disk contents. I copied good versions of these .vmdk control files to the broken VM datastore and edited the names inside them.
The .vmx file lines specifying the disks now looked like:
scsi0:0.present = "TRUE" scsi0:0.fileName = "Oracle-12K.vmdk" scsi0:0.deviceType = "scsi-hardDisk" scsi0:1.present = "TRUE" scsi0:1.fileName = "Oracle-12K_1.vmdk" scsi0:1.deviceType = "scsi-hardDisk"
As an example, the Oracle-12K.vmdk file are like this:
# Disk DescriptorFile version=1 encoding="UTF-8" CID=5261d6d1 parentCID=ffffffff isNativeSnapshot="no" createType="vmfs" # Extent description RW 104857600 VMFS "Oracle-12K-flat.vmdk" # The Disk Data Base #DDB ddb.virtualHWVersion = "7" ddb.longContentID = "c60e06ca4f84b33deeffff5d5261d6d1" ddb.uuid = "60 00 C2 91 07 d5 52 f1-65 5f dd 68 bd b3 f9 cb" ddb.geometry.cylinders = "6527" ddb.geometry.heads = "255" ddb.geometry.sectors = "63" ddb.thinProvisioned = "1" ddb.adapterType = "lsilogic" ddb.toolsVersion = "8300"
As you can see, these control files also contain disk sizes and things like cylinder information so, since the broken VMs disks were of different sizes, my expectations that this would work were low.
The inventory entry was deleted and re-added to make sure the changes were seen. I then hit the power-on button. Wow, it boots! And even better the database came up. With one bound I was free!