Welcome to edgylogic, drive:activated visitors. This is my new home on the net.

Articles

  • Recovering VMware snapshot after parent changed

    UPDATE (21/05/2010): I've been alerted to the ridiculous amount of comment spam this page has gotten; apologies to those who were further spammed by the email notifications. I have therefore disabled the email and commenting features, and all future comments will be moderated. Damn spammers have to ruin everything, grrrr.

    Scroll down to the problem or solution section below if you want to cut to the chase. 

    I upgraded my Kubuntu installation to Gutsy today - of course, it wasn't as smooth as it should've been. First I had to work out how to do it - the instructions were brief, screenshots confusing, and the process just didn't feel natural. The 'version upgrade' button only appears after you have satisfied certain conditions, conditions that you don't know. It just magically appears when it wants to, after pressing a special sequence of buttons.

    Then the 'distribution upgrade' process crashed, packages won't install. Ended up working after a few tries.

    For some stupid reason, they still haven't fixed the 'failed to set xfermode' bug that heaps of people have encountered and really cripples the system because the system doesn't boot at all. In fact, it removes the fix for it too - adding irqpoll to the end of the kernel line for the appropriate entry in /boot/grub/menu.lst.

    Plus they introduced a new bug by adding tablet settings into /etc/X11/xorg.conf by default, even if no tablet exists, tripping up the system. And did I mention that the network connection is flaky and standby/hibernate still doesn't work? Linux is still Linux it seems.

    Anyway, it all worked out in the end after some googling so I went to install VMware Server on it so I could run my virtual machines on it as well as in Windows. There is no package install available for it, so follow the instructions here, however, use this patch instead.

    Once all that was working, I ran the VMware Console, about to run my Windows Server 2003 Standard Edition virtual machine, when I thought, hmm..., I don't want this VMware instance fudging with the Windows VMware instance, so I'll create a new virtual machine, and link it to the existing virtual hard disk.

    Problem

    All sounded cool, until I accidentally linked to the base parent hard disk, and not the latest snapshot. So once I booted it, not only did I not have the latest changes, but when I re-linked it to the latest snapshot, it wouldn't boot anymore. Instead I got the error message, "Cannot open the disk ... Reason: The parent virtual disk has been modified since the child was created."

    Did I mention that the virtual machine housed the test instance for this website, including the changes I had been working on all weekend, and I had no other backup? Stick out tongue

    After a few minutes of cursing and swearing, banging on tables, wondering wtf I had done, and pondering redoing all those changes again, I did what every self-respecting nerd does when they're stuck - turn to google.

    Solution

    I found these links:

    Here is my solution, which is basically a rewrite of the process in the last link above, with a few more details. I used Linux to do the recovery, mainly because it had commands that I needed. I assume you have some Linux command line knowledge, as all this will be performed in the terminal.

    1. Make a copy of the virtual machine folder in case you screw up.
    2. Look at the size of the snapshot virtual hard disk. If it is more than 2GB and you're running a 32-bit OS, or it is more than the amount of memory that you have available, the following method will probably not work. You're welcome to try though.

      The virtual hard disk files all end in .vmdk. The snapshot one has -xxxxxx on the end of the file name, indicating the snapshot number. For example, if my virtual machine was called Windows Server 2003 Standard Edition, my base parent virtual disk will be named Windows Server 2003 Standard Edition.vmdk, and my snapshot may be named Windows Server 2003 Standard Edition-000002.vmdk.
    3. Find out the CID of the base parent virtual hard disk. Because this virtual hard disk will most likely be larger than 2GB, you won't be able to open it in nano, vi etc. As we only need to read from it, we can use a linux command to print out only the first 20 or so lines.
      head --lines=20 {base parent vmdk path}

      Replace {vmdk path} with the path to the base parent virtual hard disk file, e.g.
      head --lines=20 /media/sda1/"Virtual Machines"/"Windows Server 2003 Standard Edition"/"Windows Server 2003 Standard Edition.vmdk"
      The CID is the 8-character random string on the line starting with CID=. Write this down somewhere.
    4. Now open up the snapshot virtual hard disk in a text editor, and change the parentCID (not CID) to the CID you recorded in the previous step. Then save. You can use nano, vi or some other Linux editor, e.g.
      sudo nano {snapshot vmdk path}
      Make sure to sudo the command, and also be patient - it could take a few minutes, during which the console may remain black; it is loading.

      I chose to do this in Windows instead, using Editpad Lite which is amazingly fast.
    5. That's it, your virtual machine should now start up again.

    Further explanation

    If you're interested, here's a deeper look into what you just did. At the beginning of each vmdk file is a disk descriptor section, which contains the properties of that virtual hard disk in text. The CID is a random unique identifier that identifies a particular state of the virtual disk - each time a change is made to the virtual hard disk, the CID changes.

    In normal operation, the CID property of the base parent virtual hard disk is synced with the parentCID property of the snapshot virtual hard disk to show that the two files work together. The snapshot has to work with the base parent to be useful, as it only contains the differences from the base parent virtual hard disk. It is important to note that it is the snapshot's parentCID property that is synced with the base parent's CID property, not just the two CID properties in the virtual hard disks - the two virtual hard disks are in a parent-child relationship.

    When you startup the base parent virtual hard disk on its own however, changes are made to that virtual hard disk without being in sync with the snapshot, so the CIDs no longer match.

    And when the CIDs no longer match, VMware complains because the snapshot is out of sync and the changes in the snapshot may not apply properly to the base parent anymore, possibly resulting in data corruption.

    By forcing the CIDs to match again, you effectively trick VMware into thinking it was never out of sync.

    Depending on how complex your virtual machine is though, it may be worth recreating your virtual machine after recovering your data because it won't be known where the corruption is, if any. If you did anything to the base parent virtual hard disk before realising and shutting down, e.g. copied files around, the risk of corruption is higher.

     

Comments

1.

Hi Samuel --

One suggestion: instead of opening the snapshot file to replace the parentCID number (which, as you point out, doesn't work if the snapshot is >2GB), use command line utilities to make the change.

I found my parent CID from the base vmdk with:

grep --text -m2 CID= {base vmdk}

and the "wrong" parent CID in the snapshot vmdk:

grep --text -m2 CID= {snapshot vmdk}

Then replaced the child CID using a sed command:

sed -e 's/{wrong CID}/{right CID}/' {snapshot vmdk} > {snapshot vmdk}

That should get it done!

Posted by Ian, 14th December 2007 8:54 AM
 
2.

Good idea Ian.

Gotta admit that thought never really crossed my mind as my snapshots were small enough. My Linux command/regexp skills aren't that awesome, so I had no idea about the sed command, but I'm kicking myself for not using grep to find the parentCID and CID lines - so obvious now.

Thanks for the tip!

Posted by [edgylogic] Sam, 16th December 2007 2:33 AM
 
3.

Thank you!

That certainly saved me from my own stupidity. Even before I had a chance to lose any sleep.

From now on my snapshots are going to experience very short lives.

Test and commit shall be the new motto.

Posted by Oliver, 2nd January 2008 9:52 PM
 
4.

fantastic advice

Posted by Francis, 10th January 2008 7:22 AM
 
5.

You friggen rock!  You saved my 6 hours of a night shift and 2 secs of stupidity!

Posted by Justin, 24th January 2008 12:54 AM
 
6.

Thank you! Great manual!

Posted by Lorenz, 18th February 2008 7:32 PM
 
7.

I would like to say thank you very much! This manual was very helpful. Now i will live longer.

if you have windows 32bit system you can open and save big files with the program "winhex". It is very fast - i tried it out because i had not linux on my notebook.

Posted by fallermax, 20th February 2008 3:04 AM
 
8.

What a day.. This really really saved me. Now I'll have to re-do our backup policy, keep everybody out of our vmware, but most of all CONGRATULATE you for your skills and knowledge. This saved me and now I have a much better understanding of those freaking snapshots. You are the MEN!

Posted by Lucas Violini, 5th March 2008 10:34 AM
 
9.

The outline of the fix is this:

1) BACK EVERYTHING UP

2) lookup the CID of the parent disk image

3) lookup the (incorrect) parentCID of the curdled snapshot

  (you'll need both to make the sed command as restrictive as possible)

4) KEEPING THE BACKUP, remove the original of the curdled snapshot file

5) pipe just the beginning of the curdled snapshot through sed to change the parentCID

     and save that as the beginning of the reconstructed snapshot

6) append the rest of the curdled snapshot to the end of the reconstructed snapshot

dd is the tool for snipping pieces of a HUGE file

And here's how it looks in practice:

[root@build12 virtual_machines]# cp -R sea-cm-winvm01 /backup

[root@build12 virtual_machines]# cd sea-cm-winvm01

[root@build12 sea-cm-winvm01]# head -10 /backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk

KDMV

# Disk DescriptorFile

version=1

CID=0d55cd6c

parentCID=b1ce363c                           <-- INCORRECT PARENT CID

createType="monolithicSparse"

parentFileNameHint="sea-cm-winvm01.vmdk"

# Extent description

RW 83886080 SPARSE "sea-cm-winvm01-000001.vmdk"

[root@build12 sea-cm-winvm01]# head -10 /backup/sea-cm-winvm01/sea-cm-winvm01.vmdk

KDM

Disk DescriptorFile

version=1

CID=d68511e8                                 <-- CORRECT PARENT CID

parentCID=ffffffff

createType="monolithicSparse"

# Extent description

RW 83886080 SPARSE "sea-cm-winvm01.vmdk"

[root@build12 sea-cm-winvm01]# rm sea-cm-winvm01-000001.vmdk

[root@build12 sea-cm-winvm01]# dd if=/backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk count=10 | sed 's/parentCID=b1ce363c/parentCID=d68511e8/' >sea-cm-winvm01-000001.vmdk

10+0 records in

10+0 records out

5120 bytes (5.1 kB) copied, 0.00722415 seconds, 709 kB/s

[root@build12 sea-cm-winvm01]# dd if=/backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk skip=10 seek=10 of=sea-cm-winvm01-000001.vmdk oflag=append

75301238+0 records in

75301238+0 records out

38554233856 bytes (39 GB) copied, 716.488 seconds, 53.8 MB/s

Posted by Mike Slass, 11th March 2008 10:39 AM
 
10.

Thanks Mike for that - the solutions for this problem are getting more and more streamlined :) The one thing I'd probably add is to pipe the head commands through grep to pick out only the CID and parentCID lines. A bash script anyone? (Although I'd rather do it line by line just to be sure; it's worth understanding how VMWare works underneath anyway.)

You gotta wonder why VMWare hasn't automated a solution for this yet given how common it seems to happen. Then again, I'm not sure if I want to use their solution, given their track record with VMWare Converter - it's extremely slow, and often randomly fails for no obvious reason.

Posted by [edgylogic] Sam, 12th March 2008 5:11 PM
 
11.

THANK YOU .... YOU JUST SAVED ME WITH YOUR BLOG...

I used "010 Editor" to edit the 30G file, which was very fast.. no loading time even.

Posted by WeSam, 16th April 2008 3:19 AM
 
12.

THANK YOU .... YOU JUST SAVED ME WITH YOUR BLOG...

I used "010 Editor" to edit the 30G file, which was very fast.. no loading time even.

Posted by WeSam, 16th April 2008 3:19 AM
 
13.

Thank you so much, you saved my life!

Posted by OMG, 26th June 2008 4:29 PM
 
14.

A MLLION THANKS!

I messed the VMDKs of our main production server after attaching the main VMDK to another virtual machine to add some Windows files. When I attached the HD to the original virtual machine, I didn't boot any more, came up with the dreaded "parent modified..." message.

Fixed it on ESX server 3.5 from the console, with "head --lines=20" and "nano", following your instructions. Worked perfectly! the main file was 137Gb and there was 3 snapshot files, about 10Gb each. The snapshots were linked from last to first and then to the main file (3->2->1->original)

After fixing the CIDs, the machine worked fine, even after having writing and then deleting some files inside the VMDK.

You are a Star!

Angel, Santiago de Compostela, Spain.

Posted by redfive, 28th June 2008 8:59 PM
 
15.

I have solve the issue follow your steps.

But I didn't work in Linux.

I make a simple tool for windows.

Main Code:

try

           {

               txtResult.Clear();

               StreamReader sr = new StreamReader(txtPath.Text);

               decimal Up = nudLines.Value;

               decimal i = 0;

               while (i < Up)

               {

                   txtResult.Text += sr.ReadLine();

                   txtResult.AppendText("\r\n");

                   i++;

               }

               sr.Close();

           }

           catch (Exception ex)

           {

               MessageBox.Show(ex.ToString());

           }

Posted by Tang, 31st July 2008 11:37 AM
 
16.

Great solution!!

It save me a lot of time. Because I don't have to reinstall the hole system.

Thankyou very much.

Posted by Martin, 3rd August 2008 7:56 PM
 
17.

I googled, found you, and you just saved my day. Quick, comprehensive, and easy.

Thanks a lot!

Posted by untill, 21st August 2008 8:18 PM
 
18.

Thank you. You save my life. I moved our primary domain controller only to find it would start up. AHH.

Your fix did the trick. In esx 3.5 the files you mention are much smaller now and the main disk is called ***flat.vdmk

Posted by Mark Fitzwater, 2nd September 2008 4:27 PM
 
19.

Guys... to sum it up : THANK YOU!!!

I too had the bad luck of a non-booting VM.

This page contains more relevant info than the rest of the web...

Again... THANKS, you guys saved me weeks of work!!

Bert

Posted by Bert De Ridder, 19th September 2008 5:49 AM
 
20.

Pefect this saved my bacon.  We had the issue described but the problem occured during a VCB backup.

Posted by Matt, 8th October 2008 1:31 PM
 
21.

You saved my day. 2 weeks of work where in that snapshot the i just clocked an old "Copy of ".vmx file.

I had more adrenaline than blood in me. If you are every looking for someone th marry you... ;-)

Thanx Ruediger

Posted by Ruediger, 9th November 2008 11:49 PM
 
22.

Thanks for your post. It got us through quite a pickle last night when ESXi blew up a VM during a snapshot deletion. Great stuff!

Posted by T. Lucas, 21st November 2008 4:30 AM
 
23.

Thanks for many hours saved

Posted by WT, 27th November 2008 4:04 PM
 
24.

For me this was the most useful blog entry since the beginning of blogs!

Somehow the CID's got messed up with the vmware-mount.pl command, so be careful with this and make a backup before using this command!

Thanks a lot!

Posted by Wolfgang Fiedler, 7th January 2009 7:23 AM
 
25.

Just a quick note. You can also get the Parent CID from the vmware.log It will say something like "Content ID mismatch (f6c96825 != f6c96826)."

Posted by puck, 23rd January 2009 8:46 AM
 
26.

After reading all of the Techno Babble, I finally came to an article that I can understand!!! Thank you many times!!!

Posted by Ken, 26th January 2009 11:48 AM
 
27.

I moved my VM (including snapshot) to a different blade, got this error message and had idea what to do. VMWare forums not really that helpfull or clear!

Thanks to your great article I'm up and running again.

Thanks very much, you saved my bacon >8)

Posted by Marcos, 10th March 2009 9:13 PM
 
28.

And here's one more sucker you saved with this article! Thank you very much!! Yesterday i noticed my vm-harddisk (60Gb) had grown to use 160 Gb (no typo..) of diskspace. Of course i didn't backup at that time because of lack of backup-facilities/diskspace at that moment.. So went further and further from home... :0)

Anyway, thnx once more!!

regards,

Peter

Posted by Peter, 14th March 2009 9:33 PM
 
29.

guys, i am new to vmware and have an esx 3i server running with 3 vm's. one of my colleagues has tried to clean up some snapshots and is now getting this error. how do i edit/access the vmdk files? is there a way to do this from the Infrastructure client or can i run the linux commands on the actual VMware server itself? pretty deperate - this is (or was) a live server. i have data backups but dont want to rebuild if the fix here is valid for my situation.

Posted by damian, 6th April 2009 12:47 AM
 
30.

You are a genius! Thank you

Posted by ricky, 30th April 2009 9:33 AM
 
31.

I use the following to fix this problem:

1.  putty into the host

2.  run vmware-cmd -l to find the path of the bad VM

3. CD /path/to/vm/

4. cat NAMEOFTHEDISK-xxxxxx.vmdk (for hard disk 1)

5. (A) cat NAMEOFPARENTDISK.vmdk (shown in the previous command's output for parentFileNameHint

5. (B) keep running cat parent.vmdk until you have displayed each snapshot, it's parent --> to the base.vmdk disk

example...

[root@VMHost01 Server1]# cat SERVER1-000001.vmdk

# Disk DescriptorFile

version=1

CID=fe498eca

parentCID=66ed665b

createType="vmfsSparse"

parentFileNameHint="SERVER1.vmdk"

# Extent description

RW 35358082 VMFSSPARSE "SERVER1-000001-delta.vmdk"

# The Disk Data Base

#DDB

[root@VMHost01 Server1]# cat SERVER1.vmdk

# Disk DescriptorFile

version=1

CID=66ed665b

parentCID=ffffffff

createType="vmfs"

# Extent description

RW 35358082 VMFS "SERVER1-flat.vmdk"

# The Disk Data Base

#DDB

ddb.adapterType = "buslogic"

ddb.geometry.sectors = "63"

ddb.geometry.heads = "255"

ddb.geometry.cylinders = "2200"

ddb.uuid = "60 00 C2 9e 7c 4c 5e c4-ea f5 d8 1e 6c 36 06 40"

ddb.geometry.biosSectors = "63"

ddb.geometry.biosHeads = "255"

ddb.geometry.biosCylinders = "2200"

ddb.toolsVersion = "7299"

ddb.virtualHWVersion = "4"

6.  Notice the CID and ParentCID entries of the output:

server1-000001.vmdk

CID=fe498eca

parentCID=332a8cca   <---- THIS ONE IS NOT POINTING TO...

server1.vmdk

CID=66ed665b    <---- THIS ONE

parentCID=ffffffff

7.  run the following:

nano server1-000001.vmdk

edit the parentCID by overwriting 332a8cca with 66ed665b

Do CTRL+X and then answer 'Y' to save the changes

8.  now go back and show the output again (use the cat commands like before) each parentCID should be pointing to the parent file that VMWare expects as listed in the parentFileNameHint.

9.  once this is completed if you do not need the snapshot you should also go to the VMClient and go to snapshot manager and delete all snapshots.  If there are no snapshots to delete, create one, then immediately delete it.  This should remove all snapshots.

*** NOTE *** if you have to create a snapshot, you may want to check it's CID/ParentCID for all disks to make sure VMWare didn't do something stupid like create a snapshot file with a CID and ParentCID pointing to itself.  If that occurs, just fix the pointers like before and then delete all snapshots.

This works for me 100% of the time when I have any of the corruptRedo log errors, Parent had been modified errors, bad CID/ParentCID issues, or VM in stuck state due to failure to consolidate snapshots after VCB backups

Posted by Jones, 27th May 2009 1:37 AM
 
32.

Thanks for your help guys !

You saved me :)

Particulary regarding the tool to edit a 60GB vmdk file very quickly with no delay !! (010 Editor... great tool !)

Posted by Marc BENISTY, 18th June 2009 12:52 AM
 
33.

You are a true Saint.  After accidently clicking on the vmdk file while backing up my Mac, I got the dreaded error.   5 hours into the ordeal, I finally got things back up and running.  It took hours to back everything up first.  Then, I couldn't find a good editor that could handle a 14Gig snapshot on the Mac.   I finally found 0xED which Rocked!!!

www.suavetech.com/.../0xed.html

Great text editor loads the file instantly  -a quick hex edit of the Parent CID on the snapshot file and I was back up and running.   Keep up the good work and thanks for posting such a concise solution.    Going to bed now...

Posted by George K, 20th July 2009 6:53 PM
 
34.

I'm having a crisis with the same issue!  I need to get the Outlook data off of this silly VMWare Fusion windows XP partition.

Posted by Aaron, 9th August 2009 12:44 AM
 
35.

Thanks for this article!! You saved my ass, thank you!

Posted by Jeff, 27th August 2009 3:03 PM
 
36.

Thanks a lot for googleing around and cutting the information down to the essential point - you saved me really much time rescuing my VM :o)

And as it seems, I'm not the only stupid one who screws up links to parent VM-Disks etc. :o)

Posted by Sepp, 9th September 2009 11:05 PM
 
37.

Hi all,

just a maybe stupid question:

can I recover VM with snapshots from main *.vmdk,  *flat.vmdk and only *delta.vmdk (without describtor file *.vmdk)?

more precise, I have:

server.vmdk

server-flat.vmdk

server-000002-delta.vmdk

server-000003-delta.vmdk

I successfully recovered VM from only *flat.vmdk, however, withnout snapshots - so it is possible to recover from *delta.vmdk all the rest files, like *.vmsn, *.vmdk(for *delta.vmdk)

Posted by lemonadecowboy, 3rd October 2009 12:37 AM
 
38.

Hi,

i follow your steps but when i try to start vm i get "Failed to retrieve disk information for: xxx.vmdk" Success

and i can't startup :(

can you help me?

Posted by Rui, 4th November 2009 12:42 AM
 
39.

In case anyone needs more background to solve similar problems see sanbarrow.com/sickbay.html

Posted by Ulli, 30th November 2009 9:40 AM
 
40.

Thank You very very much.

I was trying to be slick by using a common VMDK on a RAM disk and run multiple concurrent copies of VMs each with their own vmx and snapshots. That part had been working for months.

Then I set one of my vmx's to use the base vmdk as independent-nonpersistent. That broke the entire chain and nothing would boot! I got this sick sick feeling in my stomach. Then I read your blog and I was hopeful. I updated the CID in my snapshots and it all worked.

Thank You, Thank You, Thank You.

Posted by Jonathan Marianu, 2nd February 2010 5:03 PM
 
41.

If you have a large amount of snapshots finding where is the broken CID can take a while.

Here is a script that will do that check and many others for you.

http://vmutils.blogspot.com/

And here is a video to help you avoiding surprises with the snapshots.

www.youtube.com/watch

Posted by Mr. TSE, 26th February 2010 8:45 AM
 
42.

Works like a charm saved my tons of work

Posted by Du-Wayne, 28th March 2010 1:23 PM
 
43.

I am about to try this now, wish me luck!

Posted by Toby, 7th April 2010 6:20 PM
 
44.

Unfortunately this didn't work for me, I had to use this method rdowell.blogspot.com/.../parent-virtual-disk-has-been-modified.html

Posted by Toby, 7th April 2010 6:43 PM
 
45.

Thanks this saved me.  BTW.  anther JGSoft product PowerGrep (makers of EditPad) lets your find and edit the CID very easily.

Posted by Steven T. Cramer, 23rd August 2010 3:09 AM
 
46.

Dude! You saved my life with this article. Thank you so much!

Posted by Jeroen, 10th September 2010 10:00 PM
 

New comments have been disabled for this post.

If you have something to ask about this post, drop me a message.