6

I want to recreate a dynamically allocated qcow2 image in order to shrink it. Is it sufficient that all unnecessary files have been deleted, or do I also need to fill the space formerly occupied by those files with zeros? In other words, is qemu-img filesystem-aware?

Josh
  • 61

5 Answers5

4

Yes, you do need to zero-fill the filesystem if you want to recover the space used by deleted files. And no, qemu-img isn't fs-aware.

I forgot to do this for one VM image I created today (a minimal Debian Sid image for my openstack cloud at work) and it ended up being almost 900MB, even with "-c" for qcow2 compression.

I recreated it after running "dd if=/dev/zero of=/root/zero ; rm -f /root/zero ; shutdown -h now", and the image size shrunk down to about 335MB. That's a lot less (worthless) data to copy around whenever I start up a new instance.

there were a lot of deleted files, because the VM started out as debian squeeze and was apt-get upgraded to sid.

cas
  • 6,841
3

See also: virt-sparsify, an utility which can zero-fill filesystems inside disk images (supporting various formats):

http://libguestfs.org/virt-sparsify.1.html

2

I´m using zerofree ( apt-get install zerofree ) for this task:

Zerofree finds the unallocated blocks with non-zero value content in an ext2, ext3 or ext4 file-system and fills them with zeroes

after that you can shrink your image:
kvm-img convert -O qcow2 original_image.qcow2 deduplicated_image.qcow2

ThorstenS
  • 3,170
0

Just to amplify on phoeagon's excellent suggestion which, I agree, is both quicker and better than sdelete. The reason for this is that e2image, ntfsclone and partclone all have file system knowledge (partclone is the utility on which clonezilla is ultimately based). The whole disk is not scanned, only the files actually in use are saved to a new image.

The sdelete under Windows expands a sparse image to it's full size, and is super slow. Zerofree under Linux on ext4 partitions does not expand disk images (but is still quite slow).

Here's an example session for cloning and shrinking a virtualbox vdi disk image to a qemu qcow2 image on a Linux host.

These commands are done as root; if done incorrectly, they could wipe your image file or your host! Be very careful.

For a mbr based windows 10 vdi image to qcow2 (to show two popular virtualisation formats).

#Check file system size
vboximg-mount -i "$(/bin/pwd)/win10.vdi" --list

#Create a (big) sparse qcow2 to accommodate the partitions found in

the command above

qemu-img create -f qcow2 new.qcow2 300G

#Mount the vdi disk in ./mount with the

VirtualBox mount utility.

#We don't mount the individual partitions in ./mount vboximg-mount -i "$(/bin/pwd)/win10.vdi" --rw mount

#Mount the disk new qcow2 image.

connecting, in this instance, to /dev/nbd0

modprobe nbd max_part=16 qemu-nbd --connect /dev/nbd0 ./new.qcow2

#Change to the vdi mounted directory cd mount

#Copy the mbr with the partition table #./vhdd is the raw device. #For EFI based boot systems, you'd be better to use

sgdisk:

sgdisk --backup=../part_table.bin ./vhdd

sgdisk --load-backup=../part_table.bin /dev/nbd0

sgdisk -G /dev/nbd0

#But for mbr: dd if=./vhdd of=/dev/nbd0 bs=512B count=1

#Check the image has our partition table and make sure it's

device nodes have been setup by a kernel re-read.

fdisk /dev/nbd0 ls -la /dev/nbd0* #vbox partitions are zero based (vol0,vol1,vol2); nbd based

partitions start at 1 (nbd0p1,nbd0p2,nbd0p3).

#Copy the first partition properly with dd #It's only 100MB. dd if=./vol0 of=/dev/nbd0p1 bs=4M status=progress

#Copy the third partition properly with dd dd if=./vol2 of=/dev/nbd0p3 bs=4M status=progress

#Use ntfsclone or partclone.ntfs to copy the main partition partclone.ntfs -d -b -s ./vol1 -o /dev/nbd0p2

#Get out of the ./mount dir cd ..

#Umount vdi umount mount

#Disconnect the nbd device qemu-nbd --disconnect /dev/nbd0 #Remove the module sleep 2 rmmod nbd

mv new.qcow2 win10.qcow2 #Change the image to not be root chown user_name: win10.qcow2

If you are new to all this, you will find that images swell to a given size (though usually less than the size from which they were originally reduced) because operating systems, especially windows, do create and delete a lot of temporary files.

Thanks once again to phoeagon for pointing me in this direction (I haven't got the rep to up-vote his answer).

rahrah
  • 1
0

Personally, I think it works better to clone the disk using Clonezilla or Symantec Ghost. It's a lot quicker than filling up the drive with zeros. Also it avoids the growing the image even more.

I have done this with Ghost and Win guests countless times. It's actually quicker if "used space" is smaller than those to be zeroed. Also you can use qemu-nbd to mount the images and run Clonezilla from host, avoiding the hassle of Clonezilla-within-guest. Either way it's always much quicker than sdelete/dd in my experience. (Also I often end up with no space available on host for a full zero-out-guest-disk operation, so filling up available space in guest in seldom feasible to me.)

phoeagon
  • 101