Xen guest and and loop device being busy

I had accidentally given the same UUID on two guests and I didn’t realise it until I tried to start the second one. I generated a new UUID but the guest was still failing, but this time with a different error, like its disk image was used by another guest.

# xm create node6
Using config file "/etc/xen/node6".
Error: Device 768 (vbd) could not be connected.
File /data/guests/node6.img is loopback-mounted through /dev/loop5,
which is mounted in a guest domain,
and so cannot be mounted now

A bit odd as I was sure that the disk image was not used by another guest. After a couple of tries, I used the ‘lsof’ command to check what processes are using the guests’ images. I have all the disk images under /data/guests:

# lsof +D /data/guests/ | grep node6
qemu-dm  6250 root    6u   REG   8,17 4194304000 161693706 /data/guests/node6.img
qemu-dm  7121 root    6u   REG   8,17 4194304000 161693706 /data/guests/node6.img
qemu-dm  8906 root    6u   REG   8,17 4194304000 161693706 /data/guests/node6.img
qemu-dm 11262 root    6u   REG   8,17 4194304000 161693706 /data/guests/node6.img

Four different processes pointing at the disk image of node6 guest. The first one being the first attempt to boot the guest with the wrong UUID and the other ones my attempts on trying to boot the guest after I changed the UUID. As none of these processes was corresponding to a running instance of the guest, I killed all of them:

# for i in `lsof +D /data/guests/ | grep node6 | awk {'print $2'}`;do kill -9 $i;done

As a side note ,if one of these was the last, and successful, attempt to run the guest, then I had to identify which one is the running instance by checking the PIDs. The following command would return the PIDs of the failed attempts to start the guest:

# lsof +D /data/guests/ | grep node6 | grep -v \
`ps aux | grep node6 | grep -v grep | awk {'print $2'}` | awk {'print $2'}

Once I killed the processes, I checked again with lsof and everything looked good:

# lsof +D /data/guests/
COMMAND  PID USER   FD   TYPE DEVICE       SIZE      NODE NAME
qemu-dm 3345 root    6u   REG   8,17 4194304000 161693699 /data/guests/node1.img
qemu-dm 3633 root    6u   REG   8,17 4194304000 161693702 /data/guests/node2.img
qemu-dm 3788 root    6u   REG   8,17 4194304000 161693703 /data/guests/node3.img
qemu-dm 3945 root    6u   REG   8,17 4194304000 161693704 /data/guests/node4.img
qemu-dm 4154 root    6u   REG   8,17 4194304000 161693705 /data/guests/node5.img
qemu-dm 4513 root    6u   REG   8,17 4194304000 161693708 /data/guests/node8.img
qemu-dm 4996 root    6u   REG   8,17 4194304000 161693707 /data/guests/node7.img

But trying to boot the host again gives the same error. I then powered off all of the running guests (node1-5,7,8) and tried to to boot node6. It failed again with the same error. I was a bit puzzled as there was no zombie instance listed by ‘xm list’ and no zombie guest ID listed by XenStore when running ‘xenstore-list backend/vbd’. But weirdly enough, previous experience says that without any guests running, the backend/vbd shouldn’t be present but it was. I could thought only of a “ghost” zombie instance keeping busy the disk image as during my attempt to start node6 with the wrong UUID I got node6 to be zombie at some point. My last try before rebooting the whole system was to remove the whole backend/vbd:

# xenstore-rm backend/vbd

Once I did, I restarted the xend daemon and tried to start node6 once again. It worked! I then booted and the rest of the guests and every one of them was happy.

I still can’t determine what was the exact cause of that but I guess it was because of the “ghost” zombie guest due to its failed startup attempt with the wrong UUID while there was another guest instance running with the same UUID.

Note: It’s really funny how “kills”, “zombies”, “daemons” and “ghosts” go along with computers 😛

Advertisement

One thought on “Xen guest and and loop device being busy

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s