Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: "Brian J. Murrell" <brian@interlinx.bc.ca>
To: kvm@vger.kernel.org
Subject: disk corruption after virsh destroy
Date: Tue, 02 Jul 2013 10:40:11 -0400	[thread overview]
Message-ID: <kquoo5$a80$1@ger.gmane.org> (raw)

I have a cluster of VMs setup with shared virtio-scsi disks.  The 
purpose of sharing a disk is that if a VM goes down, another can pick up 
and mount the (ext4) filesystem on shared disk a provide service to it.

But just to be super clear, only one VM ever has a filesystem mounted at 
a time even though multiple VMs technically can access the device at the 
same time.  A VM mounting a filesystem ensures absolutely that no other 
node has it mounted before mounting it.

That said, what I am finding is that when one a node dies and another 
node tries to mount the (ext4) filesystem, it is found dirty and needs 
an fsck.

My understanding is that with ext{3,4}, this should not be the case and 
indeed it is my experience, on real hardware with coherent disk caching 
(i.e. no non-battery-backed caching disk controllers lying to the O/S 
about what has been written to physical disk) that this is the case. 
That is, a node failing does not leave an ext{3,4} filesystem dirty such 
that it needs an fsck.

So, clearly, somewhere between the KVM VM and the physical disk, there 
is a cache that is resulting in the guest O/S believing data is being 
written to physical disk that is not actually being written there.  To 
that end, I have ensured that on these shared disks that I set 
"cache=none", but this does not seem to have fixed the problem.

Here is my KVM commandline.  Please bear with the unfortunate line 
wrapping since my MUA (Thunderbird) doesn't allow for one to specify 
lines which shouldn't be wrapped.  I have tried to ameliorate that by 
indenting all of the lines that start command line options with two spaces.

/usr/bin/qemu-kvm -name wtm-60vm5 -S -M pc-0.14 -enable-kvm -m 8192 \
   -smp 1,sockets=1,cores=1,threads=1 \
   -uuid 5cbc2568-e32d-11e2-9c1f-001e67293bea -no-user-config \
   -nodefaults \
   -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/wtm-60vm5.monitor,server,nowait 
\
   -mon chardev=charmonitor,id=monitor,mode=control \
   -rtc base=utc -no-shutdown \
   -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
   -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -drive 
file=/var/lib/libvirt/images/wtm-60vm5.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=node1-root
   -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
   -drive 
file=/dev/vg_00/disk1,if=none,id=drive-scsi0-0-0-0,format=raw,serial=disk1,cache=none\
   -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0
   -drive 
file=/dev/vg_00/disk2,if=none,id=drive-scsi0-0-0-1,format=raw,serial=disk2,cache=none 
\
   -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1 
\
   -drive 
file=/dev/vg_00/disk3,if=none,id=drive-scsi0-0-0-2,format=raw,serial=disk3,cache=none 
\
   -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi0-0-0-2,id=scsi0-0-0-2 
\
   -drive 
file=/dev/vg_00/disk4,if=none,id=drive-scsi0-0-0-3,format=raw,serial=disk4,cache=none 
\
   -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=3,drive=drive-scsi0-0-0-3,id=scsi0-0-0-3 
\
   -drive 
file=/dev/vg_00/disk5,if=none,id=drive-scsi0-0-0-4,format=raw,serial=disk5,cache=none 
\
   -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=4,drive=drive-scsi0-0-0-4,id=scsi0-0-0-4 
\
   -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=27 \
   -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:60:d9:05,bus=pci.0,addr=0x3 
\
   -netdev tap,fd=31,id=hostnet1,vhost=on,vhostfd=32
   -device 
virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:60:a7:05,bus=pci.0,addr=0x8 
-chardev pty,id=charserial0 \
   -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:3 \
   -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 \
   -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7

Clearly it's the 5 scsi disks which are yielding corruption when a VM is 
destroyed with "virsh destroy".

Any ideas on what I need to do to ensure that writes at the guest O/S 
layer which are to be sent to physical disk actually make it to physical 
disk on the host?

Of course, I am happy to provide any additional information, debugging, 
etc. that may be needed.

Cheers,
b.


             reply	other threads:[~2013-07-02 14:40 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-02 14:40 Brian J. Murrell [this message]
2013-07-02 15:26 ` disk corruption after virsh destroy Brian J. Murrell
2013-07-03  8:47 ` Stefan Hajnoczi
2013-07-06 13:03   ` Bernd Schubert
2013-07-15  1:23     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='kquoo5$a80$1@ger.gmane.org' \
    --to=brian@interlinx.bc.ca \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox