All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd.schubert@fastmail.fm>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "Brian J. Murrell" <brian@interlinx.bc.ca>, kvm@vger.kernel.org
Subject: Re: disk corruption after virsh destroy
Date: Sat, 06 Jul 2013 15:03:07 +0200	[thread overview]
Message-ID: <51D8158B.6020504@fastmail.fm> (raw)
In-Reply-To: <20130703084726.GB17434@stefanha-thinkpad.muc.redhat.com>

On 07/03/2013 10:47 AM, Stefan Hajnoczi wrote:
> On Tue, Jul 02, 2013 at 10:40:11AM -0400, Brian J. Murrell wrote:
>> I have a cluster of VMs setup with shared virtio-scsi disks.  The
>> purpose of sharing a disk is that if a VM goes down, another can
>> pick up and mount the (ext4) filesystem on shared disk a provide
>> service to it.
>>
>> But just to be super clear, only one VM ever has a filesystem
>> mounted at a time even though multiple VMs technically can access
>> the device at the same time.  A VM mounting a filesystem ensures
>> absolutely that no other node has it mounted before mounting it.
>>
>> That said, what I am finding is that when one a node dies and
>> another node tries to mount the (ext4) filesystem, it is found dirty
>> and needs an fsck.
>>
>> My understanding is that with ext{3,4}, this should not be the case
>> and indeed it is my experience, on real hardware with coherent disk
>> caching (i.e. no non-battery-backed caching disk controllers lying
>> to the O/S about what has been written to physical disk) that this
>> is the case. That is, a node failing does not leave an ext{3,4}
>> filesystem dirty such that it needs an fsck.
>>
>> So, clearly, somewhere between the KVM VM and the physical disk,
>> there is a cache that is resulting in the guest O/S believing data
>> is being written to physical disk that is not actually being written
>> there.  To that end, I have ensured that on these shared disks that
>> I set "cache=none", but this does not seem to have fixed the
>> problem.
> 
> I expect journal replay and possibly fsck when an ext4 file system was
> left in a mounted state and with I/O pending (e.g. due to power
> failure).
> 
> A few questions:
> 
> 1. Is the guest mounting the file system with barrier=0?  barrier=1 is
>    the default.
> 
> 2. Do the physical disks have a volatile write cache enabled (if yes,
>    the guest should use barrier=1)?  If the physical disks have a
>    non-volatile write cache or the write cache is disabled (then
>    barrier=0 is okay).

Er, why? The As far as I understood Brian the physical disks have not
been reset, so their cache should be irrelevant?
If the VM needs barrier=1, then there must be some VM caching involved,
but Brian tried to disable that. At least in the past that worked fine
with the emulated LSI scsi controller. That way I simulated shared
storage and as long as I used the raw disks format and cache=none there
never had been any corruption (although qcow2 didn't work and introduced
issues).

Brian, maybe you could figure out the pattern of the corruption? I need
to add a very mode to ql-fstest, but with some network file systems on
top of ext4 it also should work.

https://bitbucket.org/aakef/ql-fstest


Cheers,
Bernd


  reply	other threads:[~2013-07-06 13:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-02 14:40 disk corruption after virsh destroy Brian J. Murrell
2013-07-02 15:26 ` Brian J. Murrell
2013-07-03  8:47 ` Stefan Hajnoczi
2013-07-06 13:03   ` Bernd Schubert [this message]
2013-07-15  1:23     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51D8158B.6020504@fastmail.fm \
    --to=bernd.schubert@fastmail.fm \
    --cc=brian@interlinx.bc.ca \
    --cc=kvm@vger.kernel.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.