All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible RBD inconsistencies with kvm+Windows 7
@ 2012-02-03 18:19 Josh Pieper
  2012-02-03 19:55 ` Josh Durgin
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Pieper @ 2012-02-03 18:19 UTC (permalink / raw)
  To: ceph-devel

I have a Windows 7 guest running under kvm/libvirt with RBD as a
backend to a cluster of 3 OSDs.  With this setup, I am seeing behavior
that looks suspiciously like disk corruption in the guest VM executing
some of our workloads.

For instance, in one occurance, there is a python function that
recursively deletes a large directory tree while the disk is otherwise
loaded.  For us, this occasionally fails because the OS reported that
all the files in the directory were deleted, but then reports the
directory is not empty when going to remove it.  In another, a simple
test application writes new files to a directory every 50ms, then
after 6s verifies that at least 3 files were written, also while the
disk is under heavy load.

We have never ever seen these failures on bare metal, or on kvm
instances backed by a LVM volume in years of operation, but they
happen every couple of hours with RBD.  Unfortunately, I have been
unsuccessful when attempting to create synthetic test cases to
demonstrate the inconsistent RBD behavior.

Has anyone else seen similar inconsistent RBD behavior, or have ideas
how to diagnose further?

For reference, I am running ceph 0.41, qemu-kvm 1.0 on ubuntu 11.10
amd64.

Regards,
Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible RBD inconsistencies with kvm+Windows 7
  2012-02-03 18:19 Possible RBD inconsistencies with kvm+Windows 7 Josh Pieper
@ 2012-02-03 19:55 ` Josh Durgin
  2012-02-03 20:15   ` Josh Pieper
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2012-02-03 19:55 UTC (permalink / raw)
  To: Josh Pieper; +Cc: ceph-devel

On 02/03/2012 10:19 AM, Josh Pieper wrote:
> I have a Windows 7 guest running under kvm/libvirt with RBD as a
> backend to a cluster of 3 OSDs.  With this setup, I am seeing behavior
> that looks suspiciously like disk corruption in the guest VM executing
> some of our workloads.
>
> For instance, in one occurance, there is a python function that
> recursively deletes a large directory tree while the disk is otherwise
> loaded.  For us, this occasionally fails because the OS reported that
> all the files in the directory were deleted, but then reports the
> directory is not empty when going to remove it.  In another, a simple
> test application writes new files to a directory every 50ms, then
> after 6s verifies that at least 3 files were written, also while the
> disk is under heavy load.
>
> We have never ever seen these failures on bare metal, or on kvm
> instances backed by a LVM volume in years of operation, but they
> happen every couple of hours with RBD.  Unfortunately, I have been
> unsuccessful when attempting to create synthetic test cases to
> demonstrate the inconsistent RBD behavior.
>
> Has anyone else seen similar inconsistent RBD behavior, or have ideas
> how to diagnose further?

What fs are your osds using? A while ago there was a bug in ext4's
fiemap that sometimes caused incorrect reads - if you set
filestore_fiemap_threshold larger than your object size, you can test
whether fiemap is the problem.

Are you using the rbd_writeback_window option? If so, does the
corruption occur without it?

In any case, a log of this occurring with debug_ms=1 and debug_rbd=20 
from qemu will tell us if there are out-of-order operations happening.

>
> For reference, I am running ceph 0.41, qemu-kvm 1.0 on ubuntu 11.10
> amd64.
>
> Regards,
> Josh


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible RBD inconsistencies with kvm+Windows 7
  2012-02-03 19:55 ` Josh Durgin
@ 2012-02-03 20:15   ` Josh Pieper
  2012-04-02 14:58     ` Josh Pieper
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Pieper @ 2012-02-03 20:15 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel

Josh Durgin wrote:
> On 02/03/2012 10:19 AM, Josh Pieper wrote:
> >I have a Windows 7 guest running under kvm/libvirt with RBD as a
> >backend to a cluster of 3 OSDs.  With this setup, I am seeing behavior
> >that looks suspiciously like disk corruption in the guest VM executing
> >some of our workloads.
> >
> >For instance, in one occurance, there is a python function that
> >recursively deletes a large directory tree while the disk is otherwise
> >loaded.  For us, this occasionally fails because the OS reported that
> >all the files in the directory were deleted, but then reports the
> >directory is not empty when going to remove it.  In another, a simple
> >test application writes new files to a directory every 50ms, then
> >after 6s verifies that at least 3 files were written, also while the
> >disk is under heavy load.
> >
> >We have never ever seen these failures on bare metal, or on kvm
> >instances backed by a LVM volume in years of operation, but they
> >happen every couple of hours with RBD.  Unfortunately, I have been
> >unsuccessful when attempting to create synthetic test cases to
> >demonstrate the inconsistent RBD behavior.
> >
> >Has anyone else seen similar inconsistent RBD behavior, or have ideas
> >how to diagnose further?
> 
> What fs are your osds using? A while ago there was a bug in ext4's
> fiemap that sometimes caused incorrect reads - if you set
> filestore_fiemap_threshold larger than your object size, you can test
> whether fiemap is the problem.

The OSDs are using xfs.  In my testing with 0.40, btrfs had incredible
performance problems after a day or so of operation.  The last I
heard, ext4 could potentially have data loss due to its limited xattr
support.

> Are you using the rbd_writeback_window option? If so, does the
> corruption occur without it?

Yes I was.  In prior tests, performance was abysmal without it.  I
will test without it, but our runs will load the system very
differently when they are going so slowly.

> In any case, a log of this occurring with debug_ms=1 and
> debug_rbd=20 from qemu will tell us if there are out-of-order
> operations happening.

Great, I will attempt to record some.

Regards,
Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible RBD inconsistencies with kvm+Windows 7
  2012-02-03 20:15   ` Josh Pieper
@ 2012-04-02 14:58     ` Josh Pieper
  2012-04-02 15:17       ` Josh Durgin
  0 siblings, 1 reply; 5+ messages in thread
From: Josh Pieper @ 2012-04-02 14:58 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel

Josh Pieper wrote:
> Josh Durgin wrote:
> > On 02/03/2012 10:19 AM, Josh Pieper wrote:
> > >I have a Windows 7 guest running under kvm/libvirt with RBD as a
> > >backend to a cluster of 3 OSDs.  With this setup, I am seeing behavior
> > >that looks suspiciously like disk corruption in the guest VM executing
> > >some of our workloads.
> > >
> > >For instance, in one occurance, there is a python function that
> > >recursively deletes a large directory tree while the disk is otherwise
> > >loaded.  For us, this occasionally fails because the OS reported that
> > >all the files in the directory were deleted, but then reports the
> > >directory is not empty when going to remove it.  In another, a simple
> > >test application writes new files to a directory every 50ms, then
> > >after 6s verifies that at least 3 files were written, also while the
> > >disk is under heavy load.
> > >
> > >We have never ever seen these failures on bare metal, or on kvm
> > >instances backed by a LVM volume in years of operation, but they
> > >happen every couple of hours with RBD.  Unfortunately, I have been
> > >unsuccessful when attempting to create synthetic test cases to
> > >demonstrate the inconsistent RBD behavior.
> > >
> > >Has anyone else seen similar inconsistent RBD behavior, or have ideas
> > >how to diagnose further?
> > 
> > What fs are your osds using? A while ago there was a bug in ext4's
> > fiemap that sometimes caused incorrect reads - if you set
> > filestore_fiemap_threshold larger than your object size, you can test
> > whether fiemap is the problem.
> 
> The OSDs are using xfs.  In my testing with 0.40, btrfs had incredible
> performance problems after a day or so of operation.  The last I
> heard, ext4 could potentially have data loss due to its limited xattr
> support.
> 
> > Are you using the rbd_writeback_window option? If so, does the
> > corruption occur without it?
> 
> Yes I was.  In prior tests, performance was abysmal without it.  I
> will test without it, but our runs will load the system very
> differently when they are going so slowly.
> 
> > In any case, a log of this occurring with debug_ms=1 and
> > debug_rbd=20 from qemu will tell us if there are out-of-order
> > operations happening.
> 
> Great, I will attempt to record some.

Reponse much delayed.

I have finally gotten around to doing more tests here, now with ceph
0.44.1, although the kvm version is still the same at 1.0.

Disabling the rbd_writeback_window option definitely makes all the
problems clear up.  With it on, I can trigger a failure approximately
2 or 3 times per day, whereas with it off, I have been problem free
for a week now.

I have not yet managed to get our kvm to run with the appropriate
logging parameters.  For various reasons it is a lot easier for our
kvm's to run through libvirt.  I have been passing the
rbd_writeback_window option by just appending a
":rbd_writeback_window=x" to the filename in my libvirt xml file.
Doing the same thing with debug_rbd didn't appear to get the option to
the right place no matter which form I tried.  Is there any secret
easy way to get kvm/qemu rbd debugging options turned on when invoked
through libvirt?

Regards,
Josh Pieper

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible RBD inconsistencies with kvm+Windows 7
  2012-04-02 14:58     ` Josh Pieper
@ 2012-04-02 15:17       ` Josh Durgin
  0 siblings, 0 replies; 5+ messages in thread
From: Josh Durgin @ 2012-04-02 15:17 UTC (permalink / raw)
  To: Josh Pieper; +Cc: ceph-devel

On 04/02/2012 07:58 AM, Josh Pieper wrote:
> Josh Pieper wrote:
>> Josh Durgin wrote:
>>> On 02/03/2012 10:19 AM, Josh Pieper wrote:
>>>> I have a Windows 7 guest running under kvm/libvirt with RBD as a
>>>> backend to a cluster of 3 OSDs.  With this setup, I am seeing behavior
>>>> that looks suspiciously like disk corruption in the guest VM executing
>>>> some of our workloads.
>>>>
>>>> For instance, in one occurance, there is a python function that
>>>> recursively deletes a large directory tree while the disk is otherwise
>>>> loaded.  For us, this occasionally fails because the OS reported that
>>>> all the files in the directory were deleted, but then reports the
>>>> directory is not empty when going to remove it.  In another, a simple
>>>> test application writes new files to a directory every 50ms, then
>>>> after 6s verifies that at least 3 files were written, also while the
>>>> disk is under heavy load.
>>>>
>>>> We have never ever seen these failures on bare metal, or on kvm
>>>> instances backed by a LVM volume in years of operation, but they
>>>> happen every couple of hours with RBD.  Unfortunately, I have been
>>>> unsuccessful when attempting to create synthetic test cases to
>>>> demonstrate the inconsistent RBD behavior.
>>>>
>>>> Has anyone else seen similar inconsistent RBD behavior, or have ideas
>>>> how to diagnose further?
>>>
>>> What fs are your osds using? A while ago there was a bug in ext4's
>>> fiemap that sometimes caused incorrect reads - if you set
>>> filestore_fiemap_threshold larger than your object size, you can test
>>> whether fiemap is the problem.
>>
>> The OSDs are using xfs.  In my testing with 0.40, btrfs had incredible
>> performance problems after a day or so of operation.  The last I
>> heard, ext4 could potentially have data loss due to its limited xattr
>> support.
>>
>>> Are you using the rbd_writeback_window option? If so, does the
>>> corruption occur without it?
>>
>> Yes I was.  In prior tests, performance was abysmal without it.  I
>> will test without it, but our runs will load the system very
>> differently when they are going so slowly.
>>
>>> In any case, a log of this occurring with debug_ms=1 and
>>> debug_rbd=20 from qemu will tell us if there are out-of-order
>>> operations happening.
>>
>> Great, I will attempt to record some.
>
> Reponse much delayed.
>
> I have finally gotten around to doing more tests here, now with ceph
> 0.44.1, although the kvm version is still the same at 1.0.
>
> Disabling the rbd_writeback_window option definitely makes all the
> problems clear up.  With it on, I can trigger a failure approximately
> 2 or 3 times per day, whereas with it off, I have been problem free
> for a week now.

That's good to hear. If this does turn out to be a request ordering 
problem from rbd_writeback_window, rbd caching should fix it.

> I have not yet managed to get our kvm to run with the appropriate
> logging parameters.  For various reasons it is a lot easier for our
> kvm's to run through libvirt.  I have been passing the
> rbd_writeback_window option by just appending a
> ":rbd_writeback_window=x" to the filename in my libvirt xml file.
> Doing the same thing with debug_rbd didn't appear to get the option to
> the right place no matter which form I tried.  Is there any secret
> easy way to get kvm/qemu rbd debugging options turned on when invoked
> through libvirt?

You probably just need to add 'log_to_stderr=1', or 'log_file=path/to/file'.

Thanks!
Josh

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-02 15:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-03 18:19 Possible RBD inconsistencies with kvm+Windows 7 Josh Pieper
2012-02-03 19:55 ` Josh Durgin
2012-02-03 20:15   ` Josh Pieper
2012-04-02 14:58     ` Josh Pieper
2012-04-02 15:17       ` Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.