All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Christoph Hellwig <hch@lst.de>
Cc: xen-devel@lists.xensource.com, qemu-devel@nongnu.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: [Xen-devel] Re: [Qemu-devel] [PATCH] qemu and qemu-xen: support empty write barriers in xen_disk
Date: Wed, 24 Nov 2010 10:18:40 -0800	[thread overview]
Message-ID: <4CED5700.9070204@goop.org> (raw)
In-Reply-To: <20101124165835.GF31124@lst.de>

On 11/24/2010 08:58 AM, Christoph Hellwig wrote:
> I had the discussion with Jeremy in Boston before, but let's repeat it
> here:
>
>  - is there actually any pre-existing xen backend that does properly
>    implement empty barries.  Back then we couldn't find any.
>  - if this is a new concept to Xen please do not define an empty
>    barrier primitive, but a new flush cache primitive.  That one
>    maps natively to the qemu I/O layer, and with recent Linux, NetBSD,
>    Windows, or Solaris guest will be a lot faster than a barrier
>    which drains the queue.
>
> Note that what your patch implements actually is a rather inefficient
> implementation of the latter.  You do none of the queue draining which
> the in-kernel blkback implementation does by submitting the old-style
> barrier bio.  While most filesystem do not care you introduce a quite
> subtile chance of data corruption for reiserfs, or ext4 with
> asynchronous journal commits on pre-2.6.37 kernels.

Yeah, I agree.  I think semantically empty WRITE_BARRIERs are supposed
to work, as evidenced by the effort blkback makes in trying to
specifically support them.  I haven't traced through to find out why it
EIOs them regardless - it seems to be coming from deeper in the block
subsystem (is there a "no payload" flag or something missing?).

But in the future, I think defining an unordered FLUSH operator like
Linux wants is a useful thing to do and implement (especially since it
amounts to standardising the ?BSD extension).  I'm not sure of their
precise semantics (esp WRT ordering), but I think its already OK.

(BTW, in case it wasn't clear, we're seriously considering - but not yet
committed to - using qemu as the primary PV block backend for Xen
instead of submitting the existing blkback code for upstream.  We still
need to do some proper testing and measuring to make sure it stacks up
OK, and work out how it would fit together with the rest of the
management stack.  But so far it looks promising.)

    J

WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Christoph Hellwig <hch@lst.de>
Cc: xen-devel@lists.xensource.com, qemu-devel@nongnu.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: Re: [Qemu-devel] [PATCH] qemu and qemu-xen: support empty write barriers in xen_disk
Date: Wed, 24 Nov 2010 10:18:40 -0800	[thread overview]
Message-ID: <4CED5700.9070204@goop.org> (raw)
In-Reply-To: <20101124165835.GF31124@lst.de>

On 11/24/2010 08:58 AM, Christoph Hellwig wrote:
> I had the discussion with Jeremy in Boston before, but let's repeat it
> here:
>
>  - is there actually any pre-existing xen backend that does properly
>    implement empty barries.  Back then we couldn't find any.
>  - if this is a new concept to Xen please do not define an empty
>    barrier primitive, but a new flush cache primitive.  That one
>    maps natively to the qemu I/O layer, and with recent Linux, NetBSD,
>    Windows, or Solaris guest will be a lot faster than a barrier
>    which drains the queue.
>
> Note that what your patch implements actually is a rather inefficient
> implementation of the latter.  You do none of the queue draining which
> the in-kernel blkback implementation does by submitting the old-style
> barrier bio.  While most filesystem do not care you introduce a quite
> subtile chance of data corruption for reiserfs, or ext4 with
> asynchronous journal commits on pre-2.6.37 kernels.

Yeah, I agree.  I think semantically empty WRITE_BARRIERs are supposed
to work, as evidenced by the effort blkback makes in trying to
specifically support them.  I haven't traced through to find out why it
EIOs them regardless - it seems to be coming from deeper in the block
subsystem (is there a "no payload" flag or something missing?).

But in the future, I think defining an unordered FLUSH operator like
Linux wants is a useful thing to do and implement (especially since it
amounts to standardising the ?BSD extension).  I'm not sure of their
precise semantics (esp WRT ordering), but I think its already OK.

(BTW, in case it wasn't clear, we're seriously considering - but not yet
committed to - using qemu as the primary PV block backend for Xen
instead of submitting the existing blkback code for upstream.  We still
need to do some proper testing and measuring to make sure it stacks up
OK, and work out how it would fit together with the rest of the
management stack.  But so far it looks promising.)

    J

  reply	other threads:[~2010-11-24 18:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-24 13:08 [Qemu-devel] [PATCH] qemu and qemu-xen: support empty write barriers in xen_disk Stefano Stabellini
2010-11-24 13:08 ` Stefano Stabellini
2010-11-24 14:23 ` [Qemu-devel] " Gerd Hoffmann
2010-11-24 16:29   ` Kevin Wolf
2010-11-24 16:46     ` Stefano Stabellini
2010-12-14 18:43   ` [Xen-devel] " Ian Jackson
2010-12-14 18:43     ` Ian Jackson
2010-11-24 16:58 ` Christoph Hellwig
2010-11-24 18:18   ` Jeremy Fitzhardinge [this message]
2010-11-24 18:18     ` Jeremy Fitzhardinge
2010-11-24 18:44     ` [Xen-devel] " Christoph Hellwig
2010-11-24 18:44       ` Christoph Hellwig
2010-11-24 19:55       ` [Xen-devel] " Stefano Stabellini
2010-11-24 19:55         ` Stefano Stabellini
2010-11-26 11:03       ` [Xen-devel] " Kevin Wolf
2010-11-26 11:03         ` Kevin Wolf
2010-11-26 13:29         ` [Xen-devel] " Christoph Hellwig
2010-11-26 13:29           ` Christoph Hellwig
2010-11-26 15:25         ` [Xen-devel] " Stefano Stabellini
2010-11-26 15:25           ` Stefano Stabellini
     [not found]     ` <m2n.s.1PLKM9-000VR4@chiark.greenend.org.uk>
2010-11-25 19:30       ` [Xen-devel] " Ian Jackson
2010-11-25 19:30         ` Ian Jackson
2010-11-25 19:46         ` [Xen-devel] " Jeremy Fitzhardinge
2010-11-25 19:46           ` Jeremy Fitzhardinge
2010-11-25 23:30           ` [Xen-devel] " Christoph Hellwig
2010-11-25 23:30             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CED5700.9070204@goop.org \
    --to=jeremy@goop.org \
    --cc=hch@lst.de \
    --cc=qemu-devel@nongnu.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.