qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Sage Weil <sage@newdream.net>
To: Kevin Wolf <kwolf@redhat.com>
Cc: ceph-devel@vger.kernel.org, Christian Brunner <chb@muc.de>,
	Qemu-devel@nongnu.org
Subject: Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Date: Fri, 22 Oct 2010 09:22:00 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.1010220916000.24595@cobra.newdream.net> (raw)
In-Reply-To: <4CC14DD3.6060504@redhat.com>

On Fri, 22 Oct 2010, Kevin Wolf wrote:
> [ Adding qemu-devel to CC again ]
> 
> Am 21.10.2010 20:59, schrieb Sage Weil:
> > On Thu, 21 Oct 2010, Christian Brunner wrote:
> >> Hi,
> >>
> >> is there a flush operation in librados? - I guess the only way to
> >> handle this, would be waiting until all aio requests are finished?
> 
> That's not the semantics of bdrv_flush, you don't need to wait for
> running requests. You just need to make sure that all completed requests
> are safe on disk so that they would persist even in case of a
> crash/power failure.

Okay, in that case we're fine.  librados doesn't declare a write committed 
until it is safely on disk on multiple backend nodes.  There is a 
mechanism to get an ack sooner, but the qemu storage driver does not use 
it.  

> > There is no flush currently.  But librados does no caching, so in this 
> > case at least silenting upgrading to cache=writethrough should work.
> 
> You're making sure that the data can't be cached in the server's page
> cache or volatile disk cache either, e.g. by using O_SYNC for the image
> file? If so, upgrading would be safe.

Right.

> > If that's a problem, we can implement a flush.  Just let us know.
> 
> Presumably providing a writeback mode with explicit flushes could
> improve performance. Upgrading to writethrough is not a correctness
> problem, though, so it's your decision if you want to implement it.

So is a bdrv_flush generated when e.g. the guest filesystem issues a 
barrier, or would otherwise normally ask a SATA disk to flush it's cache?

sage



> Kevin
> 
> >> ---------- Forwarded message ----------
> >> From: Kevin Wolf <kwolf@redhat.com>
> >> Date: 2010/10/21
> >> Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
> >> To: Christian Brunner <chb@muc.de>, Laurent Vivier
> >> <Laurent@vivier.eu>, MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> >> Cc: Qemu-devel@nongnu.org
> >>
> >>
> >> Hi all,
> >>
> >> I'm currently looking into adding a return value to qemu's bdrv_flush
> >> function and I noticed that your block drivers (nbd, rbd and sheepdog)
> >> don't implement bdrv_flush at all. bdrv_flush is going to return
> >> -ENOTSUP for any block driver not implementing this, effectively
> >> breaking these three drivers for anything but cache=unsafe.
> >>
> >> Is there a specific reason why your drivers don't implement this? I
> >> think I remember that one of the drivers always provides
> >> cache=writethough semantics. It would be okay to silently "upgrade" to
> >> cache=writethrough, so in this case I'd just need to add an empty
> >> bdrv_flush implementation.
> >>
> >> Otherwise, we really cannot allow any option except cache=unsafe because
> >> that's the semantics provided by the driver.
> >>
> >> In any case, I think it would be a good idea to implement a real
> >> bdrv_flush function to allow the write-back cache modes cache=off and
> >> cache=writeback in order to improve performance over writethrough.
> >>
> >> Is this possible with your protocols, or can the protocol be changed to
> >> consider this? Any hints on how to proceed?
> >>
> >> Kevin
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

  reply	other threads:[~2010-10-22 16:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21 14:07 [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog Kevin Wolf
2010-10-21 15:07 ` Anthony Liguori
2010-10-21 19:32   ` Laurent Vivier
2010-10-22  8:29     ` Kevin Wolf
2010-10-22 12:58       ` Anthony Liguori
2010-10-22 13:35         ` Kevin Wolf
2010-10-22 13:45           ` Anthony Liguori
2010-10-22 13:57             ` Kevin Wolf
2010-10-22 14:01               ` Anthony Liguori
2010-10-22  5:43 ` MORITA Kazutaka
2010-10-22  8:47   ` Kevin Wolf
2010-10-25  5:31     ` MORITA Kazutaka
     [not found] ` <AANLkTikHAm7opg1TzUrUWis53ENT_z6DjfT9GPeBdqA0@mail.gmail.com>
     [not found]   ` <Pine.LNX.4.64.1010211155301.18946@cobra.newdream.net>
2010-10-22  8:39     ` Fwd: " Kevin Wolf
2010-10-22 16:22       ` Sage Weil [this message]
2010-10-25  7:58         ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.1010220916000.24595@cobra.newdream.net \
    --to=sage@newdream.net \
    --cc=Qemu-devel@nongnu.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=chb@muc.de \
    --cc=kwolf@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).