qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	pingfank@linux.vnet.ibm.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC 00/13] aio: drop io_flush()
Date: Fri, 12 Apr 2013 12:04:01 +0200	[thread overview]
Message-ID: <20130412100401.GD3426@dhcp-200-207.str.redhat.com> (raw)
In-Reply-To: <20130412094936.GA4458@stefanha-thinkpad.redhat.com>

Am 12.04.2013 um 11:49 hat Stefan Hajnoczi geschrieben:
> On Fri, Apr 12, 2013 at 10:02:02AM +0200, Kevin Wolf wrote:
> > Am 11.04.2013 um 17:44 hat Stefan Hajnoczi geschrieben:
> > > Here's my entry to the "let's get rid of io_flush()" effort.  It's based on
> > > Paolo's insight about bdrv_drain_all() that the block layer already has a
> > > tracked_requests list.  io_flush() is redundant since the block layer already
> > > knows if requests are pending.
> > 
> > Except when there are requests that don't come from the guest, but are
> > issued internally. In this case, block.c doesn't know about them, but
> > only the block driver does, so we need a .bdrv_drain callback to tell
> > the block layer about these.
> > 
> > The one specific case that comes to mind is the QED timer for resetting
> > the dirty bit. I think you need to have the .bdrv_drain callback before
> > you can start ignoring .io_flush.
> 
> I think .bdrv_drain() might come in handy in the future if we need it.
> 
> Regarding the QED timer, it does not hook into bdrv_drain_all() today.
> The semantics are unchanged with this patch series applied.
> 
> The timer runs while the guest is running (vm_clock).  It doesn't fire
> during synchronous I/O since qemu_aio_wait() does not dispatch timers.

Okay, I think I need to be more precise. The scenario I have in mind is
the following one:

1. The timer triggers
2. QED sends bdrv_aio_flush() and returns
2b. The flush callback is called and we progress to qed_write_header
    (optional, but maybe having a write instead of flush makes the
    problem more visible) and return again
3. Someone calls bdrv_drain_all()

Before removing .io_flush:

4. bdrv_drain_all() just waits until the header update is complete
   because the raw-posix driver will return io_flush = 1 as long as the
   update is in flight. It will return only then.

After removing .io_flush:

4. bdrv_drain_all() sees that the QED block device doesn't have any
   requests in flight. The raw-posix driver isn't even queried because
   it's an anonymous BDS. bdrv_drain_all() returns, but the request is
   still in flight. Oops.

This specific case could be fixed as well by querying all BDSes instead
of only the named ones, but in general it's not necessary that a block
driver calls into a lower-level BDS for its background operations that
must be stopped.

Or actually, if you want to avoid .bdrv_drain for now, the patch that I
started when I thought a bit about this, had a default .bdrv_drain
implementation that just forwarded the request to bs->file if it wasn't
implemented by a block driver. For the QED case, this would work.

Kevin

  reply	other threads:[~2013-04-12 10:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-11 15:44 [Qemu-devel] [RFC 00/13] aio: drop io_flush() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 01/13] block: stop relying on io_flush() in bdrv_drain_all() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 02/13] dataplane/virtio-blk: check exit conditions before aio_poll() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 03/13] aio: stop using .io_flush() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 04/13] block/curl: drop curl_aio_flush() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 05/13] block/gluster: drop qemu_gluster_aio_flush_cb() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 06/13] block/iscsi: drop iscsi_process_flush() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 07/13] block/linux-aio: drop qemu_laio_completion_cb() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 08/13] block/nbd: drop nbd_have_request() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 09/13] block/rbd: drop qemu_rbd_aio_flush_cb() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 10/13] block/sheepdog: drop have_co_req() and aio_flush_request() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 11/13] dataplane/virtio-blk: drop flush_true() and flush_io() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 12/13] thread-pool: drop thread_pool_active() Stefan Hajnoczi
2013-04-11 15:44 ` [Qemu-devel] [RFC 13/13] aio: drop io_flush argument Stefan Hajnoczi
2013-04-12  8:02 ` [Qemu-devel] [RFC 00/13] aio: drop io_flush() Kevin Wolf
2013-04-12  9:49   ` Stefan Hajnoczi
2013-04-12 10:04     ` Kevin Wolf [this message]
2013-04-12 10:27       ` Paolo Bonzini
2013-04-12 12:06         ` Stefan Hajnoczi
2013-04-12 12:22           ` Kevin Wolf
2013-04-12 13:41             ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130412100401.GD3426@dhcp-200-207.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=pingfank@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).