From: Kevin Wolf <kwolf@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
John Snow <jsnow@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
Prasad Pandit <ppandit@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] vl.c/exit: pause cpus before closing block devices
Date: Tue, 8 Aug 2017 13:56:52 +0200 [thread overview]
Message-ID: <20170808115652.GH4850@dhcp-200-186.str.redhat.com> (raw)
In-Reply-To: <5a5542b7-bc68-c032-24ea-12821b8b6a1a@redhat.com>
Am 08.08.2017 um 13:04 hat Paolo Bonzini geschrieben:
> On 08/08/2017 12:02, Kevin Wolf wrote:
> > Am 04.08.2017 um 13:46 hat Paolo Bonzini geschrieben:
> >> On 04/08/2017 11:58, Stefan Hajnoczi wrote:
> >>>> the root cause of this bug is related to this as well:
> >>>> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02945.html
> >>>>
> >>>> From commit 99723548 we started assuming (incorrectly?) that blk_
> >>>> functions always WILL have an attached BDS, but this is not always true,
> >>>> for instance, flushing the cache from an empty CDROM.
> >>>>
> >>>> Paolo, can we move the flight counter increment outside of the
> >>>> block-backend layer, is that safe?
> >>> I think the bdrv_inc_in_flight(blk_bs(blk)) needs to be fixed
> >>> regardless of the throttling timer issue discussed below. BB cannot
> >>> assume that the BDS graph is non-empty.
> >>
> >> Can we make bdrv_aio_* return NULL (even temporarily) if there is no
> >> attached BDS? That would make it much easier to fix.
> >
> > Would the proper fix be much more complicated than the following? I must
> > admit that I don't fully understand the current state of affairs with
> > respect to threading, AioContext etc. so I may well be missing
> > something.
>
> Not much, but it's not complete either. The issues I see are that: 1)
> blk_drain_all does not take the new counter into account;
Ok, I think this does the trick:
void blk_drain_all(void)
{
BlockBackend *blk = NULL;
bdrv_drain_all_begin();
while ((blk = blk_all_next(blk)) != NULL) {
blk_drain(blk);
}
bdrv_drain_all_end();
}
> 2) bdrv_drain_all callers need to be audited to see if they should be
> blk_drain_all (or more likely, only device BlockBackends should be drained).
qmp_transaction() is unclear to me. It should be changed in some way
anyway because it uses bdrv_drain_all() rather than a begin/end pair.
do_vm_stop() and vm_stop_force_state() probably want blk_drain_all().
xen_invalidate_map_cache() - wtf? Looks like the wrong layer to do this,
but I guess blk_drain_all(), too.
block_migration_cleanup() is just lazy and really means a blk_drain()
for its own BlockBackends. blk_drain_all() as the simple conversion.
migration/savevm: Migration wants blk_drain_all() to get the devices
quiesced.
qemu-io: blk_drain_all(), too.
Hm, looks like there won't be many callers of bdrv_drain_all() left. :-)
> > Note that my blk_drain() implementation doesn't necessarily drain
> > blk_bs(blk) completely, but only those requests that came from the
> > specific BlockBackend. I think this is what the callers want, but
> > if otherwise, it shouldn't be hard to change.
>
> Yes, this should be what they want.
Apparently not; block jobs don't complete with it any more. I haven't
checked in detail, but it makes sense that they can have a BH (e.g. for
block_job_defer_to_main_loop) without a request being in flight.
So I'm including an unconditional bdrv_drain() again. Or I guess,
calling aio_poll() unconditionally and including its return value
in the loop condition would be the cleaner approach?
Kevin
next prev parent reply other threads:[~2017-08-08 11:57 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-13 19:01 [Qemu-devel] [PATCH] vl.c/exit: pause cpus before closing block devices Dr. David Alan Gilbert (git)
2017-07-17 10:17 ` Stefan Hajnoczi
2017-07-17 10:26 ` Dr. David Alan Gilbert
2017-07-17 16:43 ` John Snow
2017-08-04 9:58 ` Stefan Hajnoczi
2017-08-04 11:46 ` Paolo Bonzini
2017-08-08 10:02 ` Kevin Wolf
2017-08-08 11:04 ` Paolo Bonzini
2017-08-08 11:56 ` Kevin Wolf [this message]
2017-08-08 12:47 ` Paolo Bonzini
2017-08-08 12:53 ` Stefan Hajnoczi
2017-08-08 13:03 ` Kevin Wolf
2017-08-08 13:07 ` Paolo Bonzini
2017-08-02 14:42 ` Alberto Garcia
2017-08-03 16:45 ` Dr. David Alan Gilbert
2017-08-03 22:36 ` Paolo Bonzini
2017-08-04 9:56 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170808115652.GH4850@dhcp-200-186.str.redhat.com \
--to=kwolf@redhat.com \
--cc=dgilbert@redhat.com \
--cc=jsnow@redhat.com \
--cc=pbonzini@redhat.com \
--cc=ppandit@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).