qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Emanuele Giuseppe Esposito <eesposit@redhat.com>,
	qemu-block@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
	John Snow <jsnow@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Fam Zheng <fam@euphon.net>,
	qemu-devel@nongnu.org
Subject: Re: [RFC PATCH v2 0/8] Removal of AioContext lock, bs->parents and ->children: new rwlock
Date: Tue, 24 May 2022 09:08:40 +0100	[thread overview]
Message-ID: <YoySiI+ReM2O8WEs@stefanha-x1.localdomain> (raw)
In-Reply-To: <67993f7d-bc84-9929-0a28-10a441c3d5bd@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4493 bytes --]

On Tue, May 24, 2022 at 09:55:39AM +0200, Paolo Bonzini wrote:
> On 5/22/22 17:06, Stefan Hajnoczi wrote:
> > However, I hit on a problem that I think Emanuele and Paolo have already
> > pointed out: draining is GS & IO. This might have worked under the 1 IOThread
> > model but it does not make sense for multi-queue. It is possible to submit I/O
> > requests in drained sections. How can multiple threads be in drained sections
> > simultaneously and possibly submit further I/O requests in their drained
> > sections? Those sections wouldn't be "drained" in any useful sense of the word.
> 
> Yeah, that works only if the drained sections are well-behaved.
> 
> "External" sources of I/O are fine; they are disabled using is_external, and
> don't drain themselves I think.

I/O requests for a given BDS may be executing in multiple AioContexts,
so how do you call aio_disable_external() on all relevant AioContexts?

We covered this below but I wanted to reply here in case someone else
reads this part without reading the rest.

> Mirror is the only I/O user of drain, and it's fine because it never submits
> I/O to the drained BDS.
>
> Drained sections in the main thread can be special cased to allow I/O
> (wrlock in this series would also allow I/O).
> 
> So I think that the "cooperation from all relevant places" that Kevin
> mentioned is already there, except for coroutine commands in the monitor.
> Those are a bad idea in my opinion and I'd rather revert commit eb94b81a94
> ("block: Convert 'block_resize' to coroutine") until we have a clearer idea
> of how to handle them.
> 
> I agree that it's basically impossible to review the change.  On the other
> hand, there's already a substantial amount of faith involved in the
> correctness of the current code.
> 
> In particular the AioContext lock does absolutely nothing to protect
> corutines in the main thread against graph changes---both from the monitor
> (including BHs as in "block: Fix BB.root changing across bdrv_next()") and
> from BDS coroutines.  The former are unprotected; the latter are protected
> by drain only: using drain to protect against graph writes would be a matter
> of extending *existing* faith to the multi-iothread case.
> 
> Once the deadlock is broken, we can proceed to remove the AioContext lock
> and then introduce actual coroutine-based locking.
> 
> > Possible steps for AioContext removal
> > -------------------------------------
> > I also wanted to share my assumptions about multi-queue and AioContext removal.
> > Please let me know if anything seems wrong or questionable:
> > 
> > - IO code can execute in any thread that has an AioContext.
> > - Multiple threads may execute a IO code at the same time.
> > - GS code only execute under the BQL.
> > 
> > For AioContext removal this means:
> > 
> > - bdrv_get_aio_context() becomes mostly meaningless since there is no need for
> >    a special "home" AioContext.
> 
> Correct.  bdrv_set_aio_context() remains useful as a way to set a home
> AioContext for sockets.
> 
> > - bdrv_coroutine_enter() becomes mostly meaningless because there is no need to
> >    run a coroutine in the BDS's AioContext.
> > - aio_disable_external(bdrv_get_aio_context(bs)) no longer works because many
> >    threads/AioContexts may submit new I/O requests. BlockDevOps.drained_begin()
> >    may be used instead (e.g. to temporarily disable ioeventfds on a multi-queue
> >    virtio-blk device).
> 
> This is a change that can be done independent of this work.
> 
> > - AIO_WAIT_WHILE() simplifies to
> > 
> >      while ((cond)) {
> >          aio_poll(qemu_get_current_aio_context(), true);
> >          ...
> >      }
> > 
> >    and the distinction between home AioContext and non-home context is
> >    eliminated. AioContext unlocking is dropped.
> 
> (I'll reply on this from elsewhere in the thread).
> 
> > Does this make sense? I haven't seen these things in recent patch series.
> 
> I agree, and yeah all these are blocked on protecting graph modifications.
> 
> In parallel to the block layer discussions, it's possible to work on
> introducing a request queue lock in virtio-blk and virtio-scsi.  That's the
> only thing that relies on the AioContext lock outside the block layer.

I'm not sure what the request queue lock protects in virtio-blk? In
virtio-scsi I guess a lock is needed to protect SCSI target emulation
state?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2022-05-24  8:22 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26  8:51 [RFC PATCH v2 0/8] Removal of AioContext lock, bs->parents and ->children: new rwlock Emanuele Giuseppe Esposito
2022-04-26  8:51 ` [RFC PATCH v2 1/8] aio_wait_kick: add missing memory barrier Emanuele Giuseppe Esposito
2022-04-28 11:09   ` Stefan Hajnoczi
2022-04-29  8:06     ` Emanuele Giuseppe Esposito
2022-04-30  5:21       ` Stefan Hajnoczi
2022-04-29  8:12   ` Paolo Bonzini
2022-04-26  8:51 ` [RFC PATCH v2 2/8] coroutine-lock: release lock when restarting all coroutines Emanuele Giuseppe Esposito
2022-04-26 14:59   ` Paolo Bonzini
2022-04-28 11:21   ` Stefan Hajnoczi
2022-04-28 22:14     ` Paolo Bonzini
2022-04-29  9:35       ` Emanuele Giuseppe Esposito
2022-04-26  8:51 ` [RFC PATCH v2 3/8] block: introduce a lock to protect graph operations Emanuele Giuseppe Esposito
2022-04-26 15:00   ` Paolo Bonzini
2022-04-28 13:45   ` Stefan Hajnoczi
2022-04-29  8:37     ` Emanuele Giuseppe Esposito
2022-04-30  5:48       ` Stefan Hajnoczi
2022-05-02  7:54         ` Emanuele Giuseppe Esposito
2022-05-03 10:50           ` Stefan Hajnoczi
2022-04-26  8:51 ` [RFC PATCH v2 4/8] async: register/unregister aiocontext in graph lock list Emanuele Giuseppe Esposito
2022-04-28 13:46   ` Stefan Hajnoczi
2022-04-28 22:19     ` Paolo Bonzini
2022-04-29  8:37       ` Emanuele Giuseppe Esposito
2022-04-26  8:51 ` [RFC PATCH v2 5/8] block.c: wrlock in bdrv_replace_child_noperm Emanuele Giuseppe Esposito
2022-04-26 15:07   ` Paolo Bonzini
2022-04-28 13:55   ` Stefan Hajnoczi
2022-04-29  8:41     ` Emanuele Giuseppe Esposito
2022-04-26  8:51 ` [RFC PATCH v2 6/8] block: assert that graph read and writes are performed correctly Emanuele Giuseppe Esposito
2022-04-28 14:43   ` Stefan Hajnoczi
2022-04-26  8:51 ` [RFC PATCH v2 7/8] graph-lock: implement WITH_GRAPH_RDLOCK_GUARD and GRAPH_RDLOCK_GUARD macros Emanuele Giuseppe Esposito
2022-04-28 15:00   ` Stefan Hajnoczi
2022-04-26  8:51 ` [RFC PATCH v2 8/8] mirror: protect drains in coroutine with rdlock Emanuele Giuseppe Esposito
2022-04-27  6:55 ` [RFC PATCH v2 0/8] Removal of AioContext lock, bs->parents and ->children: new rwlock Emanuele Giuseppe Esposito
2022-04-28 10:45   ` Stefan Hajnoczi
2022-04-28 21:56     ` Emanuele Giuseppe Esposito
2022-04-30  5:17       ` Stefan Hajnoczi
2022-05-02  8:02         ` Emanuele Giuseppe Esposito
2022-05-02 13:15           ` Paolo Bonzini
2022-05-03  8:24           ` Kevin Wolf
2022-05-03 11:04           ` Stefan Hajnoczi
2022-04-28 10:34 ` Stefan Hajnoczi
2022-04-29  8:06   ` Emanuele Giuseppe Esposito
2022-05-04 13:39 ` Stefan Hajnoczi
2022-05-17 10:59   ` Stefan Hajnoczi
2022-05-18 12:28     ` Emanuele Giuseppe Esposito
2022-05-18 12:43       ` Paolo Bonzini
2022-05-18 14:57         ` Stefan Hajnoczi
2022-05-18 16:14         ` Kevin Wolf
2022-05-19 11:27           ` Stefan Hajnoczi
2022-05-19 12:52             ` Kevin Wolf
2022-05-22 15:06           ` Stefan Hajnoczi
2022-05-23  8:48             ` Emanuele Giuseppe Esposito
2022-05-23 13:15               ` Stefan Hajnoczi
2022-05-23 13:54                 ` Emanuele Giuseppe Esposito
2022-05-23 13:02             ` Kevin Wolf
2022-05-23 15:13               ` Stefan Hajnoczi
2022-05-23 16:04                 ` Kevin Wolf
2022-05-23 16:45                   ` Stefan Hajnoczi
2022-05-24  7:55             ` Paolo Bonzini
2022-05-24  8:08               ` Stefan Hajnoczi [this message]
2022-05-24  9:17                 ` Paolo Bonzini
2022-05-24 10:20                   ` Stefan Hajnoczi
2022-05-24 17:25                     ` Paolo Bonzini
2022-05-24 10:36         ` Kevin Wolf
2022-05-25  7:41           ` Paolo Bonzini
2022-05-18 14:27       ` Stefan Hajnoczi
2022-05-24 12:10       ` Kevin Wolf
2022-05-25  8:27         ` Emanuele Giuseppe Esposito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoySiI+ReM2O8WEs@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).