qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, Wen Congyang <wencongyang2@huawei.com>,
	Xie Changlong <xiechanglong.d@gmail.com>,
	qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Max Reitz <mreitz@redhat.com>, John Snow <jsnow@redhat.com>
Subject: Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Date: Tue, 13 Jul 2021 14:10:22 +0100	[thread overview]
Message-ID: <YO2QvuBqbw58fuo/@stefanha-x1.localdomain> (raw)
In-Reply-To: <629fb077-9d0a-7c33-0b2e-d055c0493005@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3234 bytes --]

On Mon, Jul 12, 2021 at 10:41:46AM +0200, Emanuele Giuseppe Esposito wrote:
> 
> 
> On 08/07/2021 15:04, Stefan Hajnoczi wrote:
> > On Thu, Jul 08, 2021 at 01:32:12PM +0200, Paolo Bonzini wrote:
> > > On 08/07/21 12:36, Stefan Hajnoczi wrote:
> > > > > What is very clear from this patch is that it
> > > > > is strictly related to the brdv_* and lower level calls, because
> > > > > they also internally check or even use the aiocontext lock.
> > > > > Therefore, in order to make it work, I temporarly added some
> > > > > aiocontext_acquire/release pair around the function that
> > > > > still assert for them or assume they are hold and temporarly
> > > > > unlock (unlock() - lock()).
> > > > 
> > > > Sounds like the issue is that this patch series assumes AioContext locks
> > > > are no longer required for calling the blk_*()/bdrv_*() APIs? That is
> > > > not the case yet, so you had to then add those aio_context_lock() calls
> > > > back in elsewhere. This approach introduces unnecessary risk. I think we
> > > > should wait until blk_*()/bdrv_*() no longer requires the caller to hold
> > > > the AioContext lock before applying this series.
> > > 
> > > In general I'm in favor of pushing the lock further down into smaller and
> > > smaller critical sections; it's a good approach to make further audits
> > > easier until it's "obvious" that the lock is unnecessary.  I haven't yet
> > > reviewed Emanuele's patches to see if this is what he's doing where he's
> > > adding the acquire/release calls, but that's my understanding of both his
> > > cover letter and your reply.
> > 
> > The problem is the unnecessary risk. We know what the goal is for
> > blk_*()/bdrv_*() but it's not quite there yet. Does making changes in
> > block jobs help solve the final issues with blk_*()/bdrv_*()?
> 
> Correct me if I am wrong, but it seems to me that the bdrv_*()/blk_*()
> operation mostly take care of building, modifying and walking the bds graph.
> So since graph nodes can have multiple AioContext, it makes sense that we
> have a lock when modifying the graph, right?
> 
> If so, we can simply try to replace the AioContext lock with a graph lock,
> or something like that. But I am not sure of this.

Block graph manipulation (all_bdrv_states and friends) requires the BQL.
It has always been this way.

This raises the question: if block graph manipulation is already under
the BQL and BlockDriver callbacks don't need the AioContext anymore, why
are aio_context_acquire() calls still needed in block jobs?

AIO_WAIT_WHILE() requires that AioContext is acquired according to its
documentation, but I'm not sure that's true anymore. Thread-safe/atomic
primitives are used by AIO_WAIT_WHILE(), so as long as the condition
being waited for is thread-safe too it should work without the
AioContext lock.

Back to my comment about unnecessary risk, pushing the lock down is a
strategy for exploring the problem, but I'm not sure those intermediate
commits need to be committed to qemu.git/master because of the time
required to review them and the risk of introducing (temporary) bugs.
Maybe there's a benefit to this patch series that I've missed?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-07-13 13:12 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07 16:58 [RFC PATCH 0/6] job: replace AioContext lock with job_mutex Emanuele Giuseppe Esposito
2021-07-07 16:58 ` [RFC PATCH 1/6] job: use getter/setters instead of accessing the Job fields directly Emanuele Giuseppe Esposito
2021-07-07 16:58 ` [RFC PATCH 2/6] job: _locked functions and public job_lock/unlock for next patch Emanuele Giuseppe Esposito
2021-07-08 10:50   ` Stefan Hajnoczi
2021-07-12  8:43     ` Emanuele Giuseppe Esposito
2021-07-13 13:32       ` Stefan Hajnoczi
2021-07-07 16:58 ` [RFC PATCH 3/6] job: minor changes to simplify locking Emanuele Giuseppe Esposito
2021-07-08 10:55   ` Stefan Hajnoczi
2021-07-12  8:43     ` Emanuele Giuseppe Esposito
2021-07-13 17:56   ` Eric Blake
2021-07-07 16:58 ` [RFC PATCH 4/6] job.h: categorize job fields Emanuele Giuseppe Esposito
2021-07-08 11:02   ` Stefan Hajnoczi
2021-07-12  8:43     ` Emanuele Giuseppe Esposito
2021-07-07 16:58 ` [RFC PATCH 5/6] job: use global job_mutex to protect struct Job Emanuele Giuseppe Esposito
2021-07-08 12:56   ` Stefan Hajnoczi
2021-07-12  8:43     ` Emanuele Giuseppe Esposito
2021-07-07 16:58 ` [RFC PATCH 6/6] jobs: remove unnecessary AioContext aquire/release pairs Emanuele Giuseppe Esposito
2021-07-08 10:36 ` [RFC PATCH 0/6] job: replace AioContext lock with job_mutex Stefan Hajnoczi
2021-07-08 11:32   ` Paolo Bonzini
2021-07-08 12:14     ` Kevin Wolf
2021-07-08 13:04     ` Stefan Hajnoczi
2021-07-12  8:41       ` Emanuele Giuseppe Esposito
2021-07-13 13:10         ` Stefan Hajnoczi [this message]
2021-07-13 15:18           ` Vladimir Sementsov-Ogievskiy
2021-07-13 16:38             ` Stefan Hajnoczi
2021-07-15 12:35               ` Vladimir Sementsov-Ogievskiy
2021-07-15 13:29                 ` Stefan Hajnoczi
2021-07-16 15:23           ` Kevin Wolf
2021-07-19  9:29             ` Stefan Hajnoczi
2021-07-19 14:54               ` Kevin Wolf
2021-07-08 13:09 ` Stefan Hajnoczi
2021-07-12  8:42   ` Emanuele Giuseppe Esposito
2021-07-13 13:27     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YO2QvuBqbw58fuo/@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).