qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: quintela@redhat.com, Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	"open list:Block layer core" <qemu-block@nongnu.org>,
	Michael Roth <michael.roth@amd.com>, Fam Zheng <fam@euphon.net>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Peter Xu <peterx@redhat.com>
Subject: Re: QMP (without OOB) function running in thread different from the main thread as part of aio_poll
Date: Tue, 2 May 2023 12:03:41 +0200	[thread overview]
Message-ID: <df3b995e-884c-8e3c-e0cf-b720ff4fff56@proxmox.com> (raw)
In-Reply-To: <87jzxw9cco.fsf@secure.mitica>

Am 28.04.23 um 18:54 schrieb Juan Quintela:
> Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 28.04.2023 um 10:38 hat Juan Quintela geschrieben:
>>> Kevin Wolf <kwolf@redhat.com> wrote:
>>>>> I am perhaps a bit ingenuous here, but it is there a way to convince
>>>>> qemu that snapshot_save_job_bh *HAS* to run on the main thread?
>>>>
>>>> I believe we're talking about a technicality here. I asked another more
>>>> fundamental question that nobody has answered yet:
>>>>
>>>> Why do you think that it's ok to call bdrv_writev_vmstate() without
>>>> holding the BQL?
>>>
>>> I will say this function starts by bdrv_ (i.e. block layer people) and
>>> endes with _vmstate (i.e. migration people).
>>>
>>> To be honest, I don't know.  That is why I _supposed_ you have an idea.
>>
>> My idea is that bdrv_*() can only be called when you hold the BQL, or
>> for BlockDriverStates in an iothread the AioContext lock.
>>
>> Apparently dropping the BQL in migration code was introduced in Paolo's
>> commit 9b095037527.
> 
> Damn.  I reviewed it, so I am as guilty as the author.
> 10 years later without problems I will not blame that patch.
> 
> I guess we changed something else that broke doing it without the lock.
> 
> But no, I still don't have suggestions/ideas.
>

I do feel like the issue might be very difficult to trigger under normal
circumstances. Depending on the configuration and what you do in the
guest, aio_poll in a vCPU thread does not happen often and I imagine
snapshot-save is also not a super frequent operation for most people. It
still takes me a while to trigger the issue by issuing lots of pflash
writes and running snapshot-save in a loop, I'd guess about 30-60
snapshots. Another reason might be that generated co-wrappers were less
common in the past?

>> I'm not sure what this was supposed to improve in
>> the case of snapshots because the VM is stopped anyway.

Is it? Quoting Juan:> d- snapshots are a completely different beast,
that don't really stop
>    the guest in the same way at that point, and sometimes it shows in
>    this subtle details.

>> Would anything bad happen if we removed the BQL unlock/lock section in
>> qemu_savevm_state() again?
> 
> Dunno.
> 
> For what is worth, I can say that it survives migration-test, but don't
> ask me why/how/...
> 
> Fiona, can you check if it fixes your troubles?
> 

Just removing the single section in qemu_savevm_state() breaks even the
case where snapshot_save_job_bh() is executed in the main thread,
because ram_init_bitmaps() will call qemu_mutex_lock_iothread_impl()
which asserts that it's not already locked.

Also removing the lock/unlock pair in ram_init_bitmaps() seems to work.
But I'm not sure what else a full semantic revert of commit 9b095037527
would entail.

Best Regards,
Fiona



  reply	other threads:[~2023-05-02 10:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-19 14:09 QMP (without OOB) function running in thread different from the main thread as part of aio_poll Fiona Ebner
2023-04-19 16:59 ` Paolo Bonzini
2023-04-20  6:11 ` Markus Armbruster
2023-04-20  6:55   ` Paolo Bonzini
2023-04-26 14:31     ` Fiona Ebner
2023-04-27 11:03       ` Kevin Wolf
2023-04-27 12:27         ` Fiona Ebner
2023-04-27 14:36           ` Juan Quintela
2023-04-27 14:56             ` Peter Xu
2023-04-28  7:53               ` Fiona Ebner
2023-04-28  7:23             ` Fiona Ebner
2023-04-28  7:47               ` Juan Quintela
2023-04-28  8:30                 ` Kevin Wolf
2023-04-28  8:38                   ` Juan Quintela
2023-04-28 12:22                     ` Kevin Wolf
2023-04-28 16:54                       ` Juan Quintela
2023-05-02 10:03                         ` Fiona Ebner [this message]
2023-05-02 10:25                           ` Fiona Ebner
2023-05-02 10:35                             ` Juan Quintela
2023-05-02 12:49                               ` Fiona Ebner
2023-05-02 10:30                           ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df3b995e-884c-8e3c-e0cf-b720ff4fff56@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=armbru@redhat.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).