From: Fiona Ebner <f.ebner@proxmox.com>
To: quintela@redhat.com, Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>,
"open list:Block layer core" <qemu-block@nongnu.org>,
Michael Roth <michael.roth@amd.com>, Fam Zheng <fam@euphon.net>,
Stefan Hajnoczi <stefanha@redhat.com>,
Thomas Lamprecht <t.lamprecht@proxmox.com>,
Peter Xu <peterx@redhat.com>
Subject: Re: QMP (without OOB) function running in thread different from the main thread as part of aio_poll
Date: Tue, 2 May 2023 12:03:41 +0200 [thread overview]
Message-ID: <df3b995e-884c-8e3c-e0cf-b720ff4fff56@proxmox.com> (raw)
In-Reply-To: <87jzxw9cco.fsf@secure.mitica>
Am 28.04.23 um 18:54 schrieb Juan Quintela:
> Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 28.04.2023 um 10:38 hat Juan Quintela geschrieben:
>>> Kevin Wolf <kwolf@redhat.com> wrote:
>>>>> I am perhaps a bit ingenuous here, but it is there a way to convince
>>>>> qemu that snapshot_save_job_bh *HAS* to run on the main thread?
>>>>
>>>> I believe we're talking about a technicality here. I asked another more
>>>> fundamental question that nobody has answered yet:
>>>>
>>>> Why do you think that it's ok to call bdrv_writev_vmstate() without
>>>> holding the BQL?
>>>
>>> I will say this function starts by bdrv_ (i.e. block layer people) and
>>> endes with _vmstate (i.e. migration people).
>>>
>>> To be honest, I don't know. That is why I _supposed_ you have an idea.
>>
>> My idea is that bdrv_*() can only be called when you hold the BQL, or
>> for BlockDriverStates in an iothread the AioContext lock.
>>
>> Apparently dropping the BQL in migration code was introduced in Paolo's
>> commit 9b095037527.
>
> Damn. I reviewed it, so I am as guilty as the author.
> 10 years later without problems I will not blame that patch.
>
> I guess we changed something else that broke doing it without the lock.
>
> But no, I still don't have suggestions/ideas.
>
I do feel like the issue might be very difficult to trigger under normal
circumstances. Depending on the configuration and what you do in the
guest, aio_poll in a vCPU thread does not happen often and I imagine
snapshot-save is also not a super frequent operation for most people. It
still takes me a while to trigger the issue by issuing lots of pflash
writes and running snapshot-save in a loop, I'd guess about 30-60
snapshots. Another reason might be that generated co-wrappers were less
common in the past?
>> I'm not sure what this was supposed to improve in
>> the case of snapshots because the VM is stopped anyway.
Is it? Quoting Juan:> d- snapshots are a completely different beast,
that don't really stop
> the guest in the same way at that point, and sometimes it shows in
> this subtle details.
>> Would anything bad happen if we removed the BQL unlock/lock section in
>> qemu_savevm_state() again?
>
> Dunno.
>
> For what is worth, I can say that it survives migration-test, but don't
> ask me why/how/...
>
> Fiona, can you check if it fixes your troubles?
>
Just removing the single section in qemu_savevm_state() breaks even the
case where snapshot_save_job_bh() is executed in the main thread,
because ram_init_bitmaps() will call qemu_mutex_lock_iothread_impl()
which asserts that it's not already locked.
Also removing the lock/unlock pair in ram_init_bitmaps() seems to work.
But I'm not sure what else a full semantic revert of commit 9b095037527
would entail.
Best Regards,
Fiona
next prev parent reply other threads:[~2023-05-02 10:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-19 14:09 QMP (without OOB) function running in thread different from the main thread as part of aio_poll Fiona Ebner
2023-04-19 16:59 ` Paolo Bonzini
2023-04-20 6:11 ` Markus Armbruster
2023-04-20 6:55 ` Paolo Bonzini
2023-04-26 14:31 ` Fiona Ebner
2023-04-27 11:03 ` Kevin Wolf
2023-04-27 12:27 ` Fiona Ebner
2023-04-27 14:36 ` Juan Quintela
2023-04-27 14:56 ` Peter Xu
2023-04-28 7:53 ` Fiona Ebner
2023-04-28 7:23 ` Fiona Ebner
2023-04-28 7:47 ` Juan Quintela
2023-04-28 8:30 ` Kevin Wolf
2023-04-28 8:38 ` Juan Quintela
2023-04-28 12:22 ` Kevin Wolf
2023-04-28 16:54 ` Juan Quintela
2023-05-02 10:03 ` Fiona Ebner [this message]
2023-05-02 10:25 ` Fiona Ebner
2023-05-02 10:35 ` Juan Quintela
2023-05-02 12:49 ` Fiona Ebner
2023-05-02 10:30 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df3b995e-884c-8e3c-e0cf-b720ff4fff56@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=armbru@redhat.com \
--cc=fam@euphon.net \
--cc=kwolf@redhat.com \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).