qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Reiter <s.reiter@proxmox.com>
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [PATCH] monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB
Date: Mon, 22 Mar 2021 16:26:12 +0100	[thread overview]
Message-ID: <b5c777bf-d8c7-00df-4373-58ab4f06ae2a@proxmox.com> (raw)
In-Reply-To: <20210322110847.cdo477ve2gydab64@wobu-vie.proxmox.com>

On 3/22/21 12:08 PM, Wolfgang Bumiller wrote:
> On Thu, Mar 18, 2021 at 02:35:50PM +0100, Stefan Reiter wrote:
>> If OOB is disabled, events received in monitor_qmp_event will be handled
>> in the main context. Thus, we must not acquire a qmp_queue_lock there,
>> as the dispatcher coroutine holds one over a yield point, where it
>> expects to be rescheduled from the main context. If a CHR_EVENT_CLOSED
>> event is received just then, it can race and block the main thread by
>> waiting on the queue lock.
>>
>> Run monitor_qmp_cleanup_queue_and_resume in a BH on the iohandler
>> thread, so the main thread can always make progress during the
>> reschedule.
>>
>> The delaying of the cleanup is safe, since the dispatcher always moves
>> back to the iothread afterward, and thus the cleanup will happen before
>> it gets to its next iteration.
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
> 
> This is a tough one. It *may* be fine, but I wonder if we can approach
> this differently:
> 
>  From what I can gather we have the following call stacks & contexts:
> 
> Guarded lock (lock & release):
>    * monitor_qmp_cleanup_queue_and_resume
>      by monitor_qmp_event
>      by file handler (from I/O loop)
>      ^ iohandler_context (assuming that's where the file handling happens...)
>      (after this patch as BH though)
> 
>    * handle_qmp_command
>      a) by the json parser (which is also re-initialized by
>         monitor_qmp_event btw., haven't checked if that can also
>         "trigger" its methods immediately)
>      b) by monitor_qmp_read
>      by file handler (from I/O loop)
>      ^ iohandler_context
> 
> Lock-"returning":
>    * monitor_qmp_requests_pop_any_with_lock
>      by coroutine_fn monitor_qmp_dispatcher_co
>      ^ iohandler_context
> 
> Lock-releasing:
>    * coroutine_fn monitor_qmp_dispatcher_co
>      ^ qemu_aio_context
> 
> The only *weird* thing that immediately pops out here is
> `monitor_qmp_requests_pop_any_with_lock()` keeping a lock while
> switching contexts.

monitor_qmp_dispatcher_co? _pop_any_ doesn't switch contexts...

But yes, that is weird, as I mentioned in my original mail too.

> This is done in order to allow `AIO_WAIT_WHILE` to work while making
> progress on the events, but do we actually already need to be in this
> context for the OOB `monitor_resume()` call or can we defer the context
> switch to after having done that and released the lock?
> `monitor_resume()` itself seems to simply schedule a BH which should
> work regardless if I'm not mistaken. There's also a
> `readline_show_prompt()` call, but that *looks* harmless?

The BH should indeed be harmless since we don't schedule on 
qemu_get_current_aio_context, and the readline_show_prompt call we can 
ignore here since it's guarded with "!monitor_is_qmp(mon)".

> `monitor_resume()` is also called without the lock later on, so even if
> it needs to be in this context at that point for whatever reason, does
> it need the lock?
> 

It doesn't access the queue, so I don't see why it'd need the lock. And 
as you said, it currently works without too, actually, before commit 
88daf0996c ("qmp: Resume OOB-enabled monitor before processing the 
request") it always did so.

I'll cobble together a v2 with this in mind.



  reply	other threads:[~2021-03-22 15:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18 13:35 [PATCH] monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB Stefan Reiter
2021-03-22 11:08 ` Wolfgang Bumiller
2021-03-22 15:26   ` Stefan Reiter [this message]
2021-03-26 14:48   ` Markus Armbruster
2021-03-29  9:49     ` Stefan Reiter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b5c777bf-d8c7-00df-4373-58ab4f06ae2a@proxmox.com \
    --to=s.reiter@proxmox.com \
    --cc=armbru@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).