From: Peter Xu <peterx@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
qemu-devel@nongnu.org, "Stefan Hajnoczi" <stefanha@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Date: Thu, 19 Jul 2018 17:46:01 +0800 [thread overview]
Message-ID: <20180719094601.GJ4071@xz-mi> (raw)
In-Reply-To: <87fu0fh9y9.fsf@dusky.pond.sub.org>
On Thu, Jul 19, 2018 at 11:05:34AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >>
> >> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >>
> >> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
> >> >> > cur_mon as well (via monitor_qmp_dispatch_one()).
>
> Since renamed to monitor_qmp_dispatch().
True.
>
> Further down, we concluded that cur_mon isn't actually used from the I/O
> thread, didn't we?
I think so; if not we should either fix it or apply this patch. :)
>
> >> >> > Let's convert the
> >> >> > cur_mon variable to be a per-thread variable to make sure there won't be
> >> >> > a race between threads when accessing the variable.
> >> >>
> >> >> Hmm... why hasn't the OOB work created such a race already?
> >> >>
> >> >> A monitor reads, parses, dispatches and executes commands, formats and
> >> >> sends replies.
> >> >>
> >> >> Before OOB, all of that ran in the main thread. Any access of cur_mon
> >> >> should therefore be from the main thread. No races.
> >> >>
> >> >> OOB moves read, parse, format and send to an I/O thread. Dispatch and
> >> >> execute remain in the main thread. *Except* for commands executed OOB,
> >> >> dispatch and execute move to the I/O thread, too.
> >> >>
> >> >> Why is this not racy? I guess it relies on careful non-use of cur_mon
> >> >> in any part that may now execute in the I/O thread. Scary...
> >> >
> >> > I think it's because cur_mon is not really used in out-of-band command
> >> > executions - now we only have a few out-of-band enabled commands, and
> >> > IIUC none of them is using cur_mon (for example, in
> >> > qmp_migrate_recover() we don't even call error_report, and the code
> >> > path is quite straight forward to make sure of that). So IIUC cur_mon
> >> > variable is still only touched by main thread for now hence we should
> >> > be safe. However that condition might change in the future when we
> >> > add more out-of-band capable commands.
> >> >
> >> > (not to mention that I don't even know whether there are real users of
> >> > out-of-band if we haven't yet started to support that for libvirt...)
> >>
> >> It's not just the actual OOB commands (there are just two), it's also
> >> the monitor code to read, parse, format and send.
> >
> > My understanding is that read, parse, format, send will not touch
> > cur_mon (it was touched before but some patches in the out-of-band
> > series should have removed the last users when parsing). So IIUC only
> > the dispatcher would touch that now. I didn't consider the callers
> > like net_init_socket() and I'm only considering the monitor code (and
> > those callers should be only in the main thread too after all).
>
> There *is* cur_mon use outside dispatch & execute, e.g.
>
> void error_vprintf(const char *fmt, va_list ap)
> {
> if (cur_mon && !monitor_cur_is_qmp()) {
> monitor_vprintf(cur_mon, fmt, ap);
> } else {
> vfprintf(stderr, fmt, ap);
> }
> }
>
> Obviously unsafe to use outside the main thread. Consider:
>
> bool monitor_cur_is_qmp(void)
> {
> return cur_mon && monitor_is_qmp(cur_mon);
> }
>
> static inline bool monitor_is_qmp(const Monitor *mon)
> {
> return (mon->flags & MONITOR_USE_CONTROL);
> }
>
> If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
> do), this crashes when the main thread sets cur_mon back to null in
> between.
Yes, but I thought we should not even use these error_vprintf() or
sister functions outside the QMP handlers, or at least that's what I
thought. For example, in parsers, we should always use error_setg()
or something similar but never error_report().
>
> Did the OOB work make things any worse? Let's see.
>
> @cur_mon is null unless the main thread is running monitor code, either
> HMP within monitor_read():
>
> cur_mon = opaque;
>
> if (cur_mon->rs) {
> for (i = 0; i < size; i++)
> readline_handle_byte(cur_mon->rs, buf[i]);
> } else {
> if (size == 0 || buf[size - 1] != 0)
> monitor_printf(cur_mon, "corrupted command\n");
> else
> handle_hmp_command(cur_mon, (char *)buf);
> }
>
> cur_mon = old_mon;
>
> or QMP within monitor_qmp_dispatch():
>
> old_mon = cur_mon;
> cur_mon = mon;
>
> rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));
>
> cur_mon = old_mon;
>
> In both cases, old_mon is always null.
>
> Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
> deeper for QMP", we ran more code for QMP with cur_mon set, namely the
> JSON parser, but that doesn't matter here.
>
> More fine print: there's also qmp_human_monitor_command(), which stacks
> an HMP monitor on top of the QMP monitor. Also doesn't matter here.
>
> The OOB work doesn't add any new races as long as
>
> * it doesn't add assignments to @cur_mon, and
>
> * none of the code it moves out of the main thread accesses @cur_mon.
>
> The first condition obviously holds. The second one isn't obvious, but
> I figure it holds, too.
>
> Okay, I think I've convince myself the OOB work didn't add
> cur_mon-related races.
Hopefully, yes. Thanks for the double check.
>
> >> >> Should this go into 3.0 to reduce the risk of bugs?
> >> >
> >> > Yes I think it would be good to have that even for 3.0, since it still
> >> > can be seen as a bug fix of existing code.
> >>
> >> Agreed.
> >>
> >> > Regards,
> >> >
> >> >> > Note that thread variables are not initialized to a valid value when new
> >> >> > thread is created.
> >>
> >> Confusing. It sounds like @cur_mon's initial value would be
> >> indeterminate, like an automatic variable's. Not true. Variables with
> >> thread storage duration are initialized when the thread is created.
> >> Since @cur_mon's declaration lacks an initializer, it'll be initialized
> >> to a null pointer. Your sentence is correct when you consider that null
> >> pointer not a valid value.
> >
> > Yes that's what I meant. So how about this?
> >
> > Note that the per-thread @cur_mon variable is not initialized to
> > point to a valid Monitor struct when a new thread is created (the
> > default value will be NULL).
> >
> > Please feel free to tune it up.
>
> I think what the patch really changes is the value of @cur_mon outside
> the main thread: it remains null there now. Before, it depended on what
> the main thread was doing, and therefore could not be used safely.
>
> In other words, the patch makes uses of @cur_mon like the one in
> error_vprintf() shown above safe.
>
> I think that's what we should explain in the commit message. I can try
> rewriting it,
I'll appreciate that if so.
> but right now I got to run.
Must be lunch time! :)
Regards,
>
> >>
> >> >> > However for our case we don't need to set it up,
> >> >> > since the cur_mon variable is only used in such a pattern:
> >> >> >
> >> >> > old_mon = cur_mon;
> >> >> > cur_mon = xxx;
> >> >> > (do something, read cur_mon if necessary in the stack)
> >
> > [1]
> >
> >> >> > cur_mon = old_mon;
> >> >> >
> >> >> > It plays a role as stack variable, so no need to be initialized at all.
> >> >> > We only need to make sure the variable won't be changed unexpectedly by
> >> >> > other threads.
> >>
> >> Do we need this paragraph? The commit doesn't mess with @cur_mon's
> >> initial value at all...
> >
> > I was trying to explain why we don't need to initialize that variable
> > for each thread. A common idea (at least that's what I have had in
> > mind) is that when we create a new thread we should possibly inherit
> > that @cur_mon variable in a copy-on-write fashion for that new thread.
> > But that's not really necessary for the use case like above (as long
> > as we don't create thread during [1], and that's what we do).
> >
> > If you think the patch explains itself better without these lines,
> > please feel free to drop it.
> >
> >>
> >> >> > Reviewed-by: Eric Blake <eblake@redhat.com>
> >> >> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> >> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> >> > [peterx: touch up commit message a bit]
> >> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >
> > Thanks,
--
Peter Xu
next prev parent reply other threads:[~2018-07-19 9:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 7:10 [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread Peter Xu
2018-06-20 9:21 ` no-reply
2018-07-18 15:38 ` Markus Armbruster
2018-07-19 5:01 ` Peter Xu
2018-07-19 7:20 ` Markus Armbruster
2018-07-19 8:03 ` Peter Xu
2018-07-19 9:05 ` Markus Armbruster
2018-07-19 9:46 ` Peter Xu [this message]
2018-07-19 12:14 ` Markus Armbruster
2018-07-19 12:46 ` Peter Xu
2018-07-19 14:56 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180719094601.GJ4071@xz-mi \
--to=peterx@redhat.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.