From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Thomas Huth <thuth@redhat.com>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>,
qemu-block@nongnu.org, Eric Blake <eblake@redhat.com>,
Mads Ynddal <mads@ynddal.dk>
Subject: Re: [PATCH] trace/simple: Fix hang when using simpletrace with fork()
Date: Wed, 26 Feb 2025 09:53:53 +0000 [thread overview]
Message-ID: <Z77ksR0vcySWC0CS@redhat.com> (raw)
In-Reply-To: <2a7c4f21-ee27-4407-8191-dd1f0547990c@redhat.com>
On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote:
> On 26/02/2025 10.15, Daniel P. Berrangé wrote:
> > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote:
> > > When compiling QEMU with --enable-trace-backends=simple , the
> > > iotest 233 is currently hanging. This happens because qemu-nbd
> > > calls trace_init_backends() first - which causes simpletrace to
> > > install its writer thread and the atexit() handler - before
> > > calling fork(). But the simpletrace writer thread is then only
> > > available in the parent process, not in the child process anymore.
> > > Thus when the child process exits, its atexit handler waits forever
> > > on the trace_empty_cond condition to be set by the non-existing
> > > writer thread, so the process never finishes.
> > >
> > > Fix it by installing a pthread_atfork() handler, too, which
> > > makes sure that the trace_writeout_enabled variable gets set
> > > to false again in the child process, so we can use it in the
> > > atexit() handler to check whether we still need to wait on the
> > > writer thread or not.
> > >
> > > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > > ---
> > > trace/simple.c | 17 ++++++++++++++++-
> > > 1 file changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/trace/simple.c b/trace/simple.c
> > > index c0aba00cb7f..269bbda69f1 100644
> > > --- a/trace/simple.c
> > > +++ b/trace/simple.c
> > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
> > > void st_flush_trace_buffer(void)
> > > {
> > > - flush_trace_file(true);
> > > + flush_trace_file(trace_writeout_enabled);
> > > +}
> > > +
> > > +#ifndef _WIN32
> > > +static void trace_thread_atfork(void)
> > > +{
> > > + /*
> > > + * If we fork, the writer thread does not exist in the child, so
> > > + * make sure to allow st_flush_trace_buffer() to clean up correctly.
> > > + */
> > > + g_mutex_lock(&trace_lock);
> > > + trace_writeout_enabled = false;
> > > + g_cond_signal(&trace_empty_cond);
> > > + g_mutex_unlock(&trace_lock);
> > > }
> > > +#endif
> >
> > This doesn't seem right to me. This is being run in the child and while
> > it may avoid the hang when the child exits, surely it still leaves tracing
> > non-functional in the child as we're lacking the thread to write out the
> > trace data.
>
> Well, you cannot write to the same file from the parent and child at the
> same time, so one of both needs to be shut up AFAIU. And the simpletrace
> code cannot now which one of the two processes should be allowed to continue
> with the logging, so we either have to disable tracing in one of the two
> processes, or think of something completely different, e.g. using
> pthread_atfork(abort, NULL, NULL) to make people aware that they are not
> allowed to start tracing before calling fork()...? But in that case we still
> need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the
> trace backend before calling fork().
As precedent, in system/vl.c we delay trace_init() until after daemonizing
which is the simple way to avoid the worst of the danger.
It would still be nice to have an atfork() handler to fully eliminate the
danger though
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2025-02-26 9:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-26 8:50 [PATCH] trace/simple: Fix hang when using simpletrace with fork() Thomas Huth
2025-02-26 9:15 ` Daniel P. Berrangé
2025-02-26 9:38 ` Thomas Huth
2025-02-26 9:53 ` Daniel P. Berrangé [this message]
2025-02-27 7:05 ` Stefan Hajnoczi
2025-02-26 9:29 ` Kevin Wolf
2025-02-26 9:51 ` Daniel P. Berrangé
2025-02-27 19:30 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z77ksR0vcySWC0CS@redhat.com \
--to=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mads@ynddal.dk \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).