qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>,
	qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
	Hanna Reitz <hreitz@redhat.com>,
	qemu-block@nongnu.org, Eric Blake <eblake@redhat.com>,
	Mads Ynddal <mads@ynddal.dk>,
	danielhb413@gmail.com
Subject: Re: [PATCH] trace/simple: Fix hang when using simpletrace with fork()
Date: Thu, 27 Feb 2025 15:05:23 +0800	[thread overview]
Message-ID: <20250227070523.GC85709@fedora> (raw)
In-Reply-To: <Z77ksR0vcySWC0CS@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3584 bytes --]

On Wed, Feb 26, 2025 at 09:53:53AM +0000, Daniel P. Berrangé wrote:
> On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote:
> > On 26/02/2025 10.15, Daniel P. Berrangé wrote:
> > > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote:
> > > > When compiling QEMU with --enable-trace-backends=simple , the
> > > > iotest 233 is currently hanging. This happens because qemu-nbd
> > > > calls trace_init_backends() first - which causes simpletrace to
> > > > install its writer thread and the atexit() handler - before
> > > > calling fork(). But the simpletrace writer thread is then only
> > > > available in the parent process, not in the child process anymore.
> > > > Thus when the child process exits, its atexit handler waits forever
> > > > on the trace_empty_cond condition to be set by the non-existing
> > > > writer thread, so the process never finishes.
> > > > 
> > > > Fix it by installing a pthread_atfork() handler, too, which
> > > > makes sure that the trace_writeout_enabled variable gets set
> > > > to false again in the child process, so we can use it in the
> > > > atexit() handler to check whether we still need to wait on the
> > > > writer thread or not.
> > > > 
> > > > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > > > ---
> > > >   trace/simple.c | 17 ++++++++++++++++-
> > > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/trace/simple.c b/trace/simple.c
> > > > index c0aba00cb7f..269bbda69f1 100644
> > > > --- a/trace/simple.c
> > > > +++ b/trace/simple.c
> > > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
> > > >   void st_flush_trace_buffer(void)
> > > >   {
> > > > -    flush_trace_file(true);
> > > > +    flush_trace_file(trace_writeout_enabled);
> > > > +}
> > > > +
> > > > +#ifndef _WIN32
> > > > +static void trace_thread_atfork(void)
> > > > +{
> > > > +    /*
> > > > +     * If we fork, the writer thread does not exist in the child, so
> > > > +     * make sure to allow st_flush_trace_buffer() to clean up correctly.
> > > > +     */
> > > > +    g_mutex_lock(&trace_lock);
> > > > +    trace_writeout_enabled = false;
> > > > +    g_cond_signal(&trace_empty_cond);
> > > > +    g_mutex_unlock(&trace_lock);
> > > >   }
> > > > +#endif
> > > 
> > > This doesn't seem right to me. This is being run in the child and while
> > > it may avoid the hang when the child exits, surely it still leaves tracing
> > > non-functional in the child as we're lacking the thread to write out the
> > > trace data.
> > 
> > Well, you cannot write to the same file from the parent and child at the
> > same time, so one of both needs to be shut up AFAIU. And the simpletrace
> > code cannot now which one of the two processes should be allowed to continue
> > with the logging, so we either have to disable tracing in one of the two
> > processes, or think of something completely different, e.g. using
> > pthread_atfork(abort, NULL, NULL) to make people aware that they are not
> > allowed to start tracing before calling fork()...? But in that case we still
> > need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the
> > trace backend before calling fork().
> 
> As precedent, in system/vl.c we delay trace_init() until after daemonizing
> which is the simple way to avoid the worst of the danger.

That sounds good to me.

Adding Daniel Henrique Barboza because he fixed a similar issue in
commit 10b6ee1616f9 ("vl.c: do not execute trace_init_backends() before
daemonizing").

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-02-27  7:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-26  8:50 [PATCH] trace/simple: Fix hang when using simpletrace with fork() Thomas Huth
2025-02-26  9:15 ` Daniel P. Berrangé
2025-02-26  9:38   ` Thomas Huth
2025-02-26  9:53     ` Daniel P. Berrangé
2025-02-27  7:05       ` Stefan Hajnoczi [this message]
2025-02-26  9:29 ` Kevin Wolf
2025-02-26  9:51   ` Daniel P. Berrangé
2025-02-27 19:30 ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250227070523.GC85709@fedora \
    --to=stefanha@redhat.com \
    --cc=berrange@redhat.com \
    --cc=danielhb413@gmail.com \
    --cc=eblake@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mads@ynddal.dk \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).