public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Swapnil Sapkal <swapnil.sapkal@amd.com>
To: <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>,
	<namhyung@kernel.org>, <irogers@google.com>,
	<james.clark@linaro.org>
Cc: <mark.rutland@arm.com>, <alexander.shishkin@linux.intel.com>,
	<jolsa@kernel.org>, <adrian.hunter@intel.com>,
	<ravi.bangoria@amd.com>, <linux-perf-users@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	"Swapnil Sapkal" <swapnil.sapkal@amd.com>
Subject: [PATCH v2 0/3] perf: Fix SIGCHLD vs pause() race with short-lived workloads
Date: Thu, 9 Apr 2026 16:22:46 +0000	[thread overview]
Message-ID: <20260409162249.25581-1-swapnil.sapkal@amd.com> (raw)

Several perf subcommands (sched stats, lock contention) use the pattern
of forking a workload child, calling evlist__start_workload() to uncork
it, and then calling pause() to wait for a signal (typically SIGCHLD
when the child exits, or SIGINT/SIGTERM from the user).

This pattern has a race condition: if the workload is very short-lived,
the child can exit and deliver SIGCHLD in the window between
evlist__start_workload() and pause(). Since pause() only returns when a
signal is received *while the process is suspended*, and SIGCHLD has
already been delivered and handled by the empty sighandler(), pause()
blocks indefinitely.

The fix uses the standard POSIX pattern for this class of bug:

1. Block SIGCHLD, SIGINT, and SIGTERM (via sigprocmask) after
   evlist__prepare_workload() returns but before
   evlist__start_workload(). Blocking after the fork ensures the
   child workload does not inherit a modified signal mask.

2. Replace pause() with sigsuspend(&oldmask), which atomically
   restores the original mask and suspends the process. There is no
   window where any signal can slip through unnoticed.

3. Restore the original signal mask after sigsuspend() returns.

All three signals are blocked (not just SIGCHLD) so that an early
Ctrl+C during the remaining setup before sigsuspend() cannot be
consumed and lost, which would cause a hang in system-wide mode where
no SIGCHLD would follow.

Three call sites are affected across two files:
  - perf_sched__schedstat_record() in builtin-sched.c
  - perf_sched__schedstat_live()   in builtin-sched.c
  - __cmd_contention()             in builtin-lock.c

The two pause() sites in builtin-kwork.c are NOT affected because they
do not register SIGCHLD or fork workload children; they only wait for
user-initiated SIGINT/SIGTERM.

Changes since v1:
  - Moved sigprocmask() to after evlist__prepare_workload() so the
    forked child does not inherit a blocked SIGCHLD mask, which would
    break workloads relying on SIGCHLD (e.g., Node.js, Python asyncio).
    (Sashiko review)
  - Block SIGINT and SIGTERM in addition to SIGCHLD to prevent an
    early Ctrl+C during setup from being consumed before sigsuspend().
  - Error paths before sigprocmask no longer need mask restoration
    since the mask is not yet modified at that point.
    (Sashiko review)

Swapnil Sapkal (3):
  perf sched stats: Fix SIGCHLD race in schedstat_record()
  perf sched stats: Fix SIGCHLD race in schedstat_live()
  perf lock contention: Fix SIGCHLD race in __cmd_contention()

 tools/perf/builtin-lock.c  | 18 ++++++++++++++++--
 tools/perf/builtin-sched.c | 30 ++++++++++++++++++++++++++----
 2 files changed, 42 insertions(+), 6 deletions(-)

-- 
2.43.0


             reply	other threads:[~2026-04-09 16:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 16:22 Swapnil Sapkal [this message]
2026-04-09 16:22 ` [PATCH v2 1/3] perf sched stats: Fix SIGCHLD race in schedstat_record() Swapnil Sapkal
2026-04-09 16:51   ` sashiko-bot
2026-04-10  4:17     ` Namhyung Kim
2026-04-09 16:22 ` [PATCH v2 2/3] perf sched stats: Fix SIGCHLD race in schedstat_live() Swapnil Sapkal
2026-04-09 17:18   ` sashiko-bot
2026-04-09 16:22 ` [PATCH v2 3/3] perf lock contention: Fix SIGCHLD race in __cmd_contention() Swapnil Sapkal
2026-04-09 17:37   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260409162249.25581-1-swapnil.sapkal@amd.com \
    --to=swapnil.sapkal@amd.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox