From: Swapnil Sapkal <swapnil.sapkal@amd.com>
To: Ian Rogers <irogers@google.com>
Cc: <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>,
<namhyung@kernel.org>, <james.clark@linaro.org>,
<mark.rutland@arm.com>, <alexander.shishkin@linux.intel.com>,
<jolsa@kernel.org>, <adrian.hunter@intel.com>,
<ravi.bangoria@amd.com>, <KPrateek.Nayak@amd.com>,
<linux-perf-users@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] perf sched stats: Fix SIGCHLD race in schedstat_record()
Date: Thu, 9 Apr 2026 21:59:44 +0530 [thread overview]
Message-ID: <f034e394-6597-4fea-be07-69fef438ed29@amd.com> (raw)
In-Reply-To: <CAP-5=fWY7X6Zo2xFXHbPtU2Dob=eHguzbkzowHjqyBzsse2+-Q@mail.gmail.com>
Hi Ian,
On 01-04-2026 21:56, Ian Rogers wrote:
> On Tue, Mar 31, 2026 at 11:42 PM Swapnil Sapkal <swapnil.sapkal@amd.com> wrote:
>>
>> When a very short-lived workload is used with 'perf sched stats record',
>> the child process can exit and deliver SIGCHLD between
>> evlist__start_workload() and pause(). Since pause() only returns when a
>> signal is received while suspended, and the SIGCHLD has already been
>> delivered and handled by then, pause() blocks indefinitely.
>>
>> Fix this by blocking SIGCHLD before starting the workload and replacing
>> pause() with sigsuspend(). sigsuspend() atomically unblocks SIGCHLD and
>> suspends the process, ensuring no signal is lost regardless of how
>> quickly the child exits.
>>
>> Assisted-by: Claude:claude-opus-4.6
>> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
>
> Thanks Swapnil! In the Sashiko reviews there were some nits about
> clean up on error paths but also 1 about signals potentially being
> masked in the perf workload:
> https://sashiko.dev/#/patchset/20260401064114.141066-1-swapnil.sapkal%40amd.com
> Could you take a look?
>
Thank you for directing me to the sashiko reviews. I have addressed the
review comments in v2.
https://lore.kernel.org/all/20260409162249.25581-1-swapnil.sapkal@amd.com/
--
Thanks and Regards,
Swapnil
> Ian
>
>> ---
>> tools/perf/builtin-sched.c | 21 +++++++++++++++++++--
>> 1 file changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>> index 3f509cfdd58c..eb3702d98fd1 100644
>> --- a/tools/perf/builtin-sched.c
>> +++ b/tools/perf/builtin-sched.c
>> @@ -3807,6 +3807,7 @@ const char *output_name;
>> static int perf_sched__schedstat_record(struct perf_sched *sched,
>> int argc, const char **argv)
>> {
>> + sigset_t sigchld_mask, oldmask;
>> struct perf_session *session;
>> struct target target = {};
>> struct evlist *evlist;
>> @@ -3822,6 +3823,15 @@ static int perf_sched__schedstat_record(struct perf_sched *sched,
>> signal(SIGCHLD, sighandler);
>> signal(SIGTERM, sighandler);
>>
>> + /*
>> + * Block SIGCHLD early so that a short-lived workload cannot deliver
>> + * the signal before we are ready to wait for it. sigsuspend() below
>> + * will atomically unblock it.
>> + */
>> + sigemptyset(&sigchld_mask);
>> + sigaddset(&sigchld_mask, SIGCHLD);
>> + sigprocmask(SIG_BLOCK, &sigchld_mask, &oldmask);
>> +
>> evlist = evlist__new();
>> if (!evlist)
>> return -ENOMEM;
>> @@ -3902,8 +3912,15 @@ static int perf_sched__schedstat_record(struct perf_sched *sched,
>> if (argc)
>> evlist__start_workload(evlist);
>>
>> - /* wait for signal */
>> - pause();
>> + /*
>> + * Use sigsuspend() instead of pause() to avoid a race where a
>> + * short-lived workload exits and delivers SIGCHLD before pause()
>> + * is entered, causing it to block indefinitely. sigsuspend()
>> + * atomically unblocks SIGCHLD (blocked above) and suspends,
>> + * ensuring no signal is lost.
>> + */
>> + sigsuspend(&oldmask);
>> + sigprocmask(SIG_SETMASK, &oldmask, NULL);
>>
>> if (reset) {
>> err = disable_sched_schedstat();
>> --
>> 2.43.0
>>
next prev parent reply other threads:[~2026-04-09 16:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 6:41 [PATCH 0/3] perf: Fix SIGCHLD vs pause() race with short-lived workloads Swapnil Sapkal
2026-04-01 6:41 ` [PATCH 1/3] perf sched stats: Fix SIGCHLD race in schedstat_record() Swapnil Sapkal
2026-04-01 16:26 ` Ian Rogers
2026-04-09 16:29 ` Swapnil Sapkal [this message]
2026-04-01 6:41 ` [PATCH 2/3] perf sched stats: Fix SIGCHLD race in schedstat_live() Swapnil Sapkal
2026-04-01 6:41 ` [PATCH 3/3] perf lock contention: Fix SIGCHLD race in __cmd_contention() Swapnil Sapkal
2026-04-01 10:55 ` [PATCH 0/3] perf: Fix SIGCHLD vs pause() race with short-lived workloads James Clark
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f034e394-6597-4fea-be07-69fef438ed29@amd.com \
--to=swapnil.sapkal@amd.com \
--cc=KPrateek.Nayak@amd.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox