linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Osipenko <digetx@gmail.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexey Gladkov <legion@kernel.org>, Kyle Huey <me@kylehuey.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case
Date: Thu, 6 Jan 2022 00:39:10 +0300	[thread overview]
Message-ID: <5bbb54c4-7504-cd28-5dde-4e5965496625@gmail.com> (raw)
In-Reply-To: <87pmp67y4r.fsf@email.froward.int.ebiederm.org>

05.01.2022 22:58, Eric W. Biederman пишет:
> Dmitry Osipenko <digetx@gmail.com> writes:
> 
>> 14.12.2021 01:53, Eric W. Biederman пишет:
>>> Simplify the code that allows SIGKILL during coredumps to terminate
>>> the coredump.  As far as I can tell I have avoided breaking it
>>> by dumb luck.
>>>
>>> Historically with all of the other threads stopping in exit_mm the
>>> wants_signal loop in complete_signal would find the dumper task and
>>> then complete_signal would wake the dumper task with signal_wake_up.
>>>
>>> After moving the coredump_task_exit above the setting of PF_EXITING in
>>> commit 92307383082d ("coredump: Don't perform any cleanups before
>>> dumping core") wants_signal will consider all of the threads in a
>>> multi-threaded process for waking up, not just the core dumping task.
>>>
>>> Luckily complete_signal short circuits SIGKILL during a coredump marks
>>> every thread with SIGKILL and signal_wake_up.  This code is arguably
>>> buggy however as it tries to skip creating a group exit when is already
>>> present, and it fails that a coredump is in progress.
>>>
>>> Ever since commit 06af8679449d ("coredump: Limit what can interrupt
>>> coredumps") was added dump_interrupted needs not just TIF_SIGPENDING
>>> set on the dumper task but also SIGKILL set in it's pending bitmap.
>>> This means that if the code is ever fixed not to short-circuit and
>>> kill a process after it has already been killed the special case
>>> for SIGKILL during a coredump will be broken.
>>>
>>> Sort all of this out by making the coredump special case more special,
>>> and perform all of the work in prepare_signal and leave the rest of
>>> the signal delivery path out of it.
>>>
>>> In prepare_signal when the process coredumping is sent SIGKILL find
>>> the task performing the coredump and use sigaddset and signal_wake_up
>>> to ensure that task reports fatal_signal_pending.
>>>
>>> Return false from prepare_signal to tell the rest of the signal
>>> delivery path to ignore the signal.
>>>
>>> Update wait_for_dump_helpers to perform a wait_event_killable wait
>>> so that if signal_pending gets set spuriously the wait will not
>>> be interrupted unless fatal_signal_pending is true.
>>>
>>> I have tested this and verified I did not break SIGKILL during
>>> coredumps by accident (before or after this change).  I actually
>>> thought I had and I had to figure out what I had misread that kept
>>> SIGKILL during coredumps working.
>>>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>> ---
>>>  fs/coredump.c   |  4 ++--
>>>  kernel/signal.c | 11 +++++++++--
>>>  2 files changed, 11 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/coredump.c b/fs/coredump.c
>>> index a6b3c196cdef..7b91fb32dbb8 100644
>>> --- a/fs/coredump.c
>>> +++ b/fs/coredump.c
>>> @@ -448,7 +448,7 @@ static void coredump_finish(bool core_dumped)
>>>  static bool dump_interrupted(void)
>>>  {
>>>  	/*
>>> -	 * SIGKILL or freezing() interrupt the coredumping. Perhaps we
>>> +	 * SIGKILL or freezing() interrupted the coredumping. Perhaps we
>>>  	 * can do try_to_freeze() and check __fatal_signal_pending(),
>>>  	 * but then we need to teach dump_write() to restart and clear
>>>  	 * TIF_SIGPENDING.
>>> @@ -471,7 +471,7 @@ static void wait_for_dump_helpers(struct file *file)
>>>  	 * We actually want wait_event_freezable() but then we need
>>>  	 * to clear TIF_SIGPENDING and improve dump_interrupted().
>>>  	 */
>>> -	wait_event_interruptible(pipe->rd_wait, pipe->readers == 1);
>>> +	wait_event_killable(pipe->rd_wait, pipe->readers == 1);
>>>  
>>>  	pipe_lock(pipe);
>>>  	pipe->readers--;
>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>> index 8272cac5f429..7e305a8ec7c2 100644
>>> --- a/kernel/signal.c
>>> +++ b/kernel/signal.c
>>> @@ -907,8 +907,15 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
>>>  	sigset_t flush;
>>>  
>>>  	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
>>> -		if (!(signal->flags & SIGNAL_GROUP_EXIT))
>>> -			return sig == SIGKILL;
>>> +		struct core_state *core_state = signal->core_state;
>>> +		if (core_state) {
>>> +			if (sig == SIGKILL) {
>>> +				struct task_struct *dumper = core_state->dumper.task;
>>> +				sigaddset(&dumper->pending.signal, SIGKILL);
>>> +				signal_wake_up(dumper, 1);
>>> +			}
>>> +			return false;
>>> +		}
>>>  		/*
>>>  		 * The process is in the middle of dying, nothing to do.
>>>  		 */
>>>
>>
>> Hi,
>>
>> This patch breaks userspace, in particular it breaks gst-plugin-scanner
>> of GStreamer which hangs now on next-20211224. IIUC, this tool builds a
>> registry of good/working GStreamer plugins by loading them and
>> blacklisting those that don't work (crash). Before the hang I see
>> systemd-coredump process running, taking snapshot of gst-plugin-scanner
>> and then gst-plugin-scanner gets stuck.
>>
>> Bisection points at this patch, reverting it restores
>> gst-plugin-scanner. Systemd-coredump still running, but there is no hang
>> anymore and everything works properly as before.
>>
>> I'm seeing this problem on ARM32 and haven't checked other arches.
>> Please fix, thanks in advance.
> 
> 
> I have not yet been able to figure out how to run gst-pluggin-scanner in
> a way that triggers this yet.  In truth I can't figure out how to
> run gst-pluggin-scanner in a useful way.
> 
> I am going to set up some unit tests and see if I can reproduce your
> hang another way, but if you could give me some more information on what
> you are doing to trigger this I would appreciate it.

Thanks, Eric. The distro is Arch Linux, but it's a development
environment where I'm running latest GStreamer from git master. I'll try
to figure out the reproduction steps and get back to you.

  reply	other threads:[~2022-01-05 21:39 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87a6ha4zsd.fsf@email.froward.int.ebiederm.org>
2021-12-13 22:50 ` [PATCH 0/8] signal: Cleanup of the signal->flags Eric W. Biederman
2022-01-03 21:30   ` [PATCH 00/17] exit: Making task exiting a first class concept Eric W. Biederman
2022-01-03 21:32     ` [PATCH 01/17] exit: Remove profile_task_exit & profile_munmap Eric W. Biederman
2022-01-04  7:38       ` Christoph Hellwig
2022-01-07  3:48       ` Al Viro
2022-01-08 16:10         ` Eric W. Biederman
2022-01-03 21:32     ` [PATCH 02/17] exit: Coredumps reach do_group_exit Eric W. Biederman
2022-01-03 21:32     ` [PATCH 03/17] exit: Fix the exit_code for wait_task_zombie Eric W. Biederman
2022-01-03 21:32     ` [PATCH 04/17] exit: Use the correct exit_code in /proc/<pid>/stat Eric W. Biederman
2022-01-03 21:33     ` [PATCH 05/17] taskstats: Cleanup the use of task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 06/17] ptrace: Remove second setting of PT_SEIZED in ptrace_attach Eric W. Biederman
2022-01-03 21:33     ` [PATCH 07/17] ptrace: Remove unused regs argument from ptrace_report_syscall Eric W. Biederman
2022-01-03 21:33     ` [PATCH 08/17] ptrace/m68k: Stop open coding ptrace_report_syscall Eric W. Biederman
2022-01-10 15:26       ` Geert Uytterhoeven
2022-01-10 16:20         ` Al Viro
2022-01-10 16:25           ` Al Viro
2022-01-10 17:54           ` Geert Uytterhoeven
2022-01-10 20:37             ` Al Viro
2022-01-10 21:18               ` Eric W. Biederman
2022-01-11  1:33             ` Michael Schmitz
2022-01-11 22:42               ` Finn Thain
2022-01-12  0:20                 ` Michael Schmitz
2022-01-12  3:32                   ` Finn Thain
2022-01-12  7:54                     ` Michael Schmitz
2022-01-12  7:55                   ` Geert Uytterhoeven
2022-01-12  8:05                     ` Michael Schmitz
2022-01-03 21:33     ` [PATCH 09/17] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 10/17] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 11/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 12/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2022-01-03 21:33     ` [PATCH 13/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2022-01-03 21:33     ` [PATCH 14/17] signal: Remove zap_other_threads Eric W. Biederman
2022-01-03 21:33     ` [PATCH 15/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2022-01-03 21:33     ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2022-01-03 21:33     ` [PATCH 17/17] signal: Always set SIGNAL_GROUP_EXIT on process exit Eric W. Biederman
2022-03-09  0:13     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 21:05       ` Jens Axboe
2022-03-15 23:18       ` [PATCH 0/2] ptrace: Making the ptrace changes atomic Eric W. Biederman
2022-03-15 23:21         ` [PATCH 1/2] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-03-17 17:46           ` Oleg Nesterov
2022-03-17 19:10           ` Kees Cook
2022-03-18 14:44             ` Eric W. Biederman
2022-03-18 17:20               ` Kees Cook
2022-03-15 23:22         ` [PATCH 2/2] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-03-17 18:08           ` Oleg Nesterov
2022-03-17 18:31             ` Eric W. Biederman
2022-03-18 19:43               ` Oleg Nesterov
2022-03-18 14:40             ` Eric W. Biederman
2022-03-17 19:13           ` Kees Cook
2022-03-18 14:52             ` Eric W. Biederman
2022-03-18 17:28               ` Kees Cook
2022-03-28 23:56         ` [GIT PULL] ptrace: Cleanups for v5.18 Eric W. Biederman
2022-03-29  0:03           ` Jens Axboe
2022-03-29  0:33           ` Linus Torvalds
2022-03-29  0:53             ` Stephen Rothwell
2022-03-29  0:58               ` Linus Torvalds
2022-03-29  3:37             ` Eric W. Biederman
2022-03-29  4:49               ` Linus Torvalds
2022-03-29  5:20                 ` Linus Torvalds
2022-03-29  0:35           ` pr-tracker-bot
2022-03-09  0:15     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 20:58       ` Linus Torvalds
2021-12-13 22:53 ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2022-01-04  6:30   ` Dmitry Osipenko
2022-01-04 16:18     ` Eric W. Biederman
2022-01-05 19:58     ` Eric W. Biederman
2022-01-05 21:39       ` Dmitry Osipenko [this message]
2022-01-08 18:13         ` Eric W. Biederman
2022-01-08 18:15           ` [PATCH 1/2] signal: Have prepare_signal detect coredumps using signal->core_state Eric W. Biederman
2022-01-08 18:15           ` [PATCH 2/2] signal: Make coredump handling explicit in complete_signal Eric W. Biederman
2022-01-11  8:59           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Dmitry Osipenko
2022-01-11 17:20             ` Eric W. Biederman
2022-01-18 17:30               ` Dmitry Osipenko
2022-01-18 17:52                 ` Eric W. Biederman
2022-01-18 18:01                   ` Dmitry Osipenko
2022-01-04 18:44   ` Linus Torvalds
2022-01-04 19:47     ` Eric W. Biederman
2022-01-08 19:13       ` Heiko Carstens
     [not found]         ` <87ilurwjju.fsf@email.froward.int.ebiederm.org>
     [not found]           ` <87o84juwhg.fsf@email.froward.int.ebiederm.org>
2022-01-10 23:00             ` Olivier Langlois
2022-01-11 17:28               ` Eric W. Biederman
2022-01-11 18:51                 ` Eric W. Biederman
2022-01-11 19:19                   ` Linus Torvalds
2022-01-15  0:12                     ` Eric W. Biederman
2022-01-15 19:23                       ` Olivier Langlois
2022-01-17 16:09                         ` Eric W. Biederman
2022-01-17 18:46                           ` io_uring truncating coredumps Eric W. Biederman
2022-01-18  4:23                             ` Linus Torvalds
2022-01-26 15:06                           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Olivier Langlois
2021-12-13 22:53 ` [PATCH 2/8] signal: Drop signals received after a fatal signal has been processed Eric W. Biederman
2021-12-13 22:53 ` [PATCH 3/8] signal: Have the oom killer detect coredumps using signal->core_state Eric W. Biederman
2021-12-13 22:53 ` [PATCH 4/8] signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process Eric W. Biederman
2021-12-13 22:53 ` [PATCH 5/8] signal: Remove SIGNAL_GROUP_COREDUMP Eric W. Biederman
2021-12-13 22:53 ` [PATCH 6/8] coredump: Stop setting signal->group_exit_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 7/8] signal: Rename group_exit_task group_exec_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 8/8] signal: Remove the helper signal_group_exit Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5bbb54c4-7504-cd28-5dde-4e5965496625@gmail.com \
    --to=digetx@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).