All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Qianli Zhao <zhaoqianligood@gmail.com>
Cc: christian@brauner.io, axboe@kernel.dk, oleg@redhat.com,
	tglx@linutronix.de, pcc@google.com, linux-kernel@vger.kernel.org,
	zhaoqianli@xiaomi.com
Subject: Re: [PATCH V2] exit: trigger panic when global init has exited
Date: Fri, 12 Mar 2021 12:23:11 -0600	[thread overview]
Message-ID: <m1ft10i640.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <1615519478-178620-1-git-send-email-zhaoqianligood@gmail.com> (Qianli Zhao's message of "Fri, 12 Mar 2021 11:24:38 +0800")

Qianli Zhao <zhaoqianligood@gmail.com> writes:

> From: Qianli Zhao <zhaoqianli@xiaomi.com>
>
> When init sub-threads running on different CPUs exit at the same time,
> zap_pid_ns_processe()->BUG() may be happened.
> And every thread status is abnormal after exit(PF_EXITING set,task->mm=NULL etc),
> which makes it difficult to parse coredump from fulldump normally.
> In order to fix the above problem, when any one init has been set to SIGNAL_GROUP_EXIT,
> trigger panic immediately, and prevent other init threads from continuing to exit.
>
> [   24.705376] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
> [   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S         O    4.14.180-perf-g4483caa8ae80-dirty #1
> [   24.705390] kernel BUG at include/linux/pid_namespace.h:98!
>
> PID: 552   CPU: 4   COMMAND: "init"
> PID: 1     CPU: 7   COMMAND: "init"
> core4                           core7
> ...                             sys_exit_group()
>                                 do_group_exit()
>                                    - sig->flags = SIGNAL_GROUP_EXIT
>                                    - zap_other_threads()
>                                 do_exit() //PF_EXITING is set
> ret_to_user()
> do_notify_resume()
> get_signal()
>     - signal_group_exit
>     - goto fatal;
> do_group_exit()
> do_exit() //PF_EXITING is set
>     - panic("Attempted to kill init! exitcode=0x%08x\n")
>                                 exit_notify()
>                                 find_alive_thread() //no alive sub-threads
>                                 zap_pid_ns_processes()//CONFIG_PID_NS is not set
>                                 BUG()
>
> Signed-off-by: Qianli Zhao <zhaoqianli@xiaomi.com>

The changelog is much better thank you.

As Oleg pointer out we need to do something like the code below.

diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e35e69a..bc676c06ef9a 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -785,15 +785,16 @@ void __noreturn do_exit(long code)
 		sync_mm_rss(tsk->mm);
 	acct_update_integrals(tsk);
 	group_dead = atomic_dec_and_test(&tsk->signal->live);
+	/*
+	 * If the global init has exited, panic immediately to get a
+	 * useable coredump.
+	 */
+	if (unlikely(is_global_init(tsk) &&
+		     (group_dead || (tsk->signal->flags & SIGNAL_GROUP_EXIT)))) {
+		panic("Attempted to kill init! exitcode=0x%08x\n",
+		      tsk->signal->group_exit_code ?: (int)code);
+	}
 	if (group_dead) {
-		/*
-		 * If the last thread of global init has exited, panic
-		 * immediately to get a useable coredump.
-		 */
-		if (unlikely(is_global_init(tsk)))
-			panic("Attempted to kill init! exitcode=0x%08x\n",
-				tsk->signal->group_exit_code ?: (int)code);
-
 #ifdef CONFIG_POSIX_TIMERS
 		hrtimer_cancel(&tsk->signal->real_timer);
 		exit_itimers(tsk->signal);

There is still a race that could lead to the BUG in zap_pid_ns_processes.
We still have a case where the last two threads of a process call
pthread_exit (aka do_exit not do_group_exit in the kernel).

Thread A                            Thread B
do_exit()                           do_exit()

 exit_signals()
   tsk->flags |= PF_EXITING;
 group_dead = false;
                                    exit_signals()
                                      tsk->flags |= PF_EXITING;
 exit_notify()
  forget_original_parent
    find_child_reaper
      reaper = find_alive_thread()
      zap_pid_ns_processes()
         BUG()
                                    group_dead = true;
                                    if (is_global_init())
                                    	panic("Attemted to kill init");

As we are guaranteed to see the panic with my change above I suggest
we augment it by simply removing the BUG in zap_pid_ns_processes.

Or maybe not if there is a better way to write the panic code.  I don't
think having pid namespaces compiled out is a particularly common case.
So whatever we can do to keep the code correct and reduce testing.

Eric

  parent reply	other threads:[~2021-03-12 18:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-12  3:24 [PATCH V2] exit: trigger panic when global init has exited Qianli Zhao
2021-03-12 16:23 ` Oleg Nesterov
2021-03-12 18:23 ` Eric W. Biederman [this message]
2021-03-13 13:12   ` qianli zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1ft10i640.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=axboe@kernel.dk \
    --cc=christian@brauner.io \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pcc@google.com \
    --cc=tglx@linutronix.de \
    --cc=zhaoqianli@xiaomi.com \
    --cc=zhaoqianligood@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.