From: Oleg Nesterov <oleg@redhat.com>
To: Enke Chen <enkechen@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Arnd Bergmann <arnd@arndb.de>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Khalid Aziz <khalid.aziz@oracle.com>,
Kate Stewart <kstewart@linuxfoundation.org>,
Helge Deller <deller@gmx.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Christian Brauner <christian@brauner.io>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Dave Martin <Dave.Martin@arm.com>,
Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@s>
Subject: Re: [PATCH v2] kernel/signal: Signal-based pre-coredump notification
Date: Tue, 23 Oct 2018 11:23:48 +0200 [thread overview]
Message-ID: <20181023092348.GA14340@redhat.com> (raw)
In-Reply-To: <458c04d8-d189-4a26-729a-bb1d1d751534@cisco.com>
On 10/22, Enke Chen wrote:
>
> As the coredump of a process may take time, in certain time-sensitive
> applications it is necessary for a parent process (e.g., a process
> manager) to be notified of a child's imminent death before the coredump
> so that the parent process can act sooner, such as re-spawning an
> application process, or initiating a control-plane fail-over.
Personally I still do not like this feature, but I won't argue.
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -546,6 +546,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> struct cred *cred;
> int retval = 0;
> int ispipe;
> + bool notify;
> struct files_struct *displaced;
> /* require nonrelative corefile path and be extra careful */
> bool need_suid_safe = false;
> @@ -590,6 +591,15 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> if (retval < 0)
> goto fail_creds;
>
> + /*
> + * Send the pre-coredump signal to the parent if requested.
> + */
> + read_lock(&tasklist_lock);
> + notify = do_notify_parent_predump(current);
> + read_unlock(&tasklist_lock);
> + if (notify)
> + cond_resched();
Hmm. I do not understand why do we need cond_resched(). And even if we need it,
why we can't call it unconditionally?
I'd also suggest to move read_lock/unlock(tasklist) into do_notify_parent_predump()
and remove the "task_struct *tsk" argument, tsk is always current.
Yes, do_notify_parent() and do_notify_parent_cldstop() are called with tasklist_lock
held, but there are good reasons for that.
> +static inline int valid_predump_signal(int sig)
> +{
> + return (sig == SIGCHLD) || (sig == SIGUSR1) || (sig == SIGUSR2);
> +}
I still do not understand why do we need to restrict predump_signal.
PR_SET_PREDUMP_SIG can only change the caller's ->predump_signal, so to me
even PR_SET_PREDUMP_SIG(SIGKILL) is fine.
And once again, SIGCHLD/SIGUSR do not queue, this means that PR_SET_PREDUMP_SIG
is pointless if you have 2 or more children.
> +bool do_notify_parent_predump(struct task_struct *tsk)
> +{
> + struct sighand_struct *sighand;
> + struct kernel_siginfo info;
> + struct task_struct *parent;
> + unsigned long flags;
> + pid_t pid;
> + int sig;
> +
> + parent = tsk->parent;
> + sighand = parent->sighand;
> + pid = task_tgid_vnr(tsk);
> +
> + spin_lock_irqsave(&sighand->siglock, flags);
> + sig = parent->signal->predump_signal;
> + if (!valid_predump_signal(sig)) {
> + spin_unlock_irqrestore(&sighand->siglock, flags);
> + return false;
> + }
Why do we need to check parent->signal->predump_signal under ->siglock?
This complicates the code for no reason, afaics.
> + clear_siginfo(&info);
> + info.si_pid = pid;
> + info.si_signo = sig;
> + if (sig == SIGCHLD)
> + info.si_code = CLD_PREDUMP;
> +
> + __group_send_sig_info(sig, &info, parent);
> + __wake_up_parent(tsk, parent);
Why __wake_up_parent() ?
do_notify_parent() does this to wake up the parent sleeping in do_wait(), to
report the event. But predump_signal has nothing to do with wait().
Now. This version sends the signal to ->parent, not ->real_parent. OK, but this
means that real_parent won't be notified if its child is traced.
> + case PR_SET_PREDUMP_SIG:
> + if (arg3 || arg4 || arg5)
> + return -EINVAL;
> +
> + /* 0 is valid for disabling the feature */
> + if (arg2 && !valid_predump_signal((int)arg2))
> + return -EINVAL;
> + me->signal->predump_signal = (int)arg2;
> + break;
Again, I do not understand why do we need valid_predump_signal(). But even
if we need it, I don't understand why should we check it twice. IOW, why
do_notify_parent_predump() can't simply check ->predump_signal != 0?
Whatever we do, PR_SET_PREDUMP_SIG should validate arg2 anyway. Who else can
change ->predump_signal after that?
> + case PR_GET_PREDUMP_SIG:
> + if (arg3 || arg4 || arg5)
> + return -EINVAL;
> + error = put_user(me->signal->predump_signal,
> + (int __user *)arg2);
To me it would be better to simply return ->predump_signal, iow
error = me->signal->predump_signal;
break;
but I won't insist, this is subjective and cosmetic.
Oleg.
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Enke Chen <enkechen@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Arnd Bergmann <arnd@arndb.de>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Khalid Aziz <khalid.aziz@oracle.com>,
Kate Stewart <kstewart@linuxfoundation.org>,
Helge Deller <deller@gmx.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Christian Brauner <christian@brauner.io>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Dave Martin <Dave.Martin@arm.com>,
Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Roman Gushchin <guro@fb.com>,
Marcos Paulo de Souza <marcos.souza.org@gmail.com>,
Dominik Brodowski <linux@dominikbrodowski.net>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>,
x86@kernel.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org,
"Victor Kamensky (kamensky)" <kamensky@cisco.com>,
xe-linux-external@cisco.com, Stefan Strogin <sstrogin@cisco.com>
Subject: Re: [PATCH v2] kernel/signal: Signal-based pre-coredump notification
Date: Tue, 23 Oct 2018 11:23:48 +0200 [thread overview]
Message-ID: <20181023092348.GA14340@redhat.com> (raw)
Message-ID: <20181023092348.61pcUIEm7gM4T0D2wXuAePMH9KzDEHNhEN_gQ3sxZgw@z> (raw)
In-Reply-To: <458c04d8-d189-4a26-729a-bb1d1d751534@cisco.com>
On 10/22, Enke Chen wrote:
>
> As the coredump of a process may take time, in certain time-sensitive
> applications it is necessary for a parent process (e.g., a process
> manager) to be notified of a child's imminent death before the coredump
> so that the parent process can act sooner, such as re-spawning an
> application process, or initiating a control-plane fail-over.
Personally I still do not like this feature, but I won't argue.
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -546,6 +546,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> struct cred *cred;
> int retval = 0;
> int ispipe;
> + bool notify;
> struct files_struct *displaced;
> /* require nonrelative corefile path and be extra careful */
> bool need_suid_safe = false;
> @@ -590,6 +591,15 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> if (retval < 0)
> goto fail_creds;
>
> + /*
> + * Send the pre-coredump signal to the parent if requested.
> + */
> + read_lock(&tasklist_lock);
> + notify = do_notify_parent_predump(current);
> + read_unlock(&tasklist_lock);
> + if (notify)
> + cond_resched();
Hmm. I do not understand why do we need cond_resched(). And even if we need it,
why we can't call it unconditionally?
I'd also suggest to move read_lock/unlock(tasklist) into do_notify_parent_predump()
and remove the "task_struct *tsk" argument, tsk is always current.
Yes, do_notify_parent() and do_notify_parent_cldstop() are called with tasklist_lock
held, but there are good reasons for that.
> +static inline int valid_predump_signal(int sig)
> +{
> + return (sig == SIGCHLD) || (sig == SIGUSR1) || (sig == SIGUSR2);
> +}
I still do not understand why do we need to restrict predump_signal.
PR_SET_PREDUMP_SIG can only change the caller's ->predump_signal, so to me
even PR_SET_PREDUMP_SIG(SIGKILL) is fine.
And once again, SIGCHLD/SIGUSR do not queue, this means that PR_SET_PREDUMP_SIG
is pointless if you have 2 or more children.
> +bool do_notify_parent_predump(struct task_struct *tsk)
> +{
> + struct sighand_struct *sighand;
> + struct kernel_siginfo info;
> + struct task_struct *parent;
> + unsigned long flags;
> + pid_t pid;
> + int sig;
> +
> + parent = tsk->parent;
> + sighand = parent->sighand;
> + pid = task_tgid_vnr(tsk);
> +
> + spin_lock_irqsave(&sighand->siglock, flags);
> + sig = parent->signal->predump_signal;
> + if (!valid_predump_signal(sig)) {
> + spin_unlock_irqrestore(&sighand->siglock, flags);
> + return false;
> + }
Why do we need to check parent->signal->predump_signal under ->siglock?
This complicates the code for no reason, afaics.
> + clear_siginfo(&info);
> + info.si_pid = pid;
> + info.si_signo = sig;
> + if (sig == SIGCHLD)
> + info.si_code = CLD_PREDUMP;
> +
> + __group_send_sig_info(sig, &info, parent);
> + __wake_up_parent(tsk, parent);
Why __wake_up_parent() ?
do_notify_parent() does this to wake up the parent sleeping in do_wait(), to
report the event. But predump_signal has nothing to do with wait().
Now. This version sends the signal to ->parent, not ->real_parent. OK, but this
means that real_parent won't be notified if its child is traced.
> + case PR_SET_PREDUMP_SIG:
> + if (arg3 || arg4 || arg5)
> + return -EINVAL;
> +
> + /* 0 is valid for disabling the feature */
> + if (arg2 && !valid_predump_signal((int)arg2))
> + return -EINVAL;
> + me->signal->predump_signal = (int)arg2;
> + break;
Again, I do not understand why do we need valid_predump_signal(). But even
if we need it, I don't understand why should we check it twice. IOW, why
do_notify_parent_predump() can't simply check ->predump_signal != 0?
Whatever we do, PR_SET_PREDUMP_SIG should validate arg2 anyway. Who else can
change ->predump_signal after that?
> + case PR_GET_PREDUMP_SIG:
> + if (arg3 || arg4 || arg5)
> + return -EINVAL;
> + error = put_user(me->signal->predump_signal,
> + (int __user *)arg2);
To me it would be better to simply return ->predump_signal, iow
error = me->signal->predump_signal;
break;
but I won't insist, this is subjective and cosmetic.
Oleg.
next prev parent reply other threads:[~2018-10-23 9:23 UTC|newest]
Thread overview: 148+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-13 0:33 [PATCH] kernel/signal: Signal-based pre-coredump notification Enke Chen
2018-10-13 0:33 ` Enke Chen
2018-10-13 6:40 ` Greg Kroah-Hartman
2018-10-13 6:40 ` Greg Kroah-Hartman
2018-10-15 18:16 ` Enke Chen
2018-10-15 18:16 ` Enke Chen
2018-10-15 18:43 ` Greg Kroah-Hartman
2018-10-15 18:43 ` Greg Kroah-Hartman
2018-10-15 18:49 ` Enke Chen
2018-10-15 18:49 ` Enke Chen
2018-10-15 18:58 ` Greg Kroah-Hartman
2018-10-15 18:58 ` Greg Kroah-Hartman
2018-10-13 10:44 ` Christian Brauner
2018-10-13 10:44 ` Christian Brauner
2018-10-15 18:39 ` Enke Chen
2018-10-15 18:39 ` Enke Chen
2018-10-15 18:39 ` Enke Chen
2018-10-13 18:27 ` Jann Horn
2018-10-13 18:27 ` Jann Horn
2018-10-15 18:36 ` Enke Chen
2018-10-15 18:36 ` Enke Chen
2018-10-15 18:54 ` Jann Horn
2018-10-15 18:54 ` Jann Horn
2018-10-15 19:23 ` Enke Chen
2018-10-15 19:23 ` Enke Chen
2018-10-19 23:01 ` Enke Chen
2018-10-19 23:01 ` Enke Chen
2018-10-22 15:40 ` Jann Horn
2018-10-22 15:40 ` Jann Horn
2018-10-22 20:48 ` Enke Chen
2018-10-22 20:48 ` Enke Chen
2018-10-15 12:05 ` Oleg Nesterov
2018-10-15 12:05 ` Oleg Nesterov
2018-10-15 18:54 ` Enke Chen
2018-10-15 18:54 ` Enke Chen
2018-10-15 19:17 ` Enke Chen
2018-10-15 19:17 ` Enke Chen
2018-10-15 19:26 ` Enke Chen
2018-10-15 19:26 ` Enke Chen
2018-10-16 14:14 ` Oleg Nesterov
2018-10-16 14:14 ` Oleg Nesterov
2018-10-16 15:09 ` Eric W. Biederman
2018-10-16 15:09 ` Eric W. Biederman
2018-10-16 15:09 ` Eric W. Biederman
2018-10-17 0:39 ` Enke Chen
2018-10-17 0:39 ` Enke Chen
2018-10-15 21:21 ` Alan Cox
2018-10-15 21:21 ` Alan Cox
2018-10-15 21:31 ` Enke Chen
2018-10-15 21:31 ` Enke Chen
2018-10-15 23:28 ` Eric W. Biederman
2018-10-15 23:28 ` Eric W. Biederman
2018-10-15 23:28 ` Eric W. Biederman
2018-10-16 0:33 ` valdis.kletnieks
2018-10-16 0:33 ` valdis.kletnieks
2018-10-16 0:33 ` valdis.kletnieks
2018-10-16 0:54 ` Enke Chen
2018-10-16 0:54 ` Enke Chen
2018-10-16 15:26 ` Eric W. Biederman
2018-10-16 15:26 ` Eric W. Biederman
2018-10-16 15:26 ` Eric W. Biederman
2018-10-22 21:09 ` [PATCH v2] " Enke Chen
2018-10-22 21:09 ` Enke Chen
2018-10-23 9:23 ` Oleg Nesterov [this message]
2018-10-23 9:23 ` Oleg Nesterov
2018-10-23 19:43 ` Enke Chen
2018-10-23 19:43 ` Enke Chen
2018-10-23 21:40 ` Enke Chen
2018-10-23 21:40 ` Enke Chen
2018-10-24 13:52 ` Oleg Nesterov
2018-10-24 13:52 ` Oleg Nesterov
2018-10-24 21:56 ` Enke Chen
2018-10-24 21:56 ` Enke Chen
2018-10-24 5:39 ` [PATCH v3] " Enke Chen
2018-10-24 5:39 ` Enke Chen
2018-10-24 14:02 ` Oleg Nesterov
2018-10-24 14:02 ` Oleg Nesterov
2018-10-24 22:02 ` Enke Chen
2018-10-24 22:02 ` Enke Chen
2018-10-25 22:56 ` [PATCH v4] " Enke Chen
2018-10-25 22:56 ` Enke Chen
2018-10-26 8:28 ` Oleg Nesterov
2018-10-26 8:28 ` Oleg Nesterov
2018-10-26 22:23 ` Enke Chen
2018-10-26 22:23 ` Enke Chen
2018-10-29 11:18 ` Oleg Nesterov
2018-10-29 11:18 ` Oleg Nesterov
2018-10-29 21:08 ` Enke Chen
2018-10-29 21:08 ` Enke Chen
2018-10-29 22:31 ` [PATCH v5] " Enke Chen
2018-10-29 22:31 ` Enke Chen
2018-10-30 16:46 ` Oleg Nesterov
2018-10-30 16:46 ` Oleg Nesterov
2018-10-31 0:25 ` Enke Chen
2018-10-31 0:25 ` Enke Chen
2018-11-22 0:37 ` Andrew Morton
2018-11-22 0:37 ` Andrew Morton
2018-11-22 1:09 ` Enke Chen
2018-11-22 1:09 ` Enke Chen
2018-11-22 1:18 ` Enke Chen
2018-11-22 1:18 ` Enke Chen
2018-11-22 1:33 ` Andrew Morton
2018-11-22 1:33 ` Andrew Morton
2018-11-22 4:57 ` Enke Chen
2018-11-22 4:57 ` Enke Chen
2018-11-12 23:22 ` Enke Chen
2018-11-12 23:22 ` Enke Chen
2018-11-27 22:54 ` [PATCH v5 1/2] " Enke Chen
2018-11-27 22:54 ` Enke Chen
2018-11-28 15:19 ` Dave Martin
2018-11-28 15:19 ` Dave Martin
2018-11-29 0:15 ` Enke Chen
2018-11-29 0:15 ` Enke Chen
2018-11-29 11:55 ` Dave Martin
2018-11-29 11:55 ` Dave Martin
2018-11-30 0:27 ` Enke Chen
2018-11-30 0:27 ` Enke Chen
2018-11-30 12:03 ` Oleg Nesterov
2018-11-30 12:03 ` Oleg Nesterov
2018-12-05 6:47 ` Jann Horn
2018-12-05 6:47 ` Jann Horn
2018-12-04 22:37 ` Andrew Morton
2018-12-04 22:37 ` Andrew Morton
2018-12-06 17:29 ` Oleg Nesterov
2018-12-06 17:29 ` Oleg Nesterov
2018-10-25 22:56 ` [PATCH] selftests/prctl: selftest for pre-coredump signal notification Enke Chen
2018-10-25 22:56 ` Enke Chen
2018-11-27 22:54 ` [PATCH v5 2/2] " Enke Chen
2018-11-27 22:54 ` Enke Chen
2018-10-24 13:29 ` [PATCH v2] kernel/signal: Signal-based pre-coredump notification Eric W. Biederman
2018-10-24 13:29 ` Eric W. Biederman
2018-10-24 13:29 ` Eric W. Biederman
2018-10-24 23:50 ` Enke Chen
2018-10-24 23:50 ` Enke Chen
2018-10-25 12:23 ` Eric W. Biederman
2018-10-25 12:23 ` Eric W. Biederman
2018-10-25 12:23 ` Eric W. Biederman
2018-10-25 20:45 ` Enke Chen
2018-10-25 20:45 ` Enke Chen
2018-10-25 21:24 ` Enke Chen
2018-10-25 21:24 ` Enke Chen
2018-10-25 21:56 ` Enke Chen
2018-10-25 21:56 ` Enke Chen
2018-10-25 13:45 ` Jann Horn
2018-10-25 13:45 ` Jann Horn
2018-10-25 20:21 ` Eric W. Biederman
2018-10-25 20:21 ` Eric W. Biederman
2018-10-25 20:21 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181023092348.GA14340@redhat.com \
--to=oleg@redhat.com \
--cc=Dave.Martin@arm.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christian@brauner.io \
--cc=deller@gmx.de \
--cc=ebiederm@xmission.com \
--cc=enkechen@cisco.com \
--cc=gregkh@linuxfoundation.org \
--cc=hpa@zytor.com \
--cc=khalid.aziz@oracle.com \
--cc=kstewart@linuxfoundation.org \
--cc=mchehab+samsung@kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@s \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.