public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "Andy Lutomirski" <luto@kernel.org>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-arch@vger.kernel.org,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Al Viro" <viro@ZenIV.linux.org.uk>,
	"Kees Cook" <keescook@chromium.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"the arch\/x86 maintainers" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 10/20] signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved.
Date: Sun, 24 Oct 2021 11:06:55 -0500	[thread overview]
Message-ID: <87sfwqxv8g.fsf@disp2133> (raw)
In-Reply-To: <b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.com> (Andy Lutomirski's message of "Thu, 21 Oct 2021 16:08:58 -0700")

"Andy Lutomirski" <luto@kernel.org> writes:

> On Wed, Oct 20, 2021, at 10:43 AM, Eric W. Biederman wrote:
>> Instead of pretending to send SIGSEGV by calling do_exit(SIGSEGV)
>> call force_sigsegv(SIGSEGV) to force the process to take a SIGSEGV
>> and terminate.
>
> Why?  I realize it's more polite, but is this useful enough to justify
> the need for testing and potential security impacts?

The why is that do_exit as an interface needs to be refactored.

As it exists right now "do_exit" is bad enough that on a couple of older
architectures do_exit in a random location results in being able to
read/write the kernel stack using ptrace.

So to addresses the issues I need to get everything that really
shouldn't be using do_exit to use something else.

>> Update handle_signal to return immediately when save_v86_state fails
>> and kills the process.  Returning immediately without doing anything
>> except killing the process with SIGSEGV is also what signal_setup_done
>> does when setup_rt_frame fails.  Plus it is always ok to return
>> immediately without delivering a signal to a userspace handler when a
>> fatal signal has killed the current process.
>>
>
> I can mostly understand the individual sentences, but I don't
> understand what you're getting it.  If a fatal signal has killed the
> current process and we are guaranteed not to hit the exit-to-usermode
> path, then, sure, it's safe to return unless we're worried that the
> core dump code will explode.
>
> But, unless it's fixed elsewhere in your series, force_sigsegv() is
> itself quite racy, or at least looks racy -- it can race against
> another thread calling sigaction() and changing the action to
> something other than SIG_DFL.  So it does not appear to actually
> reliably kill the caller, especially if exposed to a malicious user
> program.

You are correct about the races.  I have changes in the works to make
the races go away but that is not an excuse for push a change that
is buggy without them.



>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: x86@kernel.org
>> Cc: H Peter Anvin <hpa@zytor.com>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  arch/x86/kernel/signal.c  | 6 +++++-
>>  arch/x86/kernel/vm86_32.c | 2 +-
>>  2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
>> index f4d21e470083..25a230f705c1 100644
>> --- a/arch/x86/kernel/signal.c
>> +++ b/arch/x86/kernel/signal.c
>> @@ -785,8 +785,12 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
>>  	bool stepping, failed;
>>  	struct fpu *fpu = &current->thread.fpu;
>> 
>> -	if (v8086_mode(regs))
>> +	if (v8086_mode(regs)) {
>>  		save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL);
>> +		/* Has save_v86_state failed and killed the process? */
>> +		if (fatal_signal_pending(current))
>> +			return;
>
> This might be an ABI break, or at least it could be if anyone cared
> about vm86.  Imagine this wasn't guarded by if (v8086_mode) and was
> just if (fatal_signal_pending(current)) return; Then all the other
> processing gets skipped if a fatal signal is pending (e.g. from a
> concurrent kill), which could cause visible oddities in a core dump, I
> think.  Maybe it's minor.

I believe it is minor, because the test happens before anything is
written to userspace.  The worst case is a signal gets dequeued and
then not written to userspace.

On a second I am not certain this test is even necessary.  Especially
if the change you suggest be made to save_v86_state is made so that
the kernel is out of v86 state and kernel things can safely happen.

>> +	}
>> 
>>  	/* Are we from a system call? */
>>  	if (syscall_get_nr(current, regs) != -1) {
>> diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
>> index 63486da77272..040fd01be8b3 100644
>> --- a/arch/x86/kernel/vm86_32.c
>> +++ b/arch/x86/kernel/vm86_32.c
>> @@ -159,7 +159,7 @@ void save_v86_state(struct kernel_vm86_regs *regs, 
>> int retval)
>>  	user_access_end();
>>  Efault:
>>  	pr_alert("could not access userspace vm86 info\n");
>> -	do_exit(SIGSEGV);
>> +	force_sigsegv(SIGSEGV);
>
> This causes us to run unwitting kernel code with the vm86 garbage
> still loaded into the relevant architectural areas (see the chunk if
> save_v86_state that's inside preempt_disable()).  So NAK, especially
> since the aforementioned race might cause the exit-to-usermode path to
> actually run with who-knows-what consequences.

Fair.  I suspect it might even make the current do_exit call run
with who-knows-what consequence.

> If you really want to make this change, please arrange for
> save_v86_state() to switch out of vm86 mode *before* anything that
> might fail so that it's guaranteed to at least put the task in a sane
> state.  And write an explicit test case that tests it.  I could help
> with the latter if you do the former.

I do really want to remove this do_exit.  If the error was causes by a
kernel malfunction we could do something like die.

As it is the code is effectively hand rolling die/oops for a userspace
caused condition.  Which is quite nasty from a maintenance point of
view.


I think your suggested changes to save_v86_state are much more robust
than my idea of simply calling force_sig... and expecting the kernel
to exit immediately.   Having to go another pass through the
exit_to_usermode_loop does not look like it is very hard to make
it robust against a kernel in a random state.

I could close the race today by replacing the force_sigsegv(SIGSEGV)
with force_sig(SIGKILL).  And that removes the coredump path from
the equation so is a bit interesting, but it really is unsatisfactory.


I will dig in and see what can be done including writing a test so that
this code path gracefully handles -EFAULT rather than tries to walk
through the rest of the kernel in a problematic state.


This change as proposed does not get this save_v86_state case to using
ordinary mechanisms to handle the problem, so as written it does not
solve the problem it set out to solve.

Eric

  reply	other threads:[~2021-10-24 16:07 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-20 17:32 [PATCH 00/20] exit cleanups Eric W. Biederman
2021-10-20 17:43 ` [PATCH 01/20] exit/doublefault: Remove apparently bogus comment about rewind_stack_do_exit Eric W. Biederman
2021-10-21 16:02   ` Kees Cook
2021-10-20 17:43 ` [PATCH 02/20] exit: Remove calls of do_exit after noreturn versions of die Eric W. Biederman
2021-10-21 16:02   ` Kees Cook
2021-10-21 16:25     ` Eric W. Biederman
2021-10-20 17:43 ` [PATCH 03/20] reboot: Remove the unreachable panic after do_exit in reboot(2) Eric W. Biederman
2021-10-21 16:05   ` Kees Cook
2021-10-20 17:43 ` [PATCH 04/20] signal/sparc32: Remove unreachable do_exit in do_sparc_fault Eric W. Biederman
2021-10-21 16:05   ` Kees Cook
2021-10-20 17:43 ` [PATCH 05/20] signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT Eric W. Biederman
2021-10-21 16:06   ` Kees Cook
2021-10-24  4:24   ` Maciej W. Rozycki
2021-10-25 20:55     ` Eric W. Biederman
2021-10-24 15:27   ` Thomas Bogendoerfer
2021-10-20 17:43 ` [PATCH 06/20] signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) Eric W. Biederman
2021-10-20 19:57   ` Linus Torvalds
2021-10-27 14:24     ` Rich Felker
2021-10-21 16:08   ` Kees Cook
2021-10-20 17:43 ` [PATCH 07/20] signal/powerpc: On swapcontext failure force SIGSEGV Eric W. Biederman
2021-10-21 16:09   ` Kees Cook
2021-10-20 17:43 ` [PATCH 08/20] signal/sparc: In setup_tsb_params convert open coded BUG into BUG Eric W. Biederman
2021-10-21 16:12   ` Kees Cook
2021-10-20 17:43 ` [PATCH 09/20] signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON Eric W. Biederman
2021-10-21 16:15   ` Kees Cook
2021-10-20 17:43 ` [PATCH 10/20] signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved Eric W. Biederman
2021-10-21 16:16   ` Kees Cook
2021-10-21 17:02     ` Eric W. Biederman
2021-10-21 20:33       ` Kees Cook
2021-10-21 23:08   ` Andy Lutomirski
2021-10-24 16:06     ` Eric W. Biederman [this message]
     [not found]   ` <875ytkygfj.fsf_-_@disp2133>
2021-10-25 21:12     ` [PATCH v2 10/32] " Linus Torvalds
2021-10-25 21:28       ` Eric W. Biederman
2021-10-25 22:25     ` Andy Lutomirski
2021-10-25 23:45       ` Linus Torvalds
2021-10-26  0:21         ` Andy Lutomirski
2021-10-20 17:43 ` [PATCH 11/20] signal/s390: Use force_sigsegv in default_trap_handler Eric W. Biederman
2021-10-21 16:17   ` Kees Cook
2021-10-26  9:38   ` Christian Borntraeger
2021-10-28 15:56     ` Eric W. Biederman
2021-10-29 19:32       ` Eric W. Biederman
2021-10-20 17:43 ` [PATCH 12/20] exit/kthread: Have kernel threads return instead of calling do_exit Eric W. Biederman
2021-10-21 11:12   ` Christoph Hellwig
2021-10-21 15:11     ` Eric W. Biederman
2021-10-21 16:21   ` Kees Cook
2021-10-20 17:43 ` [PATCH 13/20] signal: Implement force_fatal_sig Eric W. Biederman
2021-10-20 20:05   ` Linus Torvalds
2021-10-20 21:25     ` Eric W. Biederman
2021-10-25 22:41     ` Andy Lutomirski
2021-10-25 23:15       ` Linus Torvalds
2021-10-26  4:45         ` Eric W. Biederman
2021-10-26  4:57         ` Eric W. Biederman
2021-10-26 16:15           ` Linus Torvalds
2021-10-28 16:33             ` Eric W. Biederman
2021-10-21 16:24   ` Kees Cook
2021-10-21 16:33     ` Eric W. Biederman
2021-10-21 16:39       ` Kees Cook
2021-10-20 17:44 ` [PATCH 14/20] exit/syscall_user_dispatch: Send ordinary signals on failure Eric W. Biederman
2021-10-21 16:25   ` Kees Cook
2021-10-21 16:37     ` Eric W. Biederman
2021-10-21 16:40       ` Kees Cook
2021-10-21 17:05         ` Eric W. Biederman
2021-10-25 22:32     ` Andy Lutomirski
2021-10-21 16:35   ` Gabriel Krisman Bertazi
2021-10-20 17:44 ` [PATCH 15/20] signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails Eric W. Biederman
2021-10-21 16:34   ` Kees Cook
2021-10-21 16:56     ` Eric W. Biederman
2021-10-20 17:44 ` [PATCH 16/20] signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig Eric W. Biederman
2021-10-21 16:34   ` Kees Cook
2021-10-20 17:44 ` [PATCH 17/20] signal/x86: In emulate_vsyscall force a signal instead of calling do_exit Eric W. Biederman
2021-10-21 16:36   ` Kees Cook
2021-10-20 17:44 ` [PATCH 18/20] exit/rtl8723bs: Replace the macro thread_exit with a simple return 0 Eric W. Biederman
2021-10-21  7:06   ` Greg KH
2021-10-21 15:06     ` Eric W. Biederman
2021-10-21 16:37   ` Kees Cook
2021-10-20 17:44 ` [PATCH 19/20] exit/rtl8712: " Eric W. Biederman
2021-10-21  7:07   ` Greg KH
2021-10-21 16:37   ` Kees Cook
2021-10-20 17:44 ` [PATCH 20/20] exit/r8188eu: " Eric W. Biederman
2021-10-21  7:07   ` Greg KH
2021-10-21 16:37   ` Kees Cook
2021-10-20 21:51 ` [PATCH 21/20] signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV) Eric W. Biederman
2021-10-21  8:09   ` Geert Uytterhoeven
2021-10-21 13:33     ` Eric W. Biederman
2021-10-21  8:32   ` Philippe Mathieu-Daudé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sfwqxv8g.fsf@disp2133 \
    --to=ebiederm@xmission.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox