All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leonard Crestez <leonard.crestez@nxp.com>
To: Thomas Garnier <thgarnie@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Rik van Riel <riel@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Petr Mladek <pmladek@suse.com>, Miroslav Benes <mbenes@suse.cz>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>,
	Dave Hansen <dave.hansen@intel.com>,
	David Howells <dhowells@redhat.com>,
	Russell King <linux@armlinux.org.uk>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>, Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Pratyush Anand <panand@redhat.com>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Linux API <linux-api@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	Kernel Hardening <kernel-hardening@lists.openwall.com>,
	Octavian Purdila <octavian.purdila@nxp.com>
Subject: [kernel-hardening] Re: [PATCH v10 2/3] arm/syscalls: Check address limit on user-mode return
Date: Wed, 19 Jul 2017 17:58:20 +0300	[thread overview]
Message-ID: <1500476300.22834.13.camel@nxp.com> (raw)
In-Reply-To: <CAJcbSZFr9KJTfGfiZo2fThoDkAE-D1OFf2YtELq4P6jX8syesQ@mail.gmail.com>

On Tue, 2017-07-18 at 12:04 -0700, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 10:18 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > On Tue, 2017-07-18 at 09:04 -0700, Thomas Garnier wrote:
> > > On Tue, Jul 18, 2017 at 7:36 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > > > On Wed, 2017-06-14 at 18:12 -0700, Thomas Garnier wrote:
> > > > > 
> > > > > Ensure the address limit is a user-mode segment before returning to
> > > > > user-mode. Otherwise a process can corrupt kernel-mode memory and
> > > > > elevate privileges [1].
> > > > > 
> > > > > The set_fs function sets the TIF_SETFS flag to force a slow path on
> > > > > return. In the slow path, the address limit is checked to be USER_DS if
> > > > > needed.
> > > > > 
> > > > > The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
> > > > > for arm instruction immediate support. The global work mask is too big
> > > > > to used on a single instruction so adapt ret_fast_syscall.
> > > > > 
> > > > > @@ -571,6 +572,10 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
> > > > >        * Update the trace code with the current status.
> > > > >        */
> > > > >       trace_hardirqs_off();
> > > > > +
> > > > > +     /* Check valid user FS if needed */
> > > > > +     addr_limit_user_check();
> > > > > +
> > > > >       do {
> > > > >               if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > > > >                       schedule();
> > > > This patch made it's way into linux-next next-20170717 and it seems to
> > > > cause hangs when booting some boards over NFS (found via bisection). I
> > > > don't know exactly what determines the issue but I can reproduce hangs
> > > > if even if I just boot with init=/bin/bash and do stuff like
> > > > 
> > > > # sleep 1 & sleep 1 & sleep 1 & wait; wait; wait; echo done!
> > > > 
> > > > When this happens sysrq-t shows a sleep task hung in the 'R' state
> > > > spinning in do_work_pending, so maybe there is a potential infinite
> > > > loop here?
> > > > 
> > > > The addr_limit_user_check at the start of do_work_pending will check
> > > > for TIF_FSCHECK once and clear it but the function loops while
> > > > (thread_flags & _TIF_WORK_MASK), so it if TIF_FSCHECK is set again then
> > > > the loop will never terminate. Does this make sense?
> > > 
> > > Yes, it does. Thanks for looking into this.
> > > 
> > > Can you try this change?
> > > 
> > > diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> > > index 3a48b54c6405..bc6ad7789568 100644
> > > --- a/arch/arm/kernel/signal.c
> > > +++ b/arch/arm/kernel/signal.c
> > > @@ -573,12 +573,11 @@ do_work_pending(struct pt_regs *regs, unsigned
> > > int thread_flags, int syscall)
> > >   */
> > >   trace_hardirqs_off();
> > > 
> > > - /* Check valid user FS if needed */
> > > - addr_limit_user_check();
> > > -
> > >   do {
> > >   if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > >   schedule();
> > > + } else if (thread_flags & _TIF_FSCHECK) {
> > > + addr_limit_user_check();
> > >   } else {
> > >   if (unlikely(!user_mode(regs)))
> > >   return 0;
> > This does seem to work, it no longer hangs on boot in my setup. This is
> > obviously only a very superficial test.
> > 
> > The new location of this check seems weird, it's not clear why it
> > should be on an else path. Perhaps it should be moved to right before
> > where current_thread_info()->flags is fetched again?

> I was hitting bug when I tried that.I think that's because you
> basically let the signal handler do pending work before you check the
> flag, that's not a good idea.

> > If the purpose is hardening against buggy kernel code doing bad set_fs
> > calls shouldn't this flag also be checked before looking at
> > TIF_NEED_RESCHED and calling schedule()?
> I am not sure to be honest. I expected schedule to only schedule the
> processor to another task which would be fine given only the current
> task have a bogus fs. I will put it first in case there is an edge
> case scenario I missed.
> 
> What do you think? Let me know and I will look at changes all
> architectures and testing them.

I don't know and I'd rather not guess on security issues. It's better
if someone else reviews the code.

Unless there is a very quick fix maybe this series should be removed or
reverted from linux-next? A diagnosis of "system calls can sometimes
hang on return" seems serious even for linux-next. Since it happens
very rarely in most setups I can easily imagine somebody spending a lot
of time digging at this.

--
Regards,
Leonard

WARNING: multiple messages have this Message-ID (diff)
From: Leonard Crestez <leonard.crestez-3arQi8VN3Tc@public.gmane.org>
To: Thomas Garnier <thgarnie-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Stephen Rothwell <sfr-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org>
Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"H . Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
	Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Josh Poimboeuf <jpoimboe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>,
	Miroslav Benes <mbenes-AlSwsSmVLrQ@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Dave Hansen <dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Russell King <linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
	Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>,
	Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>,
	Pratyush Anand <panand-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v10 2/3] arm/syscalls: Check address limit on user-mode return
Date: Wed, 19 Jul 2017 17:58:20 +0300	[thread overview]
Message-ID: <1500476300.22834.13.camel@nxp.com> (raw)
In-Reply-To: <CAJcbSZFr9KJTfGfiZo2fThoDkAE-D1OFf2YtELq4P6jX8syesQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, 2017-07-18 at 12:04 -0700, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 10:18 AM, Leonard Crestez <leonard.crestez-3arQi8VN3Tc@public.gmane.org> wrote:
> > On Tue, 2017-07-18 at 09:04 -0700, Thomas Garnier wrote:
> > > On Tue, Jul 18, 2017 at 7:36 AM, Leonard Crestez <leonard.crestez-3arQi8VN3Tc@public.gmane.org> wrote:
> > > > On Wed, 2017-06-14 at 18:12 -0700, Thomas Garnier wrote:
> > > > > 
> > > > > Ensure the address limit is a user-mode segment before returning to
> > > > > user-mode. Otherwise a process can corrupt kernel-mode memory and
> > > > > elevate privileges [1].
> > > > > 
> > > > > The set_fs function sets the TIF_SETFS flag to force a slow path on
> > > > > return. In the slow path, the address limit is checked to be USER_DS if
> > > > > needed.
> > > > > 
> > > > > The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
> > > > > for arm instruction immediate support. The global work mask is too big
> > > > > to used on a single instruction so adapt ret_fast_syscall.
> > > > > 
> > > > > @@ -571,6 +572,10 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
> > > > >        * Update the trace code with the current status.
> > > > >        */
> > > > >       trace_hardirqs_off();
> > > > > +
> > > > > +     /* Check valid user FS if needed */
> > > > > +     addr_limit_user_check();
> > > > > +
> > > > >       do {
> > > > >               if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > > > >                       schedule();
> > > > This patch made it's way into linux-next next-20170717 and it seems to
> > > > cause hangs when booting some boards over NFS (found via bisection). I
> > > > don't know exactly what determines the issue but I can reproduce hangs
> > > > if even if I just boot with init=/bin/bash and do stuff like
> > > > 
> > > > # sleep 1 & sleep 1 & sleep 1 & wait; wait; wait; echo done!
> > > > 
> > > > When this happens sysrq-t shows a sleep task hung in the 'R' state
> > > > spinning in do_work_pending, so maybe there is a potential infinite
> > > > loop here?
> > > > 
> > > > The addr_limit_user_check at the start of do_work_pending will check
> > > > for TIF_FSCHECK once and clear it but the function loops while
> > > > (thread_flags & _TIF_WORK_MASK), so it if TIF_FSCHECK is set again then
> > > > the loop will never terminate. Does this make sense?
> > > 
> > > Yes, it does. Thanks for looking into this.
> > > 
> > > Can you try this change?
> > > 
> > > diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> > > index 3a48b54c6405..bc6ad7789568 100644
> > > --- a/arch/arm/kernel/signal.c
> > > +++ b/arch/arm/kernel/signal.c
> > > @@ -573,12 +573,11 @@ do_work_pending(struct pt_regs *regs, unsigned
> > > int thread_flags, int syscall)
> > >   */
> > >   trace_hardirqs_off();
> > > 
> > > - /* Check valid user FS if needed */
> > > - addr_limit_user_check();
> > > -
> > >   do {
> > >   if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > >   schedule();
> > > + } else if (thread_flags & _TIF_FSCHECK) {
> > > + addr_limit_user_check();
> > >   } else {
> > >   if (unlikely(!user_mode(regs)))
> > >   return 0;
> > This does seem to work, it no longer hangs on boot in my setup. This is
> > obviously only a very superficial test.
> > 
> > The new location of this check seems weird, it's not clear why it
> > should be on an else path. Perhaps it should be moved to right before
> > where current_thread_info()->flags is fetched again?

> I was hitting bug when I tried that.I think that's because you
> basically let the signal handler do pending work before you check the
> flag, that's not a good idea.

> > If the purpose is hardening against buggy kernel code doing bad set_fs
> > calls shouldn't this flag also be checked before looking at
> > TIF_NEED_RESCHED and calling schedule()?
> I am not sure to be honest. I expected schedule to only schedule the
> processor to another task which would be fine given only the current
> task have a bogus fs. I will put it first in case there is an edge
> case scenario I missed.
> 
> What do you think? Let me know and I will look at changes all
> architectures and testing them.

I don't know and I'd rather not guess on security issues. It's better
if someone else reviews the code.

Unless there is a very quick fix maybe this series should be removed or
reverted from linux-next? A diagnosis of "system calls can sometimes
hang on return" seems serious even for linux-next. Since it happens
very rarely in most setups I can easily imagine somebody spending a lot
of time digging at this.

--
Regards,
Leonard

WARNING: multiple messages have this Message-ID (diff)
From: leonard.crestez@nxp.com (Leonard Crestez)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v10 2/3] arm/syscalls: Check address limit on user-mode return
Date: Wed, 19 Jul 2017 17:58:20 +0300	[thread overview]
Message-ID: <1500476300.22834.13.camel@nxp.com> (raw)
In-Reply-To: <CAJcbSZFr9KJTfGfiZo2fThoDkAE-D1OFf2YtELq4P6jX8syesQ@mail.gmail.com>

On Tue, 2017-07-18 at 12:04 -0700, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 10:18 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > On Tue, 2017-07-18 at 09:04 -0700, Thomas Garnier wrote:
> > > On Tue, Jul 18, 2017 at 7:36 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > > > On Wed, 2017-06-14 at 18:12 -0700, Thomas Garnier wrote:
> > > > > 
> > > > > Ensure the address limit is a user-mode segment before returning to
> > > > > user-mode. Otherwise a process can corrupt kernel-mode memory and
> > > > > elevate privileges [1].
> > > > > 
> > > > > The set_fs function sets the TIF_SETFS flag to force a slow path on
> > > > > return. In the slow path, the address limit is checked to be USER_DS if
> > > > > needed.
> > > > > 
> > > > > The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
> > > > > for arm instruction immediate support. The global work mask is too big
> > > > > to used on a single instruction so adapt ret_fast_syscall.
> > > > > 
> > > > > @@ -571,6 +572,10 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
> > > > > ???????* Update the trace code with the current status.
> > > > > ???????*/
> > > > > ??????trace_hardirqs_off();
> > > > > +
> > > > > +?????/* Check valid user FS if needed */
> > > > > +?????addr_limit_user_check();
> > > > > +
> > > > > ??????do {
> > > > > ??????????????if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > > > > ??????????????????????schedule();
> > > > This patch made it's way into linux-next next-20170717 and it seems to
> > > > cause hangs when booting some boards over NFS (found via bisection). I
> > > > don't know exactly what determines the issue but I can reproduce hangs
> > > > if even if I just boot with init=/bin/bash and do stuff like
> > > > 
> > > > # sleep 1 & sleep 1 & sleep 1 & wait; wait; wait; echo done!
> > > > 
> > > > When this happens sysrq-t shows a sleep task hung in the 'R' state
> > > > spinning in do_work_pending, so maybe there is a potential infinite
> > > > loop here?
> > > > 
> > > > The addr_limit_user_check at the start of do_work_pending will check
> > > > for TIF_FSCHECK once and clear it but the function loops while
> > > > (thread_flags & _TIF_WORK_MASK), so it if TIF_FSCHECK is set again then
> > > > the loop will never terminate. Does this make sense?
> > > 
> > > Yes, it does. Thanks for looking into this.
> > > 
> > > Can you try this change?
> > > 
> > > diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> > > index 3a48b54c6405..bc6ad7789568 100644
> > > --- a/arch/arm/kernel/signal.c
> > > +++ b/arch/arm/kernel/signal.c
> > > @@ -573,12 +573,11 @@ do_work_pending(struct pt_regs *regs, unsigned
> > > int thread_flags, int syscall)
> > > ? */
> > > ? trace_hardirqs_off();
> > > 
> > > - /* Check valid user FS if needed */
> > > - addr_limit_user_check();
> > > -
> > > ? do {
> > > ? if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > > ? schedule();
> > > + } else if (thread_flags & _TIF_FSCHECK) {
> > > + addr_limit_user_check();
> > > ? } else {
> > > ? if (unlikely(!user_mode(regs)))
> > > ? return 0;
> > This does seem to work, it no longer hangs on boot in my setup. This is
> > obviously only a very superficial test.
> > 
> > The new location of this check seems weird, it's not clear why it
> > should be on an else path. Perhaps it should be moved to right before
> > where current_thread_info()->flags is fetched again?

> I was hitting bug when I tried that.I think that's because you
> basically let the signal handler do pending work before you check the
> flag, that's not a good idea.

> > If the purpose is hardening against buggy kernel code doing bad set_fs
> > calls shouldn't this flag also be checked before looking at
> > TIF_NEED_RESCHED and calling schedule()?
> I am not sure to be honest. I expected schedule to only schedule the
> processor to another task which would be fine given only the current
> task have a bogus fs. I will put it first in case there is an edge
> case scenario I missed.
> 
> What do you think? Let me know and I will look at changes all
> architectures and testing them.

I don't know and I'd rather not guess on security issues. It's better
if someone else reviews the code.

Unless there is a very quick fix maybe this series should be removed or
reverted from linux-next? A diagnosis of "system calls can sometimes
hang on return" seems serious even for linux-next. Since it happens
very rarely in most setups I can easily imagine somebody spending a lot
of time digging at this.

--
Regards,
Leonard

WARNING: multiple messages have this Message-ID (diff)
From: Leonard Crestez <leonard.crestez@nxp.com>
To: Thomas Garnier <thgarnie@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Rik van Riel" <riel@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Petr Mladek <pmladek@suse.com>, Miroslav Benes <mbenes@suse.cz>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>,
	Dave Hansen <dave.hansen@intel.com>,
	David Howells <dhowells@redhat.com>,
	Russell King <linux@armlinux.org.uk>,
	Andy Lutomirski <luto@amacapital.net>,
	Will Drewry <wad@chromium.org>, Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	"Pratyush Anand" <panand@redhat.com>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Linux API <linux-api@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	Kernel Hardening <kernel-hardening@lists.openwall.com>,
	Octavian Purdila <octavian.purdila@nxp.com>
Subject: Re: [PATCH v10 2/3] arm/syscalls: Check address limit on user-mode return
Date: Wed, 19 Jul 2017 17:58:20 +0300	[thread overview]
Message-ID: <1500476300.22834.13.camel@nxp.com> (raw)
In-Reply-To: <CAJcbSZFr9KJTfGfiZo2fThoDkAE-D1OFf2YtELq4P6jX8syesQ@mail.gmail.com>

On Tue, 2017-07-18 at 12:04 -0700, Thomas Garnier wrote:
> On Tue, Jul 18, 2017 at 10:18 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > On Tue, 2017-07-18 at 09:04 -0700, Thomas Garnier wrote:
> > > On Tue, Jul 18, 2017 at 7:36 AM, Leonard Crestez <leonard.crestez@nxp.com> wrote:
> > > > On Wed, 2017-06-14 at 18:12 -0700, Thomas Garnier wrote:
> > > > > 
> > > > > Ensure the address limit is a user-mode segment before returning to
> > > > > user-mode. Otherwise a process can corrupt kernel-mode memory and
> > > > > elevate privileges [1].
> > > > > 
> > > > > The set_fs function sets the TIF_SETFS flag to force a slow path on
> > > > > return. In the slow path, the address limit is checked to be USER_DS if
> > > > > needed.
> > > > > 
> > > > > The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
> > > > > for arm instruction immediate support. The global work mask is too big
> > > > > to used on a single instruction so adapt ret_fast_syscall.
> > > > > 
> > > > > @@ -571,6 +572,10 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
> > > > >        * Update the trace code with the current status.
> > > > >        */
> > > > >       trace_hardirqs_off();
> > > > > +
> > > > > +     /* Check valid user FS if needed */
> > > > > +     addr_limit_user_check();
> > > > > +
> > > > >       do {
> > > > >               if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > > > >                       schedule();
> > > > This patch made it's way into linux-next next-20170717 and it seems to
> > > > cause hangs when booting some boards over NFS (found via bisection). I
> > > > don't know exactly what determines the issue but I can reproduce hangs
> > > > if even if I just boot with init=/bin/bash and do stuff like
> > > > 
> > > > # sleep 1 & sleep 1 & sleep 1 & wait; wait; wait; echo done!
> > > > 
> > > > When this happens sysrq-t shows a sleep task hung in the 'R' state
> > > > spinning in do_work_pending, so maybe there is a potential infinite
> > > > loop here?
> > > > 
> > > > The addr_limit_user_check at the start of do_work_pending will check
> > > > for TIF_FSCHECK once and clear it but the function loops while
> > > > (thread_flags & _TIF_WORK_MASK), so it if TIF_FSCHECK is set again then
> > > > the loop will never terminate. Does this make sense?
> > > 
> > > Yes, it does. Thanks for looking into this.
> > > 
> > > Can you try this change?
> > > 
> > > diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> > > index 3a48b54c6405..bc6ad7789568 100644
> > > --- a/arch/arm/kernel/signal.c
> > > +++ b/arch/arm/kernel/signal.c
> > > @@ -573,12 +573,11 @@ do_work_pending(struct pt_regs *regs, unsigned
> > > int thread_flags, int syscall)
> > >   */
> > >   trace_hardirqs_off();
> > > 
> > > - /* Check valid user FS if needed */
> > > - addr_limit_user_check();
> > > -
> > >   do {
> > >   if (likely(thread_flags & _TIF_NEED_RESCHED)) {
> > >   schedule();
> > > + } else if (thread_flags & _TIF_FSCHECK) {
> > > + addr_limit_user_check();
> > >   } else {
> > >   if (unlikely(!user_mode(regs)))
> > >   return 0;
> > This does seem to work, it no longer hangs on boot in my setup. This is
> > obviously only a very superficial test.
> > 
> > The new location of this check seems weird, it's not clear why it
> > should be on an else path. Perhaps it should be moved to right before
> > where current_thread_info()->flags is fetched again?

> I was hitting bug when I tried that.I think that's because you
> basically let the signal handler do pending work before you check the
> flag, that's not a good idea.

> > If the purpose is hardening against buggy kernel code doing bad set_fs
> > calls shouldn't this flag also be checked before looking at
> > TIF_NEED_RESCHED and calling schedule()?
> I am not sure to be honest. I expected schedule to only schedule the
> processor to another task which would be fine given only the current
> task have a bogus fs. I will put it first in case there is an edge
> case scenario I missed.
> 
> What do you think? Let me know and I will look at changes all
> architectures and testing them.

I don't know and I'd rather not guess on security issues. It's better
if someone else reviews the code.

Unless there is a very quick fix maybe this series should be removed or
reverted from linux-next? A diagnosis of "system calls can sometimes
hang on return" seems serious even for linux-next. Since it happens
very rarely in most setups I can easily imagine somebody spending a lot
of time digging at this.

--
Regards,
Leonard

  reply	other threads:[~2017-07-19 14:58 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-15  1:12 [kernel-hardening] [PATCH v10 1/3] x86/syscalls: Check address limit on user-mode return Thomas Garnier
2017-06-15  1:12 ` Thomas Garnier
2017-06-15  1:12 ` Thomas Garnier
2017-06-15  1:12 ` [kernel-hardening] [PATCH v10 2/3] arm/syscalls: " Thomas Garnier
2017-06-15  1:12   ` Thomas Garnier
2017-06-15  1:12   ` Thomas Garnier
2017-06-20 20:18   ` [kernel-hardening] " Kees Cook
2017-06-20 20:18     ` Kees Cook
2017-06-20 20:18     ` Kees Cook
2017-06-20 20:18     ` Kees Cook
2017-06-20 20:31     ` [kernel-hardening] " Thomas Garnier
2017-06-20 20:31       ` Thomas Garnier
2017-06-20 20:31       ` Thomas Garnier
2017-06-20 20:31       ` Thomas Garnier
2017-06-21  9:08       ` [kernel-hardening] " Will Deacon
2017-06-21  9:08         ` Will Deacon
2017-06-21  9:08         ` Will Deacon
2017-06-21  9:08         ` Will Deacon
2017-07-08 12:10   ` [tip:x86/syscall] " tip-bot for Thomas Garnier
2017-07-18 14:36   ` [kernel-hardening] Re: [PATCH v10 2/3] " Leonard Crestez
2017-07-18 14:36     ` Leonard Crestez
2017-07-18 14:36     ` Leonard Crestez
2017-07-18 14:36     ` Leonard Crestez
2017-07-18 16:04     ` [kernel-hardening] " Thomas Garnier
2017-07-18 16:04       ` Thomas Garnier
2017-07-18 16:04       ` Thomas Garnier
2017-07-18 16:04       ` Thomas Garnier
2017-07-18 17:18       ` [kernel-hardening] " Leonard Crestez
2017-07-18 17:18         ` Leonard Crestez
2017-07-18 17:18         ` Leonard Crestez
2017-07-18 17:18         ` Leonard Crestez
2017-07-18 19:04         ` [kernel-hardening] " Thomas Garnier
2017-07-18 19:04           ` Thomas Garnier
2017-07-18 19:04           ` Thomas Garnier
2017-07-18 19:04           ` Thomas Garnier
2017-07-19 14:58           ` Leonard Crestez [this message]
2017-07-19 14:58             ` Leonard Crestez
2017-07-19 14:58             ` Leonard Crestez
2017-07-19 14:58             ` Leonard Crestez
2017-07-19 16:51             ` [kernel-hardening] " Thomas Garnier
2017-07-19 16:51               ` Thomas Garnier
2017-07-19 16:51               ` Thomas Garnier
2017-07-19 16:51               ` Thomas Garnier
2017-07-19 17:06             ` [kernel-hardening] " Russell King - ARM Linux
2017-07-19 17:06               ` Russell King - ARM Linux
2017-07-19 17:06               ` Russell King - ARM Linux
2017-07-19 17:06               ` Russell King - ARM Linux
2017-07-19 17:20               ` [kernel-hardening] " Thomas Garnier
2017-07-19 17:20                 ` Thomas Garnier
2017-07-19 17:20                 ` Thomas Garnier
2017-07-19 18:35                 ` Russell King - ARM Linux
2017-07-19 18:35                   ` Russell King - ARM Linux
2017-07-19 18:35                   ` Russell King - ARM Linux
2017-07-19 18:50                   ` Thomas Garnier
2017-07-19 18:50                     ` Thomas Garnier
2017-07-19 18:50                     ` Thomas Garnier
2017-06-15  1:12 ` [kernel-hardening] [PATCH v10 3/3] arm64/syscalls: " Thomas Garnier
2017-06-15  1:12   ` Thomas Garnier
2017-06-15  1:12   ` Thomas Garnier
2017-06-21  8:16   ` [kernel-hardening] " Catalin Marinas
2017-06-21  8:16     ` Catalin Marinas
2017-06-21  8:16     ` Catalin Marinas
2017-06-21  8:16     ` Catalin Marinas
2017-06-21 13:57     ` [kernel-hardening] " Thomas Garnier
2017-06-21 13:57       ` Thomas Garnier
2017-06-21 13:57       ` Thomas Garnier
2017-06-21 13:57       ` Thomas Garnier
2017-07-08 12:10   ` [tip:x86/syscall] " tip-bot for Thomas Garnier
2017-06-20 20:24 ` [kernel-hardening] Re: [PATCH v10 1/3] x86/syscalls: " Kees Cook
2017-06-20 20:24   ` Kees Cook
2017-06-20 20:24   ` Kees Cook
2017-06-20 20:24   ` Kees Cook
2017-06-28 17:52   ` [kernel-hardening] " Kees Cook
2017-06-28 17:52     ` Kees Cook
2017-06-28 17:52     ` Kees Cook
2017-06-28 17:52     ` Kees Cook
2017-07-06 20:38     ` [kernel-hardening] " Thomas Garnier
2017-07-06 20:38       ` Thomas Garnier
2017-07-06 20:38       ` Thomas Garnier
2017-07-06 20:38       ` Thomas Garnier
2017-07-06 20:48       ` [kernel-hardening] " Thomas Gleixner
2017-07-06 20:48         ` Thomas Gleixner
2017-07-06 20:48         ` Thomas Gleixner
2017-07-06 20:48         ` Thomas Gleixner
2017-07-06 20:52         ` [kernel-hardening] " Thomas Garnier
2017-07-06 20:52           ` Thomas Garnier
2017-07-06 20:52           ` Thomas Garnier
2017-07-06 20:52           ` Thomas Garnier
2017-07-08 12:09 ` [tip:x86/syscall] " tip-bot for Thomas Garnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1500476300.22834.13.camel@nxp.com \
    --to=leonard.crestez@nxp.com \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=cmetcalf@mellanox.com \
    --cc=dave.hansen@intel.com \
    --cc=dhowells@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mbenes@suse.cz \
    --cc=mingo@redhat.com \
    --cc=octavian.purdila@nxp.com \
    --cc=oleg@redhat.com \
    --cc=panand@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pmladek@suse.com \
    --cc=riel@redhat.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wad@chromium.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.