public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Richard Guy Briggs <rgb@redhat.com>,
	Eric Paris <eparis@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] context_tracking: Restore previous state in schedule_user
Date: Thu, 4 Dec 2014 02:13:18 +0100	[thread overview]
Message-ID: <20141204011316.GH31369@lerouge> (raw)
In-Reply-To: <CALCETrXSsTa76QGg+MHDAkrr9gJPSqm2zKrcTbU5SV31mWDVuQ@mail.gmail.com>

On Wed, Dec 03, 2014 at 04:38:46PM -0800, Andy Lutomirski wrote:
> On Wed, Dec 3, 2014 at 4:30 PM, Dave Jones <davej@redhat.com> wrote:
> > On Wed, Dec 03, 2014 at 04:04:31PM -0800, Andy Lutomirski wrote:
> >  > On Wed, Dec 3, 2014 at 3:58 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >  > > On Wed, Dec 03, 2014 at 03:18:41PM -0800, Andy Lutomirski wrote:
> >  > >> It appears that some SCHEDULE_USER (asm for schedule_user) callers
> >  > >> in arch/x86/kernel/entry_64.S are called from RCU kernel context,
> >  > >> and schedule_user will return in RCU user context.  This causes RCU
> >  > >> warnings and possible failures.
> >  > >>
> >  > >> This is intended to be a minimal fix suitable for 3.18.
> >  > >>
> >  > >> Reported-by: Dave Jones <davej@redhat.com>
> >  > >> Cc: Oleg Nesterov <oleg@redhat.com>
> >  > >> Cc: Frédéric Weisbecker <fweisbec@gmail.com>
> >  > >> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
> >  > >> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> >  > >
> >  > > Ah, we sent it about at the same time :-)
> >  > >
> >  > > Might be too late for 3.18 though because it's not a regression.
> >
> > Wait, so how come that trace didn't start showing up until recently ?
> 
> Looking at the code, it's because int_careful has the same bug, but
> syscall_trace_leave does:
> 
>     /*
>      * We may come here right after calling schedule_user()
>      * or do_notify_resume(), in which case we can be in RCU
>      * user mode.
>      */
>     user_exit();
> 
> which means that this issue was anticipated when that comment was written.

Indeed, in fact it was expected to work as long as the code that follows the
syscall is limited to schedule_user(), syscall_trace_leave() and do_notify_resume().
But if anything else is called and uses RCU, this doesn't work anymore.

So user_enter() and user_exit() have been designed to be re-entrant on purpose.

> 
> Prior to the 3.18 seccomp changes and the _TIF_WORK typo fix, it would
> have been difficult to hit sysret_audit when context tracking was on
> (you could do it once on the way out from a syscall that enabled
> context tracking).  So this is 3.18 regression.

I see now.

So the real problem is not on schedule_user(). It's rather that __audit_syscall_exit()
should we wrapped inside user_exit()/user_enter() or exception_foo(). The latter
is safer in a sensitive patch. That would be the real and simple regression fix.
Tweaking schedule_user() is more risky.

Then, if you like, we can rethink the whole later, define syscall_trace_leave()
as the only place that calls user_enter() and all the other syscall exit functions
(schedule_user(), do_notify_resume(), __audit_syscall_exit()) can just call
exception_enter() - exception_exit() if they can be called after syscall_trace_leave().
Then finally we can make user_enter and user_exit non-reentrant after careful audit
of how other archs use it (sounds scary though).

Or better yet: if you rework the syscall exit slow path, lets call user_enter() at the
very end of the syscall.

> 
> The sysret_audit code is still totally screwed up AFAICT.  At the very
> least, the whole mess rather strongly suggests that, if both context
> tracking and audit are on, then __audit_syscall_exit is called *twice*
> on each syscall.  __audit_syscall_exit seems to be idempotent, so
> maybe no one has noticed that little glitch.
> 
> I'll ask the x86 people to include my sysret_audit removal for 3.19,
> since I think that this schedule_user change is a better last-minute
> fix than removing a whole chunk of asm.
> 
> --Andy
> 
> >
> >         Dave
> >
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC

  reply	other threads:[~2014-12-04  1:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-03 18:19 audit: rcu_read_lock() used illegally while idle Dave Jones
2014-12-03 19:29 ` Paul E. McKenney
2014-12-03 20:06   ` Andy Lutomirski
2014-12-03 20:19     ` Dave Jones
2014-12-03 20:38       ` Andy Lutomirski
2014-12-03 22:08         ` Frederic Weisbecker
2014-12-03 22:12           ` Andy Lutomirski
2014-12-03 23:49             ` Frederic Weisbecker
2014-12-03 23:18           ` [PATCH] context_tracking: Restore previous state in schedule_user Andy Lutomirski
2014-12-03 23:26             ` Andy Lutomirski
2014-12-03 23:31             ` Dave Jones
2014-12-03 23:58             ` Frederic Weisbecker
2014-12-04  0:04               ` Andy Lutomirski
2014-12-04  0:30                 ` Dave Jones
2014-12-04  0:38                   ` Andy Lutomirski
2014-12-04  1:13                     ` Frederic Weisbecker [this message]
2014-12-03 23:37           ` [PATCH v2] " Andy Lutomirski
2014-12-03 23:50             ` Paul E. McKenney
2014-12-04  0:01               ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141204011316.GH31369@lerouge \
    --to=fweisbec@gmail.com \
    --cc=davej@redhat.com \
    --cc=eparis@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rgb@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox