Re: [PATCH] x86/entry/64: Context-track syscalls before enabling interrupts

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Frederic Weisbecker <fweisbec@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	Brian Gerst <brgerst@gmail.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH] x86/entry/64: Context-track syscalls before enabling interrupts
Date: Wed, 19 Aug 2015 19:10:09 +0200	[thread overview]
Message-ID: <20150819171007.GA21717@lerouge> (raw)
In-Reply-To: <CALCETrVqA0JVxZeD+7vMD-9xmF8udXw5RO0fwQA2X45DSVqM0Q@mail.gmail.com>

On Tue, Aug 18, 2015 at 04:07:51PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 18, 2015 at 4:02 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Aug 18, 2015 at 03:35:30PM -0700, Andy Lutomirski wrote:
> >> On Tue, Aug 18, 2015 at 3:16 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> > On Tue, Aug 18, 2015 at 12:11:59PM -0700, Andy Lutomirski wrote:
> >> >> This fixes a couple minor holes if we took an IRQ very early in syscall
> >> >> processing:
> >> >>
> >> >>  - We could enter the IRQ with CONTEXT_USER.  Everything worked (RCU
> >> >>    was fine), but we could warn if all the debugging options were
> >> >>    set.
> >> >
> >> > So this is fixing issues after your changes that call user_exit() from
> >> > IRQs, right?
> >>
> >> Yes.  Here's an example splat, courtesy of Sasha:
> >>
> >> https://gist.github.com/sashalevin/a006a44989312f6835e7
> >>
> >> >
> >> > But the IRQs aren't supposed to call user_exit(), they have their own hooks.
> >> > That's where the real issue is.
> >>
> >> In -tip, the assumption is that we *always* switch to CONTEXT_KERNEL
> >> when entering the kernel for a non-NMI reason.
> >
> > Why? IRQs don't need that! We already have irq_enter()/irq_exit().
> >
> 
> Those are certainly redundant.

So? What's the point in duplicating a hook in arch code that core code already
has?

> I want to have a real hook to call
> that says "switch to IRQ context from CONTEXT_USER" or "switch to IRQ
> context from CONTEXT_KERNEL" (aka noop), but that doesn't currently
> exist.

You're not answering _why_ you want that.

> 
> > And we don't want to call rcu_user_*() pairs on IRQs, you're
> > introducing a serious performance regression here! And I'm talking about
> > the code that's currently in -tip.
> 
> Is there an easy way to fix it?  For example, could we figure out what
> makes it take so long and make it faster?

Sure, just remove your arch IRQ hook.

> If we need to, we could
> back out the IRQ bit and change the assertions for 4.3, but I'd rather
> keep the exact context tracking if at all possible.

I have no idea what you mean by exact context tracking here.

But If we ever want to call irq_enter() using arch hooks, and I have no idea why
we would ever want to do that since that involve complexifying the code
by $NR_ARCHS and moving C code to ASM, we need serious reasons! And that's
certainly not something we are going to plan now for the next week's merge window.

> >> That means that we can
> >> avoid all of the (expensive!) checks for what context we're in.
> >
> > If you're referring to context tracking, the context check is a per-cpu
> > read. Not something that's usually considered expensive.
> 
> In -tip, there aren't even extra branches, except those imposed by the
> user_exit implementation.

No there is the "call enter_from_user_mode" in the IRQ fast path.

> 
> >
> >> It also means that (other than IRQs, which need further cleanup), we only
> >> switch once per user/kernel switch.
> >
> > ???
> 
> In 4.2 and before, we can switch multiple times on the way out of the
> kernel, via SCHEDULE_USER, do_notify_resume, etc.  In -tip, we do it
> exactly once no matter what.

That's what we want for syscalls but not for IRQs.

> 
> >
> >>
> >> The cost for doing should be essentially zero, modulo artifacts from
> >> poor inlining.
> >
> > And modulo rcu_user_*() that do multiple costly atomic_add_return() operations
> > implying full memory barriers. Plus the unnecessary vtime accounting that doubles
> > the existing one in irq_enter/exit() (those even imply a lock currently, which will
> > probably be turned to seqcount, but still, full memory barriers...).
> >
> > I'm sorry but I'm going to NACK any code that does that in IRQs (and again that
> > concerns current tip:x86/asm).
> 
> Why do we need these heavyweight barriers?

Actually it's not full barriers but atomic ones (smp_mb__after_atomic_stuff())
I suspect we can't do much better given RCU requirements.

Still we don't need to call it twice.

> 
> If there's actually a measurable performance hit in IRQs in -tip, then
> can we come up with a better fix?

I'm sure it's very easily measurable.

> For example, we could change all
> the new CT_WARN_ON calls to check "are we in CONTEXT_KERNEL or in IRQ
> context" and make the IRQ entry do a lighter weight context tracking
> operation.

I don't see what we need to check actually. Context tracking can be in any
state while in IRQ.

> 
> But I think I'm still missing something fundamental about the
> performance: why is irq_enter() any faster than user_exit()?

It's stlightly faster at least because it takes care of nesting IRQs which
is likely with softirqs that get interrupted.

Now of course we wouldn't call user_exit() in this case, but the hook is there
in generic code, no need for anything from the arch.

next prev parent reply	other threads:[~2015-08-19 17:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-18 19:11 [PATCH] x86/entry/64: Context-track syscalls before enabling interrupts Andy Lutomirski
2015-08-18 22:16 ` Frederic Weisbecker
2015-08-18 22:35   ` Andy Lutomirski
2015-08-18 23:02     ` Frederic Weisbecker
2015-08-18 23:07       ` Andy Lutomirski
2015-08-19 17:10         ` Frederic Weisbecker [this message]
2015-08-21  7:50           ` Ingo Molnar
2015-08-21 13:14             ` Frederic Weisbecker
2015-08-19  0:30 ` Andy Lutomirski
2015-08-19 15:54   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150819171007.GA21717@lerouge \
    --to=fweisbec@gmail.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.