From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Denys Vlasenko <vda.linux@googlemail.com>,
Borislav Petkov <bp@alien8.de>, X86 ML <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: For your amusement: slightly faster syscalls
Date: Thu, 18 Jun 2015 10:01:06 +0200 [thread overview]
Message-ID: <20150618080106.GA11473@gmail.com> (raw)
In-Reply-To: <CALCETrXPiKkhPM7xhkCLtYc1VHgaedNTK8y5LymF135Zg-c8oQ@mail.gmail.com>
* Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Jun 15, 2015 at 2:42 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> > On 06/15/2015 02:30 PM, Linus Torvalds wrote:
> >>
> >> On Jun 12, 2015 2:09 PM, "Andy Lutomirski" <luto@amacapital.net
> >> <mailto:luto@amacapital.net>> wrote:
> >>>
> >>> Caveat emptor: it also disables SMP.
> >>
> >> OK, I don't think it's interesting in that form.
> >>
> >> For small cpu counts, I guess we could have per-cpu syscall entry points
> >> (unless the syscall entry msr is shared across hyperthreading? Some msr's are
> >> per thread, others per core, AFAIK), and it could actually work that way.
> >>
> >> But I'm not sure the three cycles is worth the worry and the complexity.
> >
> > We discussed the per-cpu syscall entry point, and the issue at hand is that it
> > is very hard to do that without with fairly high probability touch another
> > cache line and quite possibly another page (and hence a TLB entry.)
( So apparently I wasn't Cc:ed, or gmail ate the mail - so I can only guess from
the surrounding discussion what this patch does, as my lkml folder is still
doing a long refresh ... )
>
> I think this isn't actually true. If we were going to do a per-cpu syscall
> entry point, then we might as well duplicate all of the entry code per cpu
> instead of just a short trampoline. That would avoid extra TLB misses and (L1)
> cache misses, I think.
>
> I still think this is far too complicated for three cycles. I was hoping for
> more.
The other problem with duplicating entry code is that with per CPU entry code we
split its cache footprint in higher level caches (such as the L2 but also L3
cache).
The interesting number would be to check cache cold entry performance, not cache
hot one: the NUMA latency advantage of having per node copies of the entry code
might be worth it.
... and that's why UP is the least interesting case ;-)
Thanks,
Ingo
next prev parent reply other threads:[~2015-06-18 8:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-13 0:09 For your amusement: slightly faster syscalls Andy Lutomirski
[not found] ` <CA+55aFzMCeDc5rw8bj3Zyimbi7C1RcU15TeiMA6jOMfnd+3B=Q@mail.gmail.com>
2015-06-15 21:42 ` H. Peter Anvin
2015-06-15 21:51 ` Andy Lutomirski
2015-06-18 8:01 ` Ingo Molnar [this message]
2015-06-18 8:48 ` Ingo Molnar
2015-06-18 8:50 ` H. Peter Anvin
2015-06-18 9:06 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150618080106.GA11473@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=torvalds@linux-foundation.org \
--cc=vda.linux@googlemail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox