* Re: [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO [not found] <20180814011732.075EA6E0251@virtux32.softrans.com.au> @ 2018-08-14 14:20 ` Andy Lutomirski 2018-08-14 16:30 ` David Woodhouse 0 siblings, 1 reply; 3+ messages in thread From: Andy Lutomirski @ 2018-08-14 14:20 UTC (permalink / raw) To: Matt Rickard, LKML, Linus Torvalds, Dave Hansen, David Woodhouse, X86 ML, Kees Cook [Added a whole bunch of ccs] On Mon, Aug 13, 2018 at 6:17 PM, Matt Rickard <matt@softrans.com.au> wrote: > Process clock_gettime(CLOCK_TAI) in VDSO. This makes the call about as fast as > CLOCK_REALTIME instead of taking about four times as long. > > Signed-off-by: Matt Rickard <matt@softrans.com.au> > --- > arch/x86/entry/vdso/vclock_gettime.c | 30 ++++++++++++++++++++++++++++++ > arch/x86/entry/vsyscall/vsyscall_gtod.c | 2 ++ > arch/x86/include/asm/vgtod.h | 1 + > 3 files changed, 33 insertions(+) > > diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c > index f19856d95c60..bc8d8f086721 100644 > --- a/arch/x86/entry/vdso/vclock_gettime.c > +++ b/arch/x86/entry/vdso/vclock_gettime.c ... > notrace static void do_realtime_coarse(struct timespec *ts) > { > unsigned long seq; > @@ -284,8 +305,17 @@ notrace int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) > do_monotonic_coarse(ts); > break; > default: > + /* Doubled switch statement to work around kernel Makefile error */ > + /* See: https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg567499.html */ NAK. The issue here (after reading that thread) is that, with our current compile options, gcc generates a jump table once the switch statement hits five entries. And it uses retpolines for it, and somehow it generates the relocations in such a way that the vDSO build fails. We need to address this so that the vDSO build is reliable, but there's an important question here: Should the vDSO be built with retpolines, or should it be built with indirect branches? Or should we go out of our way to make sure that the vDSO contains neither retpolines nor indirect branches? We could accomplish the latter (sort of) by manually converting the switch into the appropriate if statements, but that's rather ugly. (Hmm. We should add exports to directly read each clock source. They'll be noticeably faster, especially when cache-and-predictor-code.) ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO 2018-08-14 14:20 ` [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO Andy Lutomirski @ 2018-08-14 16:30 ` David Woodhouse 2018-08-14 21:20 ` Andy Lutomirski 0 siblings, 1 reply; 3+ messages in thread From: David Woodhouse @ 2018-08-14 16:30 UTC (permalink / raw) To: Andy Lutomirski, Matt Rickard, LKML, Linus Torvalds, Dave Hansen, X86 ML, Kees Cook [-- Attachment #1: Type: text/plain, Size: 1480 bytes --] On Tue, 2018-08-14 at 07:20 -0700, Andy Lutomirski wrote: > > + /* Doubled switch statement to work around kernel Makefile error */ > > + /* See: https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg567499.html */ > > NAK. > > The issue here (after reading that thread) is that, with our current > compile options, gcc generates a jump table once the switch statement > hits five entries. And it uses retpolines for it, and somehow it > generates the relocations in such a way that the vDSO build fails. > We > need to address this so that the vDSO build is reliable, but there's > an important question here: > > Should the vDSO be built with retpolines, or should it be built with > indirect branches? Or should we go out of our way to make sure that > the vDSO contains neither retpolines nor indirect branches? > > We could accomplish the latter (sort of) by manually converting the > switch into the appropriate if statements, but that's rather ugly. > > (Hmm. We should add exports to directly read each clock source. > They'll be noticeably faster, especially when > cache-and-predictor-code.) Surely it's kind of expected that the vDSO can't find an externally provided __x86_indirect_thunk_rax symbol, since we only provide one as part of the kernel image. Building the vDSO with -mindirect-branch=thunk(|-inline) should fix that, if we want retpolines in the vDSO. There's also -fno-jump-tables. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5213 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO 2018-08-14 16:30 ` David Woodhouse @ 2018-08-14 21:20 ` Andy Lutomirski 0 siblings, 0 replies; 3+ messages in thread From: Andy Lutomirski @ 2018-08-14 21:20 UTC (permalink / raw) To: David Woodhouse Cc: Andy Lutomirski, Matt Rickard, LKML, Linus Torvalds, Dave Hansen, X86 ML, Kees Cook On Tue, Aug 14, 2018 at 9:30 AM, David Woodhouse <dwmw2@infradead.org> wrote: > On Tue, 2018-08-14 at 07:20 -0700, Andy Lutomirski wrote: >> > + /* Doubled switch statement to work around kernel Makefile error */ >> > + /* See: https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg567499.html */ >> >> NAK. >> >> The issue here (after reading that thread) is that, with our current >> compile options, gcc generates a jump table once the switch statement >> hits five entries. And it uses retpolines for it, and somehow it >> generates the relocations in such a way that the vDSO build fails. >> We >> need to address this so that the vDSO build is reliable, but there's >> an important question here: >> >> Should the vDSO be built with retpolines, or should it be built with >> indirect branches? Or should we go out of our way to make sure that >> the vDSO contains neither retpolines nor indirect branches? >> >> We could accomplish the latter (sort of) by manually converting the >> switch into the appropriate if statements, but that's rather ugly. >> >> (Hmm. We should add exports to directly read each clock source. >> They'll be noticeably faster, especially when >> cache-and-predictor-code.) > > Surely it's kind of expected that the vDSO can't find an externally > provided __x86_indirect_thunk_rax symbol, since we only provide one as > part of the kernel image. > > Building the vDSO with -mindirect-branch=thunk(|-inline) should fix > that, if we want retpolines in the vDSO. I think that, if we want retpolines in the kernel, we probably want them in the vDSO as well. Although there's an argument to be made that IBPB gives enough protection, at least against most targets. > > There's also -fno-jump-tables. I'll probably do this, conditioned on CONFIG_RETPOLINE. Or we should do it kernel-wide. hjl filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952 based on my comment in the other bug report. gcc seems to be generating jump tables when it shouldn't be doing so. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-08-14 21:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20180814011732.075EA6E0251@virtux32.softrans.com.au>
2018-08-14 14:20 ` [PATCH] Handle clock_gettime(CLOCK_TAI) in VDSO Andy Lutomirski
2018-08-14 16:30 ` David Woodhouse
2018-08-14 21:20 ` Andy Lutomirski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox