From: Andrei Vagin <avagin@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Dmitry Safonov <dima@arista.com>,
LKML <linux-kernel@vger.kernel.org>,
Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
Andy Lutomirski <luto@kernel.org>,
Andy Tucker <agtucker@google.com>, Arnd Bergmann <arnd@arndb.de>,
Christian Brauner <christian.brauner@ubuntu.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Dmitry Safonov <0x7f454c46@gmail.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
Pavel Emelyanov <xemul@virtuozzo.com>,
Shuah Khan <shuah@kernel.org>,
containers@lists.linux-foundation.org, criu@openvz.org,
linux-api@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH 16/32] x86/vdso: Generate vdso{,32}-timens.lds
Date: Wed, 27 Mar 2019 11:00:00 -0700 [thread overview]
Message-ID: <20190327175957.GA9309@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1902081051410.1645@nanos.tec.linutronix.de>
While the generic vdso patchset is in development, we decided to think
about what other ways of generating two vdso libraries. In this
patchset, we use a linker script, but it looks too complicated, so we
decided to look at other options. Another obvious approach is the code
patching technique. The main idea was to reduce the amount of
arch-dependent code and Dmitry brought with the idea of three labels.
Let’s look at this pseudo-code:
Int vdso_clock_gettime(clockid_t clk, struct timespec *ts)
{
...
l_call:
clk_to_ns(clk, ts)
l_return:
return 0;
annotate_reachable();
l_out:
nop();
return 0;
}
Here we can see three labels. Without patching this code, the function
will apply vdso offsets. But if we copy the code between the last two
labels to the first label, we will get a version which skips vdso
offsets. The patch which implements this idea will be in replies to this
email. It was tested on x86_64 and with gcc as a compiler, but we
suspect that there might be some issues on other architectures or with
other compilers. So we would like to ask the help of the community to
understand what we have to do to be sure that this code works always
correctly.
The second patch implements static_branch for the vdso code.
Here are only a few lines of arch-dependent code:
+static __always_inline bool timens_static_branch(void)
+{
+ asm_volatile_goto("1:\n\t"
+ ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
+ ".pushsection __retcall_table, \"aw\"\n\t"
+ "2: .word 1b - 2b, %l[l_yes] - 2b\n\t"
+ ".popsection\n\t"
+ : : : : l_yes);
+
+ return false;
+l_yes:
+ return true;
+}
This is a slightly modified version of the arch_static_branch()
function. The timens code in vdso looks like this:
if (timens_static_branch()) {
clk_to_ns(clk, ts);
}
The version of vdso which is compiled from sources will never execute
clk_to_ns(). And then we can patch the 'no-op' in the straight-line
codepath with a 'jump' instruction to the out-of-line true branch and
get the timens version of the vdso library.
Now we can compare these three versions. Our opinion is that the version
with three labels looks cleaner and if it will work with all compilers
on all architectures, we probably have to choose it. Otherwise, we would
prefer the version with static_branches, because it is simpler than the
version with the linker script.
Thanks,
Andrei
On Fri, Feb 08, 2019 at 10:57:57AM +0100, Thomas Gleixner wrote:
> On Thu, 7 Feb 2019, Rasmus Villemoes wrote:
>
> Cc: + Vincenzo, Will
>
> > On 06/02/2019 01.10, Dmitry Safonov wrote:
> > > As it has been discussed on timens RFC, adding a new conditional branch
> > > `if (inside_time_ns)` on VDSO for all processes is undesirable.
> > > It will add a penalty for everybody as branch predictor may mispredict
> > > the jump. Also there are instruction cache lines wasted on cmp/jmp.
> > >
> > > Those effects of introducing time namespace are very much unwanted
> > > having in mind how much work have been spent on micro-optimisation
> > > vdso code.
> > >
> > > Addressing those problems, there are two versions of VDSO's .so:
> > > for host tasks (without any penalty) and for processes inside of time
> > > namespace with clk_to_ns() that subtracts offsets from host's time.
> > >
> > > Unfortunately, to allow changing VDSO VMA on a running process,
> > > the entry points to VDSO should have the same offsets (addresses).
> > > That's needed as i.e. application that calls setns() may have already
> > > resolved VDSO symbols in GOT/PLT.
> >
> > These (14-19, if I'm reading them right) seems to add quite a lot of
> > complexity and fragility to the build, and other architectures would
> > probably have to add something similar to their vdso builds.
>
> Yes and we really want to avoid that. The VDSO implementations are
> pointlessly different accross the architectures and there is effort on the
> way to consolidate them:
>
> https://lkml.kernel.org/r/20190115135539.24762-1-vincenzo.frascino@arm.com
>
> I talked to Vincenzo earlier this week and he's working on a new version of
> that. The timens stuff wants to go on top of the consolidation otherwise we
> end up with another set of pointlessly different and differently broken
> VDSO variants.
>
> Thanks,
>
> tglx
WARNING: multiple messages have this Message-ID (diff)
From: Andrei Vagin <avagin@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Dmitry Safonov <dima@arista.com>,
LKML <linux-kernel@vger.kernel.org>,
Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
Andy Lutomirski <luto@kernel.org>,
Andy Tucker <agtucker@google.com>, Arnd Bergmann <arnd@arndb.de>,
Christian Brauner <christian.brauner@ubuntu.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Dmitry Safonov <0x7f454c46@gmail.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
Pavel Emelyanov <xemul@virtuozzo.com>,
Shuah Khan <shuah@kernel.org>,
containers@lists.linux-foundation.org, criu@openvz.org,
linux-api@vger.kernel.org, x86@kernel.org,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH 16/32] x86/vdso: Generate vdso{,32}-timens.lds
Date: Wed, 27 Mar 2019 11:00:00 -0700 [thread overview]
Message-ID: <20190327175957.GA9309@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1902081051410.1645@nanos.tec.linutronix.de>
While the generic vdso patchset is in development, we decided to think
about what other ways of generating two vdso libraries. In this
patchset, we use a linker script, but it looks too complicated, so we
decided to look at other options. Another obvious approach is the code
patching technique. The main idea was to reduce the amount of
arch-dependent code and Dmitry brought with the idea of three labels.
Let’s look at this pseudo-code:
Int vdso_clock_gettime(clockid_t clk, struct timespec *ts)
{
...
l_call:
clk_to_ns(clk, ts)
l_return:
return 0;
annotate_reachable();
l_out:
nop();
return 0;
}
Here we can see three labels. Without patching this code, the function
will apply vdso offsets. But if we copy the code between the last two
labels to the first label, we will get a version which skips vdso
offsets. The patch which implements this idea will be in replies to this
email. It was tested on x86_64 and with gcc as a compiler, but we
suspect that there might be some issues on other architectures or with
other compilers. So we would like to ask the help of the community to
understand what we have to do to be sure that this code works always
correctly.
The second patch implements static_branch for the vdso code.
Here are only a few lines of arch-dependent code:
+static __always_inline bool timens_static_branch(void)
+{
+ asm_volatile_goto("1:\n\t"
+ ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
+ ".pushsection __retcall_table, \"aw\"\n\t"
+ "2: .word 1b - 2b, %l[l_yes] - 2b\n\t"
+ ".popsection\n\t"
+ : : : : l_yes);
+
+ return false;
+l_yes:
+ return true;
+}
This is a slightly modified version of the arch_static_branch()
function. The timens code in vdso looks like this:
if (timens_static_branch()) {
clk_to_ns(clk, ts);
}
The version of vdso which is compiled from sources will never execute
clk_to_ns(). And then we can patch the 'no-op' in the straight-line
codepath with a 'jump' instruction to the out-of-line true branch and
get the timens version of the vdso library.
Now we can compare these three versions. Our opinion is that the version
with three labels looks cleaner and if it will work with all compilers
on all architectures, we probably have to choose it. Otherwise, we would
prefer the version with static_branches, because it is simpler than the
version with the linker script.
Thanks,
Andrei
On Fri, Feb 08, 2019 at 10:57:57AM +0100, Thomas Gleixner wrote:
> On Thu, 7 Feb 2019, Rasmus Villemoes wrote:
>
> Cc: + Vincenzo, Will
>
> > On 06/02/2019 01.10, Dmitry Safonov wrote:
> > > As it has been discussed on timens RFC, adding a new conditional branch
> > > `if (inside_time_ns)` on VDSO for all processes is undesirable.
> > > It will add a penalty for everybody as branch predictor may mispredict
> > > the jump. Also there are instruction cache lines wasted on cmp/jmp.
> > >
> > > Those effects of introducing time namespace are very much unwanted
> > > having in mind how much work have been spent on micro-optimisation
> > > vdso code.
> > >
> > > Addressing those problems, there are two versions of VDSO's .so:
> > > for host tasks (without any penalty) and for processes inside of time
> > > namespace with clk_to_ns() that subtracts offsets from host's time.
> > >
> > > Unfortunately, to allow changing VDSO VMA on a running process,
> > > the entry points to VDSO should have the same offsets (addresses).
> > > That's needed as i.e. application that calls setns() may have already
> > > resolved VDSO symbols in GOT/PLT.
> >
> > These (14-19, if I'm reading them right) seems to add quite a lot of
> > complexity and fragility to the build, and other architectures would
> > probably have to add something similar to their vdso builds.
>
> Yes and we really want to avoid that. The VDSO implementations are
> pointlessly different accross the architectures and there is effort on the
> way to consolidate them:
>
> https://lkml.kernel.org/r/20190115135539.24762-1-vincenzo.frascino@arm.com
>
> I talked to Vincenzo earlier this week and he's working on a new version of
> that. The timens stuff wants to go on top of the consolidation otherwise we
> end up with another set of pointlessly different and differently broken
> VDSO variants.
>
> Thanks,
>
> tglx
next prev parent reply other threads:[~2019-03-27 18:00 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-06 0:10 [PATCH 00/32] kernel: Introduce Time Namespace Dmitry Safonov
2019-02-06 0:10 ` [PATCH 01/32] ns: " Dmitry Safonov
2019-02-06 0:10 ` [PATCH 02/32] timens: Add timens_offsets Dmitry Safonov
2019-02-06 0:10 ` [PATCH 03/32] timens: Introduce CLOCK_MONOTONIC offsets Dmitry Safonov
2019-02-07 21:40 ` Thomas Gleixner
2019-02-08 9:02 ` Andrei Vagin
2019-02-08 9:46 ` Thomas Gleixner
2019-02-06 0:10 ` [PATCH 04/32] timens: Introduce CLOCK_BOOTTIME offset Dmitry Safonov
2019-02-06 0:10 ` [PATCH 05/32] timerfd/timens: Take into account ns clock offsets Dmitry Safonov
2019-02-06 8:52 ` Cyrill Gorcunov
2019-02-06 8:55 ` Cyrill Gorcunov
2019-02-07 6:38 ` Andrei Vagin
2019-02-06 0:10 ` [PATCH 06/32] posix-timers/timens: Take into account " Dmitry Safonov
2019-02-06 0:10 ` [PATCH 07/32] timens/kernel: Take into account timens clock offsets in clock_nanosleep Dmitry Safonov
2019-02-08 7:56 ` Thomas Gleixner
2019-02-06 0:10 ` [PATCH 08/32] timens: Shift /proc/uptime Dmitry Safonov
2019-02-06 0:10 ` [PATCH 09/32] x86/vdso2c: Correct err messages on file opening Dmitry Safonov
2019-02-06 0:10 ` [PATCH 10/32] x86/vdso2c: Convert iterator to unsigned Dmitry Safonov
2019-02-06 0:10 ` [PATCH 11/32] x86/vdso/Makefile: Add vobjs32 Dmitry Safonov
2019-02-06 0:10 ` [PATCH 12/32] x86/vdso/timens: Add offsets page in vvar Dmitry Safonov
2019-02-06 0:10 ` [PATCH 13/32] x86/vdso: Build timens .so(s) Dmitry Safonov
2019-02-06 0:10 ` [PATCH 14/32] x86/VDSO: Build VDSO with -ffunction-sections Dmitry Safonov
2019-02-06 0:10 ` [PATCH 15/32] x86/vdso2c: Optionally produce linker script for vdso entries Dmitry Safonov
2019-02-06 0:10 ` [PATCH 16/32] x86/vdso: Generate vdso{,32}-timens.lds Dmitry Safonov
2019-02-07 8:31 ` Rasmus Villemoes
2019-02-07 16:11 ` Dmitry Safonov
2019-02-08 9:57 ` Thomas Gleixner
2019-02-08 9:57 ` Thomas Gleixner
2019-02-08 15:18 ` Dmitry Safonov
2019-02-08 15:18 ` Dmitry Safonov
2019-03-27 18:00 ` Andrei Vagin [this message]
2019-03-27 18:00 ` Andrei Vagin
2019-03-27 18:06 ` [PATCH RFC] x86/asm: Introduce static_retcall(s) Andrei Vagin
2019-03-27 18:06 ` Andrei Vagin
2019-03-27 18:06 ` [PATCH RFC] vdso: introduce timens_static_branch Andrei Vagin
2019-03-27 18:06 ` Andrei Vagin
2019-02-06 0:10 ` [PATCH 17/32] x86/vdso2c: Sort vdso entries by addresses for linker script Dmitry Safonov
2019-02-06 0:10 ` [PATCH 18/32] x86/vdso.lds: Align !timens (host's) vdso.so entries Dmitry Safonov
2019-02-06 0:10 ` [PATCH 19/32] x86/vdso2c: Align LOCAL symbols between vdso{-timens,}.so Dmitry Safonov
2019-02-06 0:10 ` [PATCH 20/32] x86/vdso: Initialize timens 64-bit vdso Dmitry Safonov
2019-02-06 0:10 ` [PATCH 21/32] x86/vdso: Switch image on setns()/unshare()/clone() Dmitry Safonov
2019-02-06 0:10 ` [PATCH 22/32] timens: Add align for timens_offsets Dmitry Safonov
2019-02-06 0:10 ` [PATCH 23/32] timens/fs/proc: Introduce /proc/pid/timens_offsets Dmitry Safonov
2019-02-06 0:10 ` [PATCH 24/32] selftest/timens: Add Time Namespace test for supported clocks Dmitry Safonov
2019-02-06 0:10 ` [PATCH 25/32] selftest/timens: Add a test for timerfd Dmitry Safonov
2019-02-06 0:11 ` [PATCH 26/32] selftest/timens: Add a test for clock_nanosleep() Dmitry Safonov
2019-02-06 0:11 ` [PATCH 27/32] selftest/timens: Add procfs selftest Dmitry Safonov
2019-02-06 0:11 ` [PATCH 28/32] selftest/timens: Add timer offsets test Dmitry Safonov
2019-02-06 0:11 ` [PATCH 29/32] selftests: Add a simple perf test for clock_gettime() Dmitry Safonov
2019-02-06 0:11 ` [PATCH 30/32] selftest/timens: Check that a right vdso is mapped after fork and exec Dmitry Safonov
2019-02-06 0:11 ` [PATCH 31/32] x86/vdso: Align VDSO functions by CPU L1 cache line Dmitry Safonov
2019-02-06 0:11 ` [PATCH 32/32] x86/vdso: Restrict splitting VVAR VMA Dmitry Safonov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190327175957.GA9309@gmail.com \
--to=avagin@gmail.com \
--cc=0x7f454c46@gmail.com \
--cc=adrian@lisas.de \
--cc=agtucker@google.com \
--cc=arnd@arndb.de \
--cc=avagin@openvz.org \
--cc=christian.brauner@ubuntu.com \
--cc=containers@lists.linux-foundation.org \
--cc=criu@openvz.org \
--cc=dima@arista.com \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=hpa@zytor.com \
--cc=jdike@addtoit.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=xemul@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.