linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] arm64: vdso: getcpu() support
@ 2020-07-01 20:28 Mark Brown
  2020-07-01 20:28 ` [PATCH v2 01/11] arm64: vdso: Provide a define when building the vDSO Mark Brown
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Mark Brown @ 2020-07-01 20:28 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier
  Cc: Mark Brown, Andrei Vagin, Vincenzo Frascino, linux-arm-kernel

This series is a rebase of the previously posted getcpu() support with
some additional patches 5-10 added which try to do some cleanups and
clarifications of the vDSO code and extend it to multi-page support.
Those patches are currently drafts and haven't been fully tested or
considered, they're posted as there was some discussion of other
applications of the per-CPU data so it seemed useful to share this in
progress work.

Some applications, especially tracing ones, benefit from avoiding the
syscall overhead for getcpu() so it is common for architectures to have
vDSO implementations. Add one for arm64, using TPIDRRO_EL0 to pass a
pointer to per-CPU data rather than just store the immediate value in
order to allow for future extensibility.

It is questionable if something TPIDRRO_EL0 based is worthwhile at all
on current kernels, since v4.18 we have had support for restartable
sequences which can be used to provide a sched_getcpu() implementation
with generally better performance than the vDSO approach on
architectures which have that[1].  Work is ongoing to implement this for
glibc:

    https://lore.kernel.org/lkml/20200527185130.5604-3-mathieu.desnoyers@efficio
+s.com/

but is not yet merged and will need similar work for other userspaces.
The main advantages for the vDSO implementation are the node parameter
(though this is a static mapping to CPU number so could be looked up
separately when processing data if it's needed, it shouldn't need to be
in the hot path) and ease of implementation for users.

This is currently not compatible with KPTI due to the use of TPIDRRO_EL0
by the KPTI trampoline, this could be addressed by reinitializing that
system register in the return path but I have found it hard to justify
adding that overhead for all users for something that is essentially a
profiling optimization which is likely to get superceeded by a more
modern implementation - if there are other uses for the per-CPU data
then the balance might change here.

There is some overlap with an in flight patch series from Andrei Vagin
supporting time namespaces in the vDSO, there shouldn't be a fundamental
issue integrating the two serieses.

This builds on work done by Kristina Martsenko some time ago but is a
new implementation.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7822b1e24f2df5df98c76f0e94a5416349ff759

v2:
 - Rebase on v5.8-rc3.
 - Add further cleanup patches & a first draft of multi-page support.

Mark Brown (11):
  arm64: vdso: Provide a define when building the vDSO
  arm64: vdso: Add per-CPU data
  arm64: vdso: Initialise the per-CPU vDSO data
  arm64: vdso: Add getcpu() implementation
  arm64: vdso: Remove union in declaration of the data store
  arm64: vdso: Document and verify alignment of vDSO text
  arm64: vdso: Rename vdso_pages to vdso_text_pages
  arm64: vdso: Simplify pagelist allocation
  arm64: vdso: Parameterise vDSO data length assumptions in code
  arm64: vdso: Support multiple pages of vDSO data
  selftests: vdso: Support arm64 in getcpu() test

 arch/arm64/include/asm/processor.h            |  12 +-
 arch/arm64/include/asm/vdso.h                 |  11 ++
 arch/arm64/include/asm/vdso/datapage.h        |  54 +++++++++
 arch/arm64/kernel/process.c                   |  26 +++-
 arch/arm64/kernel/vdso.c                      | 112 ++++++++++++------
 arch/arm64/kernel/vdso/Makefile               |   4 +-
 arch/arm64/kernel/vdso/vdso.lds.S             |   3 +-
 arch/arm64/kernel/vdso/vgetcpu.c              |  48 ++++++++
 .../testing/selftests/vDSO/vdso_test_getcpu.c |  10 ++
 9 files changed, 229 insertions(+), 51 deletions(-)
 create mode 100644 arch/arm64/include/asm/vdso/datapage.h
 create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c


base-commit: 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-01 20:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-01 20:28 [PATCH v2 00/11] arm64: vdso: getcpu() support Mark Brown
2020-07-01 20:28 ` [PATCH v2 01/11] arm64: vdso: Provide a define when building the vDSO Mark Brown
2020-07-01 20:28 ` [PATCH v2 02/11] arm64: vdso: Add per-CPU data Mark Brown
2020-07-01 20:28 ` [PATCH v2 03/11] arm64: vdso: Initialise the per-CPU vDSO data Mark Brown
2020-07-01 20:28 ` [PATCH v2 04/11] arm64: vdso: Add getcpu() implementation Mark Brown
2020-07-01 20:28 ` [PATCH v2 05/11] arm64: vdso: Remove union in declaration of the data store Mark Brown
2020-07-01 20:28 ` [PATCH v2 06/11] arm64: vdso: Document and verify alignment of vDSO text Mark Brown
2020-07-01 20:28 ` [PATCH v2 07/11] arm64: vdso: Rename vdso_pages to vdso_text_pages Mark Brown
2020-07-01 20:28 ` [PATCH v2 08/11] arm64: vdso: Simplify pagelist allocation Mark Brown
2020-07-01 20:28 ` [PATCH v2 09/11] arm64: vdso: Parameterise vDSO data length assumptions in code Mark Brown
2020-07-01 20:28 ` [PATCH v2 10/11] arm64: vdso: Support multiple pages of vDSO data Mark Brown
2020-07-01 20:28 ` [PATCH v2 11/11] selftests: vdso: Support arm64 in getcpu() test Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).