linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS
@ 2017-08-24  5:42 Hoeun Ryu
  2017-08-24  5:42 ` [RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines Hoeun Ryu
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Hoeun Ryu @ 2017-08-24  5:42 UTC (permalink / raw)
  To: linux-arm-kernel

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.

 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures.

 we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.

 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8)))) in the way of #ifdef.

 The last patch "sched: depend on 64BIT_ATOMIC_ACCESS to determine if to
use min_vruntime_copy" is an example of this approach.

 I'd like to know what the architecture maintainers and kernel maintainers
think about it. I think I can make more examples (mostly removing seqlock
to access the 64bit variables on the machines) if this approach is
accepted.

Hoeun Ryu (3):
  arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit
    machines
  arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
  sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use
    min_vruntime_copy

 arch/Kconfig         | 20 ++++++++++++++++++++
 arch/arm/mm/Kconfig  |  2 ++
 kernel/sched/fair.c  |  6 +++---
 kernel/sched/sched.h |  6 +++++-
 4 files changed, 30 insertions(+), 4 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines
  2017-08-24  5:42 [RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS Hoeun Ryu
@ 2017-08-24  5:42 ` Hoeun Ryu
  2017-08-24  5:42 ` [RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines Hoeun Ryu
  2017-08-24  5:42 ` [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy Hoeun Ryu
  2 siblings, 0 replies; 5+ messages in thread
From: Hoeun Ryu @ 2017-08-24  5:42 UTC (permalink / raw)
  To: linux-arm-kernel

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.
 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures,
we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.
 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8)))) in the way of #ifdef.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 arch/Kconfig | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 21d0089..1def331 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -115,6 +115,26 @@ config UPROBES
 	    managed by the kernel and kept transparent to the probed
 	    application. )
 
+config 64BIT_ATOMIC_ACCESS
+	def_bool 64BIT
+	help
+	  On some 32bit architectures as well as 64bit architectures,
+	  64bit accesses are atomic when certain conditions are satisfied.
+
+	  This symbol should be selected by an architecture if 64 bit
+	  accesses can be atomic.
+
+config 64BIT_ATOMIC_ALIGNED_ACCESS
+	def_bool n
+	depends on 64BIT_ATOMIC_ACCESS
+	help
+	  On 64BIT_ATOMIC_ACCESS enabled system, the address should be
+	  aligned by 8 to guarantee the accesses are atomic.
+
+	  This symbol should be selected by an architecture if 64 bit
+	  accesses are required to be 64 bit aligned to guarantee that
+	  the 64bit accesses are atomic.
+
 config HAVE_64BIT_ALIGNED_ACCESS
 	def_bool 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS
 	help
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
  2017-08-24  5:42 [RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS Hoeun Ryu
  2017-08-24  5:42 ` [RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines Hoeun Ryu
@ 2017-08-24  5:42 ` Hoeun Ryu
  2017-08-24  5:42 ` [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy Hoeun Ryu
  2 siblings, 0 replies; 5+ messages in thread
From: Hoeun Ryu @ 2017-08-24  5:42 UTC (permalink / raw)
  To: linux-arm-kernel

 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as
long as the address is 64-bit aligned on LPAE (Large Physical Address
Extension) enabled architectures. This feature is to guarantee atomic
accesses on newly introduced 64bit wide descriptors in the translation
tables.

 Making 64BIT_ATOMIC_ACCESS true, some kernel codes to access 64bit
variables can be optimized by omitting seqlock or the mimic of it.
 Also make 64BIT_ATOMIC_ALIGNED_ACCESS true, the 64bit atomic access is
guarnteed only when the address is 64bit algined.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 arch/arm/mm/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 60cdfdc..3142572 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -660,6 +660,8 @@ config ARM_LPAE
 	bool "Support for the Large Physical Address Extension"
 	depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \
 		!CPU_32v4 && !CPU_32v3
+	select 64BIT_ATOMIC_ACCESS
+	select 64BIT_ATOMIC_ALIGNED_ACCESS
 	help
 	  Say Y if you have an ARMv7 processor supporting the LPAE page
 	  table format and you would like to access memory beyond the
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy
  2017-08-24  5:42 [RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS Hoeun Ryu
  2017-08-24  5:42 ` [RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines Hoeun Ryu
  2017-08-24  5:42 ` [RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines Hoeun Ryu
@ 2017-08-24  5:42 ` Hoeun Ryu
  2017-08-24  8:49   ` Peter Zijlstra
  2 siblings, 1 reply; 5+ messages in thread
From: Hoeun Ryu @ 2017-08-24  5:42 UTC (permalink / raw)
  To: linux-arm-kernel

 'min_vruntime_copy' is copied when 'min_vruntime' is updated for cfq_rq
and used to check if updating 'min_vruntime' is completed on reader side.
 Because 'min_vruntime' variable is 64bit, we need a mimic of seqlock to
check if the variable is not being updated on 32bit machines.

 On 64BIT_ATOMIC_ACCESS enabled machines, 64bit accesses are atomic even
though the machines are 32bit, so we can directly access 'min_vruntime'
on the architectures.

 Depend on CONFIG_64BIT_ATOMIC_ACCESS instead of CONFIG_64BIT to determine
whether 'min_vruntime_copy' variable is used for synchronization or not.
And align 'min_vruntime' by 8 if 64BIT_ATOMIC_ALIGNED_ACCESS is true
because 64BIT_ATOMIC_ALIGNED_ACCESS enabled system can access the variable
atomically only when it' aligned.

Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com>
---
 kernel/sched/fair.c  | 6 +++---
 kernel/sched/sched.h | 6 +++++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c95880e..840658f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -536,7 +536,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
 
 	/* ensure we never gain time by being placed backwards. */
 	cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime);
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
 	smp_wmb();
 	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
@@ -5975,7 +5975,7 @@ static void migrate_task_rq_fair(struct task_struct *p)
 		struct cfs_rq *cfs_rq = cfs_rq_of(se);
 		u64 min_vruntime;
 
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
 		u64 min_vruntime_copy;
 
 		do {
@@ -9173,7 +9173,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	cfs_rq->tasks_timeline = RB_ROOT;
 	cfs_rq->min_vruntime = (u64)(-(1LL << 20));
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
 	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
 #ifdef CONFIG_SMP
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index eeef1a3..870010b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -421,8 +421,12 @@ struct cfs_rq {
 	unsigned int nr_running, h_nr_running;
 
 	u64 exec_clock;
+#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS
 	u64 min_vruntime;
-#ifndef CONFIG_64BIT
+#else
+	u64 min_vruntime __attribute__((aligned(sizeof(u64))));
+#endif
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
 	u64 min_vruntime_copy;
 #endif
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy
  2017-08-24  5:42 ` [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy Hoeun Ryu
@ 2017-08-24  8:49   ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2017-08-24  8:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 24, 2017 at 02:42:57PM +0900, Hoeun Ryu wrote:
> +#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS
>  	u64 min_vruntime;
> -#ifndef CONFIG_64BIT
> +#else
> +	u64 min_vruntime __attribute__((aligned(sizeof(u64))));
> +#endif

That's stupid, just make sure your platform defines u64 as naturally
aligned when you have this 64BIT_ATOMIC foo.

Also, please try and dig out more 32bit archs that can use this and make
sure to include performance numbers to justify this extra cruft.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-08-24  8:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-24  5:42 [RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS Hoeun Ryu
2017-08-24  5:42 ` [RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines Hoeun Ryu
2017-08-24  5:42 ` [RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines Hoeun Ryu
2017-08-24  5:42 ` [RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy Hoeun Ryu
2017-08-24  8:49   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).