linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Paul McKenney" <paulmck@linux.vnet.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Maciej W. Rozycki" <macro@imgtec.com>,
	"David Daney" <ddaney@caviumnetworks.com>,
	"Måns Rullgård" <mans@mansr.com>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock()
Date: Tue, 9 Feb 2016 11:42:42 +0000	[thread overview]
Message-ID: <20160209114242.GE22874@arm.com> (raw)
In-Reply-To: <20160209112358.GA500@gmail.com>

On Tue, Feb 09, 2016 at 12:23:58PM +0100, Ingo Molnar wrote:
> * Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Feb 03, 2016 at 01:32:10PM +0000, Will Deacon wrote:
> > > On Wed, Feb 03, 2016 at 09:33:39AM +0100, Ingo Molnar wrote:
> > > > In fact I'd suggest to test this via a quick runtime hack like this in rcupdate.h:
> > > > 
> > > > 	extern int panic_timeout;
> > > > 
> > > > 	...
> > > > 
> > > > 	if (panic_timeout)
> > > > 		smp_load_acquire(p);
> > > > 	else
> > > > 		typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p);
> > > > 
> > > > (or so)
> > > 
> > > So the problem with this is that a LOAD <ctrl> LOAD sequence isn't an
> > > ordering hazard on ARM, so you're potentially at the mercy of the branch
> > > predictor as to whether you get an acquire. That's not to say it won't
> > > be discarded as soon as the conditional is resolved, but it could
> > > screw up the benchmarking.
> > > 
> > > I'd be better off doing some runtime patching, but that's not something
> > > I can knock up in a couple of minutes (so I'll add it to my list).
> > 
> > ... so I actually got that up and running, believe it or not. Filthy stuff.
> 
> Wow!
> 
> I tried to implement the simpler solution by hacking rcupdate.h, but got drowned 
> in nasty circular header file dependencies and gave up...

I hacked linux/compiler.h, but got similar problems and ended up copying
what I need under a new namespace (posh way of saying I added _WILL to
everything until it built).

> If you are not overly embarrassed by posting hacky patches, mind posting your 
> solution?

Ok, since you asked! I'm thoroughly ashamed of the hacks, but maybe it
will keep those spammy Microsoft recruiters at bay ;)

Note that the "trigger" is changing the loglevel, but you need to do this
by echoing to /proc/sysrq-trigger, otherwise the whole thing gets invoked
in irq context which ends badly. It's also heavily arm64-specific.

> > The good news is that you're right, and I'm now seeing ~1% difference between 
> > the runs with ~0.3% noise for either of them. I still think that's significant, 
> > but it's a lot more reassuring than 4%.
> 
> hm, so for such marginal effects I think we could improve the testing method a 
> bit: we could improve 'perf bench sched messaging' to allow 'steady state 
> testing': to not exit+restart all the processes between test iterations, but to 
> continuously measure and print out current performance figures.
> 
> I.e. every 10 seconds it could print a decaying running average of current 
> throughput.
> 
> That way you could patch/unpatch the instructions without having to restart the 
> tasks. If you still see an effect (in the numbers reported every 10 seconds), then 
> that's a guaranteed result.
> 
> [ We have such functionality in 'perf bench numa' (the --show-convergence option), 
>   for similar reasons, to allow runtime monitoring and tweaking of kernel 
>   parameters. ]

That sounds handy.

Will

--->8

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 8f271b83f910..1a1e353983be 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -30,8 +30,8 @@
 #define ARM64_HAS_LSE_ATOMICS			5
 #define ARM64_WORKAROUND_CAVIUM_23154		6
 #define ARM64_WORKAROUND_834220			7
-
-#define ARM64_NCAPS				8
+#define ARM64_RCU_USES_ACQUIRE			8
+#define ARM64_NCAPS				9
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index d2ee1b21a10d..edbf188a8541 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -143,6 +143,66 @@ static int __apply_alternatives_multi_stop(void *unused)
 	return 0;
 }
 
+static void __apply_alternatives_rcu(void *alt_region)
+{
+	struct alt_instr *alt;
+	struct alt_region *region = alt_region;
+	u32 *origptr, *replptr;
+
+	for (alt = region->begin; alt < region->end; alt++) {
+		u32 insn, orig_insn;
+		int nr_inst;
+
+		if (alt->cpufeature != ARM64_RCU_USES_ACQUIRE)
+			continue;
+
+		BUG_ON(alt->alt_len != alt->orig_len);
+
+		origptr = ALT_ORIG_PTR(alt);
+		replptr = ALT_REPL_PTR(alt);
+		nr_inst = alt->alt_len / sizeof(insn);
+
+		BUG_ON(nr_inst != 1);
+
+		insn = le32_to_cpu(*replptr);
+		orig_insn = le32_to_cpu(*origptr);
+		*(origptr) = cpu_to_le32(insn);
+		*replptr = cpu_to_le32(orig_insn);
+
+		flush_icache_range((uintptr_t)origptr, (uintptr_t)(origptr + 1));
+		pr_info_ratelimited("%p: 0x%x => 0x%x\n", origptr, orig_insn, insn);
+	}
+}
+
+static int will_patched;
+
+static int __apply_alternatives_rcu_multi_stop(void *unused)
+{
+	struct alt_region region = {
+		.begin	= __alt_instructions,
+		.end	= __alt_instructions_end,
+	};
+
+	/* We always have a CPU 0 at this point (__init) */
+	if (smp_processor_id()) {
+		while (!READ_ONCE(will_patched))
+			cpu_relax();
+		isb();
+	} else {
+		__apply_alternatives_rcu(&region);
+		/* Barriers provided by the cache flushing */
+		WRITE_ONCE(will_patched, 1);
+	}
+
+	return 0;
+}
+
+void apply_alternatives_rcu(void)
+{
+	will_patched = 0;
+	stop_machine(__apply_alternatives_rcu_multi_stop, NULL, cpu_online_mask);
+}
+
 void __init apply_alternatives_all(void)
 {
 	/* better not try code patching on a live SMP system */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e3928f578891..98c132c3b018 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -141,7 +141,10 @@ SECTIONS
 
 	PERCPU_SECTION(L1_CACHE_BYTES)
 
-	. = ALIGN(4);
+	__init_end = .;
+	. = ALIGN(PAGE_SIZE);
+
+
 	.altinstructions : {
 		__alt_instructions = .;
 		*(.altinstructions)
@@ -151,8 +154,7 @@ SECTIONS
 		*(.altinstr_replacement)
 	}
 
-	. = ALIGN(PAGE_SIZE);
-	__init_end = .;
+	. = ALIGN(4);
 
 	_data = .;
 	_sdata = .;
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index e5139402e7f8..3eb7193fcf88 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -80,6 +80,7 @@ static int __init sysrq_always_enabled_setup(char *str)
 
 __setup("sysrq_always_enabled", sysrq_always_enabled_setup);
 
+extern void apply_alternatives_rcu(void);
 
 static void sysrq_handle_loglevel(int key)
 {
@@ -89,6 +90,7 @@ static void sysrq_handle_loglevel(int key)
 	console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
 	pr_info("Loglevel set to %d\n", i);
 	console_loglevel = i;
+	apply_alternatives_rcu();
 }
 static struct sysrq_key_op sysrq_loglevel_op = {
 	.handler	= sysrq_handle_loglevel,
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 00b042c49ccd..75e29d61fedd 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -218,6 +218,61 @@ void __read_once_size(const volatile void *p, void *res, int size)
 	__READ_ONCE_SIZE;
 }
 
+extern void panic(const char *fmt, ...);
+
+#define __stringify_1_will(x...)	#x
+#define __stringify_will(x...)	__stringify_1_will(x)
+
+#define ALTINSTR_ENTRY_WILL(feature)						      \
+	" .word 661b - .\n"				/* label           */ \
+	" .word 663f - .\n"				/* new instruction */ \
+	" .hword " __stringify_will(feature) "\n"		/* feature bit     */ \
+	" .byte 662b-661b\n"				/* source len      */ \
+	" .byte 664f-663f\n"				/* replacement len */
+
+#define __ALTERNATIVE_CFG_WILL(oldinstr, newinstr, feature, cfg_enabled)	\
+	".if "__stringify_will(cfg_enabled)" == 1\n"				\
+	"661:\n\t"							\
+	oldinstr "\n"							\
+	"662:\n"							\
+	".pushsection .altinstructions,\"a\"\n"				\
+	ALTINSTR_ENTRY_WILL(feature)						\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"a\"\n"			\
+	"663:\n\t"							\
+	newinstr "\n"							\
+	"664:\n\t"							\
+	".popsection\n\t"						\
+	".org	. - (664b-663b) + (662b-661b)\n\t"			\
+	".org	. - (662b-661b) + (664b-663b)\n"			\
+	".endif\n"
+
+#define _ALTERNATIVE_CFG_WILL(oldinstr, newinstr, feature, cfg, ...)	\
+	__ALTERNATIVE_CFG_WILL(oldinstr, newinstr, feature, 1)
+
+#define ALTERNATIVE_WILL(oldinstr, newinstr, ...)   \
+	_ALTERNATIVE_CFG_WILL(oldinstr, newinstr, __VA_ARGS__, 1)
+
+#define __READ_ONCE_SIZE_WILL						\
+({									\
+	__u64 tmp;							\
+									\
+	switch (size) {							\
+	case 8: asm volatile(						\
+		ALTERNATIVE_WILL("ldr %0, %1", "ldar %0, %1", 8)	\
+		: "=r" (tmp) : "Q" (*(volatile __u64 *)p));		\
+		*(__u64 *)res = tmp; break;				\
+	default:							\
+		panic("that's no pointer, yo");				\
+	}								\
+})
+
+static __always_inline
+void __read_once_size_will(const volatile void *p, void *res, int size)
+{
+	__READ_ONCE_SIZE_WILL;
+}
+
 #ifdef CONFIG_KASAN
 /*
  * This function is not 'inline' because __no_sanitize_address confilcts
@@ -537,10 +592,17 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
  * object's lifetime is managed by something other than RCU.  That
  * "something other" might be reference counting or simple immortality.
  */
+
+#define READ_ONCE_WILL(x)						\
+({									\
+	union { typeof(x) __val; char __c[1]; } __u;			\
+	__read_once_size_will(&(x), __u.__c, sizeof(x));		\
+	__u.__val;							\
+})
+
 #define lockless_dereference(p) \
 ({ \
-	typeof(p) _________p1 = READ_ONCE(p); \
-	smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
+	typeof(p) _________p1 = READ_ONCE_WILL(p); \
 	(_________p1); \
 })
 

  reply	other threads:[~2016-02-09 11:43 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-12 12:31 [RFC][PATCH] mips: Fix arch_spin_unlock() Peter Zijlstra
2015-11-12 12:35 ` Peter Zijlstra
2015-11-12 13:31 ` Måns Rullgård
2015-11-12 14:32 ` Paul E. McKenney
2015-11-12 14:50   ` Måns Rullgård
2015-11-12 14:59     ` Paul E. McKenney
2015-11-12 17:46 ` David Daney
2015-11-12 18:00   ` Peter Zijlstra
2015-11-12 18:13   ` Måns Rullgård
2015-11-12 18:17     ` David Daney
2016-01-27  9:57       ` Maciej W. Rozycki
2016-01-27 11:43         ` Will Deacon
2016-01-27 12:41           ` Maciej W. Rozycki
2016-01-28  1:11             ` Boqun Feng
2016-01-27 14:54           ` Peter Zijlstra
2016-01-27 15:21             ` Will Deacon
2016-01-27 23:38               ` Paul E. McKenney
2016-01-28  9:57                 ` Will Deacon
2016-01-28 22:31                   ` Paul E. McKenney
2016-01-29  9:59                     ` Will Deacon
2016-01-29 10:22                       ` Paul E. McKenney
2016-02-01 13:56                         ` Will Deacon
2016-02-02  3:54                           ` Paul E. McKenney
2016-02-02  5:19                             ` Boqun Feng
2016-02-02  6:44                               ` Paul E. McKenney
2016-02-02  8:07                                 ` Linus Torvalds
2016-02-02  8:19                                   ` Linus Torvalds
2016-02-02  9:34                                     ` Boqun Feng
2016-02-02 17:30                                       ` Linus Torvalds
2016-02-02 17:51                                         ` Will Deacon
2016-02-02 18:06                                           ` Linus Torvalds
2016-02-02 19:30                                             ` Will Deacon
2016-02-02 19:55                                               ` Linus Torvalds
2016-02-03 19:13                                                 ` Will Deacon
2016-02-03  8:33                                               ` Ingo Molnar
2016-02-03 13:32                                                 ` Will Deacon
2016-02-03 19:03                                                   ` Will Deacon
2016-02-09 11:23                                                     ` Ingo Molnar
2016-02-09 11:42                                                       ` Will Deacon [this message]
2016-02-02 12:02                                     ` Paul E. McKenney
2016-02-02 17:56                                       ` Linus Torvalds
2016-02-02 22:30                                         ` Paul E. McKenney
2016-02-02 14:49                                     ` Ralf Baechle
2016-02-02 14:54                                       ` Måns Rullgård
2016-02-02 14:58                                         ` Ralf Baechle
2016-02-02 15:51                                           ` Måns Rullgård
2016-02-02 17:23                                 ` Peter Zijlstra
2016-02-02 22:38                                   ` Paul E. McKenney
2016-02-02 11:45                               ` Will Deacon
2016-02-02 12:12                                 ` Boqun Feng
2016-02-02 12:20                                   ` Will Deacon
2016-02-02 13:18                                     ` Boqun Feng
2016-02-02 17:12                                     ` Paul E. McKenney
2016-02-02 17:37                                       ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160209114242.GE22874@arm.com \
    --to=will.deacon@arm.com \
    --cc=boqun.feng@gmail.com \
    --cc=ddaney@caviumnetworks.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=macro@imgtec.com \
    --cc=mans@mansr.com \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=ralf@linux-mips.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).