From: Will Deacon <will.deacon@arm.com>
To: Vikram Mulukutla <markivx@codeaurora.org>
Cc: qiaozhou <qiaozhou@asrmicro.com>,
Thomas Gleixner <tglx@linutronix.de>,
John Stultz <john.stultz@linaro.org>,
sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
Wang Wilbur <wilburwang@asrmicro.com>,
Marc Zyngier <marc.zyngier@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Fri, 28 Jul 2017 10:28:32 +0100 [thread overview]
Message-ID: <20170728092831.GA24839@arm.com> (raw)
In-Reply-To: <e1cc02c5e7dfd4d6bec937b6dc97bfc7@codeaurora.org>
On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:
> On 2017-07-26 18:29, qiaozhou wrote:
> >On 2017年07月26日 22:16, Thomas Gleixner wrote:
> >>On Wed, 26 Jul 2017, qiaozhou wrote:
> >>For that particular timer case we can clear base->running_timer w/o the
> >>lock held (see patch below), but this kind of
> >>
> >> lock -> test -> unlock -> retry
> >>
> >>loops are all over the place in the kernel, so this is going to hurt you
> >>sooner than later in some other place.
> >It's true. This is the way spinlock is used normally and widely in
> >kernel. I'll also ask ARM experts whether we can do something to avoid
> >or reduce the chance of such issue. ARMv8.1 has one single
> >instruction(ldadda) to replace the ldaxr/stxr loop. Hope it can
> >improve and reduce the chance.
>
> I think we should have this discussion now - I brought this up earlier [1]
> and I promised a test case that I completely forgot about - but here it
> is (attached). Essentially a Big CPU in an acquire-check-release loop
> will have an unfair advantage over a little CPU concurrently attempting
> to acquire the same lock, in spite of the ticket implementation. If the Big
> CPU needs the little CPU to make forward progress : livelock.
>
> We've run into the same loop construct in other spots in the kernel and
> the reason that a real symptom is so rare is that the retry-loop on the
> 'Big'
> CPU needs to be interrupted just once by say an IRQ/FIQ and the live-lock
> is broken. If the entire retry loop is within an interrupt-disabled critical
> section then the odds of live-locking are much higher.
>
> An example of the problem on a previous kernel is here [2]. Changes to the
> workqueue code since may have fixed this particular instance.
>
> One solution was to use udelay(1) in such loops instead of cpu_relax(), but
> that's not very 'relaxing'. I'm not sure if there's something we could do
> within the ticket spin-lock implementation to deal with this.
Does bodging cpu_relax to back-off to wfe after a while help? The event
stream will wake it up if nothing else does. Nasty patch below, but I'd be
interested to know whether or not it helps.
Will
--->8
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78f9882..1f5a29c8612e 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -149,9 +149,11 @@ extern void release_thread(struct task_struct *);
unsigned long get_wchan(struct task_struct *p);
+void __cpu_relax(unsigned long pc);
+
static inline void cpu_relax(void)
{
- asm volatile("yield" ::: "memory");
+ __cpu_relax(_THIS_IP_);
}
/* Thread switching */
diff --git a/arch/arm64/kernel/arm64ksyms.c b/arch/arm64/kernel/arm64ksyms.c
index 67368c7329c0..be8a698ea680 100644
--- a/arch/arm64/kernel/arm64ksyms.c
+++ b/arch/arm64/kernel/arm64ksyms.c
@@ -72,6 +72,8 @@ EXPORT_SYMBOL(_mcount);
NOKPROBE_SYMBOL(_mcount);
#endif
+EXPORT_SYMBOL(__cpu_relax);
+
/* arm-smccc */
EXPORT_SYMBOL(__arm_smccc_smc);
EXPORT_SYMBOL(__arm_smccc_hvc);
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae8094ed5..c394c3704b7f 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -403,6 +403,31 @@ unsigned long get_wchan(struct task_struct *p)
return ret;
}
+static DEFINE_PER_CPU(u64, __cpu_relax_data);
+
+#define CPU_RELAX_WFE_THRESHOLD 10000
+void __cpu_relax(unsigned long pc)
+{
+ u64 new, old = raw_cpu_read(__cpu_relax_data);
+ u32 old_pc, new_pc;
+ bool wfe = false;
+
+ old_pc = (u32)old;
+ new = new_pc = (u32)pc;
+
+ if (old_pc == new_pc) {
+ if ((old >> 32) > CPU_RELAX_WFE_THRESHOLD) {
+ asm volatile("sevl; wfe; wfe\n" ::: "memory");
+ wfe = true;
+ } else {
+ new = old + (1UL << 32);
+ }
+ }
+
+ if (this_cpu_cmpxchg(__cpu_relax_data, old, new) == old && !wfe)
+ asm volatile("yield" ::: "memory");
+}
+
unsigned long arch_align_stack(unsigned long sp)
{
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
next prev parent reply other threads:[~2017-07-28 9:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27 1:29 ` qiaozhou
2017-07-27 15:14 ` Will Deacon
2017-07-27 15:19 ` Thomas Gleixner
2017-07-28 1:10 ` Vikram Mulukutla
2017-07-28 9:28 ` Peter Zijlstra
2017-07-28 19:11 ` Vikram Mulukutla
2017-07-28 9:28 ` Will Deacon [this message]
2017-07-28 19:09 ` Vikram Mulukutla
2017-07-31 11:20 ` qiaozhou
2017-08-01 7:37 ` qiaozhou
2017-08-03 23:32 ` Vikram Mulukutla
2017-08-04 3:15 ` qiaozhou
2017-07-31 13:13 ` Will Deacon
2017-08-03 23:25 ` Vikram Mulukutla
2017-08-15 18:40 ` Will Deacon
2017-08-25 19:48 ` Vikram Mulukutla
2017-08-25 20:25 ` Vikram Mulukutla
2017-08-28 23:12 ` Vikram Mulukutla
2017-09-06 11:19 ` qiaozhou
2017-09-25 11:02 ` qiaozhou
2017-10-02 14:14 ` Will Deacon
2017-10-11 8:33 ` qiaozhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170728092831.GA24839@arm.com \
--to=will.deacon@arm.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel-owner@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=markivx@codeaurora.org \
--cc=peterz@infradead.org \
--cc=qiaozhou@asrmicro.com \
--cc=sboyd@codeaurora.org \
--cc=sudeep.holla@arm.com \
--cc=tglx@linutronix.de \
--cc=wilburwang@asrmicro.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.