linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vikram Mulukutla <markivx@codeaurora.org>
To: Will Deacon <will.deacon@arm.com>
Cc: qiaozhou <qiaozhou@asrmicro.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	John Stultz <john.stultz@linaro.org>,
	sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
	Wang Wilbur <wilburwang@asrmicro.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Mon, 28 Aug 2017 16:12:01 -0700	[thread overview]
Message-ID: <e3812c7a1202ee79101406e7003dff9a@codeaurora.org> (raw)
In-Reply-To: <9f86bd426bbaede9de6d38cb047bd6fa@codeaurora.org>

Hi Will,

On 2017-08-25 12:48, Vikram Mulukutla wrote:
> Hi Will,
> 
> On 2017-08-15 11:40, Will Deacon wrote:
>> Hi Vikram,
>> 
>> On Thu, Aug 03, 2017 at 04:25:12PM -0700, Vikram Mulukutla wrote:
>>> On 2017-07-31 06:13, Will Deacon wrote:
>>> >On Fri, Jul 28, 2017 at 12:09:38PM -0700, Vikram Mulukutla wrote:
>>> >>On 2017-07-28 02:28, Will Deacon wrote:
>>> >>>On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:
>>> 
>>> >>>
>>> >>This does seem to help. Here's some data after 5 runs with and without
>>> >>the
>>> >>patch.
>>> >
>>> >Blimey, that does seem to make a difference. Shame it's so ugly! Would you
>>> >be able to experiment with other values for CPU_RELAX_WFE_THRESHOLD? I had
>>> >it set to 10000 in the diff I posted, but that might be higher than
>>> >optimal.
>>> >It would be interested to see if it correlates with num_possible_cpus()
>>> >for the highly contended case.
>>> >
>>> >Will
>>> 
>>> Sorry for the late response - I should hopefully have some more data 
>>> with
>>> different thresholds before the week is finished or on Monday.
>> 
>> Did you get anywhere with the threshold heuristic?
>> 
>> Will
> 
> Here's some data from experiments that I finally got to today. I 
> decided
> to recompile for every value of the threshold. Was doing a binary 
> search
> of sorts and then started reducing by orders of magnitude. There pairs
> of rows here:
> 

Well here's something interesting. I tried a different platform and 
found that
the workaround doesn't help much at all, similar to Qiao's observation 
on his b.L
chipset. Something to do with the WFE implementation or event-stream?

I modified your patch to use a __delay(1) in place of the WFEs and this 
was
the result (still with the 10k threshold). The worst-case lock time for 
cpu0
drastically improves. Given that cpu0 re-enables interrupts between each 
lock
attempt in my test case, I think the lock count matters less here.

cpu_relax() patch with WFEs (original workaround):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
      999843|            25|           2|      12988498| -> c0/cpu0 at 
300Mhz
           0|       8421132|           1|       9152979| -> c0/cpu0 at 
1.9GHz
------------------------------------------------------|
      999860|           160|           2|      12963487|
           1|       8418492|           1|       9158001|
------------------------------------------------------|
      999381|           734|           2|      12988636|
           1|       8387562|           1|       9128056|
------------------------------------------------------|
      989800|           750|           3|      12996473|
           1|       8389091|           1|       9112444|
------------------------------------------------------|

cpu_relax() patch with __delay(1):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz. max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
        7703|         1532|            2|      13035203| -> c0/cpu0 at 
300Mhz
           1|      8511686|            1|       8550411| -> c0/cpu0 at 
1.9GHz
------------------------------------------------------|
        7801|         1561|            2|      13040188|
           1|      8553985|            1|       8609853|
------------------------------------------------------|
        3953|         1576|            2|      13049991|
           1|      8576370|            1|       8611533|
------------------------------------------------------|
        3953|         1557|            2|      13030553|
           1|      8509020|            1|       8543883|
------------------------------------------------------|

I should also note that my earlier kernel was 4.9-stable based
and the one above was on a 4.4-stable based kernel.

Thanks,
Vikram

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

  parent reply	other threads:[~2017-08-28 23:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27  1:29   ` qiaozhou
2017-07-27 15:14     ` Will Deacon
2017-07-27 15:19       ` Thomas Gleixner
2017-07-28  1:10     ` Vikram Mulukutla
2017-07-28  9:28       ` Peter Zijlstra
2017-07-28 19:11         ` Vikram Mulukutla
2017-07-28  9:28       ` Will Deacon
2017-07-28 19:09         ` Vikram Mulukutla
2017-07-31 11:20           ` qiaozhou
2017-08-01  7:37             ` qiaozhou
2017-08-03 23:32               ` Vikram Mulukutla
2017-08-04  3:15                 ` qiaozhou
2017-07-31 13:13           ` Will Deacon
2017-08-03 23:25             ` Vikram Mulukutla
2017-08-15 18:40               ` Will Deacon
2017-08-25 19:48                 ` Vikram Mulukutla
2017-08-25 20:25                   ` Vikram Mulukutla
2017-08-28 23:12                   ` Vikram Mulukutla [this message]
2017-09-06 11:19                     ` qiaozhou
2017-09-25 11:02                     ` qiaozhou
2017-10-02 14:14                       ` Will Deacon
2017-10-11  8:33                         ` qiaozhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3812c7a1202ee79101406e7003dff9a@codeaurora.org \
    --to=markivx@codeaurora.org \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel-owner@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=peterz@infradead.org \
    --cc=qiaozhou@asrmicro.com \
    --cc=sboyd@codeaurora.org \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wilburwang@asrmicro.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).