All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vikram Mulukutla <markivx@codeaurora.org>
To: Will Deacon <will.deacon@arm.com>
Cc: qiaozhou <qiaozhou@asrmicro.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	John Stultz <john.stultz@linaro.org>,
	sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
	Wang Wilbur <wilburwang@asrmicro.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Mon, 28 Aug 2017 16:12:01 -0700	[thread overview]
Message-ID: <e3812c7a1202ee79101406e7003dff9a@codeaurora.org> (raw)
In-Reply-To: <9f86bd426bbaede9de6d38cb047bd6fa@codeaurora.org>

Hi Will,

On 2017-08-25 12:48, Vikram Mulukutla wrote:
> Hi Will,
> 
> On 2017-08-15 11:40, Will Deacon wrote:
>> Hi Vikram,
>> 
>> On Thu, Aug 03, 2017 at 04:25:12PM -0700, Vikram Mulukutla wrote:
>>> On 2017-07-31 06:13, Will Deacon wrote:
>>> >On Fri, Jul 28, 2017 at 12:09:38PM -0700, Vikram Mulukutla wrote:
>>> >>On 2017-07-28 02:28, Will Deacon wrote:
>>> >>>On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:
>>> 
>>> >>>
>>> >>This does seem to help. Here's some data after 5 runs with and without
>>> >>the
>>> >>patch.
>>> >
>>> >Blimey, that does seem to make a difference. Shame it's so ugly! Would you
>>> >be able to experiment with other values for CPU_RELAX_WFE_THRESHOLD? I had
>>> >it set to 10000 in the diff I posted, but that might be higher than
>>> >optimal.
>>> >It would be interested to see if it correlates with num_possible_cpus()
>>> >for the highly contended case.
>>> >
>>> >Will
>>> 
>>> Sorry for the late response - I should hopefully have some more data 
>>> with
>>> different thresholds before the week is finished or on Monday.
>> 
>> Did you get anywhere with the threshold heuristic?
>> 
>> Will
> 
> Here's some data from experiments that I finally got to today. I 
> decided
> to recompile for every value of the threshold. Was doing a binary 
> search
> of sorts and then started reducing by orders of magnitude. There pairs
> of rows here:
> 

Well here's something interesting. I tried a different platform and 
found that
the workaround doesn't help much at all, similar to Qiao's observation 
on his b.L
chipset. Something to do with the WFE implementation or event-stream?

I modified your patch to use a __delay(1) in place of the WFEs and this 
was
the result (still with the 10k threshold). The worst-case lock time for 
cpu0
drastically improves. Given that cpu0 re-enables interrupts between each 
lock
attempt in my test case, I think the lock count matters less here.

cpu_relax() patch with WFEs (original workaround):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
      999843|            25|           2|      12988498| -> c0/cpu0 at 
300Mhz
           0|       8421132|           1|       9152979| -> c0/cpu0 at 
1.9GHz
------------------------------------------------------|
      999860|           160|           2|      12963487|
           1|       8418492|           1|       9158001|
------------------------------------------------------|
      999381|           734|           2|      12988636|
           1|       8387562|           1|       9128056|
------------------------------------------------------|
      989800|           750|           3|      12996473|
           1|       8389091|           1|       9112444|
------------------------------------------------------|

cpu_relax() patch with __delay(1):
(pairs of rows, first row is with c0 at 300Mhz, second
with c0 at 1.9GHz. Both rows have cpu4 at 2.3GHz. max time
is in microseconds)
------------------------------------------------------|
c0 max time| c0 lock count| c4 max time| c4 lock count|
------------------------------------------------------|
        7703|         1532|            2|      13035203| -> c0/cpu0 at 
300Mhz
           1|      8511686|            1|       8550411| -> c0/cpu0 at 
1.9GHz
------------------------------------------------------|
        7801|         1561|            2|      13040188|
           1|      8553985|            1|       8609853|
------------------------------------------------------|
        3953|         1576|            2|      13049991|
           1|      8576370|            1|       8611533|
------------------------------------------------------|
        3953|         1557|            2|      13030553|
           1|      8509020|            1|       8543883|
------------------------------------------------------|

I should also note that my earlier kernel was 4.9-stable based
and the one above was on a 4.4-stable based kernel.

Thanks,
Vikram

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

  parent reply	other threads:[~2017-08-28 23:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27  1:29   ` qiaozhou
2017-07-27 15:14     ` Will Deacon
2017-07-27 15:19       ` Thomas Gleixner
2017-07-28  1:10     ` Vikram Mulukutla
2017-07-28  9:28       ` Peter Zijlstra
2017-07-28 19:11         ` Vikram Mulukutla
2017-07-28  9:28       ` Will Deacon
2017-07-28 19:09         ` Vikram Mulukutla
2017-07-31 11:20           ` qiaozhou
2017-08-01  7:37             ` qiaozhou
2017-08-03 23:32               ` Vikram Mulukutla
2017-08-04  3:15                 ` qiaozhou
2017-07-31 13:13           ` Will Deacon
2017-08-03 23:25             ` Vikram Mulukutla
2017-08-15 18:40               ` Will Deacon
2017-08-25 19:48                 ` Vikram Mulukutla
2017-08-25 20:25                   ` Vikram Mulukutla
2017-08-28 23:12                   ` Vikram Mulukutla [this message]
2017-09-06 11:19                     ` qiaozhou
2017-09-25 11:02                     ` qiaozhou
2017-10-02 14:14                       ` Will Deacon
2017-10-11  8:33                         ` qiaozhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3812c7a1202ee79101406e7003dff9a@codeaurora.org \
    --to=markivx@codeaurora.org \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel-owner@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=peterz@infradead.org \
    --cc=qiaozhou@asrmicro.com \
    --cc=sboyd@codeaurora.org \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wilburwang@asrmicro.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.