linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vikram Mulukutla <markivx@codeaurora.org>
To: Will Deacon <will.deacon@arm.com>
Cc: qiaozhou <qiaozhou@asrmicro.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	John Stultz <john.stultz@linaro.org>,
	sboyd@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
	Wang Wilbur <wilburwang@asrmicro.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel-owner@vger.kernel.org, sudeep.holla@arm.com
Subject: Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync
Date: Fri, 28 Jul 2017 12:09:38 -0700	[thread overview]
Message-ID: <2aa9684cf9c889ee9fdc8550b4388af6@codeaurora.org> (raw)
In-Reply-To: <20170728092831.GA24839@arm.com>

On 2017-07-28 02:28, Will Deacon wrote:
> On Thu, Jul 27, 2017 at 06:10:34PM -0700, Vikram Mulukutla wrote:

<snip>

>> 
>> I think we should have this discussion now - I brought this up earlier 
>> [1]
>> and I promised a test case that I completely forgot about - but here 
>> it
>> is (attached). Essentially a Big CPU in an acquire-check-release loop
>> will have an unfair advantage over a little CPU concurrently 
>> attempting
>> to acquire the same lock, in spite of the ticket implementation. If 
>> the Big
>> CPU needs the little CPU to make forward progress : livelock.
>> 

<snip>

>> 
>> One solution was to use udelay(1) in such loops instead of 
>> cpu_relax(), but
>> that's not very 'relaxing'. I'm not sure if there's something we could 
>> do
>> within the ticket spin-lock implementation to deal with this.
> 
> Does bodging cpu_relax to back-off to wfe after a while help? The event
> stream will wake it up if nothing else does. Nasty patch below, but I'd 
> be
> interested to know whether or not it helps.
> 
> Will
> 
This does seem to help. Here's some data after 5 runs with and without 
the patch.

time = max time taken to acquire lock
counter = number of times lock acquired

cpu0: little cpu @ 300MHz, cpu4: Big cpu @2.0GHz
Without the cpu_relax() bodging patch:
=====================================================
cpu0 time | cpu0 counter | cpu4 time | cpu4 counter |
==========|==============|===========|==============|
   117893us|       2349144|        2us|       6748236|
   571260us|       2125651|        2us|       7643264|
    19780us|       2392770|        2us|       5987203|
    19948us|       2395413|        2us|       5977286|
    19822us|       2429619|        2us|       5768252|
    19888us|       2444940|        2us|       5675657|
=====================================================

cpu0: little cpu @ 300MHz, cpu4: Big cpu @2.0GHz
With the cpu_relax() bodging patch:
=====================================================
cpu0 time | cpu0 counter | cpu4 time | cpu4 counter |
==========|==============|===========|==============|
        3us|       2737438|        2us|       6907147|
        2us|       2742478|        2us|       6902241|
      132us|       2745636|        2us|       6876485|
        3us|       2744554|        2us|       6898048|
        3us|       2741391|        2us|       6882901|
=====================================================

The patch also seems to have helped with fairness in general
allowing more work to be done if the CPU frequencies are more
closely matched (I don't know if this translates to real world
performance - probably not). The counter values are higher
with the patch.

time = max time taken to acquire lock
counter = number of times lock acquired

cpu0: little cpu @ 1.5GHz, cpu4: Big cpu @2.0GHz
Without the cpu_relax() bodging patch:
=====================================================
cpu0 time | cpu0 counter | cpu4 time | cpu4 counter |
==========|==============|===========|==============|
        2us|       5240654|        1us|       5339009|
        2us|       5287797|       97us|       5327073|
        2us|       5237634|        1us|       5334694|
        2us|       5236676|       88us|       5333582|
       84us|       5285880|       84us|       5329489|
=====================================================

cpu0: little cpu @ 1.5GHz, cpu4: Big cpu @2.0GHz
With the cpu_relax() bodging patch:
=====================================================
cpu0 time | cpu0 counter | cpu4 time | cpu4 counter |
==========|==============|===========|==============|
      140us|      10449121|        1us|      11154596|
        1us|      10757081|        1us|      11479395|
       83us|      10237109|        1us|      10902557|
        2us|       9871101|        1us|      10514313|
        2us|       9758763|        1us|      10391849|
=====================================================


Thanks,
Vikram

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

  reply	other threads:[~2017-07-28 19:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3d2459c7-defd-a47e-6cea-007c10cecaac@asrmicro.com>
2017-07-26 14:16 ` [Question]: try to fix contention between expire_timers and try_to_del_timer_sync Thomas Gleixner
2017-07-27  1:29   ` qiaozhou
2017-07-27 15:14     ` Will Deacon
2017-07-27 15:19       ` Thomas Gleixner
2017-07-28  1:10     ` Vikram Mulukutla
2017-07-28  9:28       ` Peter Zijlstra
2017-07-28 19:11         ` Vikram Mulukutla
2017-07-28  9:28       ` Will Deacon
2017-07-28 19:09         ` Vikram Mulukutla [this message]
2017-07-31 11:20           ` qiaozhou
2017-08-01  7:37             ` qiaozhou
2017-08-03 23:32               ` Vikram Mulukutla
2017-08-04  3:15                 ` qiaozhou
2017-07-31 13:13           ` Will Deacon
2017-08-03 23:25             ` Vikram Mulukutla
2017-08-15 18:40               ` Will Deacon
2017-08-25 19:48                 ` Vikram Mulukutla
2017-08-25 20:25                   ` Vikram Mulukutla
2017-08-28 23:12                   ` Vikram Mulukutla
2017-09-06 11:19                     ` qiaozhou
2017-09-25 11:02                     ` qiaozhou
2017-10-02 14:14                       ` Will Deacon
2017-10-11  8:33                         ` qiaozhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2aa9684cf9c889ee9fdc8550b4388af6@codeaurora.org \
    --to=markivx@codeaurora.org \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel-owner@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=peterz@infradead.org \
    --cc=qiaozhou@asrmicro.com \
    --cc=sboyd@codeaurora.org \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wilburwang@asrmicro.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).