linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hpe.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	ling.ma.program@gmail.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, Ma Ling <ling.ml@alibaba-inc.com>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>
Subject: Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback
Date: Mon, 19 Oct 2015 13:24:27 -0400	[thread overview]
Message-ID: <5625274B.30904@hpe.com> (raw)
In-Reply-To: <20151019112417.GA752@gmail.com>

On 10/19/2015 07:24 AM, Ingo Molnar wrote:
> * Peter Zijlstra<peterz@infradead.org>  wrote:
>
>> On Mon, Oct 19, 2015 at 09:58:23AM +0200, Ingo Molnar wrote:
>>> * ling.ma.program@gmail.com<ling.ma.program@gmail.com>  wrote:
>>>
>>>> From: Ma Ling<ling.ml@alibaba-inc.com>
>>>>
>>>> All load instructions can run speculatively but they have to follow
>>>> memory order rule in multiple cores as below:
>>>> _x = _y = 0
>>>>
>>>> Processor 0				Processor 1
>>>>
>>>> mov r1, [ _y]  //M1			mov [ _x], 1  //M3
>>>> mov r2, [ _x]  //M2			mov [ _y], 1  //M4
>>>>
>>>> If r1 = 1, r2 must be 1
>>>>
>>>> In order to guarantee above rule, although Processor 0 execute
>>>> M1 and M2 instruction out of order, they are kept in ROB,
>>>> when load buffer for _x in Processor 0 received the update
>>>> message from Processor 1, Processor 0 need to roll back
>>>> from M2 instruction, which will flush the whole pipeline,
>>>> the latency is over the penalty from branch prediction miss.
>>>>
>>>> In this patch we use lock cmpxchg instruction to force load
>>>> instructions to be serialization, the destination operand
>>>> receives a write cycle without regard to the result of
>>>> the comparison, which can help us to reduce the penalty
>>>> from load instruction roll back.
>>>>
>>>> Our experiment indicates the performance can be improved by 10%~15%
>>>> for 2 and 3 threads cases, the conflicts from lock cache line
>>>> spend them most of the time.
>>> So it would be nice to create a new user-space spinlock testing facility, via a
>>> new 'perf bench spinlock' feature or so. That way others can test and validate
>>> your results on different hardware as well.
>> So its trivial to lift this code into userspace -- in fact, I have that
>> somewhere.
>>
>> The trouble is going to keep them in sync.
> So we can just try this optimistically, and if it keeps breaking, we can use the
> technique perf uses to sync up the rbtree implementation: we copy the kernel
> version into tooling, but run diff against the kernel version and warn at tool
> build time that there's divergence.
>
> I.e. a non-build-fatal force that keeps things in sync.
>
> Thanks,
>
> 	Ingo
>

It is on my to-do list. I just want to wrap up my latest PV qspinlock 
patch before embarking on this adventure.

Cheers,
Longman

  reply	other threads:[~2015-10-19 17:24 UTC|newest]

Thread overview: 228+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <yes>
2009-01-16 18:08 ` Quota fixes and improvements Jan Kara
2009-01-16 18:08   ` [PATCH 01/11] quota: Improve locking Jan Kara
2009-01-16 18:08     ` [PATCH 02/11] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Jan Kara
2009-01-16 18:08       ` [PATCH 03/11] ocfs2: Push out dropping of dentry lock to ocfs2_wq Jan Kara
2009-01-16 18:08         ` [PATCH 04/11] ocfs2: Fix possible deadlock in ocfs2_write_dquot() Jan Kara
2009-01-16 18:08           ` [PATCH 05/11] quota: Add quota reservation support Jan Kara
2009-01-16 18:08             ` [PATCH 06/11] quota: Add quota reservation claim and released operations Jan Kara
2009-01-16 18:08               ` [PATCH 07/11] quota: Use inode->i_blkbits to get block bits Jan Kara
2009-01-16 18:08                 ` [PATCH 08/11] quota: Move EXPORT_SYMBOL immediately next to the functions/varibles Jan Kara
2009-01-16 18:08                   ` [PATCH 09/11] ext3: Remove unnecessary quota functions Jan Kara
2009-01-16 18:08                     ` [PATCH 10/11] ext4: " Jan Kara
2009-01-16 18:08                       ` [PATCH 11/11] reiserfs: " Jan Kara
2009-01-20 21:41                       ` [PATCH 10/11] ext4: " Mingming Cao
2009-01-20 21:41                     ` [PATCH 09/11] ext3: " Mingming Cao
2009-01-24  7:49     ` [PATCH 01/11] quota: Improve locking Andrew Morton
2009-01-26 10:04       ` Jan Kara
2009-05-31 14:49 ` [PATCH 0/8] kernel:lockdep:replace DFS with BFS tom.leiming
2009-05-31 14:49   ` [PATCH 1/8] kernel:lockdep:improve implementation of BFS tom.leiming
2009-05-31 14:49     ` [PATCH 2/8] kernel:lockdep: introduce match function to BFS tom.leiming
2009-05-31 14:49       ` [PATCH 3/8] kernel:lockdep:implement check_noncircular() by BFS tom.leiming
2009-05-31 14:49         ` [PATCH 4/8] kernel:lockdep:implement find_usage_*wards " tom.leiming
2009-05-31 14:49           ` [PATCH 5/8] kernel:lockdep:introduce print_shortest_lock_dependencies tom.leiming
2009-05-31 14:49             ` [PATCH 6/8] kernel:lockdep: implement lockdep_count_*ward_deps by BFS tom.leiming
2009-05-31 14:49               ` [PATCH 7/8] kernel:lockdep: update memory usage introduced " tom.leiming
2009-05-31 14:49                 ` [PATCH 8/8] kernel:lockdep:add statistics info for max bfs queue depth tom.leiming
2009-05-31 15:14           ` [PATCH 4/8] kernel:lockdep:implement find_usage_*wards by BFS Daniel Walker
2009-06-01  0:14             ` Ming Lei
2009-06-08 12:22   ` [PATCH 0/8] kernel:lockdep:replace DFS with BFS Peter Zijlstra
2009-06-08 13:38     ` Ming Lei
2009-06-08 13:58     ` Ming Lei
2009-06-08 14:04       ` Peter Zijlstra
2009-06-08 15:50     ` Ming Lei
2009-06-09 12:52       ` Ming Lei
2009-10-07 13:49 ` [PATCH 1/1] perf tools: Up the verbose level for some really verbose stuff Arnaldo Carvalho de Melo
2009-10-08 17:31   ` [tip:perf/core] " tip-bot for Arnaldo Carvalho de Melo
2010-06-22 15:20 ` [RFC][PATCH 00/10] cifs: local caching support using FS-Cache Suresh Jayaraman
2010-06-22 15:22 ` [RFC][PATCH 01/10] cifs: add kernel config option for CIFS Client caching support Suresh Jayaraman
2010-06-22 15:22 ` [RFC][PATCH 02/10] cifs: guard cifsglob.h against multiple inclusion Suresh Jayaraman
2010-06-22 21:37   ` Jeff Layton
2010-06-22 15:23 ` [RFC][PATCH 03/10] cifs: register CIFS for caching Suresh Jayaraman
2010-06-23 16:51   ` David Howells
2010-06-25 10:56     ` Suresh Jayaraman
2010-06-22 15:23 ` [RFC][PATCH 04/10] cifs: define server-level cache index objects and register them with FS-Cache Suresh Jayaraman
2010-06-22 21:52   ` Jeff Layton
2010-06-23  5:34     ` Suresh Jayaraman
2010-06-23 16:54   ` David Howells
2010-06-22 15:23 ` [RFC][PATCH 05/10] cifs: define superblock-level cache index objects and register them Suresh Jayaraman
2010-06-23 16:58   ` David Howells
2010-06-25 12:44     ` Suresh Jayaraman
2010-06-25 12:58       ` David Howells
2010-06-25 13:26         ` David Howells
2010-06-28 12:53           ` Suresh Jayaraman
2010-06-28 13:24             ` David Howells
2010-06-22 15:23 ` [RFC][PATCH 06/10] cifs: define inode-level cache object " Suresh Jayaraman
2010-06-23 17:02   ` David Howells
2010-06-25 12:50     ` Suresh Jayaraman
2010-06-25 12:55       ` David Howells
2010-06-25 16:53         ` Jeff Layton
2010-06-25 21:46           ` David Howells
2010-06-25 22:26             ` Jeff Layton
2010-06-25 23:04               ` David Howells
2010-06-25 23:05               ` Steve French
     [not found]                 ` <OFB55E8EC7.E8DD23D5-ON8725774E.0004921E-8825774E.0004CC31@us.ibm.com>
2010-06-27 18:17                   ` Aneesh Kumar K. V
2010-06-27 18:22                     ` Christoph Hellwig
2010-06-22 15:23 ` [RFC][PATCH 07/10] cifs: FS-Cache page management Suresh Jayaraman
2010-06-23 17:05   ` David Howells
2010-06-22 15:24 ` [RFC][PATCH 08/10] cifs: store pages into local cache Suresh Jayaraman
2010-06-23 17:06   ` David Howells
2010-06-22 15:24 ` [RFC][PATCH 09/10] cifs: read pages from FS-Cache Suresh Jayaraman
2010-06-23 17:07   ` David Howells
2010-06-22 15:25 ` [RFC][PATCH 10/10] cifs: add mount option to enable local caching Suresh Jayaraman
2010-06-23 17:08   ` David Howells
2010-06-23 18:32   ` Scott Lovenberg
2010-06-25 10:48     ` Suresh Jayaraman
2011-06-15  0:46 ` [PATCH] Add ok2440 development board support Wu DaoGuang
2011-10-03  0:32 ` [PATCH 1/1] ARM: Make debug UART optional for S3C devices Thiago A. Correa
2011-10-10 14:44   ` Thiago A. Corrêa
2011-12-11 13:10 ` [PATCH] block: Needn't read the size of device or partition again taco
2012-06-08 17:23 ` [PATCH 1/4] slub: change declare of get_slab() to inline at all times Joonsoo Kim
2012-06-08 17:23   ` [PATCH 2/4] slub: use __cmpxchg_double_slab() at interrupt disabled place Joonsoo Kim
2012-06-08 17:23   ` [PATCH 3/4] slub: refactoring unfreeze_partials() Joonsoo Kim
2012-06-20  7:19     ` Pekka Enberg
2012-06-08 17:23   ` [PATCH 4/4] slub: deactivate freelist of kmem_cache_cpu all at once in deactivate_slab() Joonsoo Kim
2012-06-08 19:04     ` Christoph Lameter
2012-06-10 10:27       ` JoonSoo Kim
2012-06-22 18:34         ` JoonSoo Kim
2012-06-08 19:02   ` [PATCH 1/4] slub: change declare of get_slab() to inline at all times Christoph Lameter
2012-06-09 15:57     ` JoonSoo Kim
2012-06-11 15:04       ` Christoph Lameter
2012-06-22 18:22 ` [PATCH 1/3] slub: prefetch next freelist pointer in __slab_alloc() Joonsoo Kim
2012-06-22 18:22   ` [PATCH 2/3] slub: reduce failure of this_cpu_cmpxchg in put_cpu_partial() after unfreezing Joonsoo Kim
2012-07-04 13:05     ` Pekka Enberg
2012-07-05 14:20       ` Christoph Lameter
2012-08-16  7:06     ` Pekka Enberg
2012-06-22 18:22   ` [PATCH 3/3] slub: release a lock if freeing object with a lock is failed in __slab_free() Joonsoo Kim
2012-07-04 13:10     ` Pekka Enberg
2012-07-04 14:48       ` JoonSoo Kim
2012-07-05 14:26     ` Christoph Lameter
2012-07-06 14:19       ` JoonSoo Kim
2012-07-06 14:34         ` Christoph Lameter
2012-07-06 14:59           ` JoonSoo Kim
2012-07-06 15:10             ` Christoph Lameter
2012-07-08 16:19               ` JoonSoo Kim
2012-06-22 18:45   ` [PATCH 1/3 v2] slub: prefetch next freelist pointer in __slab_alloc() Joonsoo Kim
2012-07-04 12:58     ` JoonSoo Kim
2012-07-04 13:00     ` Pekka Enberg
2012-07-04 14:30       ` JoonSoo Kim
2012-07-04 15:08         ` Pekka Enberg
2012-07-04 15:26           ` Eric Dumazet
2012-07-04 15:48             ` JoonSoo Kim
2012-07-04 16:15               ` Eric Dumazet
2012-07-04 16:24                 ` JoonSoo Kim
2012-07-04 15:45           ` JoonSoo Kim
2012-07-04 15:59             ` Pekka Enberg
2012-07-04 16:04               ` JoonSoo Kim
     [not found] ` <1360258447-27247-1-git-send-email-yes>
2013-02-07 17:34   ` [PATCH 04/10] USB: EHCI: make ehci-orion a separate driver manjunath.goudar
2013-02-07 19:41     ` Arnd Bergmann
2013-02-08 10:38     ` Florian Fainelli
2013-02-07 17:34   ` [PATCH 05/10] USB: EHCI: make ehci-atmel " manjunath.goudar
2013-02-08  2:58     ` Bo Shen
2013-06-12 11:53     ` Jean-Christophe PLAGNIOL-VILLARD
2013-02-07 17:34   ` [PATCH 07/10] USB: EHCI: make ehci-mv " manjunath.goudar
2013-02-07 17:34   ` [PATCH 08/10] USB: EHCI: make ehci-vt8500 " manjunath.goudar
2013-02-07 18:54     ` Tony Prisk
2013-02-07 17:34   ` [PATCH 09/10] USB: EHCI: make ehci-msm " manjunath.goudar
2013-02-07 18:48     ` Stephen Warren
2013-02-07 19:05     ` David Brown
2013-02-07 17:34   ` [PATCH 10/10] USB: EHCI: make ehci-w90X900 " manjunath.goudar
2013-06-10  9:17 ` [PATCH v2 00/11] ARM:STixxxx: Add STixxxx platform and board support Srinivas KANDAGATLA
2013-06-10  9:21   ` [PATCH v2 01/11] serial:st-asc: Add ST ASC driver Srinivas KANDAGATLA
2013-06-10  9:35     ` Russell King - ARM Linux
2013-06-10 11:53       ` Srinivas KANDAGATLA
2013-06-10  9:21   ` [PATCH v2 02/11] clocksource:global_timer: Add ARM global timer support Srinivas KANDAGATLA
     [not found]     ` <CACRpkdbQCRKBzRF4HzNsXHwXCLJJcFZ9T36GPmmYsnX1OfgGRg@mail.gmail.com>
     [not found]       ` <51B5D7A6.1020101@st.com>
     [not found]         ` <51B72E9A.6070006@st.com>
     [not found]           ` <CACRpkdbppRqnMYknbBy8JpAVtujMOEQvyczXTmpvkQuxgikFog@mail.gmail.com>
2013-06-12 10:45             ` Srinivas KANDAGATLA
2013-06-10  9:21   ` [PATCH v2 03/11] regmap: Add regmap_field APIs Srinivas KANDAGATLA
2013-06-11 10:48     ` Mark Brown
2013-06-11 11:36       ` Srinivas KANDAGATLA
2013-06-10  9:22   ` [PATCH v2 04/11] mfd:stixxxx-syscfg: Add ST System Configuration support Srinivas KANDAGATLA
     [not found]     ` <CACRpkdaW2ALTWCB7Rd8m=aAGQwh3T_dJVncxJn_eXer4X3J6_g@mail.gmail.com>
2013-06-10 13:52       ` Srinivas KANDAGATLA
2013-06-10 14:02         ` Arnd Bergmann
2013-06-10 15:51           ` Srinivas KANDAGATLA
2013-06-11  7:41           ` Srinivas KANDAGATLA
2013-06-10  9:22   ` [PATCH v2 05/11] pinctrl:stixxxx: Add pinctrl and pinconf support Srinivas KANDAGATLA
2013-06-10  9:26   ` =?yes?q?=5BPATCH=20v2=2006/11=5D=20ARM=3Astixxxx=3A=20Add=20STiH415=20SOC=20support?= Srinivas KANDAGATLA
2013-06-10  9:55     ` [PATCH v2 06/11] ARM:stixxxx: Add STiH415 SOC support Michal Simek
2013-06-10 11:08     ` Michal Simek
     [not found]     ` <CAHTX3d+dk3W_9b7SVUokWq4KYXnj=Z1=WPj5zJ-gUvJqqwE=+Q@mail.gmail.com>
2013-06-10 11:46       ` Srinivas KANDAGATLA
2013-06-10 23:19         ` Russell King - ARM Linux
2013-06-11  6:50           ` Srinivas KANDAGATLA
2013-06-13 11:56             ` Russell King - ARM Linux
2013-06-13 12:41               ` Srinivas KANDAGATLA
2013-06-13 12:47           ` Linus Walleij
2013-06-10  9:27   ` [PATCH v2 07/11] ARM:stixxxx: Add STiH416 " Srinivas KANDAGATLA
2013-06-10 13:52     ` Arnd Bergmann
2013-06-10 16:17       ` Srinivas KANDAGATLA
2013-06-14  7:12       ` Srinivas KANDAGATLA
2013-06-10  9:27   ` [PATCH v2 08/11] ARM:stixxxx: Add DEBUG_LL console support Srinivas KANDAGATLA
2013-06-10  9:27   ` [PATCH v2 09/11] ARM:stixxxx: Add stixxxx options to multi_v7_defconfig Srinivas KANDAGATLA
2013-06-10 10:40     ` Mark Rutland
2013-06-10 10:58       ` Srinivas KANDAGATLA
2013-06-10 13:15         ` Mark Rutland
2013-06-13  9:24           ` Srinivas KANDAGATLA
2013-06-17  9:32             ` Mark Rutland
2013-06-10  9:28   ` [PATCH v2 10/11] ARM:stih41x: Add B2000 board support Srinivas KANDAGATLA
2013-06-10  9:28   ` [PATCH v2 11/11] ARM:stih41x: Add B2020 " Srinivas KANDAGATLA
2014-08-12  6:40 ` [PATCH v3] uas: replace WARN_ON_ONCE() with lockdep_assert_held() Sanjeev Sharma
2014-08-12  6:28   ` Hans de Goede
2014-08-12  6:37     ` Sharma, Sanjeev
2014-08-19  6:33     ` Sharma, Sanjeev
2014-08-19  9:30       ` gregkh
2014-08-19  9:38         ` Sharma, Sanjeev
2014-09-04  7:06         ` Sharma, Sanjeev
2014-09-04 13:50 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c linux.delve
2014-09-04 13:51   ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c This is a patch to the file r819xU_firmware.c that fixes a brace warning found by checkpatch.pl tool linux.delve
2014-09-04 14:09 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c Chaitra Ramaiah
2014-09-04 14:09   ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c This is a patch to the file r819xU_firmware.c that fixes a brace warning found by checkpatch.pl tool Chaitra Ramaiah
2014-09-04 14:28     ` Greg KH
2014-09-04 14:33     ` Dan Carpenter
2014-09-04 14:27   ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c Greg KH
2014-10-29 20:28 ` [PATCH v2 0/4] Enable PCI controller for Keystone SoCs Murali Karicheri
2014-10-29 20:28   ` [PATCH v2 1/4] ARM: keystone: add pcie related options Murali Karicheri
2014-10-29 20:28   ` [PATCH v2 2/4] ARM: keystone: defconfig: add options to enable PCI controller Murali Karicheri
2014-10-29 20:28   ` [PATCH v2 3/4] ARM: dts: keystone: add DT bindings for PCI controller for port 0 Murali Karicheri
2014-10-29 20:28   ` [PATCH v2 4/4] ARM: dts: keystone-k2e: add DT bindings for PCI controller for port 1 Murali Karicheri
2014-10-29 21:10   ` [PATCH v2 0/4] Enable PCI controller for Keystone SoCs santosh shilimkar
2015-10-19  2:27 ` [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback ling.ma.program
2015-10-19  7:58   ` Ingo Molnar
2015-10-19  9:34     ` Peter Zijlstra
2015-10-19 11:24       ` Ingo Molnar
2015-10-19 17:24         ` Waiman Long [this message]
2015-10-20  2:57     ` Ling Ma
2015-10-20  8:48       ` Ingo Molnar
2015-10-21  5:28         ` Ling Ma
2015-10-21  7:54           ` Peter Zijlstra
2015-10-20  9:15       ` Peter Zijlstra
2015-10-19  9:33   ` Peter Zijlstra
2015-10-19 17:20     ` Waiman Long
2015-10-20  3:00     ` Ling Ma
2015-10-19  9:46   ` Peter Zijlstra
2015-10-20  3:03     ` Ling Ma
2015-10-20  3:24     ` Ling Ma
2015-10-20  9:16       ` Peter Zijlstra
2015-10-21  5:30         ` Ling Ma
2015-10-19 17:18   ` Waiman Long
2015-10-20  3:12     ` Ling Ma
2015-10-20 18:55       ` Waiman Long
2015-10-21  5:43         ` Ling Ma
2015-12-31  8:09 ` [RFC PATCH] alispinlock: acceleration from lock integration on multi-core platform ling.ma.program
2016-01-05 18:46   ` Waiman Long
2016-01-08 22:48     ` Ling Ma
2016-01-05 21:18   ` Peter Zijlstra
2016-01-05 21:42     ` One Thousand Gnomes
2016-01-06  8:16       ` Peter Zijlstra
2016-01-06  8:21         ` Peter Zijlstra
2016-01-06 11:24           ` One Thousand Gnomes
2016-01-08 22:44             ` Ling Ma
2016-01-12 13:50               ` One Thousand Gnomes
2016-01-14  8:10                 ` Ling Ma
2016-01-19  8:52                   ` Ling Ma
2016-01-19 15:36                     ` Waiman Long
2016-02-03  4:40                       ` Ling Ma
2016-02-03  6:00                         ` Ling Ma
2016-02-03 21:42                         ` Waiman Long
2016-02-04  7:07                           ` Ling Ma
2016-04-05  3:44                           ` Ling Ma
2016-04-11  8:00                             ` Ling Ma
2016-01-08 23:01       ` Ling Ma
2016-01-08 22:56     ` Ling Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5625274B.30904@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=acme@infradead.org \
    --cc=jolsa@redhat.com \
    --cc=ling.ma.program@gmail.com \
    --cc=ling.ml@alibaba-inc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).