From: Waiman Long <waiman.long@hpe.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
ling.ma.program@gmail.com, mingo@redhat.com,
linux-kernel@vger.kernel.org, Ma Ling <ling.ml@alibaba-inc.com>,
Arnaldo Carvalho de Melo <acme@infradead.org>,
Jiri Olsa <jolsa@redhat.com>
Subject: Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback
Date: Mon, 19 Oct 2015 13:24:27 -0400 [thread overview]
Message-ID: <5625274B.30904@hpe.com> (raw)
In-Reply-To: <20151019112417.GA752@gmail.com>
On 10/19/2015 07:24 AM, Ingo Molnar wrote:
> * Peter Zijlstra<peterz@infradead.org> wrote:
>
>> On Mon, Oct 19, 2015 at 09:58:23AM +0200, Ingo Molnar wrote:
>>> * ling.ma.program@gmail.com<ling.ma.program@gmail.com> wrote:
>>>
>>>> From: Ma Ling<ling.ml@alibaba-inc.com>
>>>>
>>>> All load instructions can run speculatively but they have to follow
>>>> memory order rule in multiple cores as below:
>>>> _x = _y = 0
>>>>
>>>> Processor 0 Processor 1
>>>>
>>>> mov r1, [ _y] //M1 mov [ _x], 1 //M3
>>>> mov r2, [ _x] //M2 mov [ _y], 1 //M4
>>>>
>>>> If r1 = 1, r2 must be 1
>>>>
>>>> In order to guarantee above rule, although Processor 0 execute
>>>> M1 and M2 instruction out of order, they are kept in ROB,
>>>> when load buffer for _x in Processor 0 received the update
>>>> message from Processor 1, Processor 0 need to roll back
>>>> from M2 instruction, which will flush the whole pipeline,
>>>> the latency is over the penalty from branch prediction miss.
>>>>
>>>> In this patch we use lock cmpxchg instruction to force load
>>>> instructions to be serialization, the destination operand
>>>> receives a write cycle without regard to the result of
>>>> the comparison, which can help us to reduce the penalty
>>>> from load instruction roll back.
>>>>
>>>> Our experiment indicates the performance can be improved by 10%~15%
>>>> for 2 and 3 threads cases, the conflicts from lock cache line
>>>> spend them most of the time.
>>> So it would be nice to create a new user-space spinlock testing facility, via a
>>> new 'perf bench spinlock' feature or so. That way others can test and validate
>>> your results on different hardware as well.
>> So its trivial to lift this code into userspace -- in fact, I have that
>> somewhere.
>>
>> The trouble is going to keep them in sync.
> So we can just try this optimistically, and if it keeps breaking, we can use the
> technique perf uses to sync up the rbtree implementation: we copy the kernel
> version into tooling, but run diff against the kernel version and warn at tool
> build time that there's divergence.
>
> I.e. a non-build-fatal force that keeps things in sync.
>
> Thanks,
>
> Ingo
>
It is on my to-do list. I just want to wrap up my latest PV qspinlock
patch before embarking on this adventure.
Cheers,
Longman
next prev parent reply other threads:[~2015-10-19 17:24 UTC|newest]
Thread overview: 228+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <yes>
2009-01-16 18:08 ` Quota fixes and improvements Jan Kara
2009-01-16 18:08 ` [PATCH 01/11] quota: Improve locking Jan Kara
2009-01-16 18:08 ` [PATCH 02/11] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Jan Kara
2009-01-16 18:08 ` [PATCH 03/11] ocfs2: Push out dropping of dentry lock to ocfs2_wq Jan Kara
2009-01-16 18:08 ` [PATCH 04/11] ocfs2: Fix possible deadlock in ocfs2_write_dquot() Jan Kara
2009-01-16 18:08 ` [PATCH 05/11] quota: Add quota reservation support Jan Kara
2009-01-16 18:08 ` [PATCH 06/11] quota: Add quota reservation claim and released operations Jan Kara
2009-01-16 18:08 ` [PATCH 07/11] quota: Use inode->i_blkbits to get block bits Jan Kara
2009-01-16 18:08 ` [PATCH 08/11] quota: Move EXPORT_SYMBOL immediately next to the functions/varibles Jan Kara
2009-01-16 18:08 ` [PATCH 09/11] ext3: Remove unnecessary quota functions Jan Kara
2009-01-16 18:08 ` [PATCH 10/11] ext4: " Jan Kara
2009-01-16 18:08 ` [PATCH 11/11] reiserfs: " Jan Kara
2009-01-20 21:41 ` [PATCH 10/11] ext4: " Mingming Cao
2009-01-20 21:41 ` [PATCH 09/11] ext3: " Mingming Cao
2009-01-24 7:49 ` [PATCH 01/11] quota: Improve locking Andrew Morton
2009-01-26 10:04 ` Jan Kara
2009-05-31 14:49 ` [PATCH 0/8] kernel:lockdep:replace DFS with BFS tom.leiming
2009-05-31 14:49 ` [PATCH 1/8] kernel:lockdep:improve implementation of BFS tom.leiming
2009-05-31 14:49 ` [PATCH 2/8] kernel:lockdep: introduce match function to BFS tom.leiming
2009-05-31 14:49 ` [PATCH 3/8] kernel:lockdep:implement check_noncircular() by BFS tom.leiming
2009-05-31 14:49 ` [PATCH 4/8] kernel:lockdep:implement find_usage_*wards " tom.leiming
2009-05-31 14:49 ` [PATCH 5/8] kernel:lockdep:introduce print_shortest_lock_dependencies tom.leiming
2009-05-31 14:49 ` [PATCH 6/8] kernel:lockdep: implement lockdep_count_*ward_deps by BFS tom.leiming
2009-05-31 14:49 ` [PATCH 7/8] kernel:lockdep: update memory usage introduced " tom.leiming
2009-05-31 14:49 ` [PATCH 8/8] kernel:lockdep:add statistics info for max bfs queue depth tom.leiming
2009-05-31 15:14 ` [PATCH 4/8] kernel:lockdep:implement find_usage_*wards by BFS Daniel Walker
2009-06-01 0:14 ` Ming Lei
2009-06-08 12:22 ` [PATCH 0/8] kernel:lockdep:replace DFS with BFS Peter Zijlstra
2009-06-08 13:38 ` Ming Lei
2009-06-08 13:58 ` Ming Lei
2009-06-08 14:04 ` Peter Zijlstra
2009-06-08 15:50 ` Ming Lei
2009-06-09 12:52 ` Ming Lei
2009-10-07 13:49 ` [PATCH 1/1] perf tools: Up the verbose level for some really verbose stuff Arnaldo Carvalho de Melo
2009-10-08 17:31 ` [tip:perf/core] " tip-bot for Arnaldo Carvalho de Melo
2010-06-22 15:20 ` [RFC][PATCH 00/10] cifs: local caching support using FS-Cache Suresh Jayaraman
2010-06-22 15:22 ` [RFC][PATCH 01/10] cifs: add kernel config option for CIFS Client caching support Suresh Jayaraman
2010-06-22 15:22 ` [RFC][PATCH 02/10] cifs: guard cifsglob.h against multiple inclusion Suresh Jayaraman
2010-06-22 21:37 ` Jeff Layton
2010-06-22 15:23 ` [RFC][PATCH 03/10] cifs: register CIFS for caching Suresh Jayaraman
2010-06-23 16:51 ` David Howells
2010-06-25 10:56 ` Suresh Jayaraman
2010-06-22 15:23 ` [RFC][PATCH 04/10] cifs: define server-level cache index objects and register them with FS-Cache Suresh Jayaraman
2010-06-22 21:52 ` Jeff Layton
2010-06-23 5:34 ` Suresh Jayaraman
2010-06-23 16:54 ` David Howells
2010-06-22 15:23 ` [RFC][PATCH 05/10] cifs: define superblock-level cache index objects and register them Suresh Jayaraman
2010-06-23 16:58 ` David Howells
2010-06-25 12:44 ` Suresh Jayaraman
2010-06-25 12:58 ` David Howells
2010-06-25 13:26 ` David Howells
2010-06-28 12:53 ` Suresh Jayaraman
2010-06-28 13:24 ` David Howells
2010-06-22 15:23 ` [RFC][PATCH 06/10] cifs: define inode-level cache object " Suresh Jayaraman
2010-06-23 17:02 ` David Howells
2010-06-25 12:50 ` Suresh Jayaraman
2010-06-25 12:55 ` David Howells
2010-06-25 16:53 ` Jeff Layton
2010-06-25 21:46 ` David Howells
2010-06-25 22:26 ` Jeff Layton
2010-06-25 23:04 ` David Howells
2010-06-25 23:05 ` Steve French
[not found] ` <OFB55E8EC7.E8DD23D5-ON8725774E.0004921E-8825774E.0004CC31@us.ibm.com>
2010-06-27 18:17 ` Aneesh Kumar K. V
2010-06-27 18:22 ` Christoph Hellwig
2010-06-22 15:23 ` [RFC][PATCH 07/10] cifs: FS-Cache page management Suresh Jayaraman
2010-06-23 17:05 ` David Howells
2010-06-22 15:24 ` [RFC][PATCH 08/10] cifs: store pages into local cache Suresh Jayaraman
2010-06-23 17:06 ` David Howells
2010-06-22 15:24 ` [RFC][PATCH 09/10] cifs: read pages from FS-Cache Suresh Jayaraman
2010-06-23 17:07 ` David Howells
2010-06-22 15:25 ` [RFC][PATCH 10/10] cifs: add mount option to enable local caching Suresh Jayaraman
2010-06-23 17:08 ` David Howells
2010-06-23 18:32 ` Scott Lovenberg
2010-06-25 10:48 ` Suresh Jayaraman
2011-06-15 0:46 ` [PATCH] Add ok2440 development board support Wu DaoGuang
2011-10-03 0:32 ` [PATCH 1/1] ARM: Make debug UART optional for S3C devices Thiago A. Correa
2011-10-10 14:44 ` Thiago A. Corrêa
2011-12-11 13:10 ` [PATCH] block: Needn't read the size of device or partition again taco
2012-06-08 17:23 ` [PATCH 1/4] slub: change declare of get_slab() to inline at all times Joonsoo Kim
2012-06-08 17:23 ` [PATCH 2/4] slub: use __cmpxchg_double_slab() at interrupt disabled place Joonsoo Kim
2012-06-08 17:23 ` [PATCH 3/4] slub: refactoring unfreeze_partials() Joonsoo Kim
2012-06-20 7:19 ` Pekka Enberg
2012-06-08 17:23 ` [PATCH 4/4] slub: deactivate freelist of kmem_cache_cpu all at once in deactivate_slab() Joonsoo Kim
2012-06-08 19:04 ` Christoph Lameter
2012-06-10 10:27 ` JoonSoo Kim
2012-06-22 18:34 ` JoonSoo Kim
2012-06-08 19:02 ` [PATCH 1/4] slub: change declare of get_slab() to inline at all times Christoph Lameter
2012-06-09 15:57 ` JoonSoo Kim
2012-06-11 15:04 ` Christoph Lameter
2012-06-22 18:22 ` [PATCH 1/3] slub: prefetch next freelist pointer in __slab_alloc() Joonsoo Kim
2012-06-22 18:22 ` [PATCH 2/3] slub: reduce failure of this_cpu_cmpxchg in put_cpu_partial() after unfreezing Joonsoo Kim
2012-07-04 13:05 ` Pekka Enberg
2012-07-05 14:20 ` Christoph Lameter
2012-08-16 7:06 ` Pekka Enberg
2012-06-22 18:22 ` [PATCH 3/3] slub: release a lock if freeing object with a lock is failed in __slab_free() Joonsoo Kim
2012-07-04 13:10 ` Pekka Enberg
2012-07-04 14:48 ` JoonSoo Kim
2012-07-05 14:26 ` Christoph Lameter
2012-07-06 14:19 ` JoonSoo Kim
2012-07-06 14:34 ` Christoph Lameter
2012-07-06 14:59 ` JoonSoo Kim
2012-07-06 15:10 ` Christoph Lameter
2012-07-08 16:19 ` JoonSoo Kim
2012-06-22 18:45 ` [PATCH 1/3 v2] slub: prefetch next freelist pointer in __slab_alloc() Joonsoo Kim
2012-07-04 12:58 ` JoonSoo Kim
2012-07-04 13:00 ` Pekka Enberg
2012-07-04 14:30 ` JoonSoo Kim
2012-07-04 15:08 ` Pekka Enberg
2012-07-04 15:26 ` Eric Dumazet
2012-07-04 15:48 ` JoonSoo Kim
2012-07-04 16:15 ` Eric Dumazet
2012-07-04 16:24 ` JoonSoo Kim
2012-07-04 15:45 ` JoonSoo Kim
2012-07-04 15:59 ` Pekka Enberg
2012-07-04 16:04 ` JoonSoo Kim
[not found] ` <1360258447-27247-1-git-send-email-yes>
2013-02-07 17:34 ` [PATCH 04/10] USB: EHCI: make ehci-orion a separate driver manjunath.goudar
2013-02-07 19:41 ` Arnd Bergmann
2013-02-08 10:38 ` Florian Fainelli
2013-02-07 17:34 ` [PATCH 05/10] USB: EHCI: make ehci-atmel " manjunath.goudar
2013-02-08 2:58 ` Bo Shen
2013-06-12 11:53 ` Jean-Christophe PLAGNIOL-VILLARD
2013-02-07 17:34 ` [PATCH 07/10] USB: EHCI: make ehci-mv " manjunath.goudar
2013-02-07 17:34 ` [PATCH 08/10] USB: EHCI: make ehci-vt8500 " manjunath.goudar
2013-02-07 18:54 ` Tony Prisk
2013-02-07 17:34 ` [PATCH 09/10] USB: EHCI: make ehci-msm " manjunath.goudar
2013-02-07 18:48 ` Stephen Warren
2013-02-07 19:05 ` David Brown
2013-02-07 17:34 ` [PATCH 10/10] USB: EHCI: make ehci-w90X900 " manjunath.goudar
2013-06-10 9:17 ` [PATCH v2 00/11] ARM:STixxxx: Add STixxxx platform and board support Srinivas KANDAGATLA
2013-06-10 9:21 ` [PATCH v2 01/11] serial:st-asc: Add ST ASC driver Srinivas KANDAGATLA
2013-06-10 9:35 ` Russell King - ARM Linux
2013-06-10 11:53 ` Srinivas KANDAGATLA
2013-06-10 9:21 ` [PATCH v2 02/11] clocksource:global_timer: Add ARM global timer support Srinivas KANDAGATLA
[not found] ` <CACRpkdbQCRKBzRF4HzNsXHwXCLJJcFZ9T36GPmmYsnX1OfgGRg@mail.gmail.com>
[not found] ` <51B5D7A6.1020101@st.com>
[not found] ` <51B72E9A.6070006@st.com>
[not found] ` <CACRpkdbppRqnMYknbBy8JpAVtujMOEQvyczXTmpvkQuxgikFog@mail.gmail.com>
2013-06-12 10:45 ` Srinivas KANDAGATLA
2013-06-10 9:21 ` [PATCH v2 03/11] regmap: Add regmap_field APIs Srinivas KANDAGATLA
2013-06-11 10:48 ` Mark Brown
2013-06-11 11:36 ` Srinivas KANDAGATLA
2013-06-10 9:22 ` [PATCH v2 04/11] mfd:stixxxx-syscfg: Add ST System Configuration support Srinivas KANDAGATLA
[not found] ` <CACRpkdaW2ALTWCB7Rd8m=aAGQwh3T_dJVncxJn_eXer4X3J6_g@mail.gmail.com>
2013-06-10 13:52 ` Srinivas KANDAGATLA
2013-06-10 14:02 ` Arnd Bergmann
2013-06-10 15:51 ` Srinivas KANDAGATLA
2013-06-11 7:41 ` Srinivas KANDAGATLA
2013-06-10 9:22 ` [PATCH v2 05/11] pinctrl:stixxxx: Add pinctrl and pinconf support Srinivas KANDAGATLA
2013-06-10 9:26 ` =?yes?q?=5BPATCH=20v2=2006/11=5D=20ARM=3Astixxxx=3A=20Add=20STiH415=20SOC=20support?= Srinivas KANDAGATLA
2013-06-10 9:55 ` [PATCH v2 06/11] ARM:stixxxx: Add STiH415 SOC support Michal Simek
2013-06-10 11:08 ` Michal Simek
[not found] ` <CAHTX3d+dk3W_9b7SVUokWq4KYXnj=Z1=WPj5zJ-gUvJqqwE=+Q@mail.gmail.com>
2013-06-10 11:46 ` Srinivas KANDAGATLA
2013-06-10 23:19 ` Russell King - ARM Linux
2013-06-11 6:50 ` Srinivas KANDAGATLA
2013-06-13 11:56 ` Russell King - ARM Linux
2013-06-13 12:41 ` Srinivas KANDAGATLA
2013-06-13 12:47 ` Linus Walleij
2013-06-10 9:27 ` [PATCH v2 07/11] ARM:stixxxx: Add STiH416 " Srinivas KANDAGATLA
2013-06-10 13:52 ` Arnd Bergmann
2013-06-10 16:17 ` Srinivas KANDAGATLA
2013-06-14 7:12 ` Srinivas KANDAGATLA
2013-06-10 9:27 ` [PATCH v2 08/11] ARM:stixxxx: Add DEBUG_LL console support Srinivas KANDAGATLA
2013-06-10 9:27 ` [PATCH v2 09/11] ARM:stixxxx: Add stixxxx options to multi_v7_defconfig Srinivas KANDAGATLA
2013-06-10 10:40 ` Mark Rutland
2013-06-10 10:58 ` Srinivas KANDAGATLA
2013-06-10 13:15 ` Mark Rutland
2013-06-13 9:24 ` Srinivas KANDAGATLA
2013-06-17 9:32 ` Mark Rutland
2013-06-10 9:28 ` [PATCH v2 10/11] ARM:stih41x: Add B2000 board support Srinivas KANDAGATLA
2013-06-10 9:28 ` [PATCH v2 11/11] ARM:stih41x: Add B2020 " Srinivas KANDAGATLA
2014-08-12 6:40 ` [PATCH v3] uas: replace WARN_ON_ONCE() with lockdep_assert_held() Sanjeev Sharma
2014-08-12 6:28 ` Hans de Goede
2014-08-12 6:37 ` Sharma, Sanjeev
2014-08-19 6:33 ` Sharma, Sanjeev
2014-08-19 9:30 ` gregkh
2014-08-19 9:38 ` Sharma, Sanjeev
2014-09-04 7:06 ` Sharma, Sanjeev
2014-09-04 13:50 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c linux.delve
2014-09-04 13:51 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c This is a patch to the file r819xU_firmware.c that fixes a brace warning found by checkpatch.pl tool linux.delve
2014-09-04 14:09 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c Chaitra Ramaiah
2014-09-04 14:09 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c This is a patch to the file r819xU_firmware.c that fixes a brace warning found by checkpatch.pl tool Chaitra Ramaiah
2014-09-04 14:28 ` Greg KH
2014-09-04 14:33 ` Dan Carpenter
2014-09-04 14:27 ` [PATCH] Staging: rtl8192u: fix brace style coding issue in r819xU_firmware.c Greg KH
2014-10-29 20:28 ` [PATCH v2 0/4] Enable PCI controller for Keystone SoCs Murali Karicheri
2014-10-29 20:28 ` [PATCH v2 1/4] ARM: keystone: add pcie related options Murali Karicheri
2014-10-29 20:28 ` [PATCH v2 2/4] ARM: keystone: defconfig: add options to enable PCI controller Murali Karicheri
2014-10-29 20:28 ` [PATCH v2 3/4] ARM: dts: keystone: add DT bindings for PCI controller for port 0 Murali Karicheri
2014-10-29 20:28 ` [PATCH v2 4/4] ARM: dts: keystone-k2e: add DT bindings for PCI controller for port 1 Murali Karicheri
2014-10-29 21:10 ` [PATCH v2 0/4] Enable PCI controller for Keystone SoCs santosh shilimkar
2015-10-19 2:27 ` [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback ling.ma.program
2015-10-19 7:58 ` Ingo Molnar
2015-10-19 9:34 ` Peter Zijlstra
2015-10-19 11:24 ` Ingo Molnar
2015-10-19 17:24 ` Waiman Long [this message]
2015-10-20 2:57 ` Ling Ma
2015-10-20 8:48 ` Ingo Molnar
2015-10-21 5:28 ` Ling Ma
2015-10-21 7:54 ` Peter Zijlstra
2015-10-20 9:15 ` Peter Zijlstra
2015-10-19 9:33 ` Peter Zijlstra
2015-10-19 17:20 ` Waiman Long
2015-10-20 3:00 ` Ling Ma
2015-10-19 9:46 ` Peter Zijlstra
2015-10-20 3:03 ` Ling Ma
2015-10-20 3:24 ` Ling Ma
2015-10-20 9:16 ` Peter Zijlstra
2015-10-21 5:30 ` Ling Ma
2015-10-19 17:18 ` Waiman Long
2015-10-20 3:12 ` Ling Ma
2015-10-20 18:55 ` Waiman Long
2015-10-21 5:43 ` Ling Ma
2015-12-31 8:09 ` [RFC PATCH] alispinlock: acceleration from lock integration on multi-core platform ling.ma.program
2016-01-05 18:46 ` Waiman Long
2016-01-08 22:48 ` Ling Ma
2016-01-05 21:18 ` Peter Zijlstra
2016-01-05 21:42 ` One Thousand Gnomes
2016-01-06 8:16 ` Peter Zijlstra
2016-01-06 8:21 ` Peter Zijlstra
2016-01-06 11:24 ` One Thousand Gnomes
2016-01-08 22:44 ` Ling Ma
2016-01-12 13:50 ` One Thousand Gnomes
2016-01-14 8:10 ` Ling Ma
2016-01-19 8:52 ` Ling Ma
2016-01-19 15:36 ` Waiman Long
2016-02-03 4:40 ` Ling Ma
2016-02-03 6:00 ` Ling Ma
2016-02-03 21:42 ` Waiman Long
2016-02-04 7:07 ` Ling Ma
2016-04-05 3:44 ` Ling Ma
2016-04-11 8:00 ` Ling Ma
2016-01-08 23:01 ` Ling Ma
2016-01-08 22:56 ` Ling Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5625274B.30904@hpe.com \
--to=waiman.long@hpe.com \
--cc=acme@infradead.org \
--cc=jolsa@redhat.com \
--cc=ling.ma.program@gmail.com \
--cc=ling.ml@alibaba-inc.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).