From: Waiman Long <waiman.long@hp.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Jeff Layton <jlayton@redhat.com>,
Miklos Szeredi <mszeredi@suse.cz>, Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Andi Kleen <andi@firstfloor.org>,
"Chandramouleeswaran, Aswin" <aswin@hp.com>,
"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount
Date: Fri, 30 Aug 2013 14:33:14 -0400 [thread overview]
Message-ID: <5220E56A.80603@hp.com> (raw)
In-Reply-To: <CA+55aFyhTR15GBS4VNoRkcFyubEJU7a+Pzd-6VZeKoU8qB0q5Q@mail.gmail.com>
On 08/29/2013 11:54 PM, Linus Torvalds wrote:
> On Thu, Aug 29, 2013 at 8:12 PM, Waiman Long<waiman.long@hp.com> wrote:
>> On 08/29/2013 07:42 PM, Linus Torvalds wrote:
>>> Waiman? Mind looking at this and testing? Linus
>> Sure, I will try out the patch tomorrow morning and see how it works out for
>> my test case.
> Ok, thanks, please use this slightly updated pCMPXCHG_LOOPatch attached here.
>
>
I tested your patch on a 2-socket (12 cores, 24 threads) DL380 with
2.9GHz Westmere-EX CPUs, the test results of your test program (max
threads increased to 24 to match the thread count) were:
with patch = 68M
w/o patch = 12M
So it was an almost 6X improvement. I think that is really good. A
dual-socket machine, these days, shouldn't be considered as a "BIG"
machine. They are pretty common in different organizations.
I have reviewed the patch, and it looks good to me with the exception
that I added a cpu_relax() call at the end of while loop in the
CMPXCHG_LOOP macro.
I also got the perf data of the test runs with and without the patch.
With patch:
29.24% a.out [kernel.kallsyms] [k] lockref_get_or_lock
19.65% a.out [kernel.kallsyms] [k] lockref_put_or_lock
14.11% a.out [kernel.kallsyms] [k] dput
5.37% a.out [kernel.kallsyms] [k] __d_lookup_rcu
5.29% a.out [kernel.kallsyms] [k] lg_local_lock
4.59% a.out [kernel.kallsyms] [k] d_rcu_to_refcount
:
0.13% a.out [kernel.kallsyms] [k] complete_walk
:
0.01% a.out [kernel.kallsyms] [k] _raw_spin_lock
Without patch:
93.50% a.out [kernel.kallsyms] [k] _raw_spin_lock
0.96% a.out [kernel.kallsyms] [k] dput
0.80% a.out [kernel.kallsyms] [k] kmem_cache_free
0.75% a.out [kernel.kallsyms] [k] lg_local_lock
0.48% a.out [kernel.kallsyms] [k] complete_walk
0.45% a.out [kernel.kallsyms] [k] __d_lookup_rcu
For the other test cases that I am interested in, like the AIM7
benchmark, your patch may not be as good as my original one. I got 1-3M
JPM (varied quite a lot in different runs) in the short workloads on a
80-core system. My original one got 6M JPM. However, the test was done
on 3.10 based kernel. So I need to do more test to see if that has an
effect on the JPM results.
Anyway, I think this patch is good performance-wise. I remembered that
awhile ago that an internal reported a lock contention problem in dentry
involving probably complete_walk(). This patch will certainly help for
that case.
I will do more investigation to see how to make this patch work better
for my test cases.
Thank for taking the effort in optimizing the complete_walk() and
unlazy_walk() function that are not in my original patch. That will make
the patch work even better under more circumstances. I really appreciate
that.
Best regards,
Longman
next prev parent reply other threads:[~2013-08-30 18:33 UTC|newest]
Thread overview: 154+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-06 3:12 [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
2013-08-06 3:12 ` [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount Waiman Long
2013-08-29 1:40 ` Linus Torvalds
2013-08-29 4:44 ` Benjamin Herrenschmidt
2013-08-29 7:00 ` Ingo Molnar
2013-08-29 16:43 ` Linus Torvalds
2013-08-29 19:25 ` Linus Torvalds
2013-08-29 23:42 ` Linus Torvalds
2013-08-30 0:26 ` Benjamin Herrenschmidt
2013-08-30 0:49 ` Linus Torvalds
2013-08-30 2:06 ` Michael Neuling
2013-08-30 2:30 ` Benjamin Herrenschmidt
2013-08-30 2:35 ` Linus Torvalds
2013-08-30 2:45 ` Benjamin Herrenschmidt
2013-08-30 2:31 ` Linus Torvalds
2013-08-30 2:43 ` Benjamin Herrenschmidt
2013-08-30 7:16 ` Ingo Molnar
2013-08-30 15:28 ` Linus Torvalds
2013-08-30 3:12 ` Waiman Long
2013-08-30 3:54 ` Linus Torvalds
2013-08-30 7:55 ` Sedat Dilek
2013-08-30 8:10 ` Sedat Dilek
2013-08-30 9:27 ` Sedat Dilek
2013-08-30 9:48 ` Ingo Molnar
2013-08-30 9:56 ` Sedat Dilek
2013-08-30 9:58 ` Sedat Dilek
2013-08-30 10:29 ` Sedat Dilek
2013-08-30 10:36 ` Peter Zijlstra
2013-08-30 10:44 ` Sedat Dilek
2013-08-30 10:46 ` Sedat Dilek
2013-08-30 10:52 ` Peter Zijlstra
2013-08-30 10:57 ` Sedat Dilek
2013-08-30 14:05 ` Sedat Dilek
2013-08-30 11:19 ` Sedat Dilek
2013-08-30 10:38 ` Sedat Dilek
2013-08-30 15:34 ` Linus Torvalds
2013-08-30 15:38 ` Sedat Dilek
2013-08-30 16:12 ` Steven Rostedt
2013-08-30 16:16 ` Sedat Dilek
2013-08-30 18:42 ` Linus Torvalds
2013-08-30 16:32 ` Linus Torvalds
2013-08-30 16:37 ` Sedat Dilek
2013-08-30 16:52 ` Linus Torvalds
2013-08-30 17:11 ` Sedat Dilek
2013-08-30 17:26 ` Linus Torvalds
2013-09-01 10:01 ` Sedat Dilek
2013-09-01 10:33 ` Sedat Dilek
2013-09-01 15:32 ` Linus Torvalds
2013-09-01 15:45 ` Sedat Dilek
2013-09-01 15:55 ` Linus Torvalds
2013-09-02 10:30 ` Sedat Dilek
2013-09-02 16:09 ` David Ahern
2013-09-02 16:09 ` David Ahern
2013-09-01 20:59 ` Linus Torvalds
2013-09-01 21:23 ` Al Viro
2013-09-01 22:16 ` Linus Torvalds
2013-09-01 22:35 ` Al Viro
2013-09-01 22:44 ` Al Viro
2013-09-01 22:58 ` Linus Torvalds
2013-09-01 22:48 ` Linus Torvalds
2013-09-01 23:30 ` Al Viro
2013-09-02 0:12 ` Linus Torvalds
2013-09-02 0:50 ` Linus Torvalds
2013-09-02 0:50 ` Linus Torvalds
2013-09-02 7:05 ` Ingo Molnar
2013-09-02 16:44 ` Linus Torvalds
2013-09-03 10:15 ` Ingo Molnar
2013-09-03 15:41 ` Linus Torvalds
2013-09-03 18:34 ` Linus Torvalds
2013-09-03 19:19 ` Ingo Molnar
2013-09-03 21:05 ` Linus Torvalds
2013-09-03 21:13 ` Linus Torvalds
2013-09-03 21:34 ` Linus Torvalds
2013-09-03 21:39 ` Linus Torvalds
2013-09-03 14:08 ` Pavel Machek
2013-09-03 22:37 ` Sedat Dilek
2013-09-03 22:55 ` Dave Jones
2013-09-03 23:05 ` Sedat Dilek
2013-09-03 23:15 ` Dave Jones
2013-09-03 23:20 ` Sedat Dilek
2013-09-03 23:45 ` Sedat Dilek
2013-08-30 18:33 ` Waiman Long [this message]
2013-08-30 18:53 ` Linus Torvalds
2013-08-30 19:20 ` Waiman Long
2013-08-30 19:33 ` Linus Torvalds
2013-08-30 20:15 ` Waiman Long
2013-08-30 20:43 ` Linus Torvalds
2013-08-30 20:54 ` Al Viro
2013-08-30 21:03 ` Linus Torvalds
2013-08-30 21:44 ` Al Viro
2013-08-30 22:30 ` Linus Torvalds
2013-08-31 21:23 ` Al Viro
2013-08-31 22:49 ` Linus Torvalds
2013-08-31 23:27 ` Al Viro
2013-09-01 0:13 ` Al Viro
2013-09-01 17:48 ` Al Viro
2013-09-09 8:30 ` Peter Zijlstra
2013-08-30 21:10 ` Waiman Long
2013-08-30 21:22 ` Linus Torvalds
2013-08-30 21:30 ` Al Viro
2013-08-30 21:42 ` Waiman Long
2013-08-30 19:40 ` Al Viro
2013-08-30 19:52 ` Waiman Long
2013-08-30 20:26 ` Al Viro
2013-08-30 20:35 ` Waiman Long
2013-08-30 20:48 ` Al Viro
2013-08-31 2:02 ` Waiman Long
2013-08-31 2:35 ` Al Viro
2013-08-31 2:42 ` Al Viro
2013-09-02 19:25 ` Waiman Long
2013-09-03 6:01 ` Ingo Molnar
2013-09-03 7:24 ` Sedat Dilek
2013-09-03 15:38 ` Linus Torvalds
2013-09-03 15:14 ` Waiman Long
2013-09-03 15:34 ` Linus Torvalds
2013-09-03 19:09 ` Linus Torvalds
2013-09-03 21:01 ` Waiman Long
2013-09-04 14:52 ` Waiman Long
2013-09-04 15:14 ` Linus Torvalds
2013-09-04 19:25 ` Waiman Long
2013-09-04 21:34 ` Linus Torvalds
2013-09-05 2:35 ` Waiman Long
2013-09-05 13:31 ` Ingo Molnar
2013-09-05 17:33 ` Waiman Long
2013-09-05 17:40 ` Ingo Molnar
2013-09-03 22:41 ` Sedat Dilek
2013-09-03 23:11 ` Sedat Dilek
2013-09-03 23:11 ` Sedat Dilek
2013-09-08 21:45 ` Linus Torvalds
2013-09-09 0:03 ` Al Viro
2013-09-09 0:25 ` Linus Torvalds
2013-09-09 0:35 ` Al Viro
2013-09-09 0:38 ` Linus Torvalds
2013-09-09 0:57 ` Al Viro
2013-09-09 2:09 ` Ramkumar Ramachandra
2013-09-09 0:30 ` Al Viro
2013-09-09 3:32 ` Linus Torvalds
2013-09-09 4:06 ` Ramkumar Ramachandra
2013-09-09 5:44 ` Al Viro
2013-08-30 17:17 ` Peter Zijlstra
2013-08-30 17:28 ` Linus Torvalds
2013-08-30 17:33 ` Linus Torvalds
2013-08-29 15:20 ` Waiman Long
2013-08-06 3:12 ` [PATCH v7 2/4] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
2013-08-06 3:12 ` [PATCH v7 3/4] dcache: replace d_lock/d_count by d_lockcnt Waiman Long
2013-08-06 3:12 ` [PATCH v7 4/4] dcache: Enable lockless update of dentry's refcount Waiman Long
2013-08-13 18:03 ` [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
-- strict thread matches above, loose matches on Subject: below --
2013-08-31 3:06 [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount George Spelvin
2013-08-31 17:16 ` Linus Torvalds
2013-09-01 8:50 ` George Spelvin
2013-09-01 11:10 ` Theodore Ts'o
2013-09-01 15:49 ` Linus Torvalds
2013-09-01 18:11 ` Steven Rostedt
2013-09-01 20:03 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5220E56A.80603@hp.com \
--to=waiman.long@hp.com \
--cc=andi@firstfloor.org \
--cc=aswin@hp.com \
--cc=benh@kernel.crashing.org \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mszeredi@suse.cz \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=scott.norton@hp.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.