From: Ingo Molnar <mingo@kernel.org>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mgorman@suse.de>, "Shi, Alex" <alex.shi@intel.com>,
Andi Kleen <andi@firstfloor.org>,
Andrew Morton <akpm@linux-foundation.org>,
Michel Lespinasse <walken@google.com>,
Davidlohr Bueso <davidlohr.bueso@hp.com>,
"Wilcox, Matthew R" <matthew.r.wilcox@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>
Subject: Re: Performance regression from switching lock to rw-sem for anon-vma tree
Date: Tue, 2 Jul 2013 08:45:38 +0200 [thread overview]
Message-ID: <20130702064538.GB3143@gmail.com> (raw)
In-Reply-To: <1372710497.22432.224.camel@schen9-DESK>
* Tim Chen <tim.c.chen@linux.intel.com> wrote:
> On Sat, 2013-06-29 at 09:12 +0200, Ingo Molnar wrote:
> > * Tim Chen <tim.c.chen@linux.intel.com> wrote:
> >
> > > > If my analysis is correct so far then it might be useful to add two
> > > > more stats: did rwsem_spin_on_owner() fail because lock->owner == NULL
> > > > [owner released the rwsem], or because owner_running() failed [owner
> > > > went to sleep]?
> > >
> > > Ingo,
> > >
> > > I tabulated the cases where rwsem_spin_on_owner returns false and causes
> > > us to stop spinning.
> > >
> > > 97.12% was due to lock's owner switching to another writer
> > > 0.01% was due to the owner of the lock sleeping
> > > 2.87% was due to need_resched()
> > >
> > > I made a change to allow us to continue to spin even when lock's owner
> > > switch to another writer. I did get the lock to be acquired now mostly
> > > (98%) via optimistic spin and lock stealing, but my benchmark's
> > > throughput actually got reduced by 30% (too many cycles spent on useless
> > > spinning?).
> >
> > Hm, I'm running out of quick ideas :-/ The writer-ends-spinning sequence
> > is pretty similar in the rwsem and in the mutex case. I'd have a look at
> > one more detail: is the wakeup of another writer in the rwsem case
> > singular, is only a single writer woken? I suspect the answer is yes ...
>
> Ingo, we can only wake one writer, right? In __rwsem_do_wake, that is
> indeed the case. Or you are talking about something else?
Yeah, I was talking about that, and my understanding and reading of the
code says that too - I just wanted to make sure :-)
> >
> > A quick glance suggests that the ordering of wakeups of waiters is the
> > same for mutexes and rwsems: FIFO, single waiter woken on
> > slowpath-unlock. So that shouldn't make a big difference.
>
> > If all last-ditch efforts to analyze it via counters fail then the way
> > I'd approach it next is brute-force instrumentation:
> >
> > - First I'd create a workload 'steady state' that can be traced and
> > examined without worrying that that it ends or switches to some other
> > workload.
> >
> > - Then I'd create a relatively lightweight trace (maybe trace_printk() is
> > lightweight enough), and capture key mutex and rwsem events.
> >
> > - I'd capture a 1-10 seconds trace in steady state, both with rwsems and
> > mutexes. I'd have a good look at which tasks take locks and schedule
> > how and why. I'd try to eliminate any assymetries in behavior, i.e.
> > make rwsems behave like mutexes.
>
> You mean adding trace points to record the events? If you can be more
> specific on what data to capture, that will be helpful. It will be
> holidays here in US so I may get around to this the following week.
Yeah, adding the relevant tracepoints (or trace_printk(), which is much
simpler albeit a bit more expensive) - and capturing a 1-second
steady-state traces via ftrace.
Then I'd compare the two traces and look at the biggest difference, and
try to zoom in on to figure out why the difference occurs. More
trace_printk()s can be added as you suspect specific areas or want to
confirm various theories.
[ Assuming the phenomenon does not go away under tracing :-/ ]
[
Another brute-force approach is to add a dynamic debug knob to switch
between a spinlock and an rwsem implementation for that lock: I'd do it
by adding both types of locks to the data structure, initializing both
but using a /proc/sys/kernel/ knob to decide whether to use spin_*() or
rwsem facilities to utilize it. This way you could switch between the
two implementations without rebooting the system. In theory.
]
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-07-02 6:45 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-13 23:26 Performance regression from switching lock to rw-sem for anon-vma tree Tim Chen
2013-06-19 13:16 ` Ingo Molnar
2013-06-19 16:53 ` Tim Chen
2013-06-26 0:19 ` Tim Chen
2013-06-26 9:51 ` Ingo Molnar
2013-06-26 21:36 ` Tim Chen
2013-06-27 0:25 ` Tim Chen
2013-06-27 8:36 ` Ingo Molnar
2013-06-27 20:53 ` Tim Chen
2013-06-27 23:31 ` Tim Chen
2013-06-28 9:38 ` Ingo Molnar
2013-06-28 21:04 ` Tim Chen
2013-06-29 7:12 ` Ingo Molnar
2013-07-01 20:28 ` Tim Chen
2013-07-02 6:45 ` Ingo Molnar [this message]
2013-07-16 17:53 ` Tim Chen
2013-07-23 9:45 ` Ingo Molnar
2013-07-23 9:51 ` Peter Zijlstra
2013-07-23 9:53 ` Ingo Molnar
2013-07-30 0:13 ` Tim Chen
2013-07-30 19:24 ` Ingo Molnar
2013-08-05 22:08 ` Tim Chen
2013-07-30 19:59 ` Davidlohr Bueso
2013-07-30 20:34 ` Tim Chen
2013-07-30 21:45 ` Davidlohr Bueso
2013-08-06 23:55 ` Davidlohr Bueso
2013-08-07 0:56 ` Tim Chen
2013-08-12 18:52 ` Ingo Molnar
2013-08-12 20:10 ` Tim Chen
2013-06-28 9:20 ` Ingo Molnar
[not found] <1371165333.27102.568.camel@schen9-DESK>
[not found] ` <1371167015.1754.14.camel@buesod1.americas.hpqcorp.net>
2013-06-14 16:09 ` Tim Chen
2013-06-14 22:31 ` Davidlohr Bueso
2013-06-14 22:44 ` Tim Chen
2013-06-14 22:47 ` Michel Lespinasse
2013-06-17 22:27 ` Tim Chen
2013-06-16 9:50 ` Alex Shi
2013-06-17 16:22 ` Davidlohr Bueso
2013-06-17 18:45 ` Tim Chen
2013-06-17 19:05 ` Davidlohr Bueso
2013-06-17 22:28 ` Tim Chen
2013-06-17 23:18 ` Alex Shi
2013-06-17 23:20 ` Alex Shi
2013-06-17 23:35 ` Davidlohr Bueso
2013-06-18 0:08 ` Tim Chen
2013-06-19 23:11 ` Davidlohr Bueso
2013-06-19 23:24 ` Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130702064538.GB3143@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=andi@firstfloor.org \
--cc=dave.hansen@intel.com \
--cc=davidlohr.bueso@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.r.wilcox@intel.com \
--cc=mgorman@suse.de \
--cc=mingo@elte.hu \
--cc=riel@redhat.com \
--cc=tim.c.chen@linux.intel.com \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).