linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"Peter W. Morreale" <pmorreale@novell.com>,
	Rik van Riel <riel@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Gregory Haskins <ghaskins@novell.com>,
	Sven-Thorsten Dietrich <sdietrich@novell.com>,
	Chris Mason <chris.mason@oracle.com>,
	John Cooper <john.cooper@third-harmonic.com>,
	Chris Wright <chrisw@sous-sol.org>, Avi Kivity <avi@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 4/6] futex: Add FUTEX_LOCK with optional adaptive spinning
Date: Wed, 07 Apr 2010 20:25:06 -0700	[thread overview]
Message-ID: <4BBD4C92.8010701@us.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1004072124250.32352@localhost.localdomain>

Thomas Gleixner wrote:
> On Wed, 7 Apr 2010, Darren Hart wrote:
>> Thomas Gleixner wrote:
>>> On Mon, 5 Apr 2010, Darren Hart wrote:
>>> Hmm. The order is weird. Why don't you do that simpler ?
>>>
>>> Get the uval, the tid and the thread_info pointer outside of the
>>> loop. Also task_pid_vnr(current) just needs a one time lookup.
>> Eeek. Having the owner in the loop is a good way to negate the benefits
>> of adaptive spinning by spinning forever (unlikely, but it could
>> certainly spin across multiple owners). Nice catch.
>>
>> As for the uval.... I'm not sure what you mean. You get curval below
>> inside the loop, and there is no "uval" in the my version of the code.
> 
> Well, you need a first time lookup of owner and ownertid for which you
> need the user space value (uval),
> but thinking more about it it's not
> even necessary. Just initialize ownertid to 0 so it will drop into the
> lookup code when we did not acquire the futex in the cmpxchg.

No need for ownertid at all really. The cmpxchg always tries to go from 
0 to curtid. I've pushed the futex_owner() call outside the loop for a 
one time lookup.

> 
>> As for the order, I had put the initial spin prior to the cmpxchg to
>> avoid doing too many cmpxchg's in a row as they are rather expensive.
>> However, since this is (now) the first opportunity to do try and acquire
>> the lock atomically after entering the futex syscall, I think you're
>> right, it should be the first thing in the loop.
>>
>>> change the loop to do:
>>>
>>>        for (;;) {
>>>        	   curval = cmpxchg_futex_value_locked(uaddr, 0, curtid);
>>>   	   if (!curval)
>>> 	      return 1;
>> Single return point makes instrumentation so much easier. Unless folks
>> _really_ object, I'll leave it as is until we're closer to merging.
> 
> I don't care either way. That was just example code.
> 
>>> 	   if ((curval & FUTEX_TID_MASK) != ownertid) {
>>> 	      ownertid = curval & FUTEX_TID_MASK;
>>> 	      owner = update_owner(ownertid);
>>> 	   }
>>
>> Hrm... at this point the owner has changed... so we should break and go
>> to sleep, not update the owner and start spinning again. The
>> futex_spin_on_owner() will detect this and abort, so I'm not seeing the
>> purpose of the above if() block.
> 
> Why ? If the owner has changed and the new owner is running on another
> cpu then why not spin further ?

That's an interesting question, and I'm not sure what the right answer 
is. The current approach of the adaptive spinning in the kernel is to 
spin until the owner changes or deschedules, then stop and block. The 
idea is that if you didn't get the lock before the owner changed, you 
aren't going to get it in a very short period of time (you have at least 
an entire critical section to wait through plus whatever time you've 
already spent spinning). However, blocking just so another task can spin 
doesn't really make sense either, and makes the lock less fair than it 
could otherwise be.

My goal in starting this is to provide a more intelligent mechanism than 
sched_yield() for userspace to use to determine when to spin and when to 
sleep. The current implementation allows for spinning up until the owner 
changes, deschedules, or the timeslice expires. I believe these are much 
better than spinning for some fixed number of cycles and then yield for 
some unpredictable amount of time until CFS decides to schedule you back in.

Still, the criteria for breaking the spin are something that needs more 
eyes, and more numbers before I can be confident in any approach.

> 
>>>> +		hrtimer_init_sleeper(to, current);
>>>> +		hrtimer_set_expires(&to->timer, *time);
>>>> +	}
>>>   Why setup all this _before_ trying the adaptive spin ?
>>
>> I placed the retry: label above the adaptive spin loop. This way if we wake a
>> task and the lock is "stolen" it doesn't just go right back to sleep. This
>> should aid in fairness and also performance in less contended cases. I didn't
>> think it was worth a "if (first_time_through && time)" sort of block to be
>> able to setup the timer after the spin loop.
> 
> Hmm, ok.
> 
>>> Do we really need all this code ? A simple owner->on_cpu (owner needs
>>> to be the task_struct then) would be sufficient to figure that out,
>>> wouldn't it?
>> As Peter pointed out in IRC, p->oncpu isn't generic. I'll go trolling through
>> the mutex_spin_on_owner() discussions to see if I can determine why that's the
>> case.
> 
> AFAICT p->oncpu is the correct thing to use when CONFIG_SMP=y. All it
> needs is a simple accessor function and you can keep all the futex
> cruft in futex.c where it belongs.

Noted.

Thanks,
-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

  reply	other threads:[~2010-04-08  3:25 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-05 20:23 [PATCH V2 0/6][RFC] futex: FUTEX_LOCK with optional adaptive spinning Darren Hart
2010-04-05 20:23 ` [PATCH 1/6] futex: replace fshared and clockrt with combined flags Darren Hart
2010-04-05 20:23 ` [PATCH 2/6] futex: add futex_q static initializer Darren Hart
2010-04-05 20:23 ` [PATCH 3/6] futex: refactor futex_lock_pi_atomic Darren Hart
2010-04-05 20:23 ` [PATCH 4/6] futex: Add FUTEX_LOCK with optional adaptive spinning Darren Hart
2010-04-06 16:55   ` Thomas Gleixner
2010-04-07 17:26     ` Darren Hart
2010-04-07 19:59       ` Thomas Gleixner
2010-04-08  3:25         ` Darren Hart [this message]
2010-04-08 23:10           ` Peter W. Morreale
2010-04-09  5:41             ` Darren Hart
2010-04-09 13:13               ` Peter W. Morreale
2010-04-05 20:23 ` [PATCH 5/6] futex: handle timeout inside adaptive lock spin Darren Hart
2010-04-06  8:27   ` Thomas Gleixner
2010-04-07 17:31     ` Darren Hart
2010-04-07 18:44       ` Gregory Haskins
2010-04-07 23:15         ` Darren Hart
2010-04-05 20:23 ` [PATCH 6/6] futex: Add aggressive adaptive spinning argument to FUTEX_LOCK Darren Hart
2010-04-08  5:58   ` Darren Hart
2010-04-05 20:48 ` [PATCH V2^W V4 0/6][RFC] futex: FUTEX_LOCK with optional adaptive spinning Darren Hart
2010-04-05 21:15 ` [PATCH V2 " Avi Kivity
2010-04-05 21:54   ` Darren Hart
2010-04-05 22:21     ` Avi Kivity
2010-04-05 22:59       ` Darren Hart
2010-04-06 13:28         ` Avi Kivity
2010-04-06 13:35           ` Peter Zijlstra
2010-04-06 13:41             ` Avi Kivity
2010-04-06 14:09               ` Peter Zijlstra
2010-04-06 16:10                 ` Avi Kivity
2010-04-06 16:53                   ` Alan Cox
2010-04-06 13:51             ` Alan Cox
2010-04-06 15:28               ` Darren Hart
2010-04-06 16:06                 ` Avi Kivity
2010-04-06 16:14                   ` Thomas Gleixner
2010-04-06 16:20                     ` Avi Kivity
2010-04-07  6:18                       ` john cooper
2010-04-08  3:33                         ` Darren Hart
2010-04-09  5:52                           ` john cooper
2010-04-06 16:54                     ` Alan Cox
2010-04-06 18:15                       ` Thomas Gleixner
2010-04-06 16:44                 ` Alan Cox
2010-04-06 17:34                   ` Ulrich Drepper
2010-04-10 23:35                     ` Alan Cox
2010-04-10 23:53                       ` Ulrich Drepper
2010-04-06 19:31                   ` Thomas Gleixner
2010-04-06 20:02                     ` Ulrich Drepper
2010-04-06 23:16                       ` Thomas Gleixner
2010-04-06 23:36                         ` Darren Hart
2010-04-07  6:08                         ` drepper
2010-04-08  3:41                           ` Darren Hart
2010-04-08  4:29                             ` drepper
2010-04-07  5:33                     ` Avi Kivity
2010-04-06 21:22         ` Darren Hart
2010-04-05 23:15       ` Darren Hart
2010-04-05 23:29         ` Chris Wright
2010-04-06 13:30         ` Avi Kivity
2010-04-06  8:48   ` Peter Zijlstra
2010-04-06 14:47     ` Ulrich Drepper
2010-04-06 14:51       ` Peter Zijlstra
2010-04-06 15:33         ` Darren Hart
2010-04-06 15:37           ` Peter Zijlstra
2010-04-06 15:29 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BBD4C92.8010701@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=avi@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=chrisw@sous-sol.org \
    --cc=eric.dumazet@gmail.com \
    --cc=ghaskins@novell.com \
    --cc=john.cooper@third-harmonic.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=pmorreale@novell.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=sdietrich@novell.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).