public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Pierre PEIFFER <pierre.peiffer@bull.net>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	jakub@redhat.com
Subject: Re: [PATCH] 2.6.16 - futex: small optimization (?)
Date: Tue, 28 Mar 2006 12:05:44 +0200	[thread overview]
Message-ID: <44290A78.3050509@cosmosbay.com> (raw)
In-Reply-To: <4428E7B7.8040408@bull.net>

Pierre PEIFFER a écrit :
> Hi,
> 
> 
> I found a (optimization ?) problem in the futexes, during a futex_wake, 
>  if the waiter has a higher priority than the waker.
> 
> In fact, in this case, the waiter is immediately scheduled and tries to 
> take a lock still held by the waker. This is specially expensive on UP 
> or if both threads are on the same CPU, due to the two task-switchings. 
> This produces an extra latency during a wakeup in pthread_cond_broadcast 
> or pthread_cond_signal, for example.
> 
> See below my detailed explanation.
> 
> I found a solution given by the patch, at the end of this mail. It works 
> for me on kernel 2.6.16, but the kernel hangs if I use it with -rt patch 
> from Ingo Molnar. So, I have a doubt on the correctness of the patch.
> 
> The idea is simple: in unqueue_me, I first check
>     "if (list_empty(&q->list))"
> 
> If yes => we were woken (the list is initialized in wake_futex).
> Then, it immediately returns and let the waker drop the key_refs 
> (instead of the waiter).
> 
> 

Its true that futex code implies lot of context switches (kernel side but also 
user side).

Even if you change kernel behavior in futex_wake(), you wont change the fact 
that a typical pthread_cond_signal does :

1) lock cond var
lll_lock(cv->lock);
2) wake one waiter if necessary
FUTEX_WAKE(cv->wakeup_seq, 1);
3) unlock cond var

If a waiter process B has higher priority than the wake process A, then most 
probably, B is scheduled before A had a chance to unlock cond var (step 3))

So B will re-enter kernel (because of the contended cond var lock), and A will 
re-enter kernel too to futex_wake() process A again, but on cond var lock this 
time, not on condvar wakeup_seq futex.

Each time a thread enters futex kernel code, an expensive find_extend_vma() 
lookup is done, (expensive because of the read_lock but also the possible 
amount of vm_area_struct in mm_struct)

I wish futex code had a special implementation for PTHREAD_SCOPE_PROCESS 
futexes , where no vma lookups would be necessary at all. Most mutexes or 
condvar have a process private scope (not shared by different processes)

Eric




  reply	other threads:[~2006-03-28 10:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-28  7:37 [PATCH] 2.6.16 - futex: small optimization (?) Pierre PEIFFER
2006-03-28 10:05 ` Eric Dumazet [this message]
2006-03-28 15:02 ` Ulrich Drepper
2006-03-28 22:46   ` Bill Davidsen
2006-03-29 15:26     ` Ingo Molnar
2006-03-30 20:27       ` Bill Davidsen
2006-03-31  6:01         ` Ingo Molnar
2006-03-31 14:50           ` Bill Davidsen
2006-03-31 18:15             ` Ingo Molnar
2006-03-29 13:18   ` Pierre PEIFFER
2006-03-29 15:26     ` Eric Dumazet
2006-03-30 14:51       ` Pierre PEIFFER

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44290A78.3050509@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=jakub@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pierre.peiffer@bull.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox