All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ulrich Drepper <drepper@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Jones <davej@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Andi Kleen <ak@suse.de>,
	Ravikiran G Thirumalai <kiran@scalex86.org>,
	"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
	pravin b shelar <pravin.shelar@calsoftinc.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] FUTEX : new PRIVATE futexes
Date: Fri, 06 Apr 2007 07:53:08 +0200	[thread overview]
Message-ID: <4615E044.6080205@cosmosbay.com> (raw)
In-Reply-To: <4615A009.808@yahoo.com.au>

Nick Piggin a écrit :
> Hi Eric,
> 
> Thanks for doing this... It's looking good, I just have some minor
> comments:

Hi Nick, thanks for reviewing.

> 
> Eric Dumazet wrote:
>>   */
>> -int get_futex_key(void __user *uaddr, union futex_key *key)
>> +int get_futex_key(void __user *uaddr, union futex_key *key,
>> +    struct rw_semaphore *shared)
> 
> Can we pass in something other than the rw_semaphore here? Seeing as
> it only actually gets used as a flag, it might be nicer just to pass
> a 0 or 1? And all through the call stack...
> 
> Did the whole thing just turn out neater when you passed the rwsem?
> We always know to use current->mm->mmap_sem, so it doesn't seem like
> a boolean flag would hurt?

That's a good question

current->mm->mmap_sem being calculated once is a win in itself, because 
current access is not cheap.
It also does the memory access to go through part of the chain in advance, 
before its use. It does a prefetch() equivalent for free : If current->mm is 
not in CPU cache, CPU wont stall because next instructions dont depend on it.

This means less CPU stall in case current->mm is not in CPU cache. Thats 
difficult to benchmark it, but you can trust me.

A flag means :

if (flag)
     up_read(&current->mm->mmap_sem)

This generates quite a bad code.

if (ptr)
    up_read(ptr)

generates *much* better code.

So this is a cleanup and a runtime optimization.

I dit a similar optimization on commit 163da958ba5282cbf85e8b3dc08e4f51f8b01c5e

I invite you to check it :

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=163da958ba5282cbf85e8b3dc08e4f51f8b01c5e



> 
>>  {
>>      unsigned long address = (unsigned long)uaddr;
>>      struct mm_struct *mm = current->mm;
>> @@ -218,6 +224,22 @@ int get_futex_key(void __user *uaddr, un
>>      address -= key->both.offset;
>>  
>>      /*
>> +     * PROCESS_PRIVATE futexes are fast.
>> +     * As the mm cannot disappear under us and the 'key' only needs
>> +     * virtual address, we dont even have to find the underlying vma.
>> +     * Note : We do have to check 'address' is a valid user address,
>> +     *        but access_ok() should be faster than find_vma()
>> +     * Note : At this point, address points to the start of page,
>> +     *        not the real futex address, this is ok.
>> +     */
>> +    if (!shared) {
>> +        if (!access_ok(VERIFY_WRITE, address, sizeof(int)))
>> +            return -EFAULT;
> 
> Shouldn't that be sizeof(long) to handle 64 bit futexes? Or strictly, it
> should depend on the size of the operation. Maybe the access_ok check
> should go outside get_futex_key?

If you check again, you'll see that address points to the start of the PAGE, 
not the real u32/u64 futex address. This checks the PAGE. We can use char, 
short, int, long, or char[PAGE_SIZE] as long as we know a futex cannot span 
two pages.


>>       */
>>      key->shared.inode = vma->vm_file->f_path.dentry->d_inode;
>> -    key->both.offset++; /* Bit 0 of offset indicates inode-based key. */
>> +    key->both.offset += FUT_OFF_INODE; /* inode-based key. */
>>      if (likely(!(vma->vm_flags & VM_NONLINEAR))) {
>>          key->shared.pgoff = (((address - vma->vm_start) >> PAGE_SHIFT)
>>                       + vma->vm_pgoff);
> 
> I like |= for adding flags, it seems less ambiguous. But I guess that's
> a matter of opinion. Hugh seems to like +=, and I can't argue with him
> about style issues ;)


Previous code was doing offset++ wich means offset += 1;
I didnt want to hurt Hugh :)

>>  EXPORT_SYMBOL_GPL(drop_futex_key_refs);
> 
> I wonder if it would be worthwhile inlining and likley()ing the
> private fastpath? Might make it pretty compact... I guess that's
> something to worry about after glibc gets support.

Yes, in a future patch, in about one year :)

>> +
>> +    if (!(vma = find_vma(mm, address)) ||
>> +        vma->vm_start > address || !(vma->vm_flags & VM_WRITE))
>> +        ret = -EFAULT;
>> +
>> +    else
>> +        switch (handle_mm_fault(mm, vma, address, 1)) {
>> +        case VM_FAULT_MINOR:
>> +            current->min_flt++;
>> +            break;
>> +        case VM_FAULT_MAJOR:
>> +            current->maj_flt++;
>> +            break;
>> +        default:
>> +            ret = -EFAULT;
>> +        }
>> +    if (!shared)
>> +        up_read(&mm->mmap_sem);
>> +    return ret;
>>  }
>>  
>>  /*
> 
> You've got an extra space after the if (maybe for clarity?). In this
> situation I prefer putting braces around both the if and the else, and
> if you get rid of that blank line, it doesn't cost you anything more ;)

Oh well...

> 
>> @@ -1598,6 +1656,8 @@ static int futex_wait(unsigned long __us
>>          restart->arg1 = val;
>>          restart->arg2 = (unsigned long)abs_time;
>>          restart->arg3 = (unsigned long)futex64;
>> +        if (shared)
>> +            restart->arg3 |= 2;
> 
> Could you make this into a proper flags argument and use #define 
> CONSTANTs for it?

Yes, but I'm not sure it will improve readability.

> 
>> @@ -2377,23 +2455,24 @@ sys_futex64(u64 __user *uaddr, int op, u
>>      struct timespec ts;
>>      ktime_t t, *tp = NULL;
>>      u64 val2 = 0;
>> +    int opm = op & FUTEX_CMD_MASK;
> 
> What's opm stand for?

I guess 'm' stands for 'mask' or 'masked' ?

Thank you

  reply	other threads:[~2007-04-06  5:53 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-08  7:07 [RFC] NUMA futex hashing Ravikiran G Thirumalai
2006-08-08  9:14 ` Eric Dumazet
2006-08-08 20:31   ` Ravikiran G Thirumalai
2006-08-08  9:37 ` Jes Sorensen
2006-08-08  9:58   ` Andi Kleen
2006-08-08 10:07     ` Jes Sorensen
2006-08-08  9:57 ` Andi Kleen
2006-08-08 10:10   ` Eric Dumazet
2006-08-08 10:36     ` Andi Kleen
2006-08-08 12:29       ` Eric Dumazet
2006-08-08 12:47         ` Andi Kleen
2006-08-08 12:57           ` Eric Dumazet
2006-08-08 14:39             ` Ulrich Drepper
2006-08-08 15:11               ` Nick Piggin
2006-08-08 15:36                 ` Ulrich Drepper
2006-08-08 16:22                   ` Nick Piggin
2006-08-08 16:26                     ` Nick Piggin
2006-08-08 16:49                     ` Ulrich Drepper
2006-08-08 16:08                 ` Eric Dumazet
2006-08-08 16:34                   ` Nick Piggin
2006-08-08 16:49                     ` Eric Dumazet
2006-08-08 16:59                       ` Eric Dumazet
2006-08-09  1:56                       ` Nick Piggin
2006-08-08 16:58                   ` Ulrich Drepper
2006-08-08 17:08                     ` Eric Dumazet
2006-08-09  1:58                     ` Nick Piggin
2006-08-09  6:26                       ` Eric Dumazet
2006-08-09  6:43                         ` Eric Dumazet
2007-03-15 19:10                           ` [PATCH 0/3] FUTEX : new PRIVATE futexes, SMP and NUMA improvements Eric Dumazet
2007-03-15 20:15                             ` Nick Piggin
2007-03-16  8:05                             ` Peter Zijlstra
2007-03-16  9:30                               ` Eric Dumazet
2007-03-16 10:10                                 ` Peter Zijlstra
2007-03-16 10:30                                   ` Eric Dumazet
2007-03-16 10:36                                     ` Peter Zijlstra
2007-04-04  7:16                             ` Ulrich Drepper
2007-04-05 17:49                               ` [PATCH] FUTEX : new PRIVATE futexes Eric Dumazet
2007-04-05 20:43                                 ` Ulrich Drepper
2007-04-06  1:19                                 ` Nick Piggin
2007-04-06  5:53                                   ` Eric Dumazet [this message]
2007-04-06 11:50                                     ` Nick Piggin
2007-04-06  6:05                                   ` Hugh Dickins
2007-04-06 17:41                                     ` Jan Engelhardt
2007-04-06 12:26                                 ` Shared futexes (was [PATCH] FUTEX : new PRIVATE futexes) Peter Zijlstra
2007-04-06 13:02                                   ` Hugh Dickins
2007-04-06 13:15                                     ` Peter Zijlstra
2007-04-06 13:15                                     ` Nick Piggin
2007-04-06 13:22                                       ` Peter Zijlstra
2007-04-06 13:40                                         ` Nick Piggin
2007-04-06 12:31                                 ` [PATCH] FUTEX : new PRIVATE futexes Peter Zijlstra
2007-04-07  8:43                                 ` [PATCH, take4] " Eric Dumazet
2007-04-07  9:30                                   ` Nick Piggin
2007-04-07 10:00                                     ` Eric Dumazet
2007-04-11  7:22                                       ` Nick Piggin
2007-04-11  8:14                                         ` Eric Dumazet
2007-04-11  9:23                                           ` Nick Piggin
2007-04-11  9:30                                             ` Pierre Peiffer
2007-04-11  9:39                                               ` Nick Piggin
2007-04-11  9:40                                                 ` Nick Piggin
2007-04-11  9:35                                             ` Eric Dumazet
2007-04-12  1:57                                               ` Nick Piggin
2007-04-07 11:18                                   ` Jakub Jelinek
2007-04-07 11:54                                     ` Eric Dumazet
2007-04-07 16:40                                       ` Ulrich Drepper
2007-04-07 22:15                                   ` Andrew Morton
2007-04-10  9:21                                     ` Eric Dumazet
2007-04-11  9:19                                   ` [PATCH, take5] " Eric Dumazet
2007-04-11 12:23                                     ` Rusty Russell
2007-04-26 12:55                                     ` [PATCH, take6] " Eric Dumazet
2007-04-26 13:35                                       ` Pierre Peiffer
2007-03-15 19:13                           ` [PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic Eric Dumazet
2007-03-15 19:16                           ` [PATCH 2/3] FUTEX : introduce private hashtables Eric Dumazet
2007-03-15 20:25                             ` Nick Piggin
2007-03-15 21:09                               ` Ulrich Drepper
2007-03-15 21:29                                 ` Nick Piggin
2007-03-15 22:59                               ` William Lee Irwin III
2007-03-15 19:20                           ` [PATCH 3/3] FUTEX : NUMA friendly global hashtable Eric Dumazet
2006-08-09  0:13     ` [RFC] NUMA futex hashing Ravikiran G Thirumalai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4615E044.6080205@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=drepper@gmail.com \
    --cc=kiran@scalex86.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pravin.shelar@calsoftinc.com \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.