From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Ulrich Drepper <drepper@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Jones <davej@redhat.com>, Ingo Molnar <mingo@elte.hu>,
Andi Kleen <ak@suse.de>,
Ravikiran G Thirumalai <kiran@scalex86.org>,
"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
pravin b shelar <pravin.shelar@calsoftinc.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] FUTEX : new PRIVATE futexes
Date: Fri, 06 Apr 2007 11:19:05 +1000 [thread overview]
Message-ID: <4615A009.808@yahoo.com.au> (raw)
In-Reply-To: <20070405194942.1414c030.dada1@cosmosbay.com>
Hi Eric,
Thanks for doing this... It's looking good, I just have some minor
comments:
Eric Dumazet wrote:
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> --- linux-2.6.21-rc5-mm4/kernel/futex.c
> +++ linux-2.6.21-rc5-mm4-ed/kernel/futex.c
> @@ -16,6 +16,9 @@
> * Copyright (C) 2006 Red Hat, Inc., Ingo Molnar <mingo@redhat.com>
> * Copyright (C) 2006 Timesys Corp., Thomas Gleixner <tglx@timesys.com>
> *
> + * PRIVATE futexes by Eric Dumazet
> + * Copyright (C) 2007 Eric Dumazet <dada1@cosmosbay.com>
> + *
> * Thanks to Ben LaHaise for yelling "hashed waitqueues" loudly
> * enough at me, Linus for the original (flawed) idea, Matthew
> * Kirkwood for proof-of-concept implementation.
> @@ -199,9 +202,12 @@ static inline int match_futex(union fute
> * Returns: 0, or negative error code.
> * The key words are stored in *key on success.
> *
> - * Should be called with ¤t->mm->mmap_sem but NOT any spinlocks.
> + * shared is NULL for PROCESS_PRIVATE futexes
> + * For other futexes, it points to ¤t->mm->mmap_sem and
> + * caller must have taken the reader lock. but NOT any spinlocks.
> */
> -int get_futex_key(void __user *uaddr, union futex_key *key)
> +int get_futex_key(void __user *uaddr, union futex_key *key,
> + struct rw_semaphore *shared)
Can we pass in something other than the rw_semaphore here? Seeing as
it only actually gets used as a flag, it might be nicer just to pass
a 0 or 1? And all through the call stack...
Did the whole thing just turn out neater when you passed the rwsem?
We always know to use current->mm->mmap_sem, so it doesn't seem like
a boolean flag would hurt?
> {
> unsigned long address = (unsigned long)uaddr;
> struct mm_struct *mm = current->mm;
> @@ -218,6 +224,22 @@ int get_futex_key(void __user *uaddr, un
> address -= key->both.offset;
>
> /*
> + * PROCESS_PRIVATE futexes are fast.
> + * As the mm cannot disappear under us and the 'key' only needs
> + * virtual address, we dont even have to find the underlying vma.
> + * Note : We do have to check 'address' is a valid user address,
> + * but access_ok() should be faster than find_vma()
> + * Note : At this point, address points to the start of page,
> + * not the real futex address, this is ok.
> + */
> + if (!shared) {
> + if (!access_ok(VERIFY_WRITE, address, sizeof(int)))
> + return -EFAULT;
Shouldn't that be sizeof(long) to handle 64 bit futexes? Or strictly, it
should depend on the size of the operation. Maybe the access_ok check
should go outside get_futex_key?
> + key->private.mm = mm;
> + key->private.address = address;
> + return 0;
> + }
> + /*
> * The futex is hashed differently depending on whether
> * it's in a shared or private mapping. So check vma first.
> */
> @@ -244,6 +266,7 @@ int get_futex_key(void __user *uaddr, un
> * mappings of _writable_ handles.
> */
> if (likely(!(vma->vm_flags & VM_MAYSHARE))) {
> + key->both.offset += FUT_OFF_MMSHARED; /* reference taken on mm */
> key->private.mm = mm;
> key->private.address = address;
> return 0;
> @@ -253,7 +276,7 @@ int get_futex_key(void __user *uaddr, un
> * Linear file mappings are also simple.
> */
> key->shared.inode = vma->vm_file->f_path.dentry->d_inode;
> - key->both.offset++; /* Bit 0 of offset indicates inode-based key. */
> + key->both.offset += FUT_OFF_INODE; /* inode-based key. */
> if (likely(!(vma->vm_flags & VM_NONLINEAR))) {
> key->shared.pgoff = (((address - vma->vm_start) >> PAGE_SHIFT)
> + vma->vm_pgoff);
I like |= for adding flags, it seems less ambiguous. But I guess that's
a matter of opinion. Hugh seems to like +=, and I can't argue with him
about style issues ;)
> @@ -281,17 +304,19 @@ EXPORT_SYMBOL_GPL(get_futex_key);
> * Take a reference to the resource addressed by a key.
> * Can be called while holding spinlocks.
> *
> - * NOTE: mmap_sem MUST be held between get_futex_key() and calling this
> - * function, if it is called at all. mmap_sem keeps key->shared.inode valid.
> */
> inline void get_futex_key_refs(union futex_key *key)
> {
> - if (key->both.ptr != 0) {
> - if (key->both.offset & 1)
> + if (key->both.ptr == 0)
> + return;
> + switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
> + case FUT_OFF_INODE:
> atomic_inc(&key->shared.inode->i_count);
> - else
> + break;
> + case FUT_OFF_MMSHARED:
> atomic_inc(&key->private.mm->mm_count);
> - }
> + break;
> + }
> }
> EXPORT_SYMBOL_GPL(get_futex_key_refs);
>
> @@ -301,11 +326,15 @@ EXPORT_SYMBOL_GPL(get_futex_key_refs);
> */
> void drop_futex_key_refs(union futex_key *key)
> {
> - if (key->both.ptr != 0) {
> - if (key->both.offset & 1)
> + if (key->both.ptr == 0)
> + return;
> + switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
> + case FUT_OFF_INODE:
> iput(key->shared.inode);
> - else
> + break;
> + case FUT_OFF_MMSHARED:
> mmdrop(key->private.mm);
> + break;
> }
> }
> EXPORT_SYMBOL_GPL(drop_futex_key_refs);
I wonder if it would be worthwhile inlining and likley()ing the
private fastpath? Might make it pretty compact... I guess that's
something to worry about after glibc gets support.
> @@ -339,28 +368,40 @@ get_futex_value_locked(unsigned long *de
> }
>
> /*
> - * Fault handling. Called with current->mm->mmap_sem held.
> + * Fault handling.
> + * if shared is non NULL, current->mm->mmap_sem is already held
> */
> -static int futex_handle_fault(unsigned long address, int attempt)
> +static int futex_handle_fault(unsigned long address, int attempt,
> + struct rw_semaphore *shared)
> {
> struct vm_area_struct * vma;
> struct mm_struct *mm = current->mm;
> + int ret = 0;
>
> - if (attempt > 2 || !(vma = find_vma(mm, address)) ||
> - vma->vm_start > address || !(vma->vm_flags & VM_WRITE))
> + if (attempt > 2)
> return -EFAULT;
>
> - switch (handle_mm_fault(mm, vma, address, 1)) {
> - case VM_FAULT_MINOR:
> - current->min_flt++;
> - break;
> - case VM_FAULT_MAJOR:
> - current->maj_flt++;
> - break;
> - default:
> - return -EFAULT;
> - }
> - return 0;
> + if (!shared)
> + down_read(&mm->mmap_sem);
> +
> + if (!(vma = find_vma(mm, address)) ||
> + vma->vm_start > address || !(vma->vm_flags & VM_WRITE))
> + ret = -EFAULT;
> +
> + else
> + switch (handle_mm_fault(mm, vma, address, 1)) {
> + case VM_FAULT_MINOR:
> + current->min_flt++;
> + break;
> + case VM_FAULT_MAJOR:
> + current->maj_flt++;
> + break;
> + default:
> + ret = -EFAULT;
> + }
> + if (!shared)
> + up_read(&mm->mmap_sem);
> + return ret;
> }
>
> /*
You've got an extra space after the if (maybe for clarity?). In this
situation I prefer putting braces around both the if and the else, and
if you get rid of that blank line, it doesn't cost you anything more ;)
> @@ -1598,6 +1656,8 @@ static int futex_wait(unsigned long __us
> restart->arg1 = val;
> restart->arg2 = (unsigned long)abs_time;
> restart->arg3 = (unsigned long)futex64;
> + if (shared)
> + restart->arg3 |= 2;
Could you make this into a proper flags argument and use #define CONSTANTs for it?
> @@ -2377,23 +2455,24 @@ sys_futex64(u64 __user *uaddr, int op, u
> struct timespec ts;
> ktime_t t, *tp = NULL;
> u64 val2 = 0;
> + int opm = op & FUTEX_CMD_MASK;
What's opm stand for?
>
> - if (utime && (op == FUTEX_WAIT || op == FUTEX_LOCK_PI)) {
> + if (utime && (opm == FUTEX_WAIT || opm == FUTEX_LOCK_PI)) {
--
SUSE Labs, Novell Inc.
next prev parent reply other threads:[~2007-04-06 1:19 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-08 7:07 [RFC] NUMA futex hashing Ravikiran G Thirumalai
2006-08-08 9:14 ` Eric Dumazet
2006-08-08 20:31 ` Ravikiran G Thirumalai
2006-08-08 9:37 ` Jes Sorensen
2006-08-08 9:58 ` Andi Kleen
2006-08-08 10:07 ` Jes Sorensen
2006-08-08 9:57 ` Andi Kleen
2006-08-08 10:10 ` Eric Dumazet
2006-08-08 10:36 ` Andi Kleen
2006-08-08 12:29 ` Eric Dumazet
2006-08-08 12:47 ` Andi Kleen
2006-08-08 12:57 ` Eric Dumazet
2006-08-08 14:39 ` Ulrich Drepper
2006-08-08 15:11 ` Nick Piggin
2006-08-08 15:36 ` Ulrich Drepper
2006-08-08 16:22 ` Nick Piggin
2006-08-08 16:26 ` Nick Piggin
2006-08-08 16:49 ` Ulrich Drepper
2006-08-08 16:08 ` Eric Dumazet
2006-08-08 16:34 ` Nick Piggin
2006-08-08 16:49 ` Eric Dumazet
2006-08-08 16:59 ` Eric Dumazet
2006-08-09 1:56 ` Nick Piggin
2006-08-08 16:58 ` Ulrich Drepper
2006-08-08 17:08 ` Eric Dumazet
2006-08-09 1:58 ` Nick Piggin
2006-08-09 6:26 ` Eric Dumazet
2006-08-09 6:43 ` Eric Dumazet
2007-03-15 19:10 ` [PATCH 0/3] FUTEX : new PRIVATE futexes, SMP and NUMA improvements Eric Dumazet
2007-03-15 20:15 ` Nick Piggin
2007-03-16 8:05 ` Peter Zijlstra
2007-03-16 9:30 ` Eric Dumazet
2007-03-16 10:10 ` Peter Zijlstra
2007-03-16 10:30 ` Eric Dumazet
2007-03-16 10:36 ` Peter Zijlstra
2007-04-04 7:16 ` Ulrich Drepper
2007-04-05 17:49 ` [PATCH] FUTEX : new PRIVATE futexes Eric Dumazet
2007-04-05 20:43 ` Ulrich Drepper
2007-04-06 1:19 ` Nick Piggin [this message]
2007-04-06 5:53 ` Eric Dumazet
2007-04-06 11:50 ` Nick Piggin
2007-04-06 6:05 ` Hugh Dickins
2007-04-06 17:41 ` Jan Engelhardt
2007-04-06 12:26 ` Shared futexes (was [PATCH] FUTEX : new PRIVATE futexes) Peter Zijlstra
2007-04-06 13:02 ` Hugh Dickins
2007-04-06 13:15 ` Peter Zijlstra
2007-04-06 13:15 ` Nick Piggin
2007-04-06 13:22 ` Peter Zijlstra
2007-04-06 13:40 ` Nick Piggin
2007-04-06 12:31 ` [PATCH] FUTEX : new PRIVATE futexes Peter Zijlstra
2007-04-07 8:43 ` [PATCH, take4] " Eric Dumazet
2007-04-07 9:30 ` Nick Piggin
2007-04-07 10:00 ` Eric Dumazet
2007-04-11 7:22 ` Nick Piggin
2007-04-11 8:14 ` Eric Dumazet
2007-04-11 9:23 ` Nick Piggin
2007-04-11 9:30 ` Pierre Peiffer
2007-04-11 9:39 ` Nick Piggin
2007-04-11 9:40 ` Nick Piggin
2007-04-11 9:35 ` Eric Dumazet
2007-04-12 1:57 ` Nick Piggin
2007-04-07 11:18 ` Jakub Jelinek
2007-04-07 11:54 ` Eric Dumazet
2007-04-07 16:40 ` Ulrich Drepper
2007-04-07 22:15 ` Andrew Morton
2007-04-10 9:21 ` Eric Dumazet
2007-04-11 9:19 ` [PATCH, take5] " Eric Dumazet
2007-04-11 12:23 ` Rusty Russell
2007-04-26 12:55 ` [PATCH, take6] " Eric Dumazet
2007-04-26 13:35 ` Pierre Peiffer
2007-03-15 19:13 ` [PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic Eric Dumazet
2007-03-15 19:16 ` [PATCH 2/3] FUTEX : introduce private hashtables Eric Dumazet
2007-03-15 20:25 ` Nick Piggin
2007-03-15 21:09 ` Ulrich Drepper
2007-03-15 21:29 ` Nick Piggin
2007-03-15 22:59 ` William Lee Irwin III
2007-03-15 19:20 ` [PATCH 3/3] FUTEX : NUMA friendly global hashtable Eric Dumazet
2006-08-09 0:13 ` [RFC] NUMA futex hashing Ravikiran G Thirumalai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4615A009.808@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=dada1@cosmosbay.com \
--cc=davej@redhat.com \
--cc=drepper@gmail.com \
--cc=kiran@scalex86.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=pravin.shelar@calsoftinc.com \
--cc=shai@scalex86.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.