From: Darren Hart <dvhltc@us.ibm.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Eric Dumazet <dada1@cosmosbay.com>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: Question about PRIVATE_FUTEX
Date: Fri, 27 Mar 2009 08:43:17 -0700 [thread overview]
Message-ID: <49CCF415.7080201@us.ibm.com> (raw)
In-Reply-To: <28c262360903270437l72cd31e1ja2daf00dbcf29675@mail.gmail.com>
Minchan Kim wrote:
> On Fri, Mar 27, 2009 at 8:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Fri, 2009-03-27 at 19:56 +0900, Minchan Kim wrote:
>>
>>>>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>>>>> pagefault_disable.
>>>>>
>>>>> Who make sure the user page is mapped at app's page table ?
>>>> Nobody, all uses of get_futex_value_locked() have to deal with it
>>>> returning -EFAULT.
>>> Does It mean that __copy_from_user_inatomic in get_futex_value_locked
>>> would be failed rather than sleep?
>> Correct.
>>
>>> In fact, I don't make sure _copy_from_user_inatomic function's meaning.
>>> As far as I understand, It never sleep. It just can be failed in case
>>> of user page isn't mapped. Is right ?
>> Correct.
>>
>>> Otherwise, it can be scheduled with pagefault_disable which increments
>>> preempt_count. It is a atomic bug.
>>> If my assume is right, it can be failed rather than sleep.
>>> At this case, other architecture implements __copy_from_user_inatomic
>>> with __copy_from_user which can be scheduled. It also can be bug.
>>>
>>> Hmm, Now I am confusing.
>> Confused I guess ;-)
>> The trick is in the in_atomic() check in the pagefault handler and the
>> fixup section of the copy routines.
>
> Whew~, There was good hidden trick.
> I will dive into this assembly.
> I always thanks for your kindness. :)
>
>> #define __copy_user(to, from, size) \
>> do { \
>> int __d0, __d1, __d2; \
>> __asm__ __volatile__( \
>> " cmp $7,%0\n" \
>> " jbe 1f\n" \
>> " movl %1,%0\n" \
>> " negl %0\n" \
>> " andl $7,%0\n" \
>> " subl %0,%3\n" \
>> "4: rep; movsb\n" \
>> " movl %3,%0\n" \
>> " shrl $2,%0\n" \
>> " andl $3,%3\n" \
>> " .align 2,0x90\n" \
>> "0: rep; movsl\n" \
>> " movl %3,%0\n" \
>> "1: rep; movsb\n" \
>> "2:\n" \
>> ".section .fixup,\"ax\"\n" \
>> "5: addl %3,%0\n" \
>> " jmp 2b\n" \
>> "3: lea 0(%3,%0,4),%0\n" \
>> " jmp 2b\n" \
>> ".previous\n" \
>> ".section __ex_table,\"a\"\n" \
>> " .align 4\n" \
>> " .long 4b,5b\n" \
>> " .long 0b,3b\n" \
>> " .long 1b,2b\n" \
>> ".previous" \
>> : "=&c"(size), "=&D" (__d0), "=&S" (__d1), "=r"(__d2) \
>> : "3"(size), "0"(size), "1"(to), "2"(from) \
>> : "memory"); \
>> } while (0)
>>
>> see that __ex_table section, it tells the fault handler where to
>> continue in case of an atomic fault.
>>
>>>> Most of this is legacy btw, from when futex ops were done under the
>>>> mmap_sem. Back then we couldn't fault because that would cause mmap_sem
>>>> recursion. Howver, now that we don't hold mmap_sem anymore we could use
>>>> a faulting user access like get_user().
>>>> Darren has been working on patches to clean that up, some of those are
>>>> already merged in the -tip tree.
I'm a little late to the party I guess. Minchan, a lot of the fault
logic has been cleaned up in the tip tree, core/futexes branch. The
removes a lot of the legacy complication from the faulting paths.
However, the get_futex_key code remains the same if I remember correctly.
>>> Thanks for good information.
>>> It will be very desirable way to enhance kernel performance.
>> I doubt it'll make a measurable difference, if you need to fault
>> performance sucks anyway. If you don't, the current code is just as
>> fast.
>>
Agreed. If you are suffering performance hits from excessive paging,
consider locking your memory.
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
prev parent reply other threads:[~2009-03-27 15:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 2:12 Question about PRIVATE_FUTEX Minchan Kim
2009-03-27 4:32 ` Minchan Kim
2009-03-27 4:56 ` Eric Dumazet
2009-03-27 5:20 ` Minchan Kim
2009-03-27 5:50 ` Eric Dumazet
2009-03-27 6:20 ` Minchan Kim
2009-03-27 8:49 ` Peter Zijlstra
2009-03-27 10:56 ` Minchan Kim
2009-03-27 11:14 ` Peter Zijlstra
2009-03-27 11:37 ` Minchan Kim
2009-03-27 15:43 ` Darren Hart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49CCF415.7080201@us.ibm.com \
--to=dvhltc@us.ibm.com \
--cc=dada1@cosmosbay.com \
--cc=linux-kernel@vger.kernel.org \
--cc=minchan.kim@gmail.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox