* Question about PRIVATE_FUTEX
@ 2009-03-27 2:12 Minchan Kim
2009-03-27 4:32 ` Minchan Kim
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 2:12 UTC (permalink / raw)
To: Peter Zijlstra, Eric Dumazet; +Cc: lkml
Hi, Peter and Eric.
I am not expert about futex.
I am sorry if this is dumb question.
If we use private futex, get_futex_key don't call get_user_pages_fast
which pins page at page table.
Then, get_futex_value_locked calls __cpy_from_user_inatomic with
pagefault_disable.
Who make sure the user page is mapped at app's page table ?
--
Kinds regards,
Minchan Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 2:12 Question about PRIVATE_FUTEX Minchan Kim
@ 2009-03-27 4:32 ` Minchan Kim
2009-03-27 4:56 ` Eric Dumazet
2009-03-27 8:49 ` Peter Zijlstra
2 siblings, 0 replies; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 4:32 UTC (permalink / raw)
To: Peter Zijlstra, Eric Dumazet; +Cc: lkml
Hmm, It seems even shared futex, too.
After calling get_user_pages_fast, get_futex_key calls unlock, put_page, too.
Then futex_wait calls get_futex_value_locked.
Wouldn't kernel reclaim the page between get_fuex_key and
get_futex_value_locked ?
How do we make sure this race condition ?
On Fri, Mar 27, 2009 at 11:12 AM, Minchan Kim <minchan.kim@gmail.com> wrote:
> Hi, Peter and Eric.
>
> I am not expert about futex.
> I am sorry if this is dumb question.
>
> If we use private futex, get_futex_key don't call get_user_pages_fast
> which pins page at page table.
> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
> pagefault_disable.
>
> Who make sure the user page is mapped at app's page table ?
>
> --
> Kinds regards,
> Minchan Kim
>
--
Kinds regards,
Minchan Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 2:12 Question about PRIVATE_FUTEX Minchan Kim
2009-03-27 4:32 ` Minchan Kim
@ 2009-03-27 4:56 ` Eric Dumazet
2009-03-27 5:20 ` Minchan Kim
2009-03-27 8:49 ` Peter Zijlstra
2 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2009-03-27 4:56 UTC (permalink / raw)
To: Minchan Kim; +Cc: Peter Zijlstra, lkml
Minchan Kim a écrit :
> Hi, Peter and Eric.
>
> I am not expert about futex.
> I am sorry if this is dumb question.
>
> If we use private futex, get_futex_key don't call get_user_pages_fast
> which pins page at page table.
> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
> pagefault_disable.
>
> Who make sure the user page is mapped at app's page table ?
>
Nothing makes sure user page is mapped, as we dont have to (for private futexes
at least, since the 'key' is a combination of the futex virtual address (not
depending on the underlying physical page) and the task mm (sort of a static
offset per task)
If no page is mapped, a normal error should be returned to user, since
access to futex location will trigger a fault.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 4:56 ` Eric Dumazet
@ 2009-03-27 5:20 ` Minchan Kim
2009-03-27 5:50 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 5:20 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Peter Zijlstra, lkml
Thanks for kind explanation.
On Fri, Mar 27, 2009 at 1:56 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> Minchan Kim a écrit :
>> Hi, Peter and Eric.
>>
>> I am not expert about futex.
>> I am sorry if this is dumb question.
>>
>> If we use private futex, get_futex_key don't call get_user_pages_fast
>> which pins page at page table.
>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>> pagefault_disable.
>>
>> Who make sure the user page is mapped at app's page table ?
>>
>
> Nothing makes sure user page is mapped, as we dont have to (for private futexes
> at least, since the 'key' is a combination of the futex virtual address (not
> depending on the underlying physical page) and the task mm (sort of a static
> offset per task)
> If no page is mapped, a normal error should be returned to user, since
> access to futex location will trigger a fault.
>
I mean as follows.
It seems even shared futex case.
After calling get_user_pages_fast, get_futex_key calls unlock_page and
put_page, too. Then futex_wait calls get_futex_value_locked.
Generally, current page->count is one and nolocked.
I think kernel reclaimer can reclaim the page.
Wouldn't kernel reclaim the page between get_fuex_key and
get_futex_value_locked ?
If kernel reclaimed the page, __copy_from_user_inatomic can happens
page fault although pagefault_disable is on.
How do we make sure this race condition ?
Do I miss something ?
--
Kinds regards,
Minchan Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 5:20 ` Minchan Kim
@ 2009-03-27 5:50 ` Eric Dumazet
2009-03-27 6:20 ` Minchan Kim
0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2009-03-27 5:50 UTC (permalink / raw)
To: Minchan Kim; +Cc: Peter Zijlstra, lkml
Minchan Kim a écrit :
> Thanks for kind explanation.
>
> On Fri, Mar 27, 2009 at 1:56 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>> Minchan Kim a écrit :
>>> Hi, Peter and Eric.
>>>
>>> I am not expert about futex.
>>> I am sorry if this is dumb question.
>>>
>>> If we use private futex, get_futex_key don't call get_user_pages_fast
>>> which pins page at page table.
>>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>>> pagefault_disable.
>>>
>>> Who make sure the user page is mapped at app's page table ?
>>>
>> Nothing makes sure user page is mapped, as we dont have to (for private futexes
>> at least, since the 'key' is a combination of the futex virtual address (not
>> depending on the underlying physical page) and the task mm (sort of a static
>> offset per task)
>> If no page is mapped, a normal error should be returned to user, since
>> access to futex location will trigger a fault.
>>
>
> I mean as follows.
> It seems even shared futex case.
>
> After calling get_user_pages_fast, get_futex_key calls unlock_page and
> put_page, too. Then futex_wait calls get_futex_value_locked.
>
> Generally, current page->count is one and nolocked.
> I think kernel reclaimer can reclaim the page.
>
> Wouldn't kernel reclaim the page between get_fuex_key and
> get_futex_value_locked ?
> If kernel reclaimed the page, __copy_from_user_inatomic can happens
> page fault although pagefault_disable is on.
>
> How do we make sure this race condition ?
> Do I miss something ?
>
Hmmm, so your question is not about PRIVATE futexes, but shared ones.
I guess if page is no more present, its not a problem since
get_futex_value_locked() returns an error. We then take a slow
path, calling get_user() and retrying whole futex logic.
However, comment at line 1213 is misleading I guess, since
we dont hold mmap semaphore anymore ?
* for shared futexes, we hold the mmap semaphore, so the mapping
* cannot have changed since we looked it up in get_futex_key.
*/
ret = get_futex_value_locked(&uval, uaddr);
So if page was un-mapped by another thread, and re-mapped to another physical
page, then this thread might sleep on 'kernel futex' not anymore reachable...
User error, as it is not supposed to happen in a sane program, undefined
result...
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 5:50 ` Eric Dumazet
@ 2009-03-27 6:20 ` Minchan Kim
0 siblings, 0 replies; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 6:20 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Peter Zijlstra, lkml
On Fri, Mar 27, 2009 at 2:50 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
> Minchan Kim a écrit :
>> Thanks for kind explanation.
>>
>> On Fri, Mar 27, 2009 at 1:56 PM, Eric Dumazet <dada1@cosmosbay.com> wrote:
>>> Minchan Kim a écrit :
>>>> Hi, Peter and Eric.
>>>>
>>>> I am not expert about futex.
>>>> I am sorry if this is dumb question.
>>>>
>>>> If we use private futex, get_futex_key don't call get_user_pages_fast
>>>> which pins page at page table.
>>>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>>>> pagefault_disable.
>>>>
>>>> Who make sure the user page is mapped at app's page table ?
>>>>
>>> Nothing makes sure user page is mapped, as we dont have to (for private futexes
>>> at least, since the 'key' is a combination of the futex virtual address (not
>>> depending on the underlying physical page) and the task mm (sort of a static
>>> offset per task)
>>> If no page is mapped, a normal error should be returned to user, since
>>> access to futex location will trigger a fault.
>>>
>>
>> I mean as follows.
>> It seems even shared futex case.
>>
>> After calling get_user_pages_fast, get_futex_key calls unlock_page and
>> put_page, too. Then futex_wait calls get_futex_value_locked.
>>
>> Generally, current page->count is one and nolocked.
>> I think kernel reclaimer can reclaim the page.
>>
>> Wouldn't kernel reclaim the page between get_fuex_key and
>> get_futex_value_locked ?
>> If kernel reclaimed the page, __copy_from_user_inatomic can happens
>> page fault although pagefault_disable is on.
>>
>> How do we make sure this race condition ?
>> Do I miss something ?
>>
>
> Hmmm, so your question is not about PRIVATE futexes, but shared ones.
>
> I guess if page is no more present, its not a problem since
> get_futex_value_locked() returns an error. We then take a slow
> path, calling get_user() and retrying whole futex logic.
Indeed.
I misunderstood about __copy_from_user_inatomic.
It never sleep.
> However, comment at line 1213 is misleading I guess, since
> we dont hold mmap semaphore anymore ?
>
> * for shared futexes, we hold the mmap semaphore, so the mapping
> * cannot have changed since we looked it up in get_futex_key.
> */
> ret = get_futex_value_locked(&uval, uaddr);
>
> So if page was un-mapped by another thread, and re-mapped to another physical
> page, then this thread might sleep on 'kernel futex' not anymore reachable...
>
> User error, as it is not supposed to happen in a sane program, undefined
> result...
Yes. How about removing confusing comments ?
Thanks for great explanation :)
--
Kinds regards,
Minchan Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 2:12 Question about PRIVATE_FUTEX Minchan Kim
2009-03-27 4:32 ` Minchan Kim
2009-03-27 4:56 ` Eric Dumazet
@ 2009-03-27 8:49 ` Peter Zijlstra
2009-03-27 10:56 ` Minchan Kim
2 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2009-03-27 8:49 UTC (permalink / raw)
To: Minchan Kim; +Cc: Eric Dumazet, lkml, Darren Hart
On Fri, 2009-03-27 at 11:12 +0900, Minchan Kim wrote:
> Hi, Peter and Eric.
>
> I am not expert about futex.
> I am sorry if this is dumb question.
>
> If we use private futex, get_futex_key don't call get_user_pages_fast
> which pins page at page table.
But also drops that page ref at the end of get_futex_key(). The whole
and only purpose of using get_user_pages_fast() is to get at the mapping
data without having to obtain the mmap_sem.
> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
> pagefault_disable.
>
> Who make sure the user page is mapped at app's page table ?
Nobody, all uses of get_futex_value_locked() have to deal with it
returning -EFAULT.
Most of this is legacy btw, from when futex ops were done under the
mmap_sem. Back then we couldn't fault because that would cause mmap_sem
recursion. Howver, now that we don't hold mmap_sem anymore we could use
a faulting user access like get_user().
Darren has been working on patches to clean that up, some of those are
already merged in the -tip tree.
HTH
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 8:49 ` Peter Zijlstra
@ 2009-03-27 10:56 ` Minchan Kim
2009-03-27 11:14 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 10:56 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Eric Dumazet, lkml, Darren Hart
Hi, Perter.
Thanks for joining this thread.
My concern is page reclaimer can reclaim the user page which have
futex between get_fuex_key and get_futex_value_locked.
On Fri, Mar 27, 2009 at 5:49 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2009-03-27 at 11:12 +0900, Minchan Kim wrote:
>> Hi, Peter and Eric.
>>
>> I am not expert about futex.
>> I am sorry if this is dumb question.
>>
>> If we use private futex, get_futex_key don't call get_user_pages_fast
>> which pins page at page table.
>
> But also drops that page ref at the end of get_futex_key(). The whole
> and only purpose of using get_user_pages_fast() is to get at the mapping
> data without having to obtain the mmap_sem.
Thanks.
I understand. It's not only private futex.
>
>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>> pagefault_disable.
>>
>> Who make sure the user page is mapped at app's page table ?
>
> Nobody, all uses of get_futex_value_locked() have to deal with it
> returning -EFAULT.
Does It mean that __copy_from_user_inatomic in get_futex_value_locked
would be failed rather than sleep?
In fact, I don't make sure _copy_from_user_inatomic function's meaning.
As far as I understand, It never sleep. It just can be failed in case
of user page isn't mapped. Is right ?
Otherwise, it can be scheduled with pagefault_disable which increments
preempt_count. It is a atomic bug.
If my assume is right, it can be failed rather than sleep.
At this case, other architecture implements __copy_from_user_inatomic
with __copy_from_user which can be scheduled. It also can be bug.
Hmm, Now I am confusing.
> Most of this is legacy btw, from when futex ops were done under the
> mmap_sem. Back then we couldn't fault because that would cause mmap_sem
> recursion. Howver, now that we don't hold mmap_sem anymore we could use
> a faulting user access like get_user().
> Darren has been working on patches to clean that up, some of those are
> already merged in the -tip tree.
Thanks for good information.
It will be very desirable way to enhance kernel performance.
> HTH
>
--
Kinds regards,
Minchan Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 10:56 ` Minchan Kim
@ 2009-03-27 11:14 ` Peter Zijlstra
2009-03-27 11:37 ` Minchan Kim
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2009-03-27 11:14 UTC (permalink / raw)
To: Minchan Kim; +Cc: Eric Dumazet, lkml, Darren Hart
On Fri, 2009-03-27 at 19:56 +0900, Minchan Kim wrote:
> >> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
> >> pagefault_disable.
> >>
> >> Who make sure the user page is mapped at app's page table ?
> >
> > Nobody, all uses of get_futex_value_locked() have to deal with it
> > returning -EFAULT.
>
> Does It mean that __copy_from_user_inatomic in get_futex_value_locked
> would be failed rather than sleep?
Correct.
> In fact, I don't make sure _copy_from_user_inatomic function's meaning.
> As far as I understand, It never sleep. It just can be failed in case
> of user page isn't mapped. Is right ?
Correct.
> Otherwise, it can be scheduled with pagefault_disable which increments
> preempt_count. It is a atomic bug.
> If my assume is right, it can be failed rather than sleep.
> At this case, other architecture implements __copy_from_user_inatomic
> with __copy_from_user which can be scheduled. It also can be bug.
>
> Hmm, Now I am confusing.
Confused I guess ;-)
The trick is in the in_atomic() check in the pagefault handler and the
fixup section of the copy routines.
#define __copy_user(to, from, size) \
do { \
int __d0, __d1, __d2; \
__asm__ __volatile__( \
" cmp $7,%0\n" \
" jbe 1f\n" \
" movl %1,%0\n" \
" negl %0\n" \
" andl $7,%0\n" \
" subl %0,%3\n" \
"4: rep; movsb\n" \
" movl %3,%0\n" \
" shrl $2,%0\n" \
" andl $3,%3\n" \
" .align 2,0x90\n" \
"0: rep; movsl\n" \
" movl %3,%0\n" \
"1: rep; movsb\n" \
"2:\n" \
".section .fixup,\"ax\"\n" \
"5: addl %3,%0\n" \
" jmp 2b\n" \
"3: lea 0(%3,%0,4),%0\n" \
" jmp 2b\n" \
".previous\n" \
".section __ex_table,\"a\"\n" \
" .align 4\n" \
" .long 4b,5b\n" \
" .long 0b,3b\n" \
" .long 1b,2b\n" \
".previous" \
: "=&c"(size), "=&D" (__d0), "=&S" (__d1), "=r"(__d2) \
: "3"(size), "0"(size), "1"(to), "2"(from) \
: "memory"); \
} while (0)
see that __ex_table section, it tells the fault handler where to
continue in case of an atomic fault.
> > Most of this is legacy btw, from when futex ops were done under the
> > mmap_sem. Back then we couldn't fault because that would cause mmap_sem
> > recursion. Howver, now that we don't hold mmap_sem anymore we could use
> > a faulting user access like get_user().
> > Darren has been working on patches to clean that up, some of those are
> > already merged in the -tip tree.
>
> Thanks for good information.
> It will be very desirable way to enhance kernel performance.
I doubt it'll make a measurable difference, if you need to fault
performance sucks anyway. If you don't, the current code is just as
fast.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 11:14 ` Peter Zijlstra
@ 2009-03-27 11:37 ` Minchan Kim
2009-03-27 15:43 ` Darren Hart
0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2009-03-27 11:37 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Eric Dumazet, lkml, Darren Hart
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 5651 bytes --]
On Fri, Mar 27, 2009 at 8:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:> On Fri, 2009-03-27 at 19:56 +0900, Minchan Kim wrote:>>> >> Then, get_futex_value_locked calls __cpy_from_user_inatomic with>> >> pagefault_disable.>> >>>> >> Who make sure the user page is mapped at app's page table ?>> >>> > Nobody, all uses of get_futex_value_locked() have to deal with it>> > returning -EFAULT.>>>> Does It mean that __copy_from_user_inatomic in get_futex_value_locked>> would be failed rather than sleep?>> Correct.>>> In fact, I don't make sure _copy_from_user_inatomic function's meaning.>> As far as I understand, It never sleep. It just can be failed in case>> of user page isn't mapped. Is right ?>> Correct.>>> Otherwise, it can be scheduled with pagefault_disable which increments>> preempt_count. It is a atomic bug.>> If my assume is right, it can be failed rather than sleep.>> At this case, other architecture implements __copy_from_user_inatomic>> with __copy_from_user which can be scheduled. It also can be bug.>>>> Hmm, Now I am confusing.>> Confused I guess ;-)> The trick is in the in_atomic() check in the pagefault handler and the> fixup section of the copy routines.
Whew~, There was good hidden trick.I will dive into this assembly.I always thanks for your kindness. :)
> #define __copy_user(to, from, size)                   \> do {                                   \>     int __d0, __d1, __d2;                      \>     __asm__ __volatile__(                      \>         "    cmp  $7,%0\n"                  \>         "    jbe  1f\n"                    \>         "    movl %1,%0\n"                  \>         "    negl %0\n"                    \>         "    andl $7,%0\n"                  \>         "    subl %0,%3\n"                  \>         "4:   rep; movsb\n"                  \>         "    movl %3,%0\n"                  \>         "    shrl $2,%0\n"                  \>         "    andl $3,%3\n"                  \>         "    .align 2,0x90\n"                 \>         "0:   rep; movsl\n"                  \>         "    movl %3,%0\n"                  \>         "1:   rep; movsb\n"                  \>         "2:\n"                          \>         ".section .fixup,\"ax\"\n"                \>         "5:   addl %3,%0\n"                  \>         "    jmp 2b\n"                    \>         "3:   lea 0(%3,%0,4),%0\n"               \>         "    jmp 2b\n"                    \>         ".previous\n"                      \>         ".section __ex_table,\"a\"\n"              \>         "    .align 4\n"                   \>         "    .long 4b,5b\n"                  \>         "    .long 0b,3b\n"                  \>         "    .long 1b,2b\n"                  \>         ".previous"                       \>         : "=&c"(size), "=&D" (__d0), "=&S" (__d1), "=r"(__d2)  \>         : "3"(size), "0"(size), "1"(to), "2"(from)        \>         : "memory");                       \> } while (0)>> see that __ex_table section, it tells the fault handler where to> continue in case of an atomic fault.>>> > Most of this is legacy btw, from when futex ops were done under the>> > mmap_sem. Back then we couldn't fault because that would cause mmap_sem>> > recursion. Howver, now that we don't hold mmap_sem anymore we could use>> > a faulting user access like get_user().>> > Darren has been working on patches to clean that up, some of those are>> > already merged in the -tip tree.>>>> Thanks for good information.>> It will be very desirable way to enhance kernel performance.>> I doubt it'll make a measurable difference, if you need to fault> performance sucks anyway. If you don't, the current code is just as> fast.>
-- Kinds regards,Minchan Kimÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Question about PRIVATE_FUTEX
2009-03-27 11:37 ` Minchan Kim
@ 2009-03-27 15:43 ` Darren Hart
0 siblings, 0 replies; 11+ messages in thread
From: Darren Hart @ 2009-03-27 15:43 UTC (permalink / raw)
To: Minchan Kim; +Cc: Peter Zijlstra, Eric Dumazet, lkml
Minchan Kim wrote:
> On Fri, Mar 27, 2009 at 8:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Fri, 2009-03-27 at 19:56 +0900, Minchan Kim wrote:
>>
>>>>> Then, get_futex_value_locked calls __cpy_from_user_inatomic with
>>>>> pagefault_disable.
>>>>>
>>>>> Who make sure the user page is mapped at app's page table ?
>>>> Nobody, all uses of get_futex_value_locked() have to deal with it
>>>> returning -EFAULT.
>>> Does It mean that __copy_from_user_inatomic in get_futex_value_locked
>>> would be failed rather than sleep?
>> Correct.
>>
>>> In fact, I don't make sure _copy_from_user_inatomic function's meaning.
>>> As far as I understand, It never sleep. It just can be failed in case
>>> of user page isn't mapped. Is right ?
>> Correct.
>>
>>> Otherwise, it can be scheduled with pagefault_disable which increments
>>> preempt_count. It is a atomic bug.
>>> If my assume is right, it can be failed rather than sleep.
>>> At this case, other architecture implements __copy_from_user_inatomic
>>> with __copy_from_user which can be scheduled. It also can be bug.
>>>
>>> Hmm, Now I am confusing.
>> Confused I guess ;-)
>> The trick is in the in_atomic() check in the pagefault handler and the
>> fixup section of the copy routines.
>
> Whew~, There was good hidden trick.
> I will dive into this assembly.
> I always thanks for your kindness. :)
>
>> #define __copy_user(to, from, size) \
>> do { \
>> int __d0, __d1, __d2; \
>> __asm__ __volatile__( \
>> " cmp $7,%0\n" \
>> " jbe 1f\n" \
>> " movl %1,%0\n" \
>> " negl %0\n" \
>> " andl $7,%0\n" \
>> " subl %0,%3\n" \
>> "4: rep; movsb\n" \
>> " movl %3,%0\n" \
>> " shrl $2,%0\n" \
>> " andl $3,%3\n" \
>> " .align 2,0x90\n" \
>> "0: rep; movsl\n" \
>> " movl %3,%0\n" \
>> "1: rep; movsb\n" \
>> "2:\n" \
>> ".section .fixup,\"ax\"\n" \
>> "5: addl %3,%0\n" \
>> " jmp 2b\n" \
>> "3: lea 0(%3,%0,4),%0\n" \
>> " jmp 2b\n" \
>> ".previous\n" \
>> ".section __ex_table,\"a\"\n" \
>> " .align 4\n" \
>> " .long 4b,5b\n" \
>> " .long 0b,3b\n" \
>> " .long 1b,2b\n" \
>> ".previous" \
>> : "=&c"(size), "=&D" (__d0), "=&S" (__d1), "=r"(__d2) \
>> : "3"(size), "0"(size), "1"(to), "2"(from) \
>> : "memory"); \
>> } while (0)
>>
>> see that __ex_table section, it tells the fault handler where to
>> continue in case of an atomic fault.
>>
>>>> Most of this is legacy btw, from when futex ops were done under the
>>>> mmap_sem. Back then we couldn't fault because that would cause mmap_sem
>>>> recursion. Howver, now that we don't hold mmap_sem anymore we could use
>>>> a faulting user access like get_user().
>>>> Darren has been working on patches to clean that up, some of those are
>>>> already merged in the -tip tree.
I'm a little late to the party I guess. Minchan, a lot of the fault
logic has been cleaned up in the tip tree, core/futexes branch. The
removes a lot of the legacy complication from the faulting paths.
However, the get_futex_key code remains the same if I remember correctly.
>>> Thanks for good information.
>>> It will be very desirable way to enhance kernel performance.
>> I doubt it'll make a measurable difference, if you need to fault
>> performance sucks anyway. If you don't, the current code is just as
>> fast.
>>
Agreed. If you are suffering performance hits from excessive paging,
consider locking your memory.
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-03-27 15:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 2:12 Question about PRIVATE_FUTEX Minchan Kim
2009-03-27 4:32 ` Minchan Kim
2009-03-27 4:56 ` Eric Dumazet
2009-03-27 5:20 ` Minchan Kim
2009-03-27 5:50 ` Eric Dumazet
2009-03-27 6:20 ` Minchan Kim
2009-03-27 8:49 ` Peter Zijlstra
2009-03-27 10:56 ` Minchan Kim
2009-03-27 11:14 ` Peter Zijlstra
2009-03-27 11:37 ` Minchan Kim
2009-03-27 15:43 ` Darren Hart
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox