[Xenomai-core] [COW-BUG] __alloc_pages called from atomic context

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-core] [COW-BUG] __alloc_pages called from atomic context
@ 2007-08-07 13:06 Jan Kiszka
  2007-08-07 13:19 ` [Xenomai-core] [Adeos-main] " Gilles Chanteperdrix
       [not found] ` <46B86FE4.3040209@domain.hid>
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 13:06 UTC (permalink / raw)
  To: Philippe Gerum, Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

Hi all,

we are getting a lot of

BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
in_atomic():1, irqs_disabled():0
 [<c010305d>] show_trace_log_lvl+0x1a/0x2f
 [<c0103156>] show_trace+0x12/0x14
 [<c0103915>] dump_stack+0x16/0x18
 [<c010c4ab>] __might_sleep+0xcd/0xd3
 [<c0149488>] __alloc_pages+0x32/0x281
 [<c014fdd2>] copy_page_range+0x221/0x41e
 [<c010ec18>] copy_process+0x9e1/0xfe2
 [<c010f415>] do_fork+0x99/0x176
 [<c0100e75>] sys_clone+0x33/0x39
 [<c0102aaf>] syscall_call+0x7/0xb
 =======================

here due to a Xenomai program issuing system() calls.

After once again dissecting the "nice" mm code (sigh...), the reason
turned out to be plain simple:

copy_pte_range(...);
  spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
  copy_one_pte(...);
    if (is_cow_mapping(vm_flags))
      alloc_page_vma(GFP_HIGHUSER, ...);
        __alloc_pages(...)
	  might_sleep_if(gfp_mask & __GFP_WAIT);

And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...

So the bad news is that the COW code in likely all i-pipe versions is
broken. But the good new is that this might be easily fixable by
providing the right gfp_mask. GFP_ATOMIC?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 13:06 [Xenomai-core] [COW-BUG] __alloc_pages called from atomic context Jan Kiszka
@ 2007-08-07 13:19 ` Gilles Chanteperdrix
       [not found] ` <46B86FE4.3040209@domain.hid>
  1 sibling, 0 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-07 13:19 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gilles Chanteperdrix, adeos-main, xenomai-core

On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Hi all,
>
> we are getting a lot of
>
> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
> in_atomic():1, irqs_disabled():0
>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
>  [<c0103156>] show_trace+0x12/0x14
>  [<c0103915>] dump_stack+0x16/0x18
>  [<c010c4ab>] __might_sleep+0xcd/0xd3
>  [<c0149488>] __alloc_pages+0x32/0x281
>  [<c014fdd2>] copy_page_range+0x221/0x41e
>  [<c010ec18>] copy_process+0x9e1/0xfe2
>  [<c010f415>] do_fork+0x99/0x176
>  [<c0100e75>] sys_clone+0x33/0x39
>  [<c0102aaf>] syscall_call+0x7/0xb
>  =======================
>
> here due to a Xenomai program issuing system() calls.
>
> After once again dissecting the "nice" mm code (sigh...), the reason
> turned out to be plain simple:
>
> copy_pte_range(...);
>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>   copy_one_pte(...);
>     if (is_cow_mapping(vm_flags))
>       alloc_page_vma(GFP_HIGHUSER, ...);
>         __alloc_pages(...)
>           might_sleep_if(gfp_mask & __GFP_WAIT);
>
> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
>
> So the bad news is that the COW code in likely all i-pipe versions is
> broken. But the good new is that this might be easily fixable by
> providing the right gfp_mask. GFP_ATOMIC?

It does not look like a good solution, you are going to empty the
GFP_ATOMIC pools. The proper solution would rather be to look at the
real mm code (I mean not the one I wrote) and see how they cope with
this issue. Maybe we can release momentarily the spinlock and recheck
the flags after allocating and reacquiring it?

-- 
                                               Gilles Chanteperdrix


^ permalink raw reply	[flat|nested] 17+ messages in thread

[parent not found: <46B86FE4.3040209@domain.hid>]

* Re: [Xenomai-core] [COW-BUG] __alloc_pages called from atomic context
       [not found] ` <46B86FE4.3040209@domain.hid>
@ 2007-08-07 14:02   ` Jan Kiszka
  2007-08-07 14:11     ` [Xenomai-core] [Adeos-main] " Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 14:02 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi all,
>>
>> we are getting a lot of
>>
>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
>> in_atomic():1, irqs_disabled():0
>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
>>  [<c0103156>] show_trace+0x12/0x14
>>  [<c0103915>] dump_stack+0x16/0x18
>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
>>  [<c0149488>] __alloc_pages+0x32/0x281
>>  [<c014fdd2>] copy_page_range+0x221/0x41e
>>  [<c010ec18>] copy_process+0x9e1/0xfe2
>>  [<c010f415>] do_fork+0x99/0x176
>>  [<c0100e75>] sys_clone+0x33/0x39
>>  [<c0102aaf>] syscall_call+0x7/0xb
>>  =======================
>>
>> here due to a Xenomai program issuing system() calls.
>>
>> After once again dissecting the "nice" mm code (sigh...), the reason
>> turned out to be plain simple:
>>
>> copy_pte_range(...);
>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>>   copy_one_pte(...);
>>     if (is_cow_mapping(vm_flags))
>>       alloc_page_vma(GFP_HIGHUSER, ...);
>>         __alloc_pages(...)
>> 	  might_sleep_if(gfp_mask & __GFP_WAIT);
>>
>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
>>
>> So the bad news is that the COW code in likely all i-pipe versions is
>> broken. But the good new is that this might be easily fixable by
>> providing the right gfp_mask. GFP_ATOMIC?
> 
> It does not look like a good solution, you are going to empty the
> GFP_ATOMIC pools. The proper solution would rather be to look at the
> real mm code (I mean not the one I wrote) and see how they cope with
> this issue.

Mmpf. What are the chances for a quick fix within the next days? We have
to consider alternatives right now here because the whole system is
meant for production purpose next week (C-ELROB '07).

OK, I'm already finding myself inside the code :-/. What about this
approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
out, drop all locks (just like it happens in case of need_resched()),
try to fill up the pool, and restart then. What would reliably make
Linux refill its atomic pool?

Alternative approach: preallocate the required pages before entering the
loop in copy_pte_range. But that may require more code changes.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:02   ` [Xenomai-core] " Jan Kiszka
@ 2007-08-07 14:11     ` Gilles Chanteperdrix
  2007-08-07 14:20       ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-07 14:11 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gilles Chanteperdrix, adeos-main, xenomai-core

On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Gilles Chanteperdrix wrote:
> > Jan Kiszka wrote:
> >> Hi all,
> >>
> >> we are getting a lot of
> >>
> >> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
> >> in_atomic():1, irqs_disabled():0
> >>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
> >>  [<c0103156>] show_trace+0x12/0x14
> >>  [<c0103915>] dump_stack+0x16/0x18
> >>  [<c010c4ab>] __might_sleep+0xcd/0xd3
> >>  [<c0149488>] __alloc_pages+0x32/0x281
> >>  [<c014fdd2>] copy_page_range+0x221/0x41e
> >>  [<c010ec18>] copy_process+0x9e1/0xfe2
> >>  [<c010f415>] do_fork+0x99/0x176
> >>  [<c0100e75>] sys_clone+0x33/0x39
> >>  [<c0102aaf>] syscall_call+0x7/0xb
> >>  =======================
> >>
> >> here due to a Xenomai program issuing system() calls.
> >>
> >> After once again dissecting the "nice" mm code (sigh...), the reason
> >> turned out to be plain simple:
> >>
> >> copy_pte_range(...);
> >>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> >>   copy_one_pte(...);
> >>     if (is_cow_mapping(vm_flags))
> >>       alloc_page_vma(GFP_HIGHUSER, ...);
> >>         __alloc_pages(...)
> >>        might_sleep_if(gfp_mask & __GFP_WAIT);
> >>
> >> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
> >>
> >> So the bad news is that the COW code in likely all i-pipe versions is
> >> broken. But the good new is that this might be easily fixable by
> >> providing the right gfp_mask. GFP_ATOMIC?
> >
> > It does not look like a good solution, you are going to empty the
> > GFP_ATOMIC pools. The proper solution would rather be to look at the
> > real mm code (I mean not the one I wrote) and see how they cope with
> > this issue.
>
> Mmpf. What are the chances for a quick fix within the next days? We have
> to consider alternatives right now here because the whole system is
> meant for production purpose next week (C-ELROB '07).
>
> OK, I'm already finding myself inside the code :-/. What about this
> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
> out, drop all locks (just like it happens in case of need_resched()),
> try to fill up the pool, and restart then. What would reliably make
> Linux refill its atomic pool?
>
> Alternative approach: preallocate the required pages before entering the
> loop in copy_pte_range. But that may require more code changes.

I would say the real fix is to drop momentarily the spinlock(s?) for allocating.

-- 
                                               Gilles Chanteperdrix


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:11     ` [Xenomai-core] [Adeos-main] " Gilles Chanteperdrix
@ 2007-08-07 14:20       ` Jan Kiszka
  2007-08-07 14:24         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 14:20 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2726 bytes --]

Gilles Chanteperdrix wrote:
> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Hi all,
>>>>
>>>> we are getting a lot of
>>>>
>>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
>>>> in_atomic():1, irqs_disabled():0
>>>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
>>>>  [<c0103156>] show_trace+0x12/0x14
>>>>  [<c0103915>] dump_stack+0x16/0x18
>>>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
>>>>  [<c0149488>] __alloc_pages+0x32/0x281
>>>>  [<c014fdd2>] copy_page_range+0x221/0x41e
>>>>  [<c010ec18>] copy_process+0x9e1/0xfe2
>>>>  [<c010f415>] do_fork+0x99/0x176
>>>>  [<c0100e75>] sys_clone+0x33/0x39
>>>>  [<c0102aaf>] syscall_call+0x7/0xb
>>>>  =======================
>>>>
>>>> here due to a Xenomai program issuing system() calls.
>>>>
>>>> After once again dissecting the "nice" mm code (sigh...), the reason
>>>> turned out to be plain simple:
>>>>
>>>> copy_pte_range(...);
>>>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>>>>   copy_one_pte(...);
>>>>     if (is_cow_mapping(vm_flags))
>>>>       alloc_page_vma(GFP_HIGHUSER, ...);
>>>>         __alloc_pages(...)
>>>>        might_sleep_if(gfp_mask & __GFP_WAIT);
>>>>
>>>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
>>>>
>>>> So the bad news is that the COW code in likely all i-pipe versions is
>>>> broken. But the good new is that this might be easily fixable by
>>>> providing the right gfp_mask. GFP_ATOMIC?
>>> It does not look like a good solution, you are going to empty the
>>> GFP_ATOMIC pools. The proper solution would rather be to look at the
>>> real mm code (I mean not the one I wrote) and see how they cope with
>>> this issue.
>> Mmpf. What are the chances for a quick fix within the next days? We have
>> to consider alternatives right now here because the whole system is
>> meant for production purpose next week (C-ELROB '07).
>>
>> OK, I'm already finding myself inside the code :-/. What about this
>> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
>> out, drop all locks (just like it happens in case of need_resched()),
>> try to fill up the pool, and restart then. What would reliably make
>> Linux refill its atomic pool?
>>
>> Alternative approach: preallocate the required pages before entering the
>> loop in copy_pte_range. But that may require more code changes.
> 
> I would say the real fix is to drop momentarily the spinlock(s?) for allocating.
> 

Are you sure it's safe to drop locks in the (logical) middle of
copy_one_pte()? I can't tell yet from the few glances I took. It's just
my feeling that says "no" so far.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:20       ` Jan Kiszka
@ 2007-08-07 14:24         ` Gilles Chanteperdrix
  2007-08-07 14:46           ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-07 14:24 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Gilles Chanteperdrix wrote:
> > On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> >> Gilles Chanteperdrix wrote:
> >>> Jan Kiszka wrote:
> >>>> Hi all,
> >>>>
> >>>> we are getting a lot of
> >>>>
> >>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
> >>>> in_atomic():1, irqs_disabled():0
> >>>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
> >>>>  [<c0103156>] show_trace+0x12/0x14
> >>>>  [<c0103915>] dump_stack+0x16/0x18
> >>>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
> >>>>  [<c0149488>] __alloc_pages+0x32/0x281
> >>>>  [<c014fdd2>] copy_page_range+0x221/0x41e
> >>>>  [<c010ec18>] copy_process+0x9e1/0xfe2
> >>>>  [<c010f415>] do_fork+0x99/0x176
> >>>>  [<c0100e75>] sys_clone+0x33/0x39
> >>>>  [<c0102aaf>] syscall_call+0x7/0xb
> >>>>  =======================
> >>>>
> >>>> here due to a Xenomai program issuing system() calls.
> >>>>
> >>>> After once again dissecting the "nice" mm code (sigh...), the reason
> >>>> turned out to be plain simple:
> >>>>
> >>>> copy_pte_range(...);
> >>>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> >>>>   copy_one_pte(...);
> >>>>     if (is_cow_mapping(vm_flags))
> >>>>       alloc_page_vma(GFP_HIGHUSER, ...);
> >>>>         __alloc_pages(...)
> >>>>        might_sleep_if(gfp_mask & __GFP_WAIT);
> >>>>
> >>>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
> >>>>
> >>>> So the bad news is that the COW code in likely all i-pipe versions is
> >>>> broken. But the good new is that this might be easily fixable by
> >>>> providing the right gfp_mask. GFP_ATOMIC?
> >>> It does not look like a good solution, you are going to empty the
> >>> GFP_ATOMIC pools. The proper solution would rather be to look at the
> >>> real mm code (I mean not the one I wrote) and see how they cope with
> >>> this issue.
> >> Mmpf. What are the chances for a quick fix within the next days? We have
> >> to consider alternatives right now here because the whole system is
> >> meant for production purpose next week (C-ELROB '07).
> >>
> >> OK, I'm already finding myself inside the code :-/. What about this
> >> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
> >> out, drop all locks (just like it happens in case of need_resched()),
> >> try to fill up the pool, and restart then. What would reliably make
> >> Linux refill its atomic pool?
> >>
> >> Alternative approach: preallocate the required pages before entering the
> >> loop in copy_pte_range. But that may require more code changes.
> >
> > I would say the real fix is to drop momentarily the spinlock(s?) for allocating.
> >
>
> Are you sure it's safe to drop locks in the (logical) middle of
> copy_one_pte()? I can't tell yet from the few glances I took. It's just
> my feeling that says "no" so far.

There is certainly something possible, since the vanilla kernel
actually works without these warning.

-- 
                                               Gilles Chanteperdrix


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:24         ` Gilles Chanteperdrix
@ 2007-08-07 14:46           ` Jan Kiszka
  2007-08-07 14:56             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 14:46 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 4836 bytes --]

Gilles Chanteperdrix wrote:
> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>> Gilles Chanteperdrix wrote:
>>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> we are getting a lot of
>>>>>>
>>>>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
>>>>>> in_atomic():1, irqs_disabled():0
>>>>>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
>>>>>>  [<c0103156>] show_trace+0x12/0x14
>>>>>>  [<c0103915>] dump_stack+0x16/0x18
>>>>>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
>>>>>>  [<c0149488>] __alloc_pages+0x32/0x281
>>>>>>  [<c014fdd2>] copy_page_range+0x221/0x41e
>>>>>>  [<c010ec18>] copy_process+0x9e1/0xfe2
>>>>>>  [<c010f415>] do_fork+0x99/0x176
>>>>>>  [<c0100e75>] sys_clone+0x33/0x39
>>>>>>  [<c0102aaf>] syscall_call+0x7/0xb
>>>>>>  =======================
>>>>>>
>>>>>> here due to a Xenomai program issuing system() calls.
>>>>>>
>>>>>> After once again dissecting the "nice" mm code (sigh...), the reason
>>>>>> turned out to be plain simple:
>>>>>>
>>>>>> copy_pte_range(...);
>>>>>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>>>>>>   copy_one_pte(...);
>>>>>>     if (is_cow_mapping(vm_flags))
>>>>>>       alloc_page_vma(GFP_HIGHUSER, ...);
>>>>>>         __alloc_pages(...)
>>>>>>        might_sleep_if(gfp_mask & __GFP_WAIT);
>>>>>>
>>>>>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
>>>>>>
>>>>>> So the bad news is that the COW code in likely all i-pipe versions is
>>>>>> broken. But the good new is that this might be easily fixable by
>>>>>> providing the right gfp_mask. GFP_ATOMIC?
>>>>> It does not look like a good solution, you are going to empty the
>>>>> GFP_ATOMIC pools. The proper solution would rather be to look at the
>>>>> real mm code (I mean not the one I wrote) and see how they cope with
>>>>> this issue.
>>>> Mmpf. What are the chances for a quick fix within the next days? We have
>>>> to consider alternatives right now here because the whole system is
>>>> meant for production purpose next week (C-ELROB '07).
>>>>
>>>> OK, I'm already finding myself inside the code :-/. What about this
>>>> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
>>>> out, drop all locks (just like it happens in case of need_resched()),
>>>> try to fill up the pool, and restart then. What would reliably make
>>>> Linux refill its atomic pool?
>>>>
>>>> Alternative approach: preallocate the required pages before entering the
>>>> loop in copy_pte_range. But that may require more code changes.
>>> I would say the real fix is to drop momentarily the spinlock(s?) for allocating.
>>>
>> Are you sure it's safe to drop locks in the (logical) middle of
>> copy_one_pte()? I can't tell yet from the few glances I took. It's just
>> my feeling that says "no" so far.
> 
> There is certainly something possible, since the vanilla kernel
> actually works without these warning.

Vanilla doesn't allocate pages from within copy_one_pte.


So, here comes proposal #1 for a solution, also addressing the broken
ENOMEM-error path of the current patch. Only compile tested yet. Might
this work?

Jan


---
 mm/memory.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

Index: linux-2.6.20.15/mm/memory.c
===================================================================
--- linux-2.6.20.15.orig/mm/memory.c
+++ linux-2.6.20.15/mm/memory.c
@@ -498,7 +498,7 @@ copy_one_pte(struct mm_struct *dst_mm, s
 #ifdef CONFIG_IPIPE
 		if (((vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED)) == (VM_LOCKED|VM_PINNED)) {
 			struct page *old_page = vm_normal_page(vma, addr, pte);
-			page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+			page = alloc_page_vma(GFP_ATOMIC, vma, addr);
 			if (!page)
 				return -ENOMEM;
 
@@ -546,6 +546,8 @@ static int copy_pte_range(struct mm_stru
 	spinlock_t *src_ptl, *dst_ptl;
 	int progress = 0;
 	int rss[2];
+	int nomem_retries = 10; /* Arbitrary limit */
+	int err = 0;
 
 again:
 	rss[1] = rss[0] = 0;
@@ -574,8 +576,12 @@ again:
 			continue;
 		}
 		if (copy_one_pte(dst_mm, src_mm, dst_pte,
-				 src_pte, vma, addr, rss))
-			return -ENOMEM;
+				 src_pte, vma, addr, rss)) {
+			if (--nomem_retries == 0)
+				err = -ENOMEM;
+			break;
+		}
+		nomem_retries = 10;
 		progress += 8;
 	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
 
@@ -585,9 +591,9 @@ again:
 	add_mm_rss(dst_mm, rss[0], rss[1]);
 	pte_unmap_unlock(dst_pte - 1, dst_ptl);
 	cond_resched();
-	if (addr != end)
+	if (addr != end && !err)
 		goto again;
-	return 0;
+	return err;
 }
 
 static inline int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:46           ` Jan Kiszka
@ 2007-08-07 14:56             ` Gilles Chanteperdrix
  2007-08-07 14:59               ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-07 14:56 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Gilles Chanteperdrix wrote:
> > On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> >> Gilles Chanteperdrix wrote:
> >>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> >>>> Gilles Chanteperdrix wrote:
> >>>>> Jan Kiszka wrote:
> >>>>>> Hi all,
> >>>>>>
> >>>>>> we are getting a lot of
> >>>>>>
> >>>>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
> >>>>>> in_atomic():1, irqs_disabled():0
> >>>>>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
> >>>>>>  [<c0103156>] show_trace+0x12/0x14
> >>>>>>  [<c0103915>] dump_stack+0x16/0x18
> >>>>>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
> >>>>>>  [<c0149488>] __alloc_pages+0x32/0x281
> >>>>>>  [<c014fdd2>] copy_page_range+0x221/0x41e
> >>>>>>  [<c010ec18>] copy_process+0x9e1/0xfe2
> >>>>>>  [<c010f415>] do_fork+0x99/0x176
> >>>>>>  [<c0100e75>] sys_clone+0x33/0x39
> >>>>>>  [<c0102aaf>] syscall_call+0x7/0xb
> >>>>>>  =======================
> >>>>>>
> >>>>>> here due to a Xenomai program issuing system() calls.
> >>>>>>
> >>>>>> After once again dissecting the "nice" mm code (sigh...), the reason
> >>>>>> turned out to be plain simple:
> >>>>>>
> >>>>>> copy_pte_range(...);
> >>>>>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> >>>>>>   copy_one_pte(...);
> >>>>>>     if (is_cow_mapping(vm_flags))
> >>>>>>       alloc_page_vma(GFP_HIGHUSER, ...);
> >>>>>>         __alloc_pages(...)
> >>>>>>        might_sleep_if(gfp_mask & __GFP_WAIT);
> >>>>>>
> >>>>>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
> >>>>>>
> >>>>>> So the bad news is that the COW code in likely all i-pipe versions is
> >>>>>> broken. But the good new is that this might be easily fixable by
> >>>>>> providing the right gfp_mask. GFP_ATOMIC?
> >>>>> It does not look like a good solution, you are going to empty the
> >>>>> GFP_ATOMIC pools. The proper solution would rather be to look at the
> >>>>> real mm code (I mean not the one I wrote) and see how they cope with
> >>>>> this issue.
> >>>> Mmpf. What are the chances for a quick fix within the next days? We have
> >>>> to consider alternatives right now here because the whole system is
> >>>> meant for production purpose next week (C-ELROB '07).
> >>>>
> >>>> OK, I'm already finding myself inside the code :-/. What about this
> >>>> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
> >>>> out, drop all locks (just like it happens in case of need_resched()),
> >>>> try to fill up the pool, and restart then. What would reliably make
> >>>> Linux refill its atomic pool?
> >>>>
> >>>> Alternative approach: preallocate the required pages before entering the
> >>>> loop in copy_pte_range. But that may require more code changes.
> >>> I would say the real fix is to drop momentarily the spinlock(s?) for allocating.
> >>>
> >> Are you sure it's safe to drop locks in the (logical) middle of
> >> copy_one_pte()? I can't tell yet from the few glances I took. It's just
> >> my feeling that says "no" so far.
> >
> > There is certainly something possible, since the vanilla kernel
> > actually works without these warning.
>
> Vanilla doesn't allocate pages from within copy_one_pte.

The fact that you are in a hurry should not be an excuse to propose a
fix which is much worse than the bug itself.

-- 
                                               Gilles Chanteperdrix


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:56             ` Gilles Chanteperdrix
@ 2007-08-07 14:59               ` Jan Kiszka
  2007-08-07 15:09                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 14:59 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 3452 bytes --]

Gilles Chanteperdrix wrote:
> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>> Gilles Chanteperdrix wrote:
>>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>>>>> Gilles Chanteperdrix wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> we are getting a lot of
>>>>>>>>
>>>>>>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:1225
>>>>>>>> in_atomic():1, irqs_disabled():0
>>>>>>>>  [<c010305d>] show_trace_log_lvl+0x1a/0x2f
>>>>>>>>  [<c0103156>] show_trace+0x12/0x14
>>>>>>>>  [<c0103915>] dump_stack+0x16/0x18
>>>>>>>>  [<c010c4ab>] __might_sleep+0xcd/0xd3
>>>>>>>>  [<c0149488>] __alloc_pages+0x32/0x281
>>>>>>>>  [<c014fdd2>] copy_page_range+0x221/0x41e
>>>>>>>>  [<c010ec18>] copy_process+0x9e1/0xfe2
>>>>>>>>  [<c010f415>] do_fork+0x99/0x176
>>>>>>>>  [<c0100e75>] sys_clone+0x33/0x39
>>>>>>>>  [<c0102aaf>] syscall_call+0x7/0xb
>>>>>>>>  =======================
>>>>>>>>
>>>>>>>> here due to a Xenomai program issuing system() calls.
>>>>>>>>
>>>>>>>> After once again dissecting the "nice" mm code (sigh...), the reason
>>>>>>>> turned out to be plain simple:
>>>>>>>>
>>>>>>>> copy_pte_range(...);
>>>>>>>>   spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>>>>>>>>   copy_one_pte(...);
>>>>>>>>     if (is_cow_mapping(vm_flags))
>>>>>>>>       alloc_page_vma(GFP_HIGHUSER, ...);
>>>>>>>>         __alloc_pages(...)
>>>>>>>>        might_sleep_if(gfp_mask & __GFP_WAIT);
>>>>>>>>
>>>>>>>> And this is true due to #define GFP_HIGHUSER (__GFP_WAIT | ...
>>>>>>>>
>>>>>>>> So the bad news is that the COW code in likely all i-pipe versions is
>>>>>>>> broken. But the good new is that this might be easily fixable by
>>>>>>>> providing the right gfp_mask. GFP_ATOMIC?
>>>>>>> It does not look like a good solution, you are going to empty the
>>>>>>> GFP_ATOMIC pools. The proper solution would rather be to look at the
>>>>>>> real mm code (I mean not the one I wrote) and see how they cope with
>>>>>>> this issue.
>>>>>> Mmpf. What are the chances for a quick fix within the next days? We have
>>>>>> to consider alternatives right now here because the whole system is
>>>>>> meant for production purpose next week (C-ELROB '07).
>>>>>>
>>>>>> OK, I'm already finding myself inside the code :-/. What about this
>>>>>> approach: We try to alloc with GFP_ATOMIC. Once this fails, we break
>>>>>> out, drop all locks (just like it happens in case of need_resched()),
>>>>>> try to fill up the pool, and restart then. What would reliably make
>>>>>> Linux refill its atomic pool?
>>>>>>
>>>>>> Alternative approach: preallocate the required pages before entering the
>>>>>> loop in copy_pte_range. But that may require more code changes.
>>>>> I would say the real fix is to drop momentarily the spinlock(s?) for allocating.
>>>>>
>>>> Are you sure it's safe to drop locks in the (logical) middle of
>>>> copy_one_pte()? I can't tell yet from the few glances I took. It's just
>>>> my feeling that says "no" so far.
>>> There is certainly something possible, since the vanilla kernel
>>> actually works without these warning.
>> Vanilla doesn't allocate pages from within copy_one_pte.
> 
> The fact that you are in a hurry should not be an excuse to propose a
> fix which is much worse than the bug itself.

Please explain.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 14:59               ` Jan Kiszka
@ 2007-08-07 15:09                 ` Gilles Chanteperdrix
  2007-08-07 16:55                   ` Philippe Gerum
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-07 15:09 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Gilles Chanteperdrix wrote:
> > The fact that you are in a hurry should not be an excuse to propose a
> > fix which is much worse than the bug itself.
>
> Please explain.

Using GFP_ATOMIC is not a good idea, because the kernel needs the
GFP_ATOMIC pools for allocation that take place from really atomic
contexts such as interruption handlers for instance. fork should not
be an atomic context.

So the only reason I see why you propose this patch is because you are
in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
locally, but please do not make
this an official patch.

-- 
                                               Gilles Chanteperdrix

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 15:09                 ` Gilles Chanteperdrix
@ 2007-08-07 16:55                   ` Philippe Gerum
  2007-08-07 17:01                     ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Gerum @ 2007-08-07 16:55 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, Jan Kiszka, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> > Gilles Chanteperdrix wrote:
> > > The fact that you are in a hurry should not be an excuse to propose a
> > > fix which is much worse than the bug itself.
> >
> > Please explain.
> 
> Using GFP_ATOMIC is not a good idea, because the kernel needs the
> GFP_ATOMIC pools for allocation that take place from really atomic
> contexts such as interruption handlers for instance. fork should not
> be an atomic context.
> 
> So the only reason I see why you propose this patch is because you are
> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
> locally, but please do not make
> this an official patch.
> 

I'm afraid you are both right. Draining pages from the atomic pool might
starve truely atomic contexts, and releasing the lock guarding the
pagetable pages for the source mm seems contradictory with what the
current code tries to achieve in holding it.

Here is a patch preallocating the page for the cow-breaking code, which
relies on the lock breaking pattern already in place. Not exactly
pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
Needs SMP testing before any attempt is made to merge it.

-- 
Philippe.


[-- Attachment #2: breaking-cow-might-sleep.patch --]
[-- Type: text/x-patch, Size: 2677 bytes --]

--- 2.6.23-rc2/mm/memory.c	2007-08-05 17:51:09.000000000 +0200
+++ 2.6.23-rc2-x86/mm/memory.c	2007-08-07 18:44:19.000000000 +0200
@@ -453,8 +453,8 @@
 
 static inline int
 copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
-		pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
-		unsigned long addr, int *rss)
+	     pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
+	     unsigned long addr, int *rss, struct page *uncow_page)
 {
 	unsigned long vm_flags = vma->vm_flags;
 	pte_t pte = *src_pte;
@@ -494,21 +494,17 @@
 	 */
 	if (is_cow_mapping(vm_flags)) {
 #ifdef CONFIG_IPIPE
-		if (((vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED)) == (VM_LOCKED|VM_PINNED)) {
+		if (uncow_page) {
 			struct page *old_page = vm_normal_page(vma, addr, pte);
-			page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
-			if (!page)
-				return -ENOMEM;
-
-			cow_user_page(page, old_page, addr, vma);
-			pte = mk_pte(page, vma->vm_page_prot);
+			cow_user_page(uncow_page, old_page, addr, vma);
+			pte = mk_pte(uncow_page, vma->vm_page_prot);
 			
 			if (vm_flags & VM_SHARED)
 				pte = pte_mkclean(pte);
 			pte = pte_mkold(pte);
 
-			page_dup_rmap(page, vma, addr);
-			rss[!!PageAnon(page)]++;
+			page_dup_rmap(uncow_page, vma, addr);
+			rss[!!PageAnon(uncow_page)]++;
 			goto out_set_pte;
 		}
 #endif /* CONFIG_IPIPE */
@@ -542,14 +538,26 @@
 {
 	pte_t *src_pte, *dst_pte;
 	spinlock_t *src_ptl, *dst_ptl;
-	int progress = 0;
+	int progress = 0, do_cow_break = 0;
+	struct page *uncow_page = NULL;
 	int rss[2];
 
 again:
+#ifdef CONFIG_IPIPE
+	if (do_cow_break) {
+		uncow_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+		if (!uncow_page)
+			return -ENOMEM;
+		do_cow_break = 0;
+	}
+#endif
 	rss[1] = rss[0] = 0;
 	dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_ptl);
-	if (!dst_pte)
+	if (!dst_pte) {
+		if (uncow_page)
+			page_cache_release(uncow_page);
 		return -ENOMEM;
+	}
 	src_pte = pte_offset_map_nested(src_pmd, addr);
 	src_ptl = pte_lockptr(src_mm, src_pmd);
 	spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
@@ -571,9 +579,21 @@
 			progress++;
 			continue;
 		}
+#ifdef CONFIG_IPIPE
+		if (likely(uncow_page == NULL) && likely(pte_present(*src_pte))) {
+			if (is_cow_mapping(vma->vm_flags)) {
+				if (((vma->vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED))
+				    == (VM_LOCKED|VM_PINNED)) {
+					do_cow_break = 1;
+					break;
+				}
+			}
+		}
+#endif
 		if (copy_one_pte(dst_mm, src_mm, dst_pte,
-				 src_pte, vma, addr, rss))
+				 src_pte, vma, addr, rss, uncow_page))
 			return -ENOMEM;
+		uncow_page = NULL;
 		progress += 8;
 	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 16:55                   ` Philippe Gerum
@ 2007-08-07 17:01                     ` Jan Kiszka
  2007-08-07 17:32                       ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 17:01 UTC (permalink / raw)
  To: rpm; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 4279 bytes --]

Philippe Gerum wrote:
> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>> Gilles Chanteperdrix wrote:
>>>> The fact that you are in a hurry should not be an excuse to propose a
>>>> fix which is much worse than the bug itself.
>>> Please explain.
>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
>> GFP_ATOMIC pools for allocation that take place from really atomic
>> contexts such as interruption handlers for instance. fork should not
>> be an atomic context.
>>
>> So the only reason I see why you propose this patch is because you are
>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
>> locally, but please do not make
>> this an official patch.
>>
> 
> I'm afraid you are both right. Draining pages from the atomic pool might
> starve truely atomic contexts, and releasing the lock guarding the
> pagetable pages for the source mm seems contradictory with what the
> current code tries to achieve in holding it.
> 
> Here is a patch preallocating the page for the cow-breaking code, which
> relies on the lock breaking pattern already in place. Not exactly
> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
> Needs SMP testing before any attempt is made to merge it.

Funny. I just finished my own patch of this kind and made it pass a
basic COW stress test here. Find it below, it is very similar. Will
now analyse yours to see which one looks nicer and/or is more correct.

Jan


---
 mm/memory.c |   38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

Index: linux-2.6.20.15/mm/memory.c
===================================================================
--- linux-2.6.20.15.orig/mm/memory.c
+++ linux-2.6.20.15/mm/memory.c
@@ -456,7 +456,7 @@ static inline void cow_user_page(struct 
 static inline int
 copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
-		unsigned long addr, int *rss)
+		unsigned long addr, int *rss, struct page **new_page)
 {
 	unsigned long vm_flags = vma->vm_flags;
 	pte_t pte = *src_pte;
@@ -498,13 +498,15 @@ copy_one_pte(struct mm_struct *dst_mm, s
 #ifdef CONFIG_IPIPE
 		if (((vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED)) == (VM_LOCKED|VM_PINNED)) {
 			struct page *old_page = vm_normal_page(vma, addr, pte);
-			page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+
+			page = *new_page;
 			if (!page)
 				return -ENOMEM;
+			*new_page = NULL;
 
 			cow_user_page(page, old_page, addr, vma);
 			pte = mk_pte(page, vma->vm_page_prot);
-			
+
 			if (vm_flags & VM_SHARED)
 				pte = pte_mkclean(pte);
 			pte = pte_mkold(pte);
@@ -546,12 +548,23 @@ static int copy_pte_range(struct mm_stru
 	spinlock_t *src_ptl, *dst_ptl;
 	int progress = 0;
 	int rss[2];
+	int err = 0;
+	int need_new_pages = 0;
+	struct page *new_page = NULL;
 
 again:
+	if (need_new_pages && !new_page) {
+		new_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+		if (!new_page)
+			return -ENOMEM;
+	}
+
 	rss[1] = rss[0] = 0;
 	dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_ptl);
-	if (!dst_pte)
-		return -ENOMEM;
+	if (!dst_pte) {
+		err = -ENOMEM;
+		goto cleanup_exit;
+	}
 	src_pte = pte_offset_map_nested(src_pmd, addr);
 	src_ptl = pte_lockptr(src_mm, src_pmd);
 	spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
@@ -574,10 +587,13 @@ again:
 			continue;
 		}
 		if (copy_one_pte(dst_mm, src_mm, dst_pte,
-				 src_pte, vma, addr, rss))
-			return -ENOMEM;
+				 src_pte, vma, addr, rss, &new_page) < 0) {
+			need_new_pages = 1;
+			break;
+		}
 		progress += 8;
-	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
+	} while (dst_pte++, src_pte++, addr += PAGE_SIZE,
+		 addr != end && (!need_new_pages || new_page));
 
 	arch_leave_lazy_mmu_mode();
 	spin_unlock(src_ptl);
@@ -587,7 +603,11 @@ again:
 	cond_resched();
 	if (addr != end)
 		goto again;
-	return 0;
+
+cleanup_exit:
+	if (new_page)
+		page_cache_release(new_page);
+	return err;
 }
 
 static inline int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 17:01                     ` Jan Kiszka
@ 2007-08-07 17:32                       ` Jan Kiszka
  2007-08-08  7:28                         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2007-08-07 17:32 UTC (permalink / raw)
  To: rpm; +Cc: adeos-main, xenomai-core


[-- Attachment #1.1: Type: text/plain, Size: 1775 bytes --]

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
>>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> The fact that you are in a hurry should not be an excuse to propose a
>>>>> fix which is much worse than the bug itself.
>>>> Please explain.
>>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
>>> GFP_ATOMIC pools for allocation that take place from really atomic
>>> contexts such as interruption handlers for instance. fork should not
>>> be an atomic context.
>>>
>>> So the only reason I see why you propose this patch is because you are
>>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
>>> locally, but please do not make
>>> this an official patch.
>>>
>> I'm afraid you are both right. Draining pages from the atomic pool might
>> starve truely atomic contexts, and releasing the lock guarding the
>> pagetable pages for the source mm seems contradictory with what the
>> current code tries to achieve in holding it.
>>
>> Here is a patch preallocating the page for the cow-breaking code, which
>> relies on the lock breaking pattern already in place. Not exactly
>> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
>> Needs SMP testing before any attempt is made to merge it.
> 
> Funny. I just finished my own patch of this kind and made it pass a
> basic COW stress test here. Find it below, it is very similar. Will
> now analyse yours to see which one looks nicer and/or is more correct.

Yours is clearly nicer, but it required/allowed a few more cleanups.
Find my variant attached (against our local 2.6.20 kernel).

From my POV: problem solved. 8)

Jan

[-- Attachment #1.2: breaking-cow-might-sleep.patch --]
[-- Type: text/plain, Size: 3364 bytes --]

---
 mm/memory.c |   55 ++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 38 insertions(+), 17 deletions(-)

Index: linux-2.6.20.15/mm/memory.c
===================================================================
--- linux-2.6.20.15.orig/mm/memory.c
+++ linux-2.6.20.15/mm/memory.c
@@ -453,10 +453,10 @@ static inline void cow_user_page(struct 
  * covered by this vma.
  */
 
-static inline int
+static inline void
 copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
-		pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
-		unsigned long addr, int *rss)
+	     pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
+	     unsigned long addr, int *rss, struct page *uncow_page)
 {
 	unsigned long vm_flags = vma->vm_flags;
 	pte_t pte = *src_pte;
@@ -496,21 +496,17 @@ copy_one_pte(struct mm_struct *dst_mm, s
 	 */
 	if (is_cow_mapping(vm_flags)) {
 #ifdef CONFIG_IPIPE
-		if (((vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED)) == (VM_LOCKED|VM_PINNED)) {
+		if (uncow_page) {
 			struct page *old_page = vm_normal_page(vma, addr, pte);
-			page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
-			if (!page)
-				return -ENOMEM;
+			cow_user_page(uncow_page, old_page, addr, vma);
+			pte = mk_pte(uncow_page, vma->vm_page_prot);
 
-			cow_user_page(page, old_page, addr, vma);
-			pte = mk_pte(page, vma->vm_page_prot);
-			
 			if (vm_flags & VM_SHARED)
 				pte = pte_mkclean(pte);
 			pte = pte_mkold(pte);
 
-			page_dup_rmap(page);
-			rss[!!PageAnon(page)]++;
+			page_dup_rmap(uncow_page);
+			rss[!!PageAnon(uncow_page)]++;
 			goto out_set_pte;
 		}
 #endif /* CONFIG_IPIPE */
@@ -535,7 +531,6 @@ copy_one_pte(struct mm_struct *dst_mm, s
 
 out_set_pte:
 	set_pte_at(dst_mm, addr, dst_pte, pte);
-	return 0;
 }
 
 static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
@@ -546,12 +541,27 @@ static int copy_pte_range(struct mm_stru
 	spinlock_t *src_ptl, *dst_ptl;
 	int progress = 0;
 	int rss[2];
+#ifdef CONFIG_IPIPE
+	int do_cow_break = 0;
+#endif
+	struct page *uncow_page = NULL;
 
 again:
+#ifdef CONFIG_IPIPE
+	if (do_cow_break) {
+		uncow_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+		if (!uncow_page)
+			return -ENOMEM;
+		do_cow_break = 0;
+	}
+#endif
 	rss[1] = rss[0] = 0;
 	dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_ptl);
-	if (!dst_pte)
+	if (!dst_pte) {
+		if (uncow_page)
+			page_cache_release(uncow_page);
 		return -ENOMEM;
+	}
 	src_pte = pte_offset_map_nested(src_pmd, addr);
 	src_ptl = pte_lockptr(src_mm, src_pmd);
 	spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
@@ -573,9 +583,20 @@ again:
 			progress++;
 			continue;
 		}
-		if (copy_one_pte(dst_mm, src_mm, dst_pte,
-				 src_pte, vma, addr, rss))
-			return -ENOMEM;
+#ifdef CONFIG_IPIPE
+		if (likely(uncow_page == NULL) && likely(pte_present(*src_pte))) {
+			if (is_cow_mapping(vma->vm_flags)) {
+				if (((vma->vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED))
+				    == (VM_LOCKED|VM_PINNED)) {
+					do_cow_break = 1;
+					break;
+				}
+			}
+		}
+#endif
+		copy_one_pte(dst_mm, src_mm, dst_pte, src_pte,
+			     vma, addr, rss, uncow_page);
+		uncow_page = NULL;
 		progress += 8;
 	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
 

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-07 17:32                       ` Jan Kiszka
@ 2007-08-08  7:28                         ` Gilles Chanteperdrix
  2007-08-08  8:40                           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-08  7:28 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

Jan Kiszka wrote:
 > Jan Kiszka wrote:
 > > Philippe Gerum wrote:
 > >> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
 > >>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
 > >>>> Gilles Chanteperdrix wrote:
 > >>>>> The fact that you are in a hurry should not be an excuse to propose a
 > >>>>> fix which is much worse than the bug itself.
 > >>>> Please explain.
 > >>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
 > >>> GFP_ATOMIC pools for allocation that take place from really atomic
 > >>> contexts such as interruption handlers for instance. fork should not
 > >>> be an atomic context.
 > >>>
 > >>> So the only reason I see why you propose this patch is because you are
 > >>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
 > >>> locally, but please do not make
 > >>> this an official patch.
 > >>>
 > >> I'm afraid you are both right. Draining pages from the atomic pool might
 > >> starve truely atomic contexts, and releasing the lock guarding the
 > >> pagetable pages for the source mm seems contradictory with what the
 > >> current code tries to achieve in holding it.
 > >>
 > >> Here is a patch preallocating the page for the cow-breaking code, which
 > >> relies on the lock breaking pattern already in place. Not exactly
 > >> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
 > >> Needs SMP testing before any attempt is made to merge it.
 > > 
 > > Funny. I just finished my own patch of this kind and made it pass a
 > > basic COW stress test here. Find it below, it is very similar. Will
 > > now analyse yours to see which one looks nicer and/or is more correct.
 > 
 > Yours is clearly nicer, but it required/allowed a few more cleanups.
 > Find my variant attached (against our local 2.6.20 kernel).
 > 
 > From my POV: problem solved. 8)

It seems you solved this problem before I even had a chance to have
a look at the code.

-- 


					    Gilles Chanteperdrix.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-08  7:28                         ` Gilles Chanteperdrix
@ 2007-08-08  8:40                           ` Gilles Chanteperdrix
  2007-08-08  9:51                             ` Jan Kiszka
  2007-08-08 12:33                             ` Philippe Gerum
  0 siblings, 2 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-08-08  8:40 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On 8/8/07, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote:
> Jan Kiszka wrote:
>  > Jan Kiszka wrote:
>  > > Philippe Gerum wrote:
>  > >> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
>  > >>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>  > >>>> Gilles Chanteperdrix wrote:
>  > >>>>> The fact that you are in a hurry should not be an excuse to propose a
>  > >>>>> fix which is much worse than the bug itself.
>  > >>>> Please explain.
>  > >>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
>  > >>> GFP_ATOMIC pools for allocation that take place from really atomic
>  > >>> contexts such as interruption handlers for instance. fork should not
>  > >>> be an atomic context.
>  > >>>
>  > >>> So the only reason I see why you propose this patch is because you are
>  > >>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
>  > >>> locally, but please do not make
>  > >>> this an official patch.
>  > >>>
>  > >> I'm afraid you are both right. Draining pages from the atomic pool might
>  > >> starve truely atomic contexts, and releasing the lock guarding the
>  > >> pagetable pages for the source mm seems contradictory with what the
>  > >> current code tries to achieve in holding it.
>  > >>
>  > >> Here is a patch preallocating the page for the cow-breaking code, which
>  > >> relies on the lock breaking pattern already in place. Not exactly
>  > >> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
>  > >> Needs SMP testing before any attempt is made to merge it.
>  > >
>  > > Funny. I just finished my own patch of this kind and made it pass a
>  > > basic COW stress test here. Find it below, it is very similar. Will
>  > > now analyse yours to see which one looks nicer and/or is more correct.
>  >
>  > Yours is clearly nicer, but it required/allowed a few more cleanups.
>  > Find my variant attached (against our local 2.6.20 kernel).
>  >
>  > From my POV: problem solved. 8)
>
> It seems you solved this problem before I even had a chance to have
> a look at the code.

Finally looking at the I-pipe patch, I see something suspicious:
the patch replaces a "return err" with a "return 0" in zeromap_pud_range.

-- 
                                               Gilles Chanteperdrix


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-08  8:40                           ` Gilles Chanteperdrix
@ 2007-08-08  9:51                             ` Jan Kiszka
  2007-08-08 12:33                             ` Philippe Gerum
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Kiszka @ 2007-08-08  9:51 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2678 bytes --]

Gilles Chanteperdrix wrote:
> On 8/8/07, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote:
>> Jan Kiszka wrote:
>>  > Jan Kiszka wrote:
>>  > > Philippe Gerum wrote:
>>  > >> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
>>  > >>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
>>  > >>>> Gilles Chanteperdrix wrote:
>>  > >>>>> The fact that you are in a hurry should not be an excuse to propose a
>>  > >>>>> fix which is much worse than the bug itself.
>>  > >>>> Please explain.
>>  > >>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
>>  > >>> GFP_ATOMIC pools for allocation that take place from really atomic
>>  > >>> contexts such as interruption handlers for instance. fork should not
>>  > >>> be an atomic context.
>>  > >>>
>>  > >>> So the only reason I see why you propose this patch is because you are
>>  > >>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
>>  > >>> locally, but please do not make
>>  > >>> this an official patch.
>>  > >>>
>>  > >> I'm afraid you are both right. Draining pages from the atomic pool might
>>  > >> starve truely atomic contexts, and releasing the lock guarding the
>>  > >> pagetable pages for the source mm seems contradictory with what the
>>  > >> current code tries to achieve in holding it.
>>  > >>
>>  > >> Here is a patch preallocating the page for the cow-breaking code, which
>>  > >> relies on the lock breaking pattern already in place. Not exactly
>>  > >> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
>>  > >> Needs SMP testing before any attempt is made to merge it.
>>  > >
>>  > > Funny. I just finished my own patch of this kind and made it pass a
>>  > > basic COW stress test here. Find it below, it is very similar. Will
>>  > > now analyse yours to see which one looks nicer and/or is more correct.
>>  >
>>  > Yours is clearly nicer, but it required/allowed a few more cleanups.
>>  > Find my variant attached (against our local 2.6.20 kernel).
>>  >
>>  > From my POV: problem solved. 8)
>>
>> It seems you solved this problem before I even had a chance to have
>> a look at the code.
> 
> Finally looking at the I-pipe patch, I see something suspicious:
> the patch replaces a "return err" with a "return 0" in zeromap_pud_range.

Indeed, that looks fishy. Appears to be a broken combination of these
two commits:

http://www.denx.de/cgi-bin/gitweb.cgi?p=ipipe-2.6.git;a=commitdiff;h=48ac83704927dca23a28c50163eacb71758b646e
http://www.denx.de/cgi-bin/gitweb.cgi?p=ipipe-2.6.git;a=commitdiff;h=edebab28d3bced559c6be3b2de8045b83cc05d53

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-core] [Adeos-main] [COW-BUG] __alloc_pages called from atomic context
  2007-08-08  8:40                           ` Gilles Chanteperdrix
  2007-08-08  9:51                             ` Jan Kiszka
@ 2007-08-08 12:33                             ` Philippe Gerum
  1 sibling, 0 replies; 17+ messages in thread
From: Philippe Gerum @ 2007-08-08 12:33 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: adeos-main, Jan Kiszka, xenomai-core

On Wed, 2007-08-08 at 10:40 +0200, Gilles Chanteperdrix wrote:
> On 8/8/07, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote:
> > Jan Kiszka wrote:
> >  > Jan Kiszka wrote:
> >  > > Philippe Gerum wrote:
> >  > >> On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote:
> >  > >>> On 8/7/07, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> >  > >>>> Gilles Chanteperdrix wrote:
> >  > >>>>> The fact that you are in a hurry should not be an excuse to propose a
> >  > >>>>> fix which is much worse than the bug itself.
> >  > >>>> Please explain.
> >  > >>> Using GFP_ATOMIC is not a good idea, because the kernel needs the
> >  > >>> GFP_ATOMIC pools for allocation that take place from really atomic
> >  > >>> contexts such as interruption handlers for instance. fork should not
> >  > >>> be an atomic context.
> >  > >>>
> >  > >>> So the only reason I see why you propose this patch is because you are
> >  > >>> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC
> >  > >>> locally, but please do not make
> >  > >>> this an official patch.
> >  > >>>
> >  > >> I'm afraid you are both right. Draining pages from the atomic pool might
> >  > >> starve truely atomic contexts, and releasing the lock guarding the
> >  > >> pagetable pages for the source mm seems contradictory with what the
> >  > >> current code tries to achieve in holding it.
> >  > >>
> >  > >> Here is a patch preallocating the page for the cow-breaking code, which
> >  > >> relies on the lock breaking pattern already in place. Not exactly
> >  > >> pretty, slightly tested here (UP, spinlock debug), but looks ok so far.
> >  > >> Needs SMP testing before any attempt is made to merge it.
> >  > >
> >  > > Funny. I just finished my own patch of this kind and made it pass a
> >  > > basic COW stress test here. Find it below, it is very similar. Will
> >  > > now analyse yours to see which one looks nicer and/or is more correct.
> >  >
> >  > Yours is clearly nicer, but it required/allowed a few more cleanups.
> >  > Find my variant attached (against our local 2.6.20 kernel).
> >  >
> >  > From my POV: problem solved. 8)
> >
> > It seems you solved this problem before I even had a chance to have
> > a look at the code.
> 
> Finally looking at the I-pipe patch, I see something suspicious:
> the patch replaces a "return err" with a "return 0" in zeromap_pud_range.
> 

Looks like a collision between mm/memory.c updates pertaining to
different kernel versions. I've been incorporating generic changes from
your ARM tree into my generic tree when they both were at different
kernel release levels. Will fix, thanks.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-08-08 12:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-07 13:06 [Xenomai-core] [COW-BUG] __alloc_pages called from atomic context Jan Kiszka
2007-08-07 13:19 ` [Xenomai-core] [Adeos-main] " Gilles Chanteperdrix
     [not found] ` <46B86FE4.3040209@domain.hid>
2007-08-07 14:02   ` [Xenomai-core] " Jan Kiszka
2007-08-07 14:11     ` [Xenomai-core] [Adeos-main] " Gilles Chanteperdrix
2007-08-07 14:20       ` Jan Kiszka
2007-08-07 14:24         ` Gilles Chanteperdrix
2007-08-07 14:46           ` Jan Kiszka
2007-08-07 14:56             ` Gilles Chanteperdrix
2007-08-07 14:59               ` Jan Kiszka
2007-08-07 15:09                 ` Gilles Chanteperdrix
2007-08-07 16:55                   ` Philippe Gerum
2007-08-07 17:01                     ` Jan Kiszka
2007-08-07 17:32                       ` Jan Kiszka
2007-08-08  7:28                         ` Gilles Chanteperdrix
2007-08-08  8:40                           ` Gilles Chanteperdrix
2007-08-08  9:51                             ` Jan Kiszka
2007-08-08 12:33                             ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.