Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
       [not found] <bug-16148-27@https.bugzilla.kernel.org/>
@ 2010-06-10 22:38 ` Andrew Morton
  2010-06-10 23:15   ` Dave Airlie
       [not found] ` <201006131301.o5DD1vqb002755@demeter.kernel.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2010-06-10 22:38 UTC (permalink / raw)
  To: Dave Airlie; +Cc: bugzilla-daemon, devnull, dri-devel


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Jun 2010 17:32:04 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16148
> 
>            Summary: page allocation failure. order:1, mode:0x50d0
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 2.6.35-rc1
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>         AssignedTo: akpm@linux-foundation.org
>         ReportedBy: devnull@plzk.org
>         Regression: No
> 
> 
> Created an attachment (id=26687)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=26687)
> dmesg
> 
> Never seen this before.
> 
> 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux
> 
> [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0
> [48126.787691] Pid: 1895, comm: Xorg Tainted: G        W   2.6.35-rc1 #1
> [48126.787694] Call Trace:
> [48126.787709]  [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0
> [48126.787716]  [<ffffffff81148695>] alloc_pages_current+0x95/0x100
> [48126.787720]  [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0
> [48126.787724]  [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0
> [48126.787730]  [<ffffffff81403f91>] ? security_vm_enough_memory_kern+0x21/0x30
> [48126.787736]  [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70
> [48126.787740]  [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0
> [48126.787744]  [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70
> [48126.787747]  [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140
> [48126.787750]  [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140
> [48126.787755]  [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10
> [48126.787758]  [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100
> [48126.787765]  [<ffffffff81627fe4>] i915_gem_object_bind_to_gtt+0x144/0x340
> [48126.787768]  [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0
> [48126.787772]  [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0
> [48126.787777]  [<ffffffff81091ba0>] ? __is_ram+0x0/0x10
> [48126.787783]  [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0
> [48126.787787]  [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390
> [48126.787790]  [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390
> [48126.787794]  [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0
> [48126.787799]  [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0
> [48126.787802]  [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390
> [48126.787807]  [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0
> [48126.787810]  [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580
> [48126.787814]  [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0
> [48126.787820]  [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b
> 

David, I have a vague feeling that we've been round this loop before.. 

Why does agp_alloc_page_array() use __GFP_NORETRY?  It's pretty unusual
and it's what caused this spew.

There's nothing in the changelog and the only relevant commentary
appears to be "This speeds things up and also saves memory for small
AGP regions", which is inscrutable.  Can you please add a usable
comment there?

Presumably this was added in response to some observed behaviour, but
what was it??


If the __GFP_NORETRY is indeed useful and legitimate and given that we
have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as
well to keep the bug reports away.

btw, agp_memory.vmalloc_flag can be done away with - it's conventional
to use is_vmalloc_addr() for this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-10 22:38 ` [Bug 16148] New: page allocation failure. order:1, mode:0x50d0 Andrew Morton
@ 2010-06-10 23:15   ` Dave Airlie
  2010-06-11  8:46     ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2010-06-10 23:15 UTC (permalink / raw)
  To: Andrew Morton, thellstrom; +Cc: bugzilla-daemon, devnull, dri-devel

On Thu, 2010-06-10 at 15:38 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon, 7 Jun 2010 17:32:04 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=16148
> > 
> >            Summary: page allocation failure. order:1, mode:0x50d0
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 2.6.35-rc1
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Page Allocator
> >         AssignedTo: akpm@linux-foundation.org
> >         ReportedBy: devnull@plzk.org
> >         Regression: No
> > 
> > 
> > Created an attachment (id=26687)
> >  --> (https://bugzilla.kernel.org/attachment.cgi?id=26687)
> > dmesg
> > 
> > Never seen this before.
> > 
> > 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux
> > 
> > [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0
> > [48126.787691] Pid: 1895, comm: Xorg Tainted: G        W   2.6.35-rc1 #1
> > [48126.787694] Call Trace:
> > [48126.787709]  [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0
> > [48126.787716]  [<ffffffff81148695>] alloc_pages_current+0x95/0x100
> > [48126.787720]  [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0
> > [48126.787724]  [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0
> > [48126.787730]  [<ffffffff81403f91>] ? security_vm_enough_memory_kern+0x21/0x30
> > [48126.787736]  [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70
> > [48126.787740]  [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0
> > [48126.787744]  [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70
> > [48126.787747]  [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140
> > [48126.787750]  [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140
> > [48126.787755]  [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10
> > [48126.787758]  [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100
> > [48126.787765]  [<ffffffff81627fe4>] i915_gem_object_bind_to_gtt+0x144/0x340
> > [48126.787768]  [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0
> > [48126.787772]  [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0
> > [48126.787777]  [<ffffffff81091ba0>] ? __is_ram+0x0/0x10
> > [48126.787783]  [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0
> > [48126.787787]  [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390
> > [48126.787790]  [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390
> > [48126.787794]  [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0
> > [48126.787799]  [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0
> > [48126.787802]  [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390
> > [48126.787807]  [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0
> > [48126.787810]  [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580
> > [48126.787814]  [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0
> > [48126.787820]  [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b
> > 
> 
> David, I have a vague feeling that we've been round this loop before.. 
> 
> Why does agp_alloc_page_array() use __GFP_NORETRY?  It's pretty unusual
> and it's what caused this spew.
> 
> There's nothing in the changelog and the only relevant commentary
> appears to be "This speeds things up and also saves memory for small
> AGP regions", which is inscrutable.  Can you please add a usable
> comment there?

cc'ing Thomas, who added this, I expect we could drop the NORETRY or
just add NOWARN. Though an order 1 page alloc failure isn't a pretty
sight, not sure how a vmalloc fallback could save us.

> 
> Presumably this was added in response to some observed behaviour, but
> what was it??
> 
> 
> If the __GFP_NORETRY is indeed useful and legitimate and given that we
> have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as
> well to keep the bug reports away.
> 
> btw, agp_memory.vmalloc_flag can be done away with - it's conventional
> to use is_vmalloc_addr() for this.

Lols, conventional my ass, we wanted to add that thing years ago for
this purpose and got told that would be an insane interface, then the
same person added the interface a year later and never fixed AGP to use
it.

I'll try and write a patch.

Dave.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-10 23:15   ` Dave Airlie
@ 2010-06-11  8:46     ` Thomas Hellstrom
  2010-06-11 17:24       ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2010-06-11  8:46 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Andrew Morton, bugzilla-daemon@bugzilla.kernel.org,
	devnull@plzk.org, dri-devel@lists.freedesktop.org

On 06/11/2010 01:15 AM, Dave Airlie wrote:
> On Thu, 2010-06-10 at 15:38 -0700, Andrew Morton wrote:
>    
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Mon, 7 Jun 2010 17:32:04 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>      
>>> https://bugzilla.kernel.org/show_bug.cgi?id=16148
>>>
>>>             Summary: page allocation failure. order:1, mode:0x50d0
>>>             Product: Memory Management
>>>             Version: 2.5
>>>      Kernel Version: 2.6.35-rc1
>>>            Platform: All
>>>          OS/Version: Linux
>>>                Tree: Mainline
>>>              Status: NEW
>>>            Severity: normal
>>>            Priority: P1
>>>           Component: Page Allocator
>>>          AssignedTo: akpm@linux-foundation.org
>>>          ReportedBy: devnull@plzk.org
>>>          Regression: No
>>>
>>>
>>> Created an attachment (id=26687)
>>>   -->  (https://bugzilla.kernel.org/attachment.cgi?id=26687)
>>> dmesg
>>>
>>> Never seen this before.
>>>
>>> 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux
>>>
>>> [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0
>>> [48126.787691] Pid: 1895, comm: Xorg Tainted: G        W   2.6.35-rc1 #1
>>> [48126.787694] Call Trace:
>>> [48126.787709]  [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0
>>> [48126.787716]  [<ffffffff81148695>] alloc_pages_current+0x95/0x100
>>> [48126.787720]  [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0
>>> [48126.787724]  [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0
>>> [48126.787730]  [<ffffffff81403f91>] ? security_vm_enough_memory_kern+0x21/0x30
>>> [48126.787736]  [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70
>>> [48126.787740]  [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0
>>> [48126.787744]  [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70
>>> [48126.787747]  [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140
>>> [48126.787750]  [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140
>>> [48126.787755]  [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10
>>> [48126.787758]  [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100
>>> [48126.787765]  [<ffffffff81627fe4>] i915_gem_object_bind_to_gtt+0x144/0x340
>>> [48126.787768]  [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0
>>> [48126.787772]  [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0
>>> [48126.787777]  [<ffffffff81091ba0>] ? __is_ram+0x0/0x10
>>> [48126.787783]  [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0
>>> [48126.787787]  [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390
>>> [48126.787790]  [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390
>>> [48126.787794]  [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0
>>> [48126.787799]  [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0
>>> [48126.787802]  [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390
>>> [48126.787807]  [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0
>>> [48126.787810]  [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580
>>> [48126.787814]  [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0
>>> [48126.787820]  [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b
>>>
>>>        
>> David, I have a vague feeling that we've been round this loop before..
>>
>> Why does agp_alloc_page_array() use __GFP_NORETRY?  It's pretty unusual
>> and it's what caused this spew.
>>
>> There's nothing in the changelog and the only relevant commentary
>> appears to be "This speeds things up and also saves memory for small
>> AGP regions", which is inscrutable.  Can you please add a usable
>> comment there?
>>      
> cc'ing Thomas, who added this, I expect we could drop the NORETRY or
> just add NOWARN. Though an order 1 page alloc failure isn't a pretty
> sight, not sure how a vmalloc fallback could save us.
>
>    

Hmm. IIRC that was an untested speed optimization back from the time 
when I was
reading ldd3. I think the idea was to avoid slow allocations of (order > 
0) if they weren't
immediately available and fall back to vmalloc single page allocations.
It might be that that functionality is no longer preserved and only the 
__GFP_NORETRY remains.
I think it should be safe to remove the NORETRY if it's annoying, but it 
should probably be equally safe to add a NOWARN and keep the vmalloc 
fallback.

Now if we still get a "definitive" page allocation failure in this 
codepath, that's not good, but hardly the AGP driver's fault.  Has Intel 
added some kind of accounting for pinned pages yet?


>> Presumably this was added in response to some observed behaviour, but
>> what was it??
>>
>>
>> If the __GFP_NORETRY is indeed useful and legitimate and given that we
>> have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as
>> well to keep the bug reports away.
>>
>> btw, agp_memory.vmalloc_flag can be done away with - it's conventional
>> to use is_vmalloc_addr() for this.
>>      
> Lols, conventional my ass, we wanted to add that thing years ago for
> this purpose and got told that would be an insane interface, then the
> same person added the interface a year later and never fixed AGP to use
> it.
>
>    


Indeed. I even recall the phrase "Too ugly to live" :).

> I'll try and write a patch.
>
> Dave.
>
>    
/Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-11  8:46     ` Thomas Hellstrom
@ 2010-06-11 17:24       ` Andrew Morton
  2010-06-11 20:22         ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2010-06-11 17:24 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: Dave Airlie, bugzilla-daemon@bugzilla.kernel.org,
	devnull@plzk.org, dri-devel@lists.freedesktop.org

On Fri, 11 Jun 2010 10:46:07 +0200 Thomas Hellstrom <thellstrom@vmware.com> wrote:

> >>>
> >>>        
> >> David, I have a vague feeling that we've been round this loop before..
> >>
> >> Why does agp_alloc_page_array() use __GFP_NORETRY?  It's pretty unusual
> >> and it's what caused this spew.
> >>
> >> There's nothing in the changelog and the only relevant commentary
> >> appears to be "This speeds things up and also saves memory for small
> >> AGP regions", which is inscrutable.  Can you please add a usable
> >> comment there?
> >>      
> > cc'ing Thomas, who added this, I expect we could drop the NORETRY or
> > just add NOWARN. Though an order 1 page alloc failure isn't a pretty
> > sight, not sure how a vmalloc fallback could save us.
> >
> >    
> 
> Hmm. IIRC that was an untested speed optimization back from the time 
> when I was
> reading ldd3. I think the idea was to avoid slow allocations of (order > 
> 0) if they weren't
> immediately available and fall back to vmalloc single page allocations.
> It might be that that functionality is no longer preserved and only the 
> __GFP_NORETRY remains.
> I think it should be safe to remove the NORETRY if it's annoying, but it 
> should probably be equally safe to add a NOWARN and keep the vmalloc 
> fallback.

An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we
use them for kernel stacks all the time.  I'd say just remove the
__GFP_NORETRY and be happy.

In fact if the allocations are always this small I'd say we can remove
the vmalloc fallback too.  However if under some circumstances the
allocations can be "large", say order-4 or higher then allocation
failures are still a risk.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-11 17:24       ` Andrew Morton
@ 2010-06-11 20:22         ` Thomas Hellstrom
  2010-06-11 20:37           ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2010-06-11 20:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Airlie, bugzilla-daemon@bugzilla.kernel.org,
	devnull@plzk.org, dri-devel@lists.freedesktop.org

On 06/11/2010 07:24 PM, Andrew Morton wrote:
> On Fri, 11 Jun 2010 10:46:07 +0200 Thomas Hellstrom<thellstrom@vmware.com>  wrote:
>
>    
>>>>>
>>>>>            
>>>> David, I have a vague feeling that we've been round this loop before..
>>>>
>>>> Why does agp_alloc_page_array() use __GFP_NORETRY?  It's pretty unusual
>>>> and it's what caused this spew.
>>>>
>>>> There's nothing in the changelog and the only relevant commentary
>>>> appears to be "This speeds things up and also saves memory for small
>>>> AGP regions", which is inscrutable.  Can you please add a usable
>>>> comment there?
>>>>
>>>>          
>>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or
>>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty
>>> sight, not sure how a vmalloc fallback could save us.
>>>
>>>
>>>        
>> Hmm. IIRC that was an untested speed optimization back from the time
>> when I was
>> reading ldd3. I think the idea was to avoid slow allocations of (order>
>> 0) if they weren't
>> immediately available and fall back to vmalloc single page allocations.
>> It might be that that functionality is no longer preserved and only the
>> __GFP_NORETRY remains.
>> I think it should be safe to remove the NORETRY if it's annoying, but it
>> should probably be equally safe to add a NOWARN and keep the vmalloc
>> fallback.
>>      
> An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we
> use them for kernel stacks all the time.  I'd say just remove the
> __GFP_NORETRY and be happy.
>
> In fact if the allocations are always this small I'd say we can remove
> the vmalloc fallback too.  However if under some circumstances the
> allocations can be "large", say order-4 or higher then allocation
> failures are still a risk.
>
>    

Actually, At that time I was working with a SiS GPU (128MiB system), and 
was getting persistent failures for order 1 GFP_KERNEL page allocations 
(albeit not in this codepath). So while they are highly unlikely for 
modern systems, it might be worthwhile keeping the fallback.

/Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-11 20:22         ` Thomas Hellstrom
@ 2010-06-11 20:37           ` Andrew Morton
  2010-06-11 21:39             ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2010-06-11 20:37 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: Dave Airlie, bugzilla-daemon@bugzilla.kernel.org,
	devnull@plzk.org, dri-devel@lists.freedesktop.org

On Fri, 11 Jun 2010 22:22:28 +0200
Thomas Hellstrom <thellstrom@vmware.com> wrote:

> >>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or
> >>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty
> >>> sight, not sure how a vmalloc fallback could save us.
> >>>
> >>>
> >>>        
> >> Hmm. IIRC that was an untested speed optimization back from the time
> >> when I was
> >> reading ldd3. I think the idea was to avoid slow allocations of (order>
> >> 0) if they weren't
> >> immediately available and fall back to vmalloc single page allocations.
> >> It might be that that functionality is no longer preserved and only the
> >> __GFP_NORETRY remains.
> >> I think it should be safe to remove the NORETRY if it's annoying, but it
> >> should probably be equally safe to add a NOWARN and keep the vmalloc
> >> fallback.
> >>      
> > An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we
> > use them for kernel stacks all the time.  I'd say just remove the
> > __GFP_NORETRY and be happy.
> >
> > In fact if the allocations are always this small I'd say we can remove
> > the vmalloc fallback too.  However if under some circumstances the
> > allocations can be "large", say order-4 or higher then allocation
> > failures are still a risk.
> >
> >    
> 
> Actually, At that time I was working with a SiS GPU (128MiB system), and 
> was getting persistent failures for order 1 GFP_KERNEL page allocations 
> (albeit not in this codepath). So while they are highly unlikely for 
> modern systems, it might be worthwhile keeping the fallback.

128MB total RAM?  Those were the days.

Various page reclaim changes have been made in the past year or so
which _should_ improve that (eg, lumpy reclaim) but yeah, it's by no
means a certainty.

The vmalloc fallback hardly hurts anyone.  But it does mean that hardly
anyone ever executes that codepath, so it won't get tested much.

There was a patch recently which added an API formalising the
alloc_pages-then-vmalloc fallback approach.  It didn't get merged,
although there weren't strong feelings either way really.  One benefit
of that approach is that the alloc/free code itself would get more
testing coverage, but callers can still screw things up by failing to
handle vmalloc memory correctly for DMA mapping purposes.

Oh well, where were we?  Remove that __GFP_NORETRY?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] New: page allocation failure. order:1, mode:0x50d0
  2010-06-11 20:37           ` Andrew Morton
@ 2010-06-11 21:39             ` Thomas Hellstrom
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Hellstrom @ 2010-06-11 21:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Airlie, bugzilla-daemon@bugzilla.kernel.org,
	devnull@plzk.org, dri-devel@lists.freedesktop.org

On 06/11/2010 10:37 PM, Andrew Morton wrote:
> On Fri, 11 Jun 2010 22:22:28 +0200
> Thomas Hellstrom<thellstrom@vmware.com>  wrote:
>
>    
>>>>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or
>>>>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty
>>>>> sight, not sure how a vmalloc fallback could save us.
>>>>>
>>>>>
>>>>>
>>>>>            
>>>> Hmm. IIRC that was an untested speed optimization back from the time
>>>> when I was
>>>> reading ldd3. I think the idea was to avoid slow allocations of (order>
>>>> 0) if they weren't
>>>> immediately available and fall back to vmalloc single page allocations.
>>>> It might be that that functionality is no longer preserved and only the
>>>> __GFP_NORETRY remains.
>>>> I think it should be safe to remove the NORETRY if it's annoying, but it
>>>> should probably be equally safe to add a NOWARN and keep the vmalloc
>>>> fallback.
>>>>
>>>>          
>>> An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we
>>> use them for kernel stacks all the time.  I'd say just remove the
>>> __GFP_NORETRY and be happy.
>>>
>>> In fact if the allocations are always this small I'd say we can remove
>>> the vmalloc fallback too.  However if under some circumstances the
>>> allocations can be "large", say order-4 or higher then allocation
>>> failures are still a risk.
>>>
>>>
>>>        
>> Actually, At that time I was working with a SiS GPU (128MiB system), and
>> was getting persistent failures for order 1 GFP_KERNEL page allocations
>> (albeit not in this codepath). So while they are highly unlikely for
>> modern systems, it might be worthwhile keeping the fallback.
>>      
> 128MB total RAM?  Those were the days.
>
> Various page reclaim changes have been made in the past year or so
> which _should_ improve that (eg, lumpy reclaim) but yeah, it's by no
> means a certainty.
>
> The vmalloc fallback hardly hurts anyone.  But it does mean that hardly
> anyone ever executes that codepath, so it won't get tested much.
>
> There was a patch recently which added an API formalising the
> alloc_pages-then-vmalloc fallback approach.  It didn't get merged,
> although there weren't strong feelings either way really.  One benefit
> of that approach is that the alloc/free code itself would get more
> testing coverage, but callers can still screw things up by failing to
> handle vmalloc memory correctly for DMA mapping purposes.
>
> Oh well, where were we?  Remove that __GFP_NORETRY?
>    

Yeah, I think that's the sanest thing to do!

/Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] page allocation failure. order:1, mode:0x50d0
       [not found] ` <201006131301.o5DD1vqb002755@demeter.kernel.org>
@ 2010-06-15 22:41   ` Andrew Morton
  2010-06-15 22:57     ` Dave Airlie
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2010-06-15 22:41 UTC (permalink / raw)
  To: mikko.cal; +Cc: thellstrom, devnull, bugzilla-daemon, dri-devel, Dave Airlie


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

(switching back to email, actually)

On Sun, 13 Jun 2010 13:01:57 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16148
>
>
> Mikko C. <mikko.cal@gmail.com> changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |mikko.cal@gmail.com
>
>
>
>
> --- Comment #8 from Mikko C. <mikko.cal@gmail.com>  2010-06-13 13:01:53 ---
> I have been getting this with 2.6.35-rc2 and rc3.
> Could it be the same problem?
>
>
> X: page allocation failure. order:0, mode:0x4
> Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1
> Call Trace:
>  [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680
>  [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680
>  [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm]
>  [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm]
>  [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm]
>  [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm]
>  [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm]
>  [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm]
>  [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm]
>  [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20
>  [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm]
>  [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm]
>  [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon]
>  [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon]
>  [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon]
>  [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon]
>  [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm]
>  [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon]
>  [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100
>  [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0
>  [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530
>  [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0
>  [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80
>  [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90
>  [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b

That's different.  ttm_get_pages() looks pretty busted to me.  It's not
using __GFP_WAIT and it's not using __GFP_FS.  It's using a plain
GFP_DMA32 so it's using atomic allocations even though it doesn't need
to.  IOW, it's shooting itself in the head.

Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS
and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT.
If so, it should very much do so.  If not then the function is
misdesigned and should be altered to take a gfp_t argument so the
caller can tell ttm_get_pages() which is the strongest allocation mode
which it may use.

> [TTM] Unable to allocate page.
> radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002)
> [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456,
> 2, 4096, -12)

This bug actually broke stuff for you.

Something like this:

--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a
+++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page
 	/* No pool for cached pages */
 	if (pool == NULL) {
 		if (flags & TTM_PAGE_FLAG_DMA32)
-			gfp_flags |= GFP_DMA32;
+			gfp_flags |= GFP_KERNEL|GFP_DMA32;
 		else
 			gfp_flags |= GFP_HIGHUSER;
 
_

although I wonder whether it should be using pool->gfp_flags.


It's a shame that this code was developed and merged in secret :( Had
we known, we could have looked at enhancing mempools to cover the
requirement, or at implementing this in some generic fashion rather
than hiding it down in drivers/gpu/drm.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] page allocation failure. order:1, mode:0x50d0
  2010-06-15 22:41   ` [Bug 16148] " Andrew Morton
@ 2010-06-15 22:57     ` Dave Airlie
  2010-06-16 13:54       ` Mel Gorman
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2010-06-15 22:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: mikko.cal, thellstrom, devnull, mel, bugzilla-daemon, dri-devel

On Tue, 2010-06-15 at 15:41 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> (switching back to email, actually)
> 
> On Sun, 13 Jun 2010 13:01:57 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=16148
> >
> >
> > Mikko C. <mikko.cal@gmail.com> changed:
> >
> >            What    |Removed                     |Added
> > ----------------------------------------------------------------------------
> >                  CC|                            |mikko.cal@gmail.com
> >
> >
> >
> >
> > --- Comment #8 from Mikko C. <mikko.cal@gmail.com>  2010-06-13 13:01:53 ---
> > I have been getting this with 2.6.35-rc2 and rc3.
> > Could it be the same problem?
> >
> >
> > X: page allocation failure. order:0, mode:0x4
> > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1
> > Call Trace:
> >  [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680
> >  [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680
> >  [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm]
> >  [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm]
> >  [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm]
> >  [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm]
> >  [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm]
> >  [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm]
> >  [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm]
> >  [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20
> >  [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm]
> >  [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm]
> >  [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon]
> >  [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon]
> >  [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon]
> >  [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon]
> >  [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm]
> >  [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon]
> >  [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100
> >  [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0
> >  [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530
> >  [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0
> >  [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80
> >  [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90
> >  [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b
> 
> That's different.  ttm_get_pages() looks pretty busted to me.  It's not
> using __GFP_WAIT and it's not using __GFP_FS.  It's using a plain
> GFP_DMA32 so it's using atomic allocations even though it doesn't need
> to.  IOW, it's shooting itself in the head.
> 
> Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS
> and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT.
> If so, it should very much do so.  If not then the function is
> misdesigned and should be altered to take a gfp_t argument so the
> caller can tell ttm_get_pages() which is the strongest allocation mode
> which it may use.
> 
> > [TTM] Unable to allocate page.
> > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002)
> > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456,
> > 2, 4096, -12)
> 
> This bug actually broke stuff for you.
> 
> Something like this:
> 
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a
> +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page
>  	/* No pool for cached pages */
>  	if (pool == NULL) {
>  		if (flags & TTM_PAGE_FLAG_DMA32)
> -			gfp_flags |= GFP_DMA32;
> +			gfp_flags |= GFP_KERNEL|GFP_DMA32;
>  		else
>  			gfp_flags |= GFP_HIGHUSER;
>  
> _
> 
> although I wonder whether it should be using pool->gfp_flags.
> 
> 
> It's a shame that this code was developed and merged in secret :( Had
> we known, we could have looked at enhancing mempools to cover the
> requirement, or at implementing this in some generic fashion rather
> than hiding it down in drivers/gpu/drm.

Its been post to lkml at least once or twice over the past few years
though not as much as it was posted to dri-devel, but that was because
we had never seen anyone show any interest in it outside of kernel
hackers. Originally I was going to use the generic allocator stuff ia64
uses for uncached allocations but it allocates memory ranges not pages
so it wasn't useful. I also suggested getting a page flag for uncached
allocator stuff, I was told to go write the code in my own corner and
prove it was required. So I did, and it was cleaned up and worked on by
others and I merged it. So can we lay off with the "in secret", the
original code is nearly 2 years old at this point and just because -mm
hackers choose to ignore it isn't our fault. Patches welcome.

So now back to the bug:

So the page pools are setup with gfp flags, in the normal case, 4 pools,
one WC GFP_HIGHUSER pages, one UC HIGHUSER pages, one WC GFP_USER|
GFP_DMA32, one UC GFP_USER|GFP_DMA32, so the pools are all fine, the
problem here is the same as before we added the pools, which is the
normal page allocation path, which needs the GFP_USER added instead of
GFP_KERNEL.

That said I've noticed a lot more page allocation failure reports in
2.6.35-rcX than we've gotten for a long time, in code that hasn't
changed (the AGP ones the other day for example) has something in the
core MM regressed (again... didn't this happen back in 2.6.31 or
something).

(cc'ing Mel who tracked these down before).

Dave.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 16148] page allocation failure. order:1, mode:0x50d0
  2010-06-15 22:57     ` Dave Airlie
@ 2010-06-16 13:54       ` Mel Gorman
  0 siblings, 0 replies; 10+ messages in thread
From: Mel Gorman @ 2010-06-16 13:54 UTC (permalink / raw)
  To: Dave Airlie
  Cc: mikko.cal, thellstrom, devnull, bugzilla-daemon, dri-devel,
	Andrew Morton

Hi Dave,

On Wed, Jun 16, 2010 at 08:57:09AM +1000, Dave Airlie wrote:
> > (switching back to email, actually)
> > 
> > On Sun, 13 Jun 2010 13:01:57 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=16148
> > >
> > >
> > > Mikko C. <mikko.cal@gmail.com> changed:
> > >
> > >            What    |Removed                     |Added
> > > ----------------------------------------------------------------------------
> > >                  CC|                            |mikko.cal@gmail.com
> > >
> > >
> > >
> > >
> > > --- Comment #8 from Mikko C. <mikko.cal@gmail.com>  2010-06-13 13:01:53 ---
> > > I have been getting this with 2.6.35-rc2 and rc3.
> > > Could it be the same problem?
> > >
> > >
> > > X: page allocation failure. order:0, mode:0x4
> > > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1
> > > Call Trace:
> > >  [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680
> > >  [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680
> > >  [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm]
> > >  [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm]
> > >  [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm]
> > >  [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm]
> > >  [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm]
> > >  [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm]
> > >  [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm]
> > >  [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20
> > >  [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm]
> > >  [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm]
> > >  [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon]
> > >  [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon]
> > >  [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon]
> > >  [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon]
> > >  [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm]
> > >  [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon]
> > >  [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100
> > >  [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0
> > >  [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530
> > >  [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0
> > >  [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80
> > >  [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90
> > >  [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b
> > 
> > That's different.  ttm_get_pages() looks pretty busted to me.  It's not
> > using __GFP_WAIT and it's not using __GFP_FS.  It's using a plain
> > GFP_DMA32 so it's using atomic allocations even though it doesn't need
> > to.  IOW, it's shooting itself in the head.
> > 

This is an accurate assessment.

> > Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS
> > and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT.
> > If so, it should very much do so.  If not then the function is
> > misdesigned and should be altered to take a gfp_t argument so the
> > caller can tell ttm_get_pages() which is the strongest allocation mode
> > which it may use.
> > 
> > > [TTM] Unable to allocate page.
> > > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002)
> > > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456,
> > > 2, 4096, -12)
> > 
> > This bug actually broke stuff for you.
> > 
> > Something like this:
> > 
> > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a
> > +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> > @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page
> >  	/* No pool for cached pages */
> >  	if (pool == NULL) {
> >  		if (flags & TTM_PAGE_FLAG_DMA32)
> > -			gfp_flags |= GFP_DMA32;
> > +			gfp_flags |= GFP_KERNEL|GFP_DMA32;
> >  		else
> >  			gfp_flags |= GFP_HIGHUSER;
> >  
> > _
> > 
> > although I wonder whether it should be using pool->gfp_flags.
> > 
> > 
> > It's a shame that this code was developed and merged in secret :( Had
> > we known, we could have looked at enhancing mempools to cover the
> > requirement, or at implementing this in some generic fashion rather
> > than hiding it down in drivers/gpu/drm.
> 
> Its been post to lkml at least once or twice over the past few years
> though not as much as it was posted to dri-devel, but that was because
> we had never seen anyone show any interest in it outside of kernel
> hackers. Originally I was going to use the generic allocator stuff ia64
> uses for uncached allocations but it allocates memory ranges not pages
> so it wasn't useful. I also suggested getting a page flag for uncached
> allocator stuff, I was told to go write the code in my own corner and
> prove it was required. So I did, and it was cleaned up and worked on by
> others and I merged it. So can we lay off with the "in secret", the
> original code is nearly 2 years old at this point and just because -mm
> hackers choose to ignore it isn't our fault. Patches welcome.
> 

No comment on this aspect of things.

> So now back to the bug:
> 
> So the page pools are setup with gfp flags, in the normal case, 4 pools,
> one WC GFP_HIGHUSER pages, one UC HIGHUSER pages, one WC GFP_USER|
> GFP_DMA32, one UC GFP_USER|GFP_DMA32, so the pools are all fine, the
> problem here is the same as before we added the pools, which is the
> normal page allocation path, which needs the GFP_USER added instead of
> GFP_KERNEL.
> 
> That said I've noticed a lot more page allocation failure reports in
> 2.6.35-rcX than we've gotten for a long time, in code that hasn't
> changed (the AGP ones the other day for example) has something in the
> core MM regressed (again... didn't this happen back in 2.6.31 or
> something).
> 
> (cc'ing Mel who tracked these down before).
> 

Ok, so there has been very little changed in the page allocator that
might have caused this. The only possibility I can think of is
[6dda9d: page allocator: reduce fragmentation in buddy allocator by
adding buddies that are merging to the tail of the free lists] but I
would have expected the patch to help, not hinder, this situation.

In the last few kernel releases when there has been a spike in high-order
allocation, it has been largely due to an increased source of high-order
callers from somewhere. I am 99% sure the last time I encountered this, I used
the following script to debug it even though it's a bit of a hack. It uses
ftrace to track high-order allocators and reports the number of high-order
allocations (either atomic or normal) and the count of unique backtraces
(to point the finger at bad callers)

Unfortunately, I don't have access to a suitable machine to test this with
right now (my laptop isn't making high-order allocations to confirm yet)
but I'm pretty sure it's the right one!

If you do some fixed set of tests with 2.6.34 and 2.6.35-rc2 with this script
running and compare the reports, it might point the finger at a new source
of frequent high-order allocations that might be leading to more failures.
Do something like

./watch-highorder.pl -o highorder-2.6.34-report.txt
PID=$!
Do your tests
# Two sigints in quick succession to make the monitor exit
kill $PID
kill $PID 

==== CUT HERE ====
#!/usr/bin/perl
# This is a tool that analyses the trace output related to page allocation,
# sums up the number of high-order allocations taking place and who the
# callers are
#
# Example usage: trace-pagealloc-highorder.pl -o output-report.txt
# options
# -o, --output	Where to store the report
#
# Copyright (c) IBM Corporation 2009
# Author: Mel Gorman <mel@csn.ul.ie>
use strict;
use Getopt::Long;

# Tracepoint events
use constant MM_PAGE_ALLOC		=> 1;
use constant EVENT_UNKNOWN		=> 7;

use constant HIGH_NORMAL_HIGHORDER_ALLOC	=> 10;
use constant HIGH_ATOMIC_HIGHORDER_ALLOC	=> 11;

my $opt_output;
my %stats;
my %unique_events;
my $last_updated = 0;

$stats{HIGH_NORMAL_HIGHORDER_ALLOC} = 0;
$stats{HIGH_ATOMIC_HIGHORDER_ALLOC} = 0;

# Catch sigint and exit on request
my $sigint_report = 0;
my $sigint_exit = 0;
my $sigint_pending = 0;
my $sigint_received = 0;
sub sigint_handler {
	my $current_time = time;
	if ($current_time - 2 > $sigint_received) {
		print "SIGINT received, report pending. Hit ctrl-c again to exit\n";
		$sigint_report = 1;
	} else {
		if (!$sigint_exit) {
			print "Second SIGINT received quickly, exiting\n";
		}
		$sigint_exit++;
	}

	if ($sigint_exit > 3) {
		print "Many SIGINTs received, exiting now without report\n";
		exit;
	}

	$sigint_received = $current_time;
	$sigint_pending = 1;
}
$SIG{INT} = "sigint_handler";

# Parse command line options
GetOptions(
	'output=s'    => \$opt_output,
);

# Defaults for dynamically discovered regex's
my $regex_pagealloc_default = 'page=([0-9a-f]*) pfn=([0-9]*) order=([-0-9]*) migratetype=([-0-9]*) gfp_flags=([A-Z_|]*)';

# Dyanically discovered regex
my $regex_pagealloc;

# Static regex used. Specified like this for readability and for use with /o
#                      (process_pid)     (cpus      )   ( time  )   (tpoint    ) (details)
my $regex_traceevent = '\s*([a-zA-Z0-9-]*)\s*(\[[0-9]*\])\s*([0-9.]*):\s*([a-zA-Z_]*):\s*(.*)';
my $regex_statname = '[-0-9]*\s\((.*)\).*';
my $regex_statppid = '[-0-9]*\s\(.*\)\s[A-Za-z]\s([0-9]*).*';

sub generate_traceevent_regex {
	my $event = shift;
	my $default = shift;
	my $regex;

	# Read the event format or use the default
	if (!open (FORMAT, "/sys/kernel/debug/tracing/events/$event/format")) {
		$regex = $default;
	} else {
		my $line;
		while (!eof(FORMAT)) {
			$line = <FORMAT>;
			if ($line =~ /^print fmt:\s"(.*)",.*/) {
				$regex = $1;
				$regex =~ s/%p/\([0-9a-f]*\)/g;
				$regex =~ s/%d/\([-0-9]*\)/g;
				$regex =~ s/%lu/\([0-9]*\)/g;
				$regex =~ s/%s/\([A-Z_|]*\)/g;
				$regex =~ s/\(REC->gfp_flags\).*/REC->gfp_flags/;
				$regex =~ s/\",.*//;
			}
		}
	}

	# Verify fields are in the right order
	my $tuple;
	foreach $tuple (split /\s/, $regex) {
		my ($key, $value) = split(/=/, $tuple);
		my $expected = shift;
		if ($key ne $expected) {
			print("WARNING: Format not as expected '$key' != '$expected'");
			$regex =~ s/$key=\((.*)\)/$key=$1/;
		}
	}

	if (defined shift) {
		die("Fewer fields than expected in format");
	}

	return $regex;
}
$regex_pagealloc = generate_traceevent_regex("kmem/mm_page_alloc",
			$regex_pagealloc_default,
			"page", "pfn",
			"order", "migratetype", "gfp_flags");

sub process_events {
	my $traceevent;
	my $process_pid = 0;
	my $cpus;
	my $timestamp;
	my $tracepoint;
	my $details;
	my $statline;
	my $nextline = 1;

	# Read each line of the event log
EVENT_PROCESS:
	while (($traceevent = <TRACING>) && !$sigint_exit) {
SKIP_LINEREAD:

		if ($traceevent eq "") {
			last EVENT_PROCSS;
		}

		if ($traceevent =~ /$regex_traceevent/o) {
			$process_pid = $1;
			$tracepoint = $4;
		} else {
			next;
		}

		# Perl Switch() sucks majorly
		if ($tracepoint eq "mm_page_alloc") {
			my ($page, $order, $gfp_flags, $type);
			my ($atomic);
			my $details = $5;

			if ($details !~ /$regex_pagealloc/o) {
				print "WARNING: Failed to parse mm_page_alloc as expected\n";
				print "$details\n";
				print "$regex_pagealloc\n";
				print "\n";
				next;
			}
			$page = $1;
			$order = $3;
			$gfp_flags = $5;

			# Only concerned with high-order allocs
			if ($order == 0) {
				next;
			}

			$stats{MM_PAGE_ALLOC}++;

			if ($gfp_flags =~ /ATOMIC/) {
				$stats{HIGH_ATOMIC_HIGHORDER_ALLOC}++;
				$type = "atomic";
			} else {
				$stats{HIGH_NORMAL_HIGHORDER_ALLOC}++;
				$type = "normal";
			}

			# Record the stack trace
			$traceevent = <TRACING>;
			if ($traceevent !~ /stack trace/) {
				goto SKIP_LINEREAD;
			}
			my $event = "order=$order $type gfp_flags=$gfp_flags\n";;
			while ($traceevent = <TRACING>) {
				if ($traceevent !~ /^ =>/) {
					$unique_events{$event}++;
					my $current = time;

					if ($current - $last_updated > 60) {
						$last_updated = $current;
						update_report();
					}
					goto SKIP_LINEREAD;
				}
				$event .= $traceevent;
			}
		} else {
			$stats{EVENT_UNKNOWN}++;
		}

		if ($sigint_pending) {
			last EVENT_PROCESS;
		}
	}
}

sub update_report() {
	my $event;
	open (REPORT, ">$opt_output") || die ("Failed to open $opt_output for writing");

	foreach $event (keys %unique_events) {
		print REPORT $unique_events{$event} . " instances $event\n";
	}
	print REPORT "High-order normal allocations: " . $stats{HIGH_NORMAL_HIGHORDER_ALLOC} . "\n";
	print REPORT "High-order atomic allocations: " . $stats{HIGH_ATOMIC_HIGHORDER_ALLOC} . "\n";

	close REPORT;
}

sub print_report() {
	print "\nReport\n";
	open (REPORT, $opt_output) || die ("Failed to open $opt_output for reading");
	while (<REPORT>) {
		print $_;
	}
	close REPORT;
}

# Process events or signals until neither is available
sub signal_loop() {
	my $sigint_processed;
	do {
		$sigint_processed = 0;
		process_events();

		# Handle pending signals if any
		if ($sigint_pending) {
			my $current_time = time;

			if ($sigint_exit) {
				print "Received exit signal\n";
				$sigint_pending = 0;
			}
			if ($sigint_report) {
				if ($current_time >= $sigint_received + 2) {
					update_report();
					print_report();
					$sigint_report = 0;
					$sigint_pending = 0;
					$sigint_processed = 1;
					$sigint_exit = 0;
				}
			}
		}
	} while ($sigint_pending || $sigint_processed);
}

sub set_traceoption($) {
	my $option = shift;

	open(TRACEOPT, ">/sys/kernel/debug/tracing/trace_options") || die("Failed to open trace_options");
	print TRACEOPT $option;
	close TRACEOPT;
}

sub enable_tracing($) {
	my $tracing = shift;

	open(TRACING, ">/sys/kernel/debug/tracing/events/$tracing/enable") || die("Failed to open tracing event $tracing");
	print TRACING "1";
	close TRACING;
}

sub disable_tracing($) {
	my $tracing = shift;

	open(TRACING, ">/sys/kernel/debug/tracing/events/$tracing/enable") || die("Failed to open tracing event $tracing");
	print TRACING "0";
	close TRACING;

}

set_traceoption("stacktrace");
set_traceoption("sym-offset");
set_traceoption("sym-addr");
enable_tracing("kmem/mm_page_alloc");
open(TRACING, "/sys/kernel/debug/tracing/trace_pipe") || die("Failed to open trace_pipe");
signal_loop();
close TRACING;
disable_tracing("kmem/mm_page_alloc");
update_report();
print_report();

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-06-16 14:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-16148-27@https.bugzilla.kernel.org/>
2010-06-10 22:38 ` [Bug 16148] New: page allocation failure. order:1, mode:0x50d0 Andrew Morton
2010-06-10 23:15   ` Dave Airlie
2010-06-11  8:46     ` Thomas Hellstrom
2010-06-11 17:24       ` Andrew Morton
2010-06-11 20:22         ` Thomas Hellstrom
2010-06-11 20:37           ` Andrew Morton
2010-06-11 21:39             ` Thomas Hellstrom
     [not found] ` <201006131301.o5DD1vqb002755@demeter.kernel.org>
2010-06-15 22:41   ` [Bug 16148] " Andrew Morton
2010-06-15 22:57     ` Dave Airlie
2010-06-16 13:54       ` Mel Gorman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.