From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [Bug 16148] page allocation failure. order:1, mode:0x50d0 Date: Tue, 15 Jun 2010 15:41:38 -0700 Message-ID: <20100615154138.11622d81.akpm@linux-foundation.org> References: <201006131301.o5DD1vqb002755@demeter.kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from smtp1.linux-foundation.org (smtp1.linux-foundation.org [140.211.169.13]) by gabe.freedesktop.org (Postfix) with ESMTP id 9FA649E878 for ; Tue, 15 Jun 2010 15:42:20 -0700 (PDT) In-Reply-To: <201006131301.o5DD1vqb002755@demeter.kernel.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: mikko.cal@gmail.com Cc: thellstrom@vmware.com, devnull@plzk.org, bugzilla-daemon@bugzilla.kernel.org, dri-devel@lists.freedesktop.org, Dave Airlie List-Id: dri-devel@lists.freedesktop.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). (switching back to email, actually) On Sun, 13 Jun 2010 13:01:57 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > Mikko C. changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |mikko.cal@gmail.com > > > > > --- Comment #8 from Mikko C. 2010-06-13 13:01:53 --- > I have been getting this with 2.6.35-rc2 and rc3. > Could it be the same problem? > > > X: page allocation failure. order:0, mode:0x4 > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 > Call Trace: > [] ? __alloc_pages_nodemask+0x629/0x680 > [] ? __alloc_pages_nodemask+0x100/0x680 > [] ? ttm_get_pages+0x2c3/0x448 [ttm] > [] ? __ttm_tt_get_page+0x98/0xc0 [ttm] > [] ? ttm_tt_populate+0x48/0x90 [ttm] > [] ? ttm_tt_bind+0x56/0xa0 [ttm] > [] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] > [] ? ttm_bo_move_buffer+0x166/0x180 [ttm] > [] ? drm_mm_kmalloc+0x26/0xc0 [drm] > [] ? get_parent_ip+0x9/0x20 > [] ? ttm_bo_validate+0x96/0x130 [ttm] > [] ? ttm_bo_init+0x315/0x390 [ttm] > [] ? radeon_bo_create+0x118/0x210 [radeon] > [] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] > [] ? radeon_gem_object_create+0x8c/0x110 [radeon] > [] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] > [] ? drm_ioctl+0x3d6/0x470 [drm] > [] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] > [] ? do_sync_read+0xbf/0x100 > [] ? vfs_ioctl+0x35/0xd0 > [] ? do_vfs_ioctl+0x88/0x530 > [] ? sub_preempt_count+0x87/0xb0 > [] ? sys_ioctl+0x49/0x80 > [] ? sys_read+0x4e/0x90 > [] ? system_call_fastpath+0x16/0x1b That's different. ttm_get_pages() looks pretty busted to me. It's not using __GFP_WAIT and it's not using __GFP_FS. It's using a plain GFP_DMA32 so it's using atomic allocations even though it doesn't need to. IOW, it's shooting itself in the head. Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT. If so, it should very much do so. If not then the function is misdesigned and should be altered to take a gfp_t argument so the caller can tell ttm_get_pages() which is the strongest allocation mode which it may use. > [TTM] Unable to allocate page. > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456, > 2, 4096, -12) This bug actually broke stuff for you. Something like this: --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page /* No pool for cached pages */ if (pool == NULL) { if (flags & TTM_PAGE_FLAG_DMA32) - gfp_flags |= GFP_DMA32; + gfp_flags |= GFP_KERNEL|GFP_DMA32; else gfp_flags |= GFP_HIGHUSER; _ although I wonder whether it should be using pool->gfp_flags. It's a shame that this code was developed and merged in secret :( Had we known, we could have looked at enhancing mempools to cover the requirement, or at implementing this in some generic fashion rather than hiding it down in drivers/gpu/drm.