From mboxrd@z Thu Jan 1 00:00:00 1970 From: lauraa@codeaurora.org (Laura Abbott) Date: Fri, 11 May 2012 14:51:24 -0700 Subject: Bad use of highmem with buffer_migrate_page? In-Reply-To: <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com> References: <4FAC200D.2080306@codeaurora.org> <02fc01cd2f50$5d77e4c0$1867ae40$%szyprowski@samsung.com> Message-ID: <4FAD89DC.2090307@codeaurora.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 5/11/2012 1:30 AM, Marek Szyprowski wrote: > Hello, > > On Thursday, May 10, 2012 10:08 PM Laura Abbott wrote: > >> I did a backport of the Contiguous Memory Allocator to a 3.0.8 tree. I >> wrote fairly simple test case that, in 1MB chunks, allocs up to 40MB >> from a reserved area, maps, writes, unmaps and then frees in an infinite >> loop. When running this with another program in parallel to put some >> stress on the filesystem, I hit data aborts in the filesystem/journal >> layer, although not always the same backtrace. As an example: >> >> [] (__ext4_check_dir_entry+0x20/0x184) from [] >> (add_dirent_to_buf+0x70/0x2ac) >> [] (add_dirent_to_buf+0x70/0x2ac) from [] >> (ext4_add_entry+0xd8/0x4bc) >> [] (ext4_add_entry+0xd8/0x4bc) from [] >> (ext4_add_nondir+0x14/0x64) >> [] (ext4_add_nondir+0x14/0x64) from [] >> (ext4_create+0xd8/0x120) >> [] (ext4_create+0xd8/0x120) from [] >> (vfs_create+0x74/0xa4) >> [] (vfs_create+0x74/0xa4) from [] (do_last+0x588/0x8d4) >> [] (do_last+0x588/0x8d4) from [] >> (path_openat+0xc4/0x394) >> [] (path_openat+0xc4/0x394) from [] >> (do_filp_open+0x30/0x7c) >> [] (do_filp_open+0x30/0x7c) from [] >> (do_sys_open+0xd8/0x174) >> [] (do_sys_open+0xd8/0x174) from [] >> (ret_fast_syscall+0x0/0x30) >> >> Every panic had the same issue where a struct buffer_head [1] had a >> b_data that was unexpectedly NULL. >> >> During the course of CMA, buffer_migrate_page could be called to migrate >> from a CMA page to a new page. buffer_migrate_page calls set_bh_page[2] >> to set the new page for the buffer_head. If the new page is a highmem >> page though, the bh->b_data ends up as NULL, which could produce the >> panics seen above. >> >> This seems to indicate that highmem pages are not not appropriate for >> use as pages to migrate to. The following made the problem go away for me: >> >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5753,7 +5753,7 @@ static struct page * >> __alloc_contig_migrate_alloc(struct page *page, unsigned long private, >> int **resultp) >> { >> - return alloc_page(GFP_HIGHUSER_MOVABLE); >> + return alloc_page(GFP_USER | __GFP_MOVABLE); >> } >> >> >> Does this seem like an actual issue or is this an artifact of my >> backport to 3.0? I'm not familiar enough with the filesystem layer to be >> able to tell where highmem can actually be used. > > I will need to investigate this further as this issue doesn't appear on > v3.3+ kernels, but I remember I saw something similar when I tried CMA > backported to v3.0. > The 3.0 kernel was the most stable I had around and easiest to work with. I'll be trying 3.4 sometime in the near future. > You have pointed to an important issue which we need to solve somehow. CMA > wasn't fully tested with high mem and it looks that there are some issues here > and there. Your patch will prevent using highmem for any migration triggered by > CMA. IMHO this is a bit limited, but right now I have no better idea. For a > quick backport it should be ok. > All the systems I'll be testing with will have highmem so I'll continue to report any problems I find. I'll be curious if this will show up again when I try 3.4. If it does, I'll see if I can get evidence of why this is the proper fix or a more nuanced patch. > Best regards Thanks, Laura -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.