linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: prasadjoshi124@gmail.com
Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org,
	"Ricardo M. Correia" <ricardo.correia@oracle.com>
Subject: Re: [Bug 30702] New: vmalloc(GFP_NOFS) can callback file system evict_inode, inducing deadlock.
Date: Wed, 9 Mar 2011 14:23:11 -0800	[thread overview]
Message-ID: <20110309142311.1d8073fe.akpm@linux-foundation.org> (raw)
In-Reply-To: <bug-30702-27@https.bugzilla.kernel.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Mar 2011 19:12:23 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=30702
> 
>            Summary: vmalloc(GFP_NOFS) can callback file system
>                     evict_inode, inducing deadlock.

Yeah.

Ricardo has been working on this.  See the thread at
http://marc.info/?l=linux-mm&m=128942194520631&w=4

It's tough, and we've been bad, and progress is slow :(

>            Product: Memory Management
>            Version: 2.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>         AssignedTo: akpm@linux-foundation.org
>         ReportedBy: prasadjoshi124@gmail.com
>         Regression: No
> 
> 
> I am working on a propitiatory file system development. The problem I am facing
> is with calling __vmalloc in a lock. Though I am working on changing the code
> that I have, I thought it would be good to atleast report the VMALLOC problem.
> 
> The code looks something like this
> 
> const struct file_operations lzfs_file_operations = {
>     .write              = lzfs_vnop_write,
> };
> 
> ssize_t
> lzfs_vnop_write()
> {
>       mutex_lock(some global mutex);
>       ptr = __vmalloc(size, GFP_NOFS | __GFP_HIGHMEM, PAGE_KERNEL);
>       mutex_unlock(some global mutex);
> }
> 
> static const struct super_operations lzfs_super_ops = {
>     .evict_inode    = lzfs_evict_vnode,
> };
> 
> static void
> lzfs_evict_vnode(struct inode *inode)
> {
>       mutex_lock(some global mutex);
> 
>       some code for eviction;
> 
>       mutex_unlock(some global mutex);
> }
> 
> As the __vmalloc is called with GFP_NOFS, I was expecting the evict_inode (or
> clear_inode) would not be called when page cache is purned. But I noticed
> following oops message during the testing.
> 
> [ 5058.193312]  [<ffffffffa092a534>] lzfs_clear_vnode+0x104/0x160 [lzfs]
> [ 5058.193318]  [<ffffffff8116abc5>] clear_inode+0x75/0xf0
> [ 5058.193323]  [<ffffffff8116ac80>] dispose_list+0x40/0x150
> [ 5058.193328]  [<ffffffff8116af23>] prune_icache+0x193/0x2a0
> [ 5058.193332]  [<ffffffff811665e3>] ? prune_dcache+0x183/0x1d0
> [ 5058.193338]  [<ffffffff8116b081>] shrink_icache_memory+0x51/0x60
> [ 5058.193345]  [<ffffffff8110e6d4>] shrink_slab+0x124/0x180
> [ 5058.193349]  [<ffffffff8110ff0f>] do_try_to_free_pages+0x1cf/0x360
> [ 5058.193354]  [<ffffffff8111024b>] try_to_free_pages+0x6b/0x70
> [ 5058.193359]  [<ffffffff8110740a>] __alloc_pages_slowpath+0x27a/0x590
> [ 5058.193365]  [<ffffffff81107884>] __alloc_pages_nodemask+0x164/0x1d0
> [ 5058.193370]  [<ffffffff811397ba>] alloc_pages_current+0x9a/0x100
> [ 5058.193375]  [<ffffffff811066ce>] __get_free_pages+0xe/0x50
> [ 5058.193380]  [<ffffffff81042435>] pte_alloc_one_kernel+0x15/0x20
> [ 5058.193385]  [<ffffffff8111c86b>] __pte_alloc_kernel+0x1b/0xc0
> [ 5058.193391]  [<ffffffff8112ad63>] vmap_pte_range+0x183/0x1a0
> [ 5058.193395]  [<ffffffff8112aec6>] vmap_pud_range+0x146/0x1c0
> [ 5058.193400]  [<ffffffff8112afda>] vmap_page_range_noflush+0x9a/0xc0
> [ 5058.193405]  [<ffffffff8112b032>] map_vm_area+0x32/0x50
> [ 5058.193410]  [<ffffffff8112c4a8>] __vmalloc_area_node+0x108/0x190
> [ 5058.193426]  [<ffffffffa06591a0>] ? kv_alloc+0x90/0x130 [spl]
> [ 5058.193431]  [<ffffffff8112c392>] __vmalloc_node+0xa2/0xb0
> [ 5058.193443]  [<ffffffffa06591a0>] ? kv_alloc+0x90/0x130 [spl]
> [ 5058.193453]  [<ffffffff8112c712>] __vmalloc+0x22/0x30
> [ 5058.193464]  [<ffffffffa06591a0>] kv_alloc+0x90/0x130 [spl]
> [ 5058.194007]  [<ffffffffa0858136>] zfs_grow_blocksize+0x46/0xe0 [zfs]
> [ 5058.194063]  [<ffffffffa08547e8>] zfs_write+0xbb8/0x1100 [zfs]
> [ 5058.194075]  [<ffffffff8114e740>] ? mem_cgroup_charge_common+0x70/0x90
> [ 5058.194082]  [<ffffffffa092ced7>] lzfs_vnop_write+0xc7/0x3b0 [lzfs]
> [ 5058.194087]  [<ffffffff8111bacc>] ? do_anonymous_page+0x11c/0x350
> [ 5058.194096]  [<ffffffff81152ec8>] vfs_write+0xb8/0x1a0
> [ 5058.194100]  [<ffffffff81153711>] sys_write+0x51/0x80
> [ 5058.194105]  [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
> 
> The problem is with __vmalloc (map_vm_area) which discards the allocation flag
> while mapping the scattered physical pages contiguously into the virtual
> vmalloc area. 
> 
> 1482 static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> 1483                  pgprot_t prot, int node, void *caller)
> 1484 {
> 1525     if (map_vm_area(area, prot, &pages, gfp_mask))
> 1526         goto fail;
> 1527     return area->addr;
> 1532 }
> 
> The function map_vm_area() can result in calls to 
> pud_alloc
> pmd_alloc
> pte_alloc_kernel
> 
> Which allocate memory using flag GFP_KERNEL
> for example
> pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
> {
>     pte_t *pte;
> 
>     pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT|__GFP_ZERO);
>     return pte;
> }
> 
> The page allocation might trigger the clear_inode (or evict_inode) if the
> system is running short of the memory. Thus causing the oops.
> 
> Though non of the file system in Linux Kernel seems to calling vmalloc in a
> lock, it would be good to fix the problem anyway.
> 
> As far as I can understand the solution is to pass the gfp_mask down the call
> hierarchy. I wanted to send the patch with these changes, but soon I realized
> changes are needed at various places and are too much. I thought to reporting
> the problem first.
> 
> Thanks and Regards.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

       reply	other threads:[~2011-03-09 22:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-30702-27@https.bugzilla.kernel.org/>
2011-03-09 22:23 ` Andrew Morton [this message]
2011-03-10 12:11   ` [Bug 30702] New: vmalloc(GFP_NOFS) can callback file system evict_inode, inducing deadlock Prasad Joshi
2011-03-10 12:17   ` Ricardo M. Correia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110309142311.1d8073fe.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=prasadjoshi124@gmail.com \
    --cc=ricardo.correia@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).