From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: Propagating GFP_NOFS inside __vmalloc() Date: Thu, 11 Nov 2010 09:10:38 +1100 Message-ID: <20101110221038.GX2715@dastard> References: <1289421759.11149.59.camel@oralap> <1289424955.11149.73.camel@oralap> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Brian Behlendorf , Andreas Dilger To: "Ricardo M. Correia" Return-path: Received: from bld-mail13.adl6.internode.on.net ([150.101.137.98]:41016 "EHLO mail.internode.on.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756993Ab0KJWKv (ORCPT ); Wed, 10 Nov 2010 17:10:51 -0500 Content-Disposition: inline In-Reply-To: <1289424955.11149.73.camel@oralap> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Nov 10, 2010 at 10:35:55PM +0100, Ricardo M. Correia wrote: > On Wed, 2010-11-10 at 21:42 +0100, Ricardo M. Correia wrote: > > Hi, > > > > As part of Lustre filesystem development, we are running into a > > situation where we (sporadically) need to call into __vmalloc() from a > > thread that processes I/Os to disk (it's a long story). > > > > In general, this would be fine as long as we pass GFP_NOFS to > > __vmalloc(), but the problem is that even if we pass this flag, vmalloc > > itself sometimes allocates memory with GFP_KERNEL. > > By the way, it seems that existing users in Linus' tree may be > vulnerable to the same bug that we experienced: > > In GFS: > 8 1253 fs/gfs2/dir.c <> > ptr = __vmalloc(size, GFP_NOFS, PAGE_KERNEL); > > The Ceph filesystem: > 20 22 net/ceph/buffer.c <> > b->vec.iov_base = __vmalloc(len, gfp, PAGE_KERNEL); > .. which can be called from: > 3 560 fs/ceph/inode.c <> > xattr_blob = ceph_buffer_new(iinfo->xattr_len, GFP_NOFS); > > In the MM code: > 18 5184 mm/page_alloc.c <> > table = __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL); > > All of these seem to be vulnerable to GFP_KERNEL allocations from within > __vmalloc(), at least on x86-64 (as I've detailed below). Hmmm. I'd say there's a definite possibility that vm_map_ram() as called from in fs/xfs/linux-2.6/xfs_buf.c needs to use GFP_NOFS allocation, too. Currently vm_map_ram() just uses GFP_KERNEL internally, but is certainly being called in contexts where we don't allow recursion (e.g. in a transaction) so probably should allow a gfp mask to be passed in.... Cheers, Dave. -- Dave Chinner david@fromorbit.com