All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH 09/19] XFS: ensure xfs_file_*_read cannot deadlock in memory allocation.
Date: Wed, 16 Apr 2014 16:27:45 +1000	[thread overview]
Message-ID: <20140416162745.67442b07@notabene.brown> (raw)
In-Reply-To: <20140416060459.GE15995@dastard>

[-- Attachment #1: Type: text/plain, Size: 2819 bytes --]

On Wed, 16 Apr 2014 16:04:59 +1000 Dave Chinner <david@fromorbit.com> wrote:

> On Wed, Apr 16, 2014 at 02:03:36PM +1000, NeilBrown wrote:
> > xfs_file_*_read holds an inode lock while calling a generic 'read'
> > function.  These functions perform read-ahead and are quite likely to
> > allocate memory.
> 
> Yes, that's what reading data from disk requires.
> 
> > So set PF_FSTRANS to ensure they avoid __GFP_FS and so don't recurse
> > into a filesystem to free memory.
> 
> We already have that protection via the
> > 
> > This can be a problem with loop-back NFS mounts, if free_pages ends up
> > wating in nfs_release_page(), and nfsd is blocked waiting for the lock
> > that this code holds.
> > 
> > This was found both by lockdep and as a real deadlock during testing.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/xfs/xfs_file.c |   12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 64b48eade91d..88b33ef64668 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -243,6 +243,7 @@ xfs_file_aio_read(
> >  	ssize_t			ret = 0;
> >  	int			ioflags = 0;
> >  	xfs_fsize_t		n;
> > +	unsigned int		pflags;
> >  
> >  	XFS_STATS_INC(xs_read_calls);
> >  
> > @@ -290,6 +291,10 @@ xfs_file_aio_read(
> >  	 * proceeed concurrently without serialisation.
> >  	 */
> >  	xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
> > +	/* As we hold a lock, we must ensure that any allocation
> > +	 * in generic_file_aio_read avoid __GFP_FS
> > +	 */
> > +	current_set_flags_nested(&pflags, PF_FSTRANS);
> 
> Ugh. No. This is Simply Wrong.
> 
> We handle the memory allocations in the IO path with
> GFP_NOFS/KM_NOFS where necessary.
> 
> We also do this when setting up regular file inodes in
> xfs_setup_inode():
> 
>         /*
>          * Ensure all page cache allocations are done from GFP_NOFS context to
>          * prevent direct reclaim recursion back into the filesystem and blowing
>          * stacks or deadlocking.
>          */
>         gfp_mask = mapping_gfp_mask(inode->i_mapping);
>         mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
> 
> Which handles all of the mapping allocations that occur within the
> page cache read/write paths.
> 
> Remember, you removed the KM_NOFS code from the XFS allocator that
> caused it to clear __GFP_FS in an earlier patch - the read Io path
> is one of the things you broke by doing that....
> 
> If there are places where we don't use GFP_NOFS context allocations
> that we should, then we need to fix them individually....
> 
> Cheers,
> 
> Dave.

Thanks Dave.  Having fixed the KM_NOFS error I'll start retesting with all
the other xfs patches removed.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: NeilBrown <neilb@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH 09/19] XFS: ensure xfs_file_*_read cannot deadlock in memory allocation.
Date: Wed, 16 Apr 2014 16:27:45 +1000	[thread overview]
Message-ID: <20140416162745.67442b07@notabene.brown> (raw)
In-Reply-To: <20140416060459.GE15995@dastard>


[-- Attachment #1.1: Type: text/plain, Size: 2819 bytes --]

On Wed, 16 Apr 2014 16:04:59 +1000 Dave Chinner <david@fromorbit.com> wrote:

> On Wed, Apr 16, 2014 at 02:03:36PM +1000, NeilBrown wrote:
> > xfs_file_*_read holds an inode lock while calling a generic 'read'
> > function.  These functions perform read-ahead and are quite likely to
> > allocate memory.
> 
> Yes, that's what reading data from disk requires.
> 
> > So set PF_FSTRANS to ensure they avoid __GFP_FS and so don't recurse
> > into a filesystem to free memory.
> 
> We already have that protection via the
> > 
> > This can be a problem with loop-back NFS mounts, if free_pages ends up
> > wating in nfs_release_page(), and nfsd is blocked waiting for the lock
> > that this code holds.
> > 
> > This was found both by lockdep and as a real deadlock during testing.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/xfs/xfs_file.c |   12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 64b48eade91d..88b33ef64668 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -243,6 +243,7 @@ xfs_file_aio_read(
> >  	ssize_t			ret = 0;
> >  	int			ioflags = 0;
> >  	xfs_fsize_t		n;
> > +	unsigned int		pflags;
> >  
> >  	XFS_STATS_INC(xs_read_calls);
> >  
> > @@ -290,6 +291,10 @@ xfs_file_aio_read(
> >  	 * proceeed concurrently without serialisation.
> >  	 */
> >  	xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
> > +	/* As we hold a lock, we must ensure that any allocation
> > +	 * in generic_file_aio_read avoid __GFP_FS
> > +	 */
> > +	current_set_flags_nested(&pflags, PF_FSTRANS);
> 
> Ugh. No. This is Simply Wrong.
> 
> We handle the memory allocations in the IO path with
> GFP_NOFS/KM_NOFS where necessary.
> 
> We also do this when setting up regular file inodes in
> xfs_setup_inode():
> 
>         /*
>          * Ensure all page cache allocations are done from GFP_NOFS context to
>          * prevent direct reclaim recursion back into the filesystem and blowing
>          * stacks or deadlocking.
>          */
>         gfp_mask = mapping_gfp_mask(inode->i_mapping);
>         mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
> 
> Which handles all of the mapping allocations that occur within the
> page cache read/write paths.
> 
> Remember, you removed the KM_NOFS code from the XFS allocator that
> caused it to clear __GFP_FS in an earlier patch - the read Io path
> is one of the things you broke by doing that....
> 
> If there are places where we don't use GFP_NOFS context allocations
> that we should, then we need to fix them individually....
> 
> Cheers,
> 
> Dave.

Thanks Dave.  Having fixed the KM_NOFS error I'll start retesting with all
the other xfs patches removed.

NeilBrown

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-04-16  6:27 UTC|newest]

Thread overview: 151+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-16  4:03 [PATCH/RFC 00/19] Support loop-back NFS mounts NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` NeilBrown
2014-04-16  4:03 ` [PATCH 04/19] Make effect of PF_FSTRANS to disable __GFP_FS universal NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:37   ` Dave Chinner
2014-04-16  5:37     ` Dave Chinner
2014-04-16  5:37     ` Dave Chinner
2014-04-16  6:17     ` NeilBrown
2014-04-16  6:17       ` NeilBrown
2014-04-17  1:03       ` NeilBrown
2014-04-17  1:03         ` NeilBrown
2014-04-17  4:41         ` Dave Chinner
2014-04-17  4:41           ` Dave Chinner
2014-04-17  4:41           ` Dave Chinner
2014-04-16  4:03 ` [PATCH 13/19] MM: set PF_FSTRANS while allocating per-cpu memory to avoid deadlock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:49   ` Dave Chinner
2014-04-16  5:49     ` Dave Chinner
2014-04-16  5:49     ` Dave Chinner
2014-04-16  6:22     ` NeilBrown
2014-04-16  6:22       ` NeilBrown
2014-04-16  6:30       ` Dave Chinner
2014-04-16  6:30         ` Dave Chinner
2014-04-16  6:30         ` Dave Chinner
2014-04-16  4:03 ` [PATCH 06/19] nfsd: set PF_FSTRANS for nfsd threads NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  7:28   ` Peter Zijlstra
2014-04-16  7:28     ` Peter Zijlstra
2014-04-16  7:28     ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 01/19] Promote current_{set, restore}_flags_nested from xfs to global NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 08/19] Set PF_FSTRANS while write_cache_pages calls ->writepage NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 10/19] NET: set PF_FSTRANS while holding sk_lock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  5:13   ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:13     ` Eric Dumazet
2014-04-16  5:47     ` NeilBrown
2014-04-16  5:47       ` NeilBrown
2014-04-16  5:47       ` NeilBrown
2014-04-16 13:00     ` David Miller
2014-04-16 13:00       ` David Miller
2014-04-16 13:00       ` David Miller
2014-04-17  2:38       ` NeilBrown
2014-04-17  2:38         ` NeilBrown
2014-04-16  4:03 ` [PATCH 12/19] NET: set PF_FSTRANS while holding rtnl_lock NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 09/19] XFS: ensure xfs_file_*_read cannot deadlock in memory allocation NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:04   ` Dave Chinner
2014-04-16  6:04     ` Dave Chinner
2014-04-16  6:04     ` Dave Chinner
2014-04-16  6:27     ` NeilBrown [this message]
2014-04-16  6:27       ` NeilBrown
2014-04-16  6:31     ` Dave Chinner
2014-04-16  6:31       ` Dave Chinner
2014-04-16  6:31       ` Dave Chinner
2014-04-16  4:03 ` [PATCH 07/19] nfsd and VM: use PF_LESS_THROTTLE to avoid throttle in shrink_inactive_list NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 11/19] FS: set PF_FSTRANS while holding mmap_sem in exec.c NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 03/19] lockdep: improve scenario messages for RECLAIM_FS errors NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  7:22   ` Peter Zijlstra
2014-04-16  7:22     ` Peter Zijlstra
2014-04-16  7:22     ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 02/19] lockdep: lockdep_set_current_reclaim_state should save old value NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 14/19] driver core: set PF_FSTRANS while holding gdp_mutex NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 05/19] SUNRPC: track whether a request is coming from a loop-back interface NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16 14:47   ` Jeff Layton
2014-04-16 14:47     ` Jeff Layton
2014-04-16 14:47     ` Jeff Layton
2014-04-16 23:25     ` NeilBrown
2014-04-16 23:25       ` NeilBrown
2014-04-16  4:03 ` [PATCH 19/19] XFS: set PF_FSTRANS while ilock is held in xfs_free_eofblocks NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:18   ` Dave Chinner
2014-04-16  6:18     ` Dave Chinner
2014-04-16  6:18     ` Dave Chinner
2014-04-16  4:03 ` [PATCH 18/19] nfsd: set PF_FSTRANS during nfsd4_do_callback_rpc NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 15/19] nfsd: set PF_FSTRANS when client_mutex is held NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03 ` [PATCH 17/19] VFS: set PF_FSTRANS while namespace_sem " NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:46   ` Al Viro
2014-04-16  4:46     ` Al Viro
2014-04-16  5:52     ` NeilBrown
2014-04-16  5:52       ` NeilBrown
2014-04-16 16:37       ` Al Viro
2014-04-16 16:37         ` Al Viro
2014-04-16 16:37         ` Al Viro
2014-04-16  4:03 ` [PATCH 16/19] VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  4:03   ` NeilBrown
2014-04-16  6:25   ` Dave Chinner
2014-04-16  6:25     ` Dave Chinner
2014-04-16  6:25     ` Dave Chinner
2014-04-16  6:49     ` NeilBrown
2014-04-16  6:49       ` NeilBrown
2014-04-16  9:00       ` Dave Chinner
2014-04-16  9:00         ` Dave Chinner
2014-04-16  9:00         ` Dave Chinner
2014-04-17  0:51         ` NeilBrown
2014-04-17  0:51           ` NeilBrown
2014-04-17  5:58           ` Dave Chinner
2014-04-17  5:58             ` Dave Chinner
2014-04-17  5:58             ` Dave Chinner
2014-04-16 14:42 ` [PATCH/RFC 00/19] Support loop-back NFS mounts Jeff Layton
2014-04-16 14:42   ` Jeff Layton
2014-04-16 14:42   ` Jeff Layton
2014-04-17  0:20   ` NeilBrown
2014-04-17  0:20     ` NeilBrown
2014-04-17  0:20     ` NeilBrown
2014-04-17  1:27     ` Dave Chinner
2014-04-17  1:27       ` Dave Chinner
2014-04-17  1:27       ` Dave Chinner
2014-04-17  1:50       ` NeilBrown
2014-04-17  1:50         ` NeilBrown
2014-04-17  1:50         ` NeilBrown
2014-04-17  4:23         ` Dave Chinner
2014-04-17  4:23           ` Dave Chinner
2014-04-17  4:23           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140416162745.67442b07@notabene.brown \
    --to=neilb@suse.de \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.