linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
	netdev@vger.kernel.org
Subject: Re: [PATCH 10/19] NET: set PF_FSTRANS while holding sk_lock
Date: Wed, 16 Apr 2014 15:47:46 +1000	[thread overview]
Message-ID: <20140416154746.7dbb4485@notabene.brown> (raw)
In-Reply-To: <1397625226.4222.113.camel@edumazet-glaptop2.roam.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 1892 bytes --]

On Tue, 15 Apr 2014 22:13:46 -0700 Eric Dumazet <eric.dumazet@gmail.com>
wrote:

> On Wed, 2014-04-16 at 14:03 +1000, NeilBrown wrote:
> > sk_lock can be taken while reclaiming memory (in nfsd for loop-back
> > NFS mounts, and presumably in nfs), and memory can be allocated while
> > holding sk_lock, at least via:
> > 
> >  inet_listen -> inet_csk_listen_start ->reqsk_queue_alloc
> > 
> > So to avoid deadlocks, always set PF_FSTRANS while holding sk_lock.
> > 
> > This deadlock was found by lockdep.
> 
> Wow, this is adding expensive stuff in fast path, only for nfsd :(

Yes, this was probably one part that I was least comfortable about.

> 
> BTW, why should the current->flags should be saved on a socket field,
> and not a current->save_flags. This really looks a thread property, not
> a socket one.
> 
> Why nfsd could not have PF_FSTRANS in its current->flags ?

nfsd does have PF_FSTRANS set in current->flags.  But some other processes
might not.

If any process takes sk_lock, allocates memory, and then blocks in reclaim it
could be waiting for nfsd.  If nfsd waits for that sk_lock, it would cause a
deadlock.

Thinking a bit more carefully .... I suspect that any socket that nfsd
created would only ever be locked by nfsd.  If that is the case then the
problem can be resolved entirely within nfsd.  We would need to tell lockdep
that there are two sorts of sk_locks, those which nfsd uses and all the
rest.  That might get a little messy, but wouldn't impact performance.

Is it justified to assume that sockets created by nfsd threads would only
ever be locked by nfsd threads (and interrupts, which won't be allocating
memory so don't matter), or might there be locked by other threads - e.g for
'netstat -a' etc??


> 
> For applications handling millions of sockets, this makes a difference.
> 

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2014-04-16  5:47 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-16  4:03 [PATCH/RFC 00/19] Support loop-back NFS mounts NeilBrown
2014-04-16  4:03 ` [PATCH 01/19] Promote current_{set, restore}_flags_nested from xfs to global NeilBrown
2014-04-16  4:03 ` [PATCH 03/19] lockdep: improve scenario messages for RECLAIM_FS errors NeilBrown
2014-04-16  7:22   ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 13/19] MM: set PF_FSTRANS while allocating per-cpu memory to avoid deadlock NeilBrown
2014-04-16  5:49   ` Dave Chinner
2014-04-16  6:22     ` NeilBrown
2014-04-16  6:30       ` Dave Chinner
2014-04-16  4:03 ` [PATCH 05/19] SUNRPC: track whether a request is coming from a loop-back interface NeilBrown
2014-04-16 14:47   ` Jeff Layton
2014-04-16 23:25     ` NeilBrown
2014-04-16  4:03 ` [PATCH 06/19] nfsd: set PF_FSTRANS for nfsd threads NeilBrown
2014-04-16  7:28   ` Peter Zijlstra
2014-04-16  4:03 ` [PATCH 11/19] FS: set PF_FSTRANS while holding mmap_sem in exec.c NeilBrown
2014-04-16  4:03 ` [PATCH 02/19] lockdep: lockdep_set_current_reclaim_state should save old value NeilBrown
2014-04-16  4:03 ` [PATCH 12/19] NET: set PF_FSTRANS while holding rtnl_lock NeilBrown
2014-04-16  4:03 ` [PATCH 07/19] nfsd and VM: use PF_LESS_THROTTLE to avoid throttle in shrink_inactive_list NeilBrown
2014-04-16  4:03 ` [PATCH 04/19] Make effect of PF_FSTRANS to disable __GFP_FS universal NeilBrown
2014-04-16  5:37   ` Dave Chinner
2014-04-16  6:17     ` NeilBrown
2014-04-17  1:03       ` NeilBrown
2014-04-17  4:41         ` Dave Chinner
2014-04-16  4:03 ` [PATCH 14/19] driver core: set PF_FSTRANS while holding gdp_mutex NeilBrown
2014-04-16  4:03 ` [PATCH 08/19] Set PF_FSTRANS while write_cache_pages calls ->writepage NeilBrown
2014-04-16  4:03 ` [PATCH 10/19] NET: set PF_FSTRANS while holding sk_lock NeilBrown
2014-04-16  5:13   ` Eric Dumazet
2014-04-16  5:47     ` NeilBrown [this message]
2014-04-16 13:00     ` David Miller
2014-04-17  2:38       ` NeilBrown
2014-04-16  4:03 ` [PATCH 09/19] XFS: ensure xfs_file_*_read cannot deadlock in memory allocation NeilBrown
2014-04-16  6:04   ` Dave Chinner
2014-04-16  6:27     ` NeilBrown
2014-04-16  6:31     ` Dave Chinner
2014-04-16  4:03 ` [PATCH 17/19] VFS: set PF_FSTRANS while namespace_sem is held NeilBrown
2014-04-16  4:46   ` Al Viro
2014-04-16  5:52     ` NeilBrown
     [not found]       ` <20140416155230.4d02e4b9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2014-04-16 16:37         ` Al Viro
2014-04-16  4:03 ` [PATCH 18/19] nfsd: set PF_FSTRANS during nfsd4_do_callback_rpc NeilBrown
2014-04-16  4:03 ` [PATCH 16/19] VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc NeilBrown
2014-04-16  6:25   ` Dave Chinner
2014-04-16  6:49     ` NeilBrown
2014-04-16  9:00       ` Dave Chinner
2014-04-17  0:51         ` NeilBrown
2014-04-17  5:58           ` Dave Chinner
2014-04-16  4:03 ` [PATCH 15/19] nfsd: set PF_FSTRANS when client_mutex is held NeilBrown
2014-04-16  4:03 ` [PATCH 19/19] XFS: set PF_FSTRANS while ilock is held in xfs_free_eofblocks NeilBrown
2014-04-16  6:18   ` Dave Chinner
2014-04-16 14:42 ` [PATCH/RFC 00/19] Support loop-back NFS mounts Jeff Layton
2014-04-17  0:20   ` NeilBrown
2014-04-17  1:27     ` Dave Chinner
2014-04-17  1:50       ` NeilBrown
2014-04-17  4:23         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140416154746.7dbb4485@notabene.brown \
    --to=neilb@suse.de \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).