linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
To: Eric Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: [RFC] alloc_pid() breakage
Date: Tue, 3 Dec 2013 02:11:03 +0000	[thread overview]
Message-ID: <20131203021103.GH10323@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20131201181329.GC10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>

On Sun, Dec 01, 2013 at 06:13:29PM +0000, Al Viro wrote:
 
> AFAICS, pid_ns gets internal procfs instance and it pins the sucker down.
> Which would cause exact same problems, obviously.  The trick done there
> is more or less to introduce a "being shut down" state of pid_ns - from
> the moment when we don't have any pids in it to actual destruction.
> Entering that state schedules (yes, it is async and yes, it is ugly)
> dropping the internal procfs vfsmount.
> 
> Additional headache, AFAICS, comes from /proc/self/ns/pid - it can be
> opened, passed to somebody in ancestor pidns and then fed by it to
> setns(2).  After that fork() by that somebody will trigger alloc_pid() in
> that pid_ns.  What happens if it comes just before the (already scheduled)
> pid_ns_release_proc()?  AFAICS, nothing good - there's no protection
> against leaks, access to freed vfsmount, double-mntput, etc.  Eric, am
> I missing something subtle and relevant in that code?

Egads...  I think I see what's going on, but it's convoluted as hell -
you rely on 1 not getting returned more than once by alloc_pidmap(), even
after having been freed, so this
        if (unlikely(is_child_reaper(pid))) {
                if (pid_ns_prepare_proc(ns))
                        goto out_free;
        }
is essentially "on the first call of alloc_pid() for given pidns".  And
upper bit in ->nr_hashed acts as "it's not in rundown state".

OK, so... what happens if I do unshare(CLONE_NEWPID) and the first fork()
attempt fails (e.g. due to failure to allocate a map page when allocating
a number in parent pidns, or OOM-induced failure to mount procfs, whatever).
Sure, that fork() has failed.  No pid had been allocated, thus no free_pid()
calls made.  After a while the memory becomes less tight and the same process
tries to fork() again.  What happens then?  pidns with processes in it,
but no reaper and NULL ->proc_mnt?  sysctl(2) called in it won't be happy;
neither will exit(2), actually, since it'll hit proc_flush_task_mnt() and
oops on trying to evaluate ->proc_mnt->mnt_root...

Another question: can free_pid() end up scheduling ->proc_work for anything
other than the last level?  After all, reaper in parent pidns couldn't have
gotten through the zap_pid_ns_process() yet, let alone getting to its
free_pid(), right?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2013-12-03  2:11 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-01 13:14 [PATCH 00/11] [RFC] repair net namespace damage to rpc_pipefs Christoph Hellwig
2013-12-01 13:14 ` [PATCH 01/11] sunrpc: allocate pipefs inodes using kmalloc Christoph Hellwig
     [not found]   ` <20131201132009.815675002-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2013-12-01 14:26     ` Jeff Layton
2013-12-01 13:14 ` [PATCH 02/11] rpc_pipefs: always mount on net namespace initialization Christoph Hellwig
2013-12-01 15:24   ` Jeff Layton
2013-12-01 13:14 ` [PATCH 03/11] sunrpc: remove the rpc_clients_block notifier Christoph Hellwig
2013-12-01 15:25   ` Jeff Layton
2013-12-01 13:14 ` [PATCH 04/11] sunrpc: no need to have a lock or superblock for rpc_unlink Christoph Hellwig
2013-12-01 13:14 ` [PATCH 05/11] sunprc: add sensible pipe creation and removal helpers Christoph Hellwig
2013-12-01 13:14 ` [PATCH 06/11] sunrpc: clean up rpc_pipefs client dir creation Christoph Hellwig
2013-12-01 13:14 ` [PATCH 07/11] sunrpc: make rpc_mkdir_populate net-namespace aware Christoph Hellwig
2013-12-01 13:14 ` [PATCH 08/11] sunrpc: rpc_get_sb_net, die, die, die Christoph Hellwig
2013-12-01 13:14 ` [PATCH 09/11] nfs: convert idmapper to rpc_mkpipe_clnt Christoph Hellwig
2013-12-01 13:14 ` [PATCH 10/11] auth_gss: convert " Christoph Hellwig
2013-12-01 13:14 ` [PATCH 11/11] sunrpc: rpc_pipefs cleanup Christoph Hellwig
     [not found] ` <20131201131441.790963326-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2013-12-01 14:36   ` [PATCH 00/11] [RFC] repair net namespace damage to rpc_pipefs Jeff Layton
2013-12-01 15:44   ` Jeff Layton
2013-12-01 15:57     ` Jeff Layton
2013-12-01 18:13   ` Al Viro
     [not found]     ` <20131201181329.GC10323-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2013-12-02  8:12       ` Christoph Hellwig
2013-12-02 13:44         ` Trond Myklebust
2013-12-02 14:24           ` Stanislav Kinsbursky
     [not found]             ` <529C982B.2030906-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-12-02 15:36               ` Christoph Hellwig
2013-12-02 15:58               ` Trond Myklebust
     [not found]                 ` <5872AE74-DF03-42E2-A21B-F8E35634CD1D-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2013-12-03  7:24                   ` Stanislav Kinsbursky
     [not found]           ` <3C65EB4C-6592-44F8-B08D-E5A9EFD6C8C6-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2013-12-02 15:34             ` Christoph Hellwig
     [not found]               ` <20131202153435.GA2804-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2013-12-02 16:00                 ` Trond Myklebust
     [not found]                   ` <BCED1826-0743-4420-AD45-B38D67EF3D99-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2013-12-02 16:27                     ` Christoph Hellwig
     [not found]                       ` <20131202162716.GA17417-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2013-12-02 16:46                         ` Trond Myklebust
2013-12-02 16:33                   ` Jeff Layton
2013-12-02 16:37                     ` J. Bruce Fields
2013-12-02 16:45                       ` Jeff Layton
2013-12-02 15:57             ` Al Viro
2013-12-02 16:04               ` Trond Myklebust
2013-12-03  2:11       ` Al Viro [this message]
2013-12-02  7:23   ` Stanislav Kinsbursky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131203021103.GH10323@ZenIV.linux.org.uk \
    --to=viro-3bdd1+5odreifsdqtta3olvcufugdwfn@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).