All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Rainer Weikusat <rweikusat@mobileactivedefense.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	netdev <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	syzkaller <syzkaller@googlegroups.com>
Subject: Re: fs, net: deadlock between bind/splice on af_unix
Date: Fri, 9 Dec 2016 01:32:08 +0000	[thread overview]
Message-ID: <20161209013208.GW1555@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAM_iQpXu+fyjmvrYRB9+VJCdSLS=7Jiet762hqWDANfsOM0XWw@mail.gmail.com>

On Thu, Dec 08, 2016 at 04:08:27PM -0800, Cong Wang wrote:
> On Thu, Dec 8, 2016 at 8:30 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Chain exists of:
> >  Possible unsafe locking scenario:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(sb_writers#5);
> >                                lock(&u->bindlock);
> >                                lock(sb_writers#5);
> >   lock(&pipe->mutex/1);
> 
> This looks false positive, probably just needs lockdep_set_class()
> to set keys for pipe->mutex and unix->bindlock.

I'm afraid that it's not a false positive at all.

Preparations:
	* create an AF_UNIX socket.
	* set SOCK_PASSCRED on it.
	* create a pipe.

Child 1: splice from pipe to socket; locks pipe and proceeds down towards
unix_dgram_sendmsg().

Child 2: splice from pipe to /mnt/foo/bar; requests write access to /mnt
and blocks on attempt to lock the pipe already locked by (1).

Child 3: freeze /mnt; blocks until (2) is done

Child 4: bind() the socket to /mnt/barf; grabs ->bindlock on the socket and
proceeds to create /mnt/barf, which blocks due to fairness of freezer (no
extra write accesses to something that is in process of being frozen).

_Now_ (1) gets around to unix_dgram_sendmsg().  We still have NULL u->addr,
since bind() has not gotten through yet.  We also have SOCK_PASSCRED set,
so we attempt autobind; it blocks on the ->bindlock, which won't be
released until bind() is done (at which point we'll see non-NULL u->addr
and bugger off from autobind), but bind() won't succeed until /mnt
goes through the freeze-thaw cycle, which won't happen until (2) finishes,
which won't happen until (1) unlocks the pipe.  Deadlock.

Granted, ->bindlock is taken interruptibly, so it's not that much of
a problem (you can kill the damn thing), but you would need to intervene
and kill it.

Why do we do autobind there, anyway, and why is it conditional on
SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
to sending stuff without autobind ever done - just use socketpair()
to create that sucker and we won't be going through the connect()
at all.

  reply	other threads:[~2016-12-09  1:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-08 14:47 fs, net: deadlock between bind/splice on af_unix Dmitry Vyukov
2016-12-08 16:30 ` Dmitry Vyukov
2016-12-09  0:08   ` Cong Wang
2016-12-09  1:32     ` Al Viro [this message]
2016-12-09  6:32       ` Cong Wang
2016-12-09  6:41         ` Al Viro
2017-01-16  9:32           ` Dmitry Vyukov
2017-01-17 21:21             ` Cong Wang
2017-01-18  9:17               ` Dmitry Vyukov
2017-01-20  4:57                 ` Cong Wang
2017-01-20 22:52                   ` Dmitry Vyukov
2017-01-23 19:00                     ` Cong Wang
2017-01-26 23:29               ` Mateusz Guzik
2017-01-27  5:11                 ` Cong Wang
2017-01-27  6:41                   ` Mateusz Guzik
2017-01-31  6:44                     ` Cong Wang
2017-01-31 18:14                       ` Mateusz Guzik
2017-02-06  7:22                         ` Cong Wang
2017-02-07 14:20                           ` Mateusz Guzik
2017-02-10  1:37                             ` Cong Wang
2017-01-17  8:07           ` Eric W. Biederman
     [not found] ` <065031f0-27c5-443d-82f9-2f475fcef8c3@googlegroups.com>
2017-06-23 16:30   ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161209013208.GW1555@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=davem@davemloft.net \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rweikusat@mobileactivedefense.com \
    --cc=syzkaller@googlegroups.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.