public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "gregkh\@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: Jan Glauber <Jan.Glauber@cavium.com>,
	Will Deacon <will.deacon@arm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jslaby\@suse.com" <jslaby@suse.com>
Subject: Re: dcache_readdir NULL inode oops
Date: Fri, 30 Nov 2018 09:16:49 -0600	[thread overview]
Message-ID: <875zwe389q.fsf@xmission.com> (raw)
In-Reply-To: <20181130104154.GA11991@kroah.com> (gregkh@linuxfoundation.org's message of "Fri, 30 Nov 2018 11:41:54 +0100")

"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org> writes:

> Adding Eric as he touched this code last :)
>
> On Thu, Nov 29, 2018 at 07:25:48PM +0000, Jan Glauber wrote:
>> On Wed, Nov 28, 2018 at 08:08:06PM +0000, Will Deacon wrote:
>> > I spent some more time looking at this today...
>> > 
>> > On Fri, Nov 23, 2018 at 06:05:25PM +0000, Will Deacon wrote:
>> > > Doing some more debugging, it looks like the usual failure case is where
>> > > one CPU clears the inode field in the dentry via:
>> > >
>> > >       devpts_pty_kill()
>> > >               -> d_delete()   // dentry->d_lockref.count == 1
>> > >                       -> dentry_unlink_inode()
>> > >
>> > > whilst another CPU gets a pointer to the dentry via:
>> > >
>> > >       sys_getdents64()
>> > >               -> iterate_dir()
>> > >                       -> dcache_readdir()
>> > >                               -> next_positive()
>> > >
>> > > and explodes on the subsequent inode dereference when trying to pass the
>> > > inode number to dir_emit():
>> > >
>> > >       if (!dir_emit(..., d_inode(next)->i_ino, ...))
>> > >
>> > > Indeed, the hack below triggers a warning, indicating that the inode
>> > > is being cleared concurrently.
>> > >
>> > > I can't work out whether the getdents64() path should hold a refcount
>> > > to stop d_delete() in its tracks, or whether devpts_pty_kill() shouldn't
>> > > be calling d_delete() like this at all.
>> > 
>> > So the issue is that opening /dev/pts/ptmx creates a new pty in /dev/pts,
>> > which disappears when you close /dev/pts/ptmx. Consequently, when we tear
>> > down the dentry for the magic new file, we have to take the i_node rwsem of
>> > the *parent* so that concurrent path walkers don't trip over it whilst its
>> > being freed. I wrote a simple concurrent program to getdents(/dev/pts/) in
>> > one thread, whilst another opens and closes /dev/pts/ptmx: it crashes the
>> > kernel in seconds.
>> 
>> I also made a testcase and verified that your fix is fine. I also tried
>> replacing open-close on /dev/ptmx with mkdir-rmdir but that does not
>> trigger the error.
>> 
>> > Patch below, but I'd still like somebody else to look at this, please.
>> 
>> I wonder why no inode_lock on parent is needed for devpts_pty_new(), but
>> I'm obviously not a VFS expert... So your patch looks good to me and
>> clearly solves the issue.
>> 
>> thanks,
>> Jan
>> 
>> > Will
>> > 
>> > --->8
>> > 
>> > diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
>> > index c53814539070..50ddb95ff84c 100644
>> > --- a/fs/devpts/inode.c
>> > +++ b/fs/devpts/inode.c
>> > @@ -619,11 +619,17 @@ void *devpts_get_priv(struct dentry *dentry)
>> >   */
>> >  void devpts_pty_kill(struct dentry *dentry)
>> >  {
>> > -       WARN_ON_ONCE(dentry->d_sb->s_magic != DEVPTS_SUPER_MAGIC);
>> > +       struct super_block *sb = dentry->d_sb;
>> > +       struct dentry *parent = sb->s_root;
>> > 
>> > +       WARN_ON_ONCE(sb->s_magic != DEVPTS_SUPER_MAGIC);
>
> Side note, I wonder if this is even needed anymore...
>
>> > +
>> > +       inode_lock(parent->d_inode);
>> >         dentry->d_fsdata = NULL;
>> >         drop_nlink(dentry->d_inode);
>> >         d_delete(dentry);
>> > +       inode_unlock(parent->d_inode);
>> > +
>> >         dput(dentry);   /* d_alloc_name() in devpts_pty_new() */
>> >  }
>
> This feels right but getting some feedback from others would be good.

This is going to be special at least because we are not coming through
the normal unlink path and we are manipulating the dcache.

This looks plausible.  If this is whats going on then we have had this
bug for a very long time.  I will see if I can make some time.

It looks like in the general case everything is serialized by the
devpts_mutex.  I wonder if just changing the order of operations
here would be enough.

AKA: drop_nlink d_delete then dentry->d_fsdata.  Ugh d_fsdata is not
implicated so that won't help here.

I will look more shortly.  I am in the middle in the move so I don't
have time to complete the analysis today.

Eric


  reply	other threads:[~2018-11-30 15:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-09 14:37 dcache_readdir NULL inode oops Jan Glauber
2018-11-09 15:58 ` Will Deacon
2018-11-10 11:17   ` Jan Glauber
2018-11-20 18:28     ` Will Deacon
2018-11-20 19:03       ` Will Deacon
2018-11-21 13:19         ` Jan Glauber
2018-11-23 18:05           ` Will Deacon
2018-11-28 20:08             ` Will Deacon
2018-11-29 19:25               ` Jan Glauber
2018-11-30 10:41                 ` gregkh
2018-11-30 15:16                   ` Eric W. Biederman [this message]
2018-11-30 16:08                     ` Al Viro
2018-11-30 16:32                       ` Will Deacon
2019-04-30  9:32                         ` Jan Glauber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zwe389q.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=Jan.Glauber@cavium.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jslaby@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox