From: Al Viro <viro@ZenIV.linux.org.uk>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ian Kent" <raven@themaw.net>, "Mickaël Salaün" <mic@digikod.net>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
syzkaller <syzkaller@googlegroups.com>,
"Kostya Serebryany" <kcc@google.com>,
"Alexander Potapenko" <glider@google.com>,
"Sasha Levin" <sasha.levin@oracle.com>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
"David Howells" <dhowells@redhat.com>
Subject: Re: fs: NULL deref in atime_needs_update
Date: Mon, 29 Feb 2016 16:11:44 +0000 [thread overview]
Message-ID: <20160229161144.GW17997@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CACT4Y+Z=SHjUNXpm7dahCgbVX=tgC6oitY_9mmXK7FJEYK275Q@mail.gmail.com>
On Mon, Feb 29, 2016 at 01:34:13PM +0100, Dmitry Vyukov wrote:
> It's not that I really understand what happens here, but looking at
> the diff: is it the case that negative and inode can change under our
> feet? If so, we still probably can get an inconsistent picture (i.e.
> negative dentry but not NULL inode), can it be an issue? Is
> non-negative->negative->non-negative->negative transition possible? If
> so, we still probably can get the same crash regardless of order of
> negative/inode loads.
Yes, we can - relying on the ordering is brittle and wrong. See other
posting for possible solution; at least that one has much more understandable
rules:
* ->d_seq is bumped before and after modifications of ->d_inode and
->d_flags, which both provides the barriers and (which is what matters for
x86) guarantees that ->d_seq match on recheck (which we do anyway) means
that ->d_inode and ->d_flags match each other.
* RCU users of that part of ->d_flags should be verified by ->d_seq
check (we already are doing that - should_follow_link() didn't and that was
one of the bugs that got fixed).
* non-RCU users either have the parent locked (which stabilizes
everything) or have dentry pinned and positive (ditto). Checking that
dentry is negative (either by looking at inode or flags) does not guarantee
that it will stay such unless the parent is locked anyway. IOW, the
games with barriers and order of assignments between ->d_inode and ->d_flags
do not actually buy us anything useful.
* in case of __dentry_kill() we do *NOT* surround the stores to
->d_inode/->d_flags with ->d_seq bumps, but that's safe since by that point
we had already done __d_drop(), so RCU reader either doesn't find the
dentry in the first place, or gets ->d_seq bumped (by 2) between the
moment it's been fetched by __d_lookup_rcu() finding the sucker and
the moment when ->d_inode/->d_flags get changed. If RCU reader gets in
before that, it sees consistent ->d_inode/->d_flags, as they used to be.
It will eventually fail to grab a reference to that dentry, but that's not
our problem. If it gets in late enough to see ->d_inode and/or ->d_flags
changed, it will fail ->d_seq check and ignore the values it has fetched.
Again, no need for a barrier between ->d_inode and ->d_flags stores in
that case.
As the matter of fact, I'm somewhat tempted to make
static void dentry_unlink_inode(struct dentry * dentry)
__releases(dentry->d_lock)
__releases(dentry->d_inode->i_lock)
{
struct inode *inode = dentry->d_inode;
bool hashed = !d_unhashed(dentry);
if (hashed)
raw_write_seqcount_begin(&dentry->d_seq);
__d_clear_type_and_inode(dentry);
hlist_del_init(&dentry->d_u.d_alias);
if (hashed)
raw_write_seqcount_end(&dentry->d_seq);
spin_unlock(&dentry->d_lock);
spin_unlock(&inode->i_lock);
if (!inode->i_nlink)
fsnotify_inoderemove(inode);
if (dentry->d_op && dentry->d_op->d_iput)
dentry->d_op->d_iput(dentry, inode);
else
iput(inode);
}
and replace dentry_iput() in its only caller with
if (dentry->d_inode)
dentry_unlink_inode(dentry); /* will drop ->d_lock */
else
spin_unlock(&dentry->d_lock);
That would get rid of annoying code duplication, but I would like to see
profiles - the cost of branches might very well get unpleasant. Not sure,
and that part definitely isn't a -stable fodder.
next prev parent reply other threads:[~2016-02-29 16:11 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-05 21:11 fs: NULL deref in atime_needs_update Dmitry Vyukov
2016-02-16 23:40 ` Mickaël Salaün
2016-02-19 19:32 ` Dmitry Vyukov
2016-02-20 3:21 ` Al Viro
2016-02-20 3:54 ` Al Viro
2016-02-20 3:54 ` Al Viro
2016-02-20 13:25 ` Mickaël Salaün
2016-02-20 17:10 ` Al Viro
2016-02-20 17:10 ` Al Viro
2016-02-20 20:26 ` Mickaël Salaün
2016-02-20 20:50 ` Al Viro
2016-02-20 20:50 ` Al Viro
2016-02-22 11:20 ` Dmitry Vyukov
2016-02-22 17:23 ` Al Viro
2016-02-23 15:34 ` Dmitry Vyukov
2016-02-23 18:17 ` Al Viro
2016-02-20 10:36 ` Dmitry Vyukov
2016-02-24 3:12 ` Ian Kent
2016-02-24 4:46 ` Al Viro
2016-02-24 4:46 ` Al Viro
2016-02-24 10:03 ` Dmitry Vyukov
2016-02-24 10:15 ` Dmitry Vyukov
2016-02-24 13:35 ` Dmitry Vyukov
2016-02-24 15:15 ` Al Viro
2016-02-25 8:29 ` Dmitry Vyukov
2016-02-25 16:39 ` Al Viro
2016-02-26 21:21 ` Al Viro
2016-02-26 21:25 ` Dmitry Vyukov
2016-02-26 22:07 ` Al Viro
2016-02-26 22:07 ` Al Viro
2016-02-27 22:27 ` Al Viro
2016-02-27 22:27 ` Al Viro
2016-02-28 15:43 ` Dmitry Vyukov
2016-02-28 16:04 ` Dmitry Vyukov
2016-02-28 17:01 ` Al Viro
2016-02-28 20:01 ` Al Viro
2016-02-29 9:38 ` Dmitry Vyukov
2016-02-29 12:34 ` Dmitry Vyukov
2016-02-29 16:11 ` Al Viro [this message]
2016-02-29 13:09 ` Al Viro
2016-02-29 13:43 ` David Howells
2016-02-29 15:54 ` Dmitry Vyukov
2016-02-29 16:19 ` Al Viro
2016-02-29 18:19 ` Dmitry Vyukov
2016-03-01 8:59 ` Dmitry Vyukov
2016-02-29 16:45 ` Linus Torvalds
2016-02-29 16:50 ` Al Viro
2016-02-29 17:20 ` Al Viro
2016-02-29 17:24 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160229161144.GW17997@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=dhowells@redhat.com \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=kcc@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mic@digikod.net \
--cc=raven@themaw.net \
--cc=sasha.levin@oracle.com \
--cc=syzkaller@googlegroups.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.