From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: yangerkun <yangerkun@huawei.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
yi.zhang@huawei.com, houtao1@huawei.com, miaoxie@huawei.com
Subject: Re: system panic while dentry reference count overflow
Date: Tue, 7 May 2019 05:15:52 +0100 [thread overview]
Message-ID: <20190507041552.GH23075@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wjjK16yyug_5-xjPjXniE_T9tzQwxW45JJOHb=ho9kqrA@mail.gmail.com>
On Mon, May 06, 2019 at 06:50:27PM -0700, Linus Torvalds wrote:
> On Mon, May 6, 2019 at 5:40 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Linus, lockref is your code, IIRC; which variant would you consider
> > more feasible?
>
> I think we should strive for the same kind of count overflow handling
> that the pageref patches did: keep the count at 32 bits, but just add
> a new "try_dget()" thing that returns "no" when the count grows too
> large.
>
> And then use the "try_dget()" model when possible, and particularly
> for the easy cases for user mode to trigger. You don't have to catch
> them all, and in most places it isn't worth even worrying about it
> because users can't force billions of those places to be active at
> once.
Umm... Where would you put the cutoff for try_dget()? 1G? Because
2G-<something relatively small> is risky - have it reached, then
get the rest of the way to 2G by normal dget() and you've got trouble.
The part that worries me is that we'd get new failure exits that will
be impossible to test without fuckloads of RAM in the testbox; certainly
more than I have ;-/
> I don't see the original email (I'm not on fsdevel, and google doesn't
> find it), so I don't see if there was some particular case that was
> pointed out as being an easy attack vector.
Attack is predicated upon having 2Tb RAM; basically, a forkbomb with
plenty of access(2) on different inexistent names in the same directory
from each process. And yes, that can be mitigated by limiting the
number of negative dentries, but I'm pretty sure that it can be
done with positive ones. E.g. take tmpfs, create a file there,
then link(2) a lot and you've got each child contribute to parent
directory's dentry refcount. Or you can open the root directory
a lot (again, on tmpfs), then seek in it (each open and each cursor
will contribute). That way you need 1G opened files, and /proc/sys/fs/file-nr
set that high is not impossible if you have enough RAM. Will any
boxen be set up that way? No idea...
Most of attack vectors going through ->d_parent contributions are
relatively easy to deal with - d_alloc() can fail already, and having
that kind of overflow mapped at -ENOMEM is not a big deal. Ditto
for d_alloc_cursor(). Callers of __d_move() in d_splice_alias() are
not hard to deal with. However, d_move() itself is nasty. We call
it in situations where it's too late to fail. We _might_ get away
with it if we do a variant that consumes a reference to new parent
and have that code go
try to reserve an extra reference to new parent
sod off if failed
do the stuff that might fail
if failed - drop that extra reference
otherwise do the new variant of d_move()
That's an extra lock/unlock of parent's ->d_lock, but I'd expect
it to drown in the noise on those paths...
However, that obviously doesn't do anything for references
held by opened files. Having complete_walk() check for
refcount being too high and buggering off in that case
probably would help that one...
Ugh...
next prev parent reply other threads:[~2019-05-07 4:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-06 3:36 system panic while dentry reference count overflow yangerkun
2019-05-07 0:40 ` Al Viro
2019-05-07 1:50 ` Linus Torvalds
2019-05-07 4:15 ` Al Viro [this message]
2019-05-07 15:26 ` Linus Torvalds
2019-05-07 19:16 ` Al Viro
2019-05-07 19:23 ` Linus Torvalds
2019-05-07 19:55 ` Al Viro
2019-05-07 20:47 ` Linus Torvalds
2019-05-07 21:14 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190507041552.GH23075@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=houtao1@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miaoxie@huawei.com \
--cc=torvalds@linux-foundation.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).