linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: yangerkun <yangerkun@huawei.com>
Cc: linux-fsdevel@vger.kernel.org, yi.zhang@huawei.com,
	houtao1@huawei.com, miaoxie@huawei.com,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: system panic while dentry reference count overflow
Date: Tue, 7 May 2019 01:40:46 +0100	[thread overview]
Message-ID: <20190507004046.GE23075@ZenIV.linux.org.uk> (raw)
In-Reply-To: <af9a8dec-98a2-896f-448b-04ded0af95f0@huawei.com>

On Mon, May 06, 2019 at 11:36:10AM +0800, yangerkun wrote:
> Hi,
> 
> Run process parallel which each code show as below(2T memory), reference
> count of root dentry will overflow since allocation of negative dentry
> should do count++ for root dentry. Then, another dput of root dentry will
> free it, which cause crash of system. I wondered is there anyone has found
> this problem?

The problem is, in principle, known - it's just that you need an obscene
amount of RAM to trigger it (you need 4G objects of some sort to hold those
references).

_If_ you have that much RAM, there's any number of ways to hit that thing -
it doesn't have to be cached results of lookups in directory as in your
testcase.  E.g. raise /proc/sys/fs/file-nr past 4Gb (you will need a lot
of RAM for that, or the thing won't let you go that high) and just keep
opening the same file (using enough processes to get around the per-process
limit, or playing with SCM_RIGHTS sendmsg to yourself, etc.)

I don't think that making dget() able to fail is a feasible approach;
there are too many callers and hundreds of brand-new failure exits
that will almost never be exercised is _the_ recipe for bitrot from
hell.

An obvious approach would be to use atomic_long_t; the problem is that
it's not atomic_t - it's lockref, which is limited to 32 bits.  Doing
a wider variant... hell knows - wider cmpxchg variants might be
usable, or we could put the upper bits into a separate word, with
cmpxchg loops in lockref_get() et.al. treating "lower bits all zero" as
"fall back to grabbing spinlock".

Linus, lockref is your code, IIRC; which variant would you consider
more feasible?

We don't have that many places looking at the refcount, fortunately.
And most of them are using d_count(dentry) (comparisons or printk).
The rest is almost all in fs/dcache.c...  So it's not as if we'd
been tied to refcount representation by arseloads of code all over
the tree.

  reply	other threads:[~2019-05-07  0:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-06  3:36 system panic while dentry reference count overflow yangerkun
2019-05-07  0:40 ` Al Viro [this message]
2019-05-07  1:50   ` Linus Torvalds
2019-05-07  4:15     ` Al Viro
2019-05-07 15:26       ` Linus Torvalds
2019-05-07 19:16         ` Al Viro
2019-05-07 19:23           ` Linus Torvalds
2019-05-07 19:55             ` Al Viro
2019-05-07 20:47               ` Linus Torvalds
2019-05-07 21:14                 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190507004046.GE23075@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=houtao1@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miaoxie@huawei.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).