From: Chris Down <chris@chrisdown.name>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
Jeff Layton <jlayton@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] fs: inode: Reduce volatile inode wraparound risk when ino_t is 64 bit
Date: Sat, 21 Dec 2019 10:16:52 +0000 [thread overview]
Message-ID: <20191221101652.GA494948@chrisdown.name> (raw)
In-Reply-To: <20191220213052.GB7476@magnolia>
Darrick J. Wong writes:
>On Fri, Dec 20, 2019 at 02:49:36AM +0000, Chris Down wrote:
>> In Facebook production we are seeing heavy inode number wraparounds on
>> tmpfs. On affected tiers, in excess of 10% of hosts show multiple files
>> with different content and the same inode number, with some servers even
>> having as many as 150 duplicated inode numbers with differing file
>> content.
>>
>> This causes actual, tangible problems in production. For example, we
>> have complaints from those working on remote caches that their
>> application is reporting cache corruptions because it uses (device,
>> inodenum) to establish the identity of a particular cache object, but
>
>...but you cannot delete the (dev, inum) tuple from the cache index when
>you remove a cache object??
There are some cache objects which may be long-lived. In these kinds of cases,
the cache objects aren't removed until they're conclusively not needed.
Since tmpfs shares the i_ino counter with every other user of get_next_ino,
it's then entirely possible that we can thrash through 2^32 inodes within a
period that it's possible for a single cache file to exist.
>> because it's not unique any more, the application refuses to continue
>> and reports cache corruption. Even worse, sometimes applications may not
>> even detect the corruption but may continue anyway, causing phantom and
>> hard to debug behaviour.
>>
>> In general, userspace applications expect that (device, inodenum) should
>> be enough to be uniquely point to one inode, which seems fair enough.
>
>Except that it's not. (dev, inum, generation) uniquely points to an
>instance of an inode from creation to the last unlink.
I didn't mention generation because, even though it's set on tmpfs (to
prandom_u32()), it's not possible to evaluate it from userspace since `ioctl`
returns ENOTTY. We can't ask userspace applications to introspect on an inode
attribute that they can't even access :-)
next prev parent reply other threads:[~2019-12-21 10:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-20 2:49 [PATCH] fs: inode: Reduce volatile inode wraparound risk when ino_t is 64 bit Chris Down
2019-12-20 3:05 ` zhengbin (A)
2019-12-20 8:32 ` Amir Goldstein
2019-12-20 12:16 ` Chris Down
2019-12-20 13:41 ` Amir Goldstein
2019-12-20 16:46 ` Matthew Wilcox
2019-12-20 17:35 ` Amir Goldstein
2019-12-20 19:50 ` Matthew Wilcox
2019-12-23 20:45 ` Chris Down
2019-12-24 3:04 ` Amir Goldstein
2019-12-25 12:54 ` Chris Down
2019-12-26 1:40 ` zhengbin (A)
2019-12-20 21:30 ` Darrick J. Wong
2019-12-21 8:43 ` Amir Goldstein
2019-12-21 18:05 ` Darrick J. Wong
2019-12-21 10:16 ` Chris Down [this message]
2020-01-07 17:35 ` J. Bruce Fields
2020-01-07 17:44 ` Chris Down
2020-01-08 3:00 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191221101652.GA494948@chrisdown.name \
--to=chris@chrisdown.name \
--cc=darrick.wong@oracle.com \
--cc=hannes@cmpxchg.org \
--cc=jlayton@kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).