From: "Theodore Ts'o" <tytso@mit.edu>
To: Christoph Hellwig <hch@infradead.org>,
J?rn Engel <joern@wohnheim.fh-wedel.de>,
Alfred Brons <alfredbrons@yahoo.com>,
pocm@sat.inesc-id.pt, linux-kernel@vger.kernel.org
Subject: Re: what is our answer to ZFS?
Date: Tue, 22 Nov 2005 11:28:36 -0500 [thread overview]
Message-ID: <20051122162836.GA31444@thunk.org> (raw)
In-Reply-To: <20051122152531.GU12760@delft.aura.cs.cmu.edu>
On Tue, Nov 22, 2005 at 10:25:31AM -0500, Jan Harkes wrote:
> On Tue, Nov 22, 2005 at 09:50:47AM -0500, Theodore Ts'o wrote:
> > I will note though that there are people who are asking for 64-bit
> > inode numbers on 32-bit platforms, since 2**32 inodes are not enough
> > for certain distributed/clustered filesystems. And this is something
> > we don't yet support today, and probably will need to think about much
> > sooner than 128-bit filesystems....
>
> As far as the kernel is concerned this hasn't been a problem in a while
> (2.4.early). The iget4 operation that was introduced by reiserfs (now
> iget5) pretty much makes it possible for a filesystem to use anything to
> identify it's inodes. The 32-bit inode numbers are simply used as a hash
> index.
iget4 wasn't even strictly necessary, unless you want to use the inode
cache (which has always been strictly optional for filesystems, even
inode-based ones) --- Linux's VFS is dentry-based, not inode-based, so
we don't use inode numbers to index much of anything inside the
kernel, other than the aforementioned optional inode cache.
The main issue is the lack of a 64-bit interface to extract inode
numbers, which is needed as you point out for userspace archiving
tools like tar. There are also other programs or protocols that in the
past have broken as a result of inode number collisions.
As another example, a quick google search indicates that the some mail
programs can use inode numbers as a part of a technique to create
unique filenames in maildir directories. One could easily also
imagine using inode numbers as part of creating unique ids returned by
an IMAP server --- not something I would recommend, but it's an
example of what some people might have done, since everybody _knows_
they can count on inode numbers on Unix systems, right? POSIX
promises that they won't break!
> The only thing that tends to break are userspace archiving tools like
> tar, which assume that 2 objects with the same 32-bit st_ino value are
> identical. I think that by now several actually double check that the
> inode linkcount is larger than 1.
Um, that's not good enough to avoid failure modes; consider what might
happen if you have two inodes that have hardlinks, so that st_nlink >
1, but whose inode numbers are the same if you only look at the low 32
bits? Oops.
It's not a bad hueristic, if you don't have that many hard-linked
files on your system, but if you have a huge number of hard-linked
trees (such as you might find on a kernel developer with tons of
hard-linked trees), I wouldn't want to count on this always working.
- Ted
>
> Jan
next prev parent reply other threads:[~2005-11-22 16:28 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-21 9:28 what is our answer to ZFS? Alfred Brons
2005-11-21 9:44 ` Paulo Jorge Matos
2005-11-21 9:59 ` Alfred Brons
2005-11-21 10:08 ` Bernd Petrovitsch
2005-11-21 10:16 ` Andreas Happe
2005-11-21 11:30 ` Anton Altaparmakov
2005-11-21 10:19 ` Jörn Engel
2005-11-21 11:46 ` Matthias Andree
2005-11-21 12:07 ` Kasper Sandberg
2005-11-21 13:18 ` Matthias Andree
2005-11-21 14:18 ` Kasper Sandberg
2005-11-21 14:41 ` Matthias Andree
2005-11-21 15:08 ` Kasper Sandberg
2005-11-22 8:52 ` Matthias Andree
2005-11-21 22:41 ` Bill Davidsen
2005-11-21 20:48 ` jdow
2005-11-22 11:17 ` Jörn Engel
2005-11-21 11:59 ` Diego Calleja
2005-11-22 7:51 ` Christoph Hellwig
2005-11-22 10:28 ` Jörn Engel
2005-11-22 14:50 ` Theodore Ts'o
2005-11-22 15:25 ` Jan Harkes
2005-11-22 16:17 ` Chris Adams
2005-11-22 16:55 ` Anton Altaparmakov
2005-11-22 17:18 ` Theodore Ts'o
2005-11-22 19:25 ` Anton Altaparmakov
2005-11-22 19:52 ` Theodore Ts'o
2005-11-22 20:00 ` Anton Altaparmakov
2005-11-22 23:02 ` Theodore Ts'o
2005-11-22 21:14 ` Bill Davidsen
2005-11-22 21:06 ` Bill Davidsen
2005-11-22 20:19 ` Alan Cox
2005-11-22 19:56 ` Chris Adams
2005-11-22 21:19 ` Bill Davidsen
2005-11-23 19:20 ` Generation numbers in stat was Re: what is slashdot's " Andi Kleen
2005-11-24 5:15 ` Chris Adams
2005-11-24 8:47 ` Andi Kleen
2005-11-22 16:28 ` Theodore Ts'o [this message]
2005-11-22 17:37 ` what is our " Jan Harkes
2005-11-22 16:36 ` Jeff V. Merkey
2005-11-28 12:53 ` Lars Marowsky-Bree
2005-11-29 5:04 ` Theodore Ts'o
2005-11-29 5:57 ` Willy Tarreau
2005-11-29 14:42 ` John Stoffel
2005-11-29 13:58 ` Andi Kleen
2005-11-29 16:03 ` Chris Adams
2005-11-21 11:45 ` Diego Calleja
2005-11-21 14:19 ` Tarkan Erimer
2005-11-21 18:52 ` Rob Landley
2005-11-21 19:28 ` Diego Calleja
2005-11-21 20:02 ` Bernd Petrovitsch
2005-11-22 5:42 ` Rob Landley
2005-11-22 9:25 ` Matthias Andree
2005-11-21 23:05 ` Bill Davidsen
2005-11-22 0:15 ` Bernd Eckenfels
2005-11-21 22:59 ` Jeff V. Merkey
2005-11-22 7:45 ` Christoph Hellwig
2005-11-22 9:19 ` Jeff V. Merkey
2005-11-22 16:00 ` Bill Davidsen
2005-11-22 16:09 ` Jeff V. Merkey
2005-11-22 20:16 ` Bill Davidsen
2005-11-22 16:14 ` Randy.Dunlap
2005-11-22 16:38 ` Steve Flynn
2005-11-22 7:15 ` Rob Landley
2005-11-22 8:16 ` Bernd Eckenfels
2005-11-22 0:45 ` Pavel Machek
2005-11-22 6:34 ` Rob Landley
2005-11-22 19:05 ` Pavel Machek
2005-11-22 9:20 ` Matthias Andree
2005-11-22 10:00 ` Tarkan Erimer
2005-11-22 15:46 ` Jan Dittmer
2005-11-22 16:27 ` Bill Davidsen
2005-11-21 18:17 ` Rob Landley
-- strict thread matches above, loose matches on Subject: below --
2005-11-24 1:52 art
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051122162836.GA31444@thunk.org \
--to=tytso@mit.edu \
--cc=alfredbrons@yahoo.com \
--cc=hch@infradead.org \
--cc=joern@wohnheim.fh-wedel.de \
--cc=linux-kernel@vger.kernel.org \
--cc=pocm@sat.inesc-id.pt \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox