Re: BIG files & file systems

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hans Reiser <reiser@namesys.com>
To: Steve Lord <lord@sgi.com>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>,
	Alexander Viro <viro@math.psu.edu>,
	"Peter J. Braam" <braam@clusterfs.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: BIG files & file systems
Date: Fri, 02 Aug 2002 19:10:30 +0400	[thread overview]
Message-ID: <3D4AA0E6.9000904@namesys.com> (raw)
In-Reply-To: 1028297194.30192.25.camel@jen.americas.sgi.com

There are a number of interfaces that need expansion in 2.5.  Telldir 
and seekdir would be much better if they took as argument some 
filesystem specific opaque cookie (e.g. filename). Using a byte offset 
to reference a directory entry that was found with a filename is an 
implementation specific artifact that obviously only works for a 
ufs/s5fs/ext2 type of filesystem, and is just wrong.

4 billion files is not enough to store the government's XML databases in.

Hans

Steve Lord wrote:

>On Fri, 2002-08-02 at 08:56, Jan Harkes wrote:
>  
>
>>I was simply assuming that any filesystem that is using iget5 and
>>doesn't use the simpler iget helper has some reason why it cannot find
>>an inode given just the 32-bit ino_t.
>>    
>>
>
>In XFS's case (remember, the iget5 code is based on XFS changes) it is
>more a matter of the code to read the inode sometimes needing to pass
>other info down to the read_inode part of the filesystem, so we want to
>do that internally. XFS can have 64 bit inode numbers, but you need more
>than 1 Tbyte in an fs to get that big (inode numbers are a disk
>address). We also have code which keeps them in the bottom 1 Tbyte
>which is turned on by default on Linux.
>
>  
>
>>This is definitely true for Coda, we have 96-bit file identifiers.
>>Actually my development tree currently uses 128-bit, it is aware of
>>multiple administrative realms and distinguishes between objects with
>>FID 0x7f000001.0x1.0x1 in different administrative domains. There is a
>>hash-function that tries to map these large FIDs into the 32-bit ino_t
>>space with as few collisions as possible.
>>
>>NFS has a >32-bit filehandle. ReiserFS might have unique inodes, but
>>seems to need access to the directory to find them. So I don't quickly
>>see how it would guarantee uniqueness. NTFS actually doesn't seem to use
>>iget5 yet, but it has multiple streams per object which would probably
>>end up using the same ino_t.
>>
>>Userspace applications should either have an option to ignore hardlinks.
>>Very large filesystems either don't care because there is plenty of
>>space, don't support them across boundaries that are not visible to the
>>application, or could be dealing with them them automatically (COW
>>links). Besides, if I really have a trillion files, I don't want 'tar
>>and friends' to try to keep track of all those inode numbers (and device
>>numbers) in memory.
>>
>>The other solution is that applications can actually use more of the
>>information from the inode to avoid confusion, like st_nlink and
>>st_mtime, which are useful when the filesystem is still mounted rw as
>>well. And to make it even better, st_uid, st_gid, st_size, st_blocks and
>>st_ctime, and a MD5/SHA checksum. Although this obviously would become
>>even worse for the trillion file backup case.
>>    
>>
>
>If apps would have to change then I would vote for allowing larger
>inodes out of the kernel in an extended version of stat and getdents.
>I was going to say 64 bit versions, but if even 64 is not enough for
>you, it is getting a little hard to handle.
>
>Steve
>
>  
>
>>Jan
>>    
>>


-- 
Hans

next prev parent reply	other threads:[~2002-08-02 15:07 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-31 19:16 BIG files & file systems Peter J. Braam
2002-07-31 19:26 ` Christoph Hellwig
2002-07-31 20:04   ` Matti Aarnio
2002-07-31 20:12     ` Christoph Hellwig
2002-08-02 17:26     ` Albert D. Cahalan
2002-08-02 22:14       ` Randy.Dunlap
2002-08-03  3:26         ` Albert D. Cahalan
2002-08-06  5:19           ` Andreas Dilger
2002-08-06  7:24             ` Albert D. Cahalan
2002-08-06  7:52               ` Andreas Dilger
2002-08-06  9:28             ` Matti Aarnio
2002-08-05 13:04         ` Stephen Lord
2002-08-05 13:42           ` Hans Reiser
2002-08-05 13:56             ` Randy.Dunlap
2002-08-05 14:21               ` Randy.Dunlap
2002-08-05 17:31                 ` Albert D. Cahalan
2002-08-06  0:16             ` jw schultz
2002-08-06  9:48               ` Hans Reiser
2002-07-31 21:07 ` Jan Harkes
2002-07-31 21:13   ` Alexander Viro
2002-08-01  3:51     ` Jan Harkes
2002-08-01 12:01       ` Mark Mielke
2002-08-02  0:09       ` Stephen Lord
2002-08-02 12:17         ` Chris Mason
2002-08-02 12:33           ` Anton Altaparmakov
2002-08-02 13:56         ` Jan Harkes
2002-08-02 14:06           ` Steve Lord
2002-08-02 15:10             ` Hans Reiser [this message]
2002-08-02 15:39               ` Trond Myklebust
2002-08-02 17:01                 ` Hans Reiser
2002-08-02 17:25                   ` Nikita Danilov
2002-08-02 17:47                     ` Trond Myklebust
2002-08-02 18:10                       ` Nikita Danilov
2002-08-02 18:31                         ` Hans Reiser
2002-08-02 18:48                           ` Nikita Danilov
2002-08-02 18:59                             ` Hans Reiser
2002-08-01 12:01 ` David Woodhouse
2002-08-01 20:33 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D4AA0E6.9000904@namesys.com \
    --to=reiser@namesys.com \
    --cc=braam@clusterfs.com \
    --cc=jaharkes@cs.cmu.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lord@sgi.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox