Re: BIG files & file systems

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hans Reiser <reiser@namesys.com>
To: Steve Lord <lord@sgi.com>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>,
	Alexander Viro <viro@math.psu.edu>,
	"Peter J. Braam" <braam@clusterfs.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: BIG files & file systems
Date: Fri, 02 Aug 2002 19:10:30 +0400	[thread overview]
Message-ID: <3D4AA0E6.9000904@namesys.com> (raw)
In-Reply-To: 1028297194.30192.25.camel@jen.americas.sgi.com

There are a number of interfaces that need expansion in 2.5.  Telldir 
and seekdir would be much better if they took as argument some 
filesystem specific opaque cookie (e.g. filename). Using a byte offset 
to reference a directory entry that was found with a filename is an 
implementation specific artifact that obviously only works for a 
ufs/s5fs/ext2 type of filesystem, and is just wrong.

4 billion files is not enough to store the government's XML databases in.

Hans

Steve Lord wrote:

>On Fri, 2002-08-02 at 08:56, Jan Harkes wrote:
>  
>
>>I was simply assuming that any filesystem that is using iget5 and
>>doesn't use the simpler iget helper has some reason why it cannot find
>>an inode given just the 32-bit ino_t.
>>    
>>
>
>In XFS's case (remember, the iget5 code is based on XFS changes) it is
>more a matter of the code to read the inode sometimes needing to pass
>other info down to the read_inode part of the filesystem, so we want to
>do that internally. XFS can have 64 bit inode numbers, but you need more
>than 1 Tbyte in an fs to get that big (inode numbers are a disk
>address). We also have code which keeps them in the bottom 1 Tbyte
>which is turned on by default on Linux.
>
>  
>
>>This is definitely true for Coda, we have 96-bit file identifiers.
>>Actually my development tree currently uses 128-bit, it is aware of
>>multiple administrative realms and distinguishes between objects with
>>FID 0x7f000001.0x1.0x1 in different administrative domains. There is a
>>hash-function that tries to map these large FIDs into the 32-bit ino_t
>>space with as few collisions as possible.
>>
>>NFS has a >32-bit filehandle. ReiserFS might have unique inodes, but
>>seems to need access to the directory to find them. So I don't quickly
>>see how it would guarantee uniqueness. NTFS actually doesn't seem to use
>>iget5 yet, but it has multiple streams per object which would probably
>>end up using the same ino_t.
>>
>>Userspace applications should either have an option to ignore hardlinks.
>>Very large filesystems either don't care because there is plenty of
>>space, don't support them across boundaries that are not visible to the
>>application, or could be dealing with them them automatically (COW
>>links). Besides, if I really have a trillion files, I don't want 'tar
>>and friends' to try to keep track of all those inode numbers (and device
>>numbers) in memory.
>>
>>The other solution is that applications can actually use more of the
>>information from the inode to avoid confusion, like st_nlink and
>>st_mtime, which are useful when the filesystem is still mounted rw as
>>well. And to make it even better, st_uid, st_gid, st_size, st_blocks and
>>st_ctime, and a MD5/SHA checksum. Although this obviously would become
>>even worse for the trillion file backup case.
>>    
>>
>
>If apps would have to change then I would vote for allowing larger
>inodes out of the kernel in an extended version of stat and getdents.
>I was going to say 64 bit versions, but if even 64 is not enough for
>you, it is getting a little hard to handle.
>
>Steve
>
>  
>
>>Jan
>>    
>>


-- 
Hans

next prev parent reply	other threads:[~2002-08-02 15:07 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-31 19:16 BIG files & file systems Peter J. Braam
2002-07-31 19:26 ` Christoph Hellwig
2002-07-31 20:04   ` Matti Aarnio
2002-07-31 20:12     ` Christoph Hellwig
2002-08-02 17:26     ` Albert D. Cahalan
2002-08-02 22:14       ` Randy.Dunlap
2002-08-03  3:26         ` Albert D. Cahalan
2002-08-06  5:19           ` Andreas Dilger
2002-08-06  7:24             ` Albert D. Cahalan
2002-08-06  7:52               ` Andreas Dilger
2002-08-06  9:28             ` Matti Aarnio
2002-08-05 13:04         ` Stephen Lord
2002-08-05 13:42           ` Hans Reiser
2002-08-05 13:56             ` Randy.Dunlap
2002-08-05 14:21               ` Randy.Dunlap
2002-08-05 17:31                 ` Albert D. Cahalan
2002-08-06  0:16             ` jw schultz
2002-08-06  9:48               ` Hans Reiser
2002-07-31 21:07 ` Jan Harkes
2002-07-31 21:13   ` Alexander Viro
2002-08-01  3:51     ` Jan Harkes
2002-08-01 12:01       ` Mark Mielke
2002-08-02  0:09       ` Stephen Lord
2002-08-02 12:17         ` Chris Mason
2002-08-02 12:33           ` Anton Altaparmakov
2002-08-02 13:56         ` Jan Harkes
2002-08-02 14:06           ` Steve Lord
2002-08-02 15:10             ` Hans Reiser [this message]
2002-08-02 15:39               ` Trond Myklebust
2002-08-02 17:01                 ` Hans Reiser
2002-08-02 17:25                   ` Nikita Danilov
2002-08-02 17:47                     ` Trond Myklebust
2002-08-02 18:10                       ` Nikita Danilov
2002-08-02 18:31                         ` Hans Reiser
2002-08-02 18:48                           ` Nikita Danilov
2002-08-02 18:59                             ` Hans Reiser
2002-08-01 12:01 ` David Woodhouse
2002-08-01 20:33 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D4AA0E6.9000904@namesys.com \
    --to=reiser@namesys.com \
    --cc=braam@clusterfs.com \
    --cc=jaharkes@cs.cmu.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lord@sgi.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.