public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Richard Moser <nigelenki@comcast.net>
To: Phil Lougher <phil.lougher@gmail.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Designing Another File System
Date: Tue, 30 Nov 2004 20:42:38 -0500	[thread overview]
Message-ID: <41AD218E.7090305@comcast.net> (raw)
In-Reply-To: <cce9e37e041130112243beb62d@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Phil Lougher wrote:
| On Mon, 29 Nov 2004 23:32:05 -0500, John Richard Moser
| <nigelenki@comcast.net> wrote:
|
|
|>- - localization of Inodes and related meta-data to prevent disk thrashing
|
|
| All filesystems place their filesystem metadata inside the inodes.  If
| you mean file metadata then please be more precise.  This isn't
| terribly new, recent posts have discussed how moving eas/acls inside
| the inode for ext3 has sped up performance.
|

I got the idea from FFS and its derivatives (UFS, UFS2, EXT2).  I don't
want to store xattrs inside inodes though; I want them in the same block
with the inode.  A few mS for the seek, but eh, the data's right there,
not on the other side of the disk.

|
|>- - a scheme which allows Inodes to be dynamically allocated and
|>deallocated out of order
|>
|
|
| Um,  all filesystems do that, I think you're missing words to the
| effect "without any performance loss or block fragmentation" !

All filesystems allow you to create the FS with 1 inode total?


Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/hda1            7823616  272670 7550946    4% /

No, it looks like this one allocates many inodes and uses them as it
goes.  Reiser has 0 inodes . . .

|
|
|>- - 64 bit indices indicating the exact physical location on disk of
|>Inodes, giving a O(1) seek to the Inode itself
|
|
|>1)  Can Unix utilities in general deal with 64 bit Inodes?  (Most
|>programs I assume won't care; ls -i and df -i might have trouble)
|>
|
|
| There seems to be some confusion here.  The filesystem can use 64 bit
| inode numbers internally but hide these 64 bits and instead present
| munged 32 bit numbers to Linux.
|
| The 64 bit resolution is only necessary within the filesystem dentry
| lookup function to go from a directory name entry to the physical
| inode location on disk.  The inode number can then be reduced to 32
| bits for 'presentation' to the VFS.  AFAIK as all file access is
| through the dentry cache this is sufficient.  The only problems are
| that VFS iget() needs to be replaced with a filesystem specific iget.
| A number of filesystems do this.  Squashfs internally uses 37 bit
| inode numbers and presents them as 32 bit inode numbers in this way.
|

Ugly, but ok.  What happens when i actually have >4G inodes though?

|
|>3)  What basic information do I absolutely *need* in my super block?
|>4)  What basic information do I absolutely *need* in my Inodes? (I'm
|>thinking {type,atime,dtime,ctime,mtime,posix_dac,meta_data_offset,size,\
|>links}
|
|
| Very much depends on your filesystem.  Cramfs is a good example of the
| minimum you need to store to satisfy the Linux VFS.  If you don't care
| what they are almost anything can be invented (uid, gid, mode, atime,
| dtime etc) and set to a useful default.  The *absolute* minimum is
| probably type, file/dir size, and file/dir data location on disk.

I meant basic, not for me.  Basic things a real Unix filesystem needs.
What *I* need comes from my head.  :)

|
|
|>I guess the second would be better?  I can't locate any directories on
|>my drive with >2000 entries *shrug*.  The end key is just the entry
|>{name,inode} pair.
|
|
| I've had people trying to store 500,000 + files in a Squashfs
| directory.  Needless to say with the original directory implementation
| this didn't work terribly well...
|

Ouch.  Someone told me the directory had to be O(1) lookup . . . .

| Phillip Lougher
|

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBrSGNhDd4aOud5P8RAlo0AJ4pxB/LMhgTvNW4GdMmaNA2/uM0wACfWR8+
kOxwHU3/mTUUNAAhda2rv+g=
=fsJV
-----END PGP SIGNATURE-----

  reply	other threads:[~2004-12-01  1:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-30  4:32 Designing Another File System John Richard Moser
2004-11-30  7:16 ` Bernd Eckenfels
2004-11-30 13:07 ` Helge Hafting
2004-12-01  1:16   ` John Richard Moser
2004-11-30 16:31 ` Alan Cox
2004-11-30 18:28 ` Valdis.Kletnieks
2004-11-30 17:46   ` Alan Cox
2004-11-30 19:00     ` Jan Engelhardt
2004-11-30 19:14     ` Valdis.Kletnieks
2004-11-30 20:22       ` Valdis.Kletnieks
2004-12-02 23:32     ` David Woodhouse
2004-12-01  1:35   ` John Richard Moser
2004-11-30 19:22 ` Phil Lougher
2004-12-01  1:42   ` John Richard Moser [this message]
2004-12-01  2:46     ` Phil Lougher
2004-12-01  4:32       ` Bernd Eckenfels
2004-12-01  7:51         ` Phil Lougher
2004-12-01  5:11       ` John Richard Moser
2004-12-02 20:18   ` Jan Knutar
2004-12-07  2:01     ` Phil Lougher
2004-12-07  2:31       ` John Richard Moser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41AD218E.7090305@comcast.net \
    --to=nigelenki@comcast.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phil.lougher@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox