From: Andreas Dilger <adilger@sun.com>
To: Alex Tomas <bzzz@sun.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [RFC] dynamic inodes
Date: Thu, 25 Sep 2008 16:09:36 -0600 [thread overview]
Message-ID: <20080925220936.GL10950@webber.adilger.int> (raw)
In-Reply-To: <48DA28B0.2020207@sun.com>
Sadly this was sitting in my outbox overnight, and might be obsolete
already (explanation in a follow-up email), but I'm sending it as food
for thought...
On Sep 24, 2008 15:46 +0400, Alex Tomas wrote:
> another idea how to achieve more (dynamic) inodes:
> * new dir_entry format with 64bit inum
Yes, that is a requirement in all cases.
I've always thought that we should also implement inode-in-dirent when
we need to change the dirent format and make dynamic inodes, but that
may be too much to chew on at one time.
> * ino space is 64bit:
> * 2^48 phys. 4K blocks
> * 2^5 inodes in 4K block
The 2^5 inodes/4kB block would actually depend on the blocksize/inodesize,
lets just call this inodes-per-block-bits (IPBB). It will be a power-of-2
between 0 and 8 (i.e. between 1 and 256 inodes per block), which is fine.
For common ext4 filesystems this would be 2^4 = 16 inodes/block, because
the default is 256-byte inodes today.
> * highest bit is used to choose addressing schema: static or dynamic
Alternately, any inode >= 2^32 would be dynamic? One clear benefit of
putting the dynamic inodes at the end of the number space is that they
will only be used if the static inodes are full, which reduces risk due
to corruption and overhead due to dynamic allocations.
> * each block is covered by two bits: in inode (I) and block (B) bitmaps:
> I: 0, B: 0 - block is just free
> I: 0, B: 1 - block is used, but not contains inodes
> I: 1, B: 0 - block is full of inodes
> I: 1, B: 1 - block contains few inodes, has free space
Storing B:0 for an in-use block seems very dangerous to me. This also
doesn't really address the need to be able to quickly locate free inodes,
because it means "I:1" _might_ mean the inode is free or it might not,
so EVERY "in-use" inode would need to be checked to see if it is free.
We need to start with a "dynamic inode bitmap" (DIB) that is mapped from
an "inode table file" (possibly only for the dynamic inode table blocks).
Free inodes can be scanned using the normal ext4_find_next_zero_bit()
in each of the bitmaps.
Each such DIB block holds an array of bits indicating dynamic inode
use, as well as an array of block numbers which map IPBB inode bits to
dynamic inode table blocks. The DIBB should also have a header which
contains space for a magic, a checksum, and the count of free and total
inodes, like a GDT has, as well as a count of in-use itable blocks.
The dynamic inode table blocks (DITB) should also hold a header with
magic, checksum, back-pointer to DIBB. The back-pointer to the DIBB
allows efficient clearing of in-use bit and location of the DIBB if the
dynamic inode itself is corrupted, and possibly freeing the DITB if
the last in-use inode is freed.
For common 256-byte inodes and 4kB blocks we need 8 bytes/block for the
block addresses, and 1 bit/inode, so
4096 bytes/block / 256 bytes/inode = 16 inodes(bits)/block = 2 byte bitmap
(4096 bytes - 64-byte header) / (8 byte address + 2 byte bitmap) =
400 itable blocks per DIBB = 400 * 16 = 6400 inodes/DIBB
65536 bytes/block / 256 bytes/inode = 256 inodes(bits)/block = 8 byte bitmap
(65536 bytes - 64-byte header) / (8 byte address + 8 byte bitmap) =
4092 itable blocks per DIBB = 4092 * 16 = 1048576 inodes/DIBB
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2008-09-25 22:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-24 11:46 [RFC] dynamic inodes Alex Tomas
2008-09-25 22:09 ` Andreas Dilger [this message]
2008-09-25 23:00 ` Alex Tomas
2008-09-25 23:29 ` Andreas Dilger
2008-09-30 14:02 ` Alex Tomas
2008-09-25 22:37 ` Andreas Dilger
2008-09-26 1:10 ` Jose R. Santos
2008-09-26 10:36 ` Andreas Dilger
2008-09-26 14:49 ` Jose R. Santos
2008-09-26 20:01 ` Andreas Dilger
2008-09-26 2:11 ` Theodore Tso
2008-09-26 10:33 ` Andreas Dilger
2008-09-26 14:33 ` Theodore Tso
2008-09-26 20:18 ` Andreas Dilger
2008-09-26 22:26 ` Theodore Tso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080925220936.GL10950@webber.adilger.int \
--to=adilger@sun.com \
--cc=bzzz@sun.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.