From: Andreas Dilger <adilger@clusterfs.com>
To: "Albert D. Cahalan" <acahalan@cs.uml.edu>
Cc: "Randy.Dunlap" <rddunlap@osdl.org>,
Matti Aarnio <matti.aarnio@zmailer.org>,
Christoph Hellwig <hch@infradead.org>,
"Peter J. Braam" <braam@CLUSTERFS.COM>,
linux-kernel@vger.kernel.org
Subject: Re: BIG files & file systems
Date: Mon, 5 Aug 2002 23:19:50 -0600 [thread overview]
Message-ID: <20020806051950.GD22933@clusterfs.com> (raw)
In-Reply-To: <200208030326.g733Q7O474061@saturn.cs.uml.edu>
On Aug 02, 2002 23:26 -0400, Albert D. Cahalan wrote:
> Randy.Dunlap writes:
> > On Fri, 2 Aug 2002, Albert D. Cahalan wrote:
> >> Matti Aarnio writes:
>
> >>> - Filesystem format dependent limits
> >>> - EXT2/EXT3: u32_t FILESYSTEM block index, presuming the EXT2/EXT3
> >>> is supported only up to 4 kB block sizes, that gives
> >>> you a very hard limit.. of 16 terabytes (16 * "10^12")
> >>
> >> You first hit the triple-indirection limit at 4 TB.
> >> http://www.cs.uml.edu/~acahalan/linux/ext2.gif
> >>
> >>> - ReiserFS: u32_t block indexes presently, u64_t in future;
> >>> block size ranges ? Max size is limited by the
> >>> maximum supported file size, likely 2^63, which is
> >>> roughly 8 * "10^18", or circa 500 000 times larger
> >>> than EXT2/EXT3 format maximum.
> >>
> >> The top 4 st_size bits get stolen, so it's 60-bit sizes.
> >> You also get the 32-bit block limit at 16 TB.
> >
> > For a LinuxWorld presentation in August, I have asked each of the
> > 4 journaling filesystems (ext3, reiserfs, JFS, and XFS) what their
> > filesystem/filesize limits are. Here's what they have told me.
> >
> > ext3fs reiserfs JFS XFS
> > max filesize: 16 TB# 1 EB 4 PB$ 8 TB%
> > max filesystem size: 2 TB 17.6 TB* 4 PB$ 2 TB!
I think you need a "!" behind the 2TB limit for ext3 max filesystem
size. The actual filesystem limit for 4kB block size is 16TB*
(2^32 blocks). More on this below.
> > Notes:
> > #: think sparse files
> > *: 4 KB blocks
> > $: 16 TB on 32-bit architectures
> > %: 4 KB pages
> > !: block device limit
>
> Please fix that before you give your presentation.
> Sparse files won't save you from the triple-indirection limit.
> This has me suspicious of the other numbers as well.
>
> Ext2 gives you 0xc blocks addressed right off the inode.
> Then with one 4 kB block of block pointers, you can get
> to another 0x400 (1024) blocks. With a block of pointers to
> blocks of pointers, you may address another 0x100000 blocks.
> Finally, triple indirection gives you a block of pointers
> to blocks of pointers to blocks of pointers, for another
> 0x40000000 data blocks. That's a total of:
>
> 0x4010040c blocks
> 0x4010040c000 bytes
> 4.4e12 bytes and change
> 4402 GB (decimal gigabytes)
> 4.4 TB (decimal terabytes)
>
> Of course you can't really use 4.4 TB on 32-bit Linux,
> so there is a sort of dishonesty in making this claim.
> I can get to 2.2 TB, which disturbingly would wrap any
> code using signed 32-bit math on units of 512 bytes.
> The exact limits are:
>
> 0x000001ffffffefff max offset
> 0x000001fffffff000 max size
I would also have to add another footnote to this, if people start
talking about limits on 64-bit and >4kB page size systems. ext2/3 can
support multiple block sizes (limited by the hardware page size), and
actually supporting larger block sizes has only been restricted for
cross-platform compatibility reasons.
Now that larger page sizes are becoming more common, the support for up
to 16kB block sizes has already been added into e2fsprogs, and will only
need a 1-line change in the kernel to be supported. The choice of 16kB
pages as the limit is somewhat arbitrary also, and could be increased
again in the future, as needed.
Having 16kB block size would allow a maximum of 64TB for a single
filesystem. The per-file limit would be over 256TB.
In reality, we will probably implement extent-based allocation for
ext3 when we start getting into filesystems that large, which has been
discussed among the ext2/ext3 developers already. We could also go to
a clustered filesystem like Lustre, which can span a large number of
separate filesystems (and hosts).
Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/
next prev parent reply other threads:[~2002-08-06 6:49 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-07-31 19:16 BIG files & file systems Peter J. Braam
2002-07-31 19:26 ` Christoph Hellwig
2002-07-31 20:04 ` Matti Aarnio
2002-07-31 20:12 ` Christoph Hellwig
2002-08-02 17:26 ` Albert D. Cahalan
2002-08-02 22:14 ` Randy.Dunlap
2002-08-03 3:26 ` Albert D. Cahalan
2002-08-06 5:19 ` Andreas Dilger [this message]
2002-08-06 7:24 ` Albert D. Cahalan
2002-08-06 7:52 ` Andreas Dilger
2002-08-06 9:28 ` Matti Aarnio
2002-08-05 13:04 ` Stephen Lord
2002-08-05 13:42 ` Hans Reiser
2002-08-05 13:56 ` Randy.Dunlap
2002-08-05 14:21 ` Randy.Dunlap
2002-08-05 17:31 ` Albert D. Cahalan
2002-08-06 0:16 ` jw schultz
2002-08-06 9:48 ` Hans Reiser
2002-07-31 21:07 ` Jan Harkes
2002-07-31 21:13 ` Alexander Viro
2002-08-01 3:51 ` Jan Harkes
2002-08-01 12:01 ` Mark Mielke
2002-08-02 0:09 ` Stephen Lord
2002-08-02 12:17 ` Chris Mason
2002-08-02 12:33 ` Anton Altaparmakov
2002-08-02 13:56 ` Jan Harkes
2002-08-02 14:06 ` Steve Lord
2002-08-02 15:10 ` Hans Reiser
2002-08-02 15:39 ` Trond Myklebust
2002-08-02 17:01 ` Hans Reiser
2002-08-02 17:25 ` Nikita Danilov
2002-08-02 17:47 ` Trond Myklebust
2002-08-02 18:10 ` Nikita Danilov
2002-08-02 18:31 ` Hans Reiser
2002-08-02 18:48 ` Nikita Danilov
2002-08-02 18:59 ` Hans Reiser
2002-08-01 12:01 ` David Woodhouse
2002-08-01 20:33 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020806051950.GD22933@clusterfs.com \
--to=adilger@clusterfs.com \
--cc=acahalan@cs.uml.edu \
--cc=braam@CLUSTERFS.COM \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matti.aarnio@zmailer.org \
--cc=rddunlap@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox