linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: device-mapper development <dm-devel@redhat.com>,
	Neil Brown <neilb@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: How to handle >16TB devices on 32 bit hosts ??
Date: Sat, 18 Jul 2009 02:52:13 -0400	[thread overview]
Message-ID: <20090718065213.GK4231@webber.adilger.int> (raw)
In-Reply-To: <871voewm6y.fsf@basil.nowhere.org>

On Jul 18, 2009  08:16 +0200, Andi Kleen wrote:
> Andreas Dilger <adilger@sun.com> writes:
> > I think the point is that for those people who want to use > 16TB
> > devices on 32-bit platforms (e.g. embedded/appliance systems) the
> > choice is between "completely non-functional" and "uses a bit more
> > memory per page", and the answer is pretty obvious.
> 
> It's not just more memory per page, but also worse code all over the
> VM. long long 32bit code is generally rather bad, especially on
> register constrained x86.

If you aren't running a 32-bit system with this config, you shouldn't
really care.  For those systems that need to run in this mode they
would rather have it work a few percent slower instead of not at all.

> But I think the fsck problem is a show stopper here anyways.
> Enabling a setup that cannot handle IO errors wouldn't 
> be really a good idea.
> 
> In fact this problem already hits before 16TB on 32bit.

The e2fsck code is currently just starting to get > 16TB support,
and while the initial implementation is naive, we are definitely
planning on reducing the memory needed to check very large devices.

The last test numbers I saw were 5GB of RAM for a 20TB filesystem,
but since the bitmaps used are fully-allocated arrays that isn't
surprising.  We are planning to replace this with a tree, since the
majority of bitmaps used by e2fsck have large contiguous ranges of
set or unset bits and can be represented much more efficiently.

> Unless people rewrite fsck to use /dev/shm >4GB swapping
> (or perhaps use JFS which iirc had a way to use the file system
> itself as fsck scratch space)

I'm guessing that such systems won't have a 20TB boot device, but
rather a small flash boot/swap device (a few GB is cheap) and then
they could swap, if strictly necessary.

Also, for filesystems like btrfs or ZFS the checking can be done
online and incrementally without storing a full representation of
the state in memory.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2009-07-18  6:53 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-18  0:08 How to handle >16TB devices on 32 bit hosts ?? Neil Brown
2009-07-18  4:31 ` Andreas Dilger
2009-07-18  6:16   ` Andi Kleen
2009-07-18  6:52     ` Andreas Dilger [this message]
2009-07-18  7:48       ` Andi Kleen
2009-07-18 13:49         ` Theodore Tso
2009-07-18 14:21           ` Andi Kleen
2009-07-18 14:32             ` Andreas Dilger
2009-07-18 18:19             ` Christoph Hellwig
2009-07-29 15:07           ` Pavel Machek
2009-07-19  3:44         ` Tapani Tarvainen
2009-07-18  6:09 ` Andi Kleen
2009-07-22  6:59 ` Andrew Morton
2009-07-22 18:32   ` Andreas Dilger
2009-07-22 18:51     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090718065213.GK4231@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=andi@firstfloor.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).