All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Solofo.Ramangalahy@bull.net, linux-ext4@vger.kernel.org
Subject: Re: [RFC 0/2] ext4: zero uninitialized inode tables
Date: Tue, 25 Nov 2008 01:35:33 -0700	[thread overview]
Message-ID: <20081125083533.GS3186@webber.adilger.int> (raw)
In-Reply-To: <20081125053226.GE20928@mit.edu>

On Nov 25, 2008  00:32 -0500, Theodore Ts'o wrote:
> I would recommend doing the first 32k of the inode table
> first, and once it completes, you can update inode_bg_unavaile so that
> an additional (32k / EXT4_INODE_SIZE(sb)) inodes are available.

I agree with everything Ted says, though I would zero the itable in
chunks of 64kB or even 128kB.  Two reasons are because 64kB is the
maximum blocksize for the filesystem, and it doesn't make sense to zero
less than a whole block at once.  Secondly, 64kB is more likely to
match with the internal track size of spinning disks, and 128kB is more
likely to match the erase block size of SSDs.

> In terms of how quickly the itable initializer should work, in between
> each block group, as we discussed on the call, the simplest thing for
> it do is to wait for some time period to go by (say, 5 seconds) before
> working on the next block group.  The next, slightly more complicated
> scheme would be to set a "last ext4 operation time" field in
> EXT4_SB(sb) which is set any time the ext4 code paths are entered

That would be "s_wtime" already in the on-disk superblock.  It wouldn't
kill us to update this occasionally in ext4, though not on disk all
the time.

> (basically, any function in ext4's inode operations, super operations
> or file operations).  The itable initalizer would sample that time,
> and before starting to initialize the next block group where
> BG_ITABLE_ZERO is not set, it would check the last ext4 operation time
> field, and if there had been an ext4 operation in the last 5 seconds,
> it would sleep 5 seconds and check again.

Well, I'd say if it has slept 5s then it should submit a block regardless
of whether the filesystem was in use or not.  Otherwise the itable may
never be zeroed out if the filesystem is always in use.  Adding a rare
64kB write to disk is unlikely to hurt anything, and if people REALLY care
about it they can avoid formatting with "lazy_itable_init".

> This would prevent the itable initializer from running if the filesystem
> is in use, although it will not detect the case where there is a lot
> of mmap'ed I/O going on, but no other ext4 operations.

Wouldn't even mmap operations cause some ext4 methods to be called?

> In the long run, we would really want some kind of I/O activity
> indication from the block device elevator, but that would require
> changes to the core kernel, and the last ext4 operation time is almost
> just as good.

Alternately we could check the journal tid?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2008-11-25  8:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-21 10:23 [RFC 0/2] ext4: zero uninitialized inode tables Solofo.Ramangalahy
2008-11-21 10:23 ` [RFC 1/2] ext4 resize: Mark the added group with EXT4_BG_INODE_ZEROED flag Solofo.Ramangalahy
2008-11-24 23:25   ` Andreas Dilger
2008-11-25 11:27     ` Solofo.Ramangalahy
2008-11-25 21:18       ` Andreas Dilger
2008-11-27  4:50   ` Theodore Tso
2008-11-27  9:30     ` Solofo.Ramangalahy
2008-11-27 22:35       ` Theodore Tso
2008-11-27 23:09         ` Andreas Dilger
2008-11-21 10:23 ` [RFC 2/2] ext4: module to initialize the inode table when using mkfs option lazy_itable_init Solofo.Ramangalahy
2008-11-25  5:32 ` [RFC 0/2] ext4: zero uninitialized inode tables Theodore Tso
2008-11-25  8:35   ` Andreas Dilger [this message]
2008-11-25 12:28   ` Solofo.Ramangalahy
2008-11-25 18:52     ` Theodore Tso
2008-11-25 21:10     ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081125083533.GS3186@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=Solofo.Ramangalahy@bull.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@MIT.EDU \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.