From: Eric Sandeen <sandeen@redhat.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: Re: [PATCH 1/4] e2fsck: Discard only unused parts of inode table
Date: Mon, 05 Mar 2012 11:45:34 -0600 [thread overview]
Message-ID: <4F54FBBE.8060303@redhat.com> (raw)
In-Reply-To: <1330933776-2696-1-git-send-email-lczerner@redhat.com>
On 03/05/2012 01:49 AM, Lukas Czerner wrote:
> When calling e2fsck with '-E discard' option it might happen that
> valid inodes are discarded accidentally. This is because we just
> discard the part of inode table which lies past the free inode count.
> This is terribly wrong (sorry!).
>
> This patch fixes it so only the free parts of an inode table
> is discarded, leaving used inodes intact. This was tested with highly
> fragmented inode tables with block size 4k and 1k.
Every time I look at this I have new comments. The surrounding code
is confusing to read, IMHO, so don't take it too hard. ;)
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Reported-by: Phillip Susi <psusi@ubuntu.com>
I'll go ahead and:
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
so hopefully we can get this in & fix the bug.
but I have one more suggestion, which could be done as a cleanup later
on (I think the check_*_bitmaps could use a fair bit of cleanup for
clarity):
/*
* If the last inode is free, we can discard it as well.
*/
if (inodes >= first_free)
e2fsck_discard_inodes(ctx, group, first_free,
inodes - first_free + 1);
could/should probably be something like:
if (inodes == first_free)
e2fsck_discard_inodes(ctx, group, inodes, 1);
I _think_ this case should only, ever, handle the last inode in the
group, right?
-Eric
> ---
> e2fsck/pass5.c | 90 +++++++++++++++++++++++++++++++++++++++++--------------
> 1 files changed, 67 insertions(+), 23 deletions(-)
>
> diff --git a/e2fsck/pass5.c b/e2fsck/pass5.c
> index 1e836e3..ee73dd5 100644
> --- a/e2fsck/pass5.c
> +++ b/e2fsck/pass5.c
> @@ -94,6 +94,52 @@ static void e2fsck_discard_blocks(e2fsck_t ctx, io_manager manager,
> ctx->options &= ~E2F_OPT_DISCARD;
> }
>
> +/*
> + * This will try to discard number 'count' inodes starting at
> + * inode number 'start' within the 'group'. Note that 'start'
> + * is 1-based, it means that we need to adjust it by -1 in this
> + * function to compute right offset in the particular inode table.
> + */
> +static void e2fsck_discard_inodes(e2fsck_t ctx, int group,
> + int start, int count)
> +{
> + ext2_filsys fs = ctx->fs;
> + blk64_t blk, num;
> + int orig = count;
> +
> + /*
> + * Sanity check for 'start'
> + */
> + if ((start < 1) || (start > EXT2_INODES_PER_GROUP(fs->super))) {
> + printf("PROGRAMMING ERROR: Got start %d outside of group %d!"
> + " Disabling discard\n",
> + start, group);
> + ctx->options &= ~E2F_OPT_DISCARD;
> + }
> +
> + if ((ctx->options & E2F_OPT_NO) || !(ctx->options & E2F_OPT_DISCARD))
> + return;
> +
> + /*
> + * Start is inode number within the group which starts
> + * counting from 1, so we need to adjust it.
> + */
> + start -= 1;
> +
> + /*
> + * We can discard only blocks containing only unused
> + * inodes in the table.
> + */
> + blk = DIV_ROUND_UP(start,
> + EXT2_INODES_PER_BLOCK(fs->super));
> + count -= (blk * EXT2_INODES_PER_BLOCK(fs->super) - start);
> + blk += ext2fs_inode_table_loc(fs, group);
> + num = count / EXT2_INODES_PER_BLOCK(fs->super);
> +
> + if (num > 0)
> + e2fsck_discard_blocks(ctx, fs->io->manager, blk, num);
> +}
> +
> #define NO_BLK ((blk64_t) -1)
>
> static void print_bitmap_problem(e2fsck_t ctx, int problem,
> @@ -435,6 +481,7 @@ static void check_inode_bitmaps(e2fsck_t ctx)
> int skip_group = 0;
> int redo_flag = 0;
> io_manager manager = ctx->fs->io->manager;
> + unsigned long long first_free = fs->super->s_inodes_per_group + 1;
>
> clear_problem_context(&pctx);
> free_array = (int *) e2fsck_allocate_memory(ctx,
> @@ -497,6 +544,7 @@ redo_counts:
> * are 0, count the free inode,
> * skip the current block group.
> */
> + first_free = 1;
> inodes = fs->super->s_inodes_per_group - 1;
> group_free = inodes;
> free_inodes += inodes;
> @@ -561,50 +609,46 @@ redo_counts:
> ctx->options &= ~E2F_OPT_DISCARD;
>
> do_counts:
> + inodes++;
> if (bitmap) {
> if (ext2fs_test_inode_bitmap2(ctx->inode_dir_map, i))
> dirs_count++;
> + if (inodes > first_free) {
> + e2fsck_discard_inodes(ctx, group, first_free,
> + inodes - first_free);
> + first_free = fs->super->s_inodes_per_group + 1;
> + }
> } else if (!skip_group || csum_flag) {
> group_free++;
> free_inodes++;
> + if (first_free > inodes)
> + first_free = inodes;
> }
>
> - inodes++;
> if ((inodes == fs->super->s_inodes_per_group) ||
> (i == fs->super->s_inodes_count)) {
> -
> - free_array[group] = group_free;
> - dir_array[group] = dirs_count;
> -
> - /* Discard inode table */
> - if (ctx->options & E2F_OPT_DISCARD) {
> - blk64_t used_blks, blk, num;
> -
> - used_blks = DIV_ROUND_UP(
> - (EXT2_INODES_PER_GROUP(fs->super) -
> - group_free),
> - EXT2_INODES_PER_BLOCK(fs->super));
> -
> - blk = ext2fs_inode_table_loc(fs, group) +
> - used_blks;
> - num = fs->inode_blocks_per_group -
> - used_blks;
> - e2fsck_discard_blocks(ctx, manager, blk, num);
> - }
> -
> + /*
> + * If the last inode is free, we can discard it as well.
> + */
> + if (inodes >= first_free)
> + e2fsck_discard_inodes(ctx, group, first_free,
> + inodes - first_free + 1);
> /*
> * If discard zeroes data and the group inode table
> * was not zeroed yet, set itable as zeroed
> */
> if ((ctx->options & E2F_OPT_DISCARD) &&
> - (io_channel_discard_zeroes_data(fs->io)) &&
> + io_channel_discard_zeroes_data(fs->io) &&
> !(ext2fs_bg_flags_test(fs, group,
> - EXT2_BG_INODE_ZEROED))) {
> + EXT2_BG_INODE_ZEROED))) {
> ext2fs_bg_flags_set(fs, group,
> EXT2_BG_INODE_ZEROED);
> ext2fs_group_desc_csum_set(fs, group);
> }
>
> + first_free = fs->super->s_inodes_per_group + 1;
> + free_array[group] = group_free;
> + dir_array[group] = dirs_count;
> group ++;
> inodes = 0;
> skip_group = 0;
next prev parent reply other threads:[~2012-03-05 17:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-05 7:49 [PATCH 1/4] e2fsck: Discard only unused parts of inode table Lukas Czerner
2012-03-05 7:49 ` [PATCH 2/4] e2fsck: Do not forget to discard last block group Lukas Czerner
2012-03-05 19:29 ` Eric Sandeen
2012-03-11 19:18 ` Ted Ts'o
2012-03-11 19:33 ` [PATCH 1/3] e2fsck: remove last argument from e2fsck_discard_blocks() Theodore Ts'o
2012-03-11 19:33 ` [PATCH 2/3] e2fsck: do not forget to discard last block group Theodore Ts'o
2012-03-12 7:31 ` Lukas Czerner
2012-03-11 19:33 ` [PATCH 3/3] e2fsck: optimize CPU usage in check_{block,inode}_bitmaps() Theodore Ts'o
2012-03-12 7:44 ` Lukas Czerner
2012-03-12 7:30 ` [PATCH 1/3] e2fsck: remove last argument from e2fsck_discard_blocks() Lukas Czerner
2012-03-05 7:49 ` [PATCH 3/4] e2fsck: Do not discard when in read only mode Lukas Czerner
2012-03-05 19:33 ` Eric Sandeen
2012-03-11 19:35 ` Ted Ts'o
2012-03-05 7:49 ` [PATCH 4/4] e2fsck: Do not discard itable if discard doen't zero data Lukas Czerner
2012-03-05 19:46 ` Eric Sandeen
2012-03-11 19:39 ` Ted Ts'o
2012-03-05 17:45 ` Eric Sandeen [this message]
2012-03-05 18:43 ` [PATCH 1/4] e2fsck: Discard only unused parts of inode table Lukas Czerner
2012-03-05 19:01 ` Eric Sandeen
2012-03-11 19:27 ` Ted Ts'o
2012-03-12 7:26 ` Lukas Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F54FBBE.8060303@redhat.com \
--to=sandeen@redhat.com \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).