From: Li Zhong <lizhongfs@gmail.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, adilger.kernel@dilger.ca,
Lukas Czerner <lczerner@redhat.com>
Subject: Re: [RFC PATCH] ext4: Fix group number calculation
Date: Sun, 07 Jul 2013 08:14:01 +0800 [thread overview]
Message-ID: <1373156041.14101.8.camel@ThinkPad-T5421> (raw)
In-Reply-To: <20130706031147.GB32062@thunk.org>
On Fri, 2013-07-05 at 23:11 -0400, Theodore Ts'o wrote:
> On Fri, Jul 05, 2013 at 05:29:48PM +0800, Li Zhong wrote:
> >
> > It seems that it is caused by different group numbers calculated from
> > ext4_get_group_number() and ext4_get_group_no_and_offset().
> >
> > The fix below tries to make ext4_get_group_number() consistent with
> > ext4_get_group_no_and_offset():
> > 1) minus first data block number instead of adding it
> > 2) remove the cluster bits, as seems it is used for offset calculation in
> > ext4_get_group_no_and_offset().
>
> Thanks for reporting the problem, but only (1) is a problem. The
> cluster bits part is right, although it wasn't actually ever going to
> be used because we were only setting STD_GROUP_SIZE for non-bigalloc
> file systems. The following is I believe the correct fix.
Ah, yes, it seems correct to fix the place where STD_GROUP_SIZE is set.
So the standard(default) blocks per group is always clustersize * 8:
for non-bigalloc file systems, it equals to blocksize * 8 (cluster has
only 1 block -- cluster_bits is 0)
for bigalloc ones, it equals to blocksize * w ^ cluster_bits * 8
(cluster has more than 1 blocks -- 2 ^ cluster_bits)
Thanks, Zhong
>
> - Ted
>
> From 96eaeed0d7fc516481ed07739826dac840755eb3 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 5 Jul 2013 23:11:16 -0400
> Subject: [PATCH] ext4: fix ext4_get_group_number()
>
> The function ext4_get_group_number() was introduced as an optimization
> in commit bd86298e60b8. Unfortunately, this commit incorrectly
> calculate the group number for file systems with a 1k block size (when
> s_first_data_block is 1 instead of zero). This could cause the
> following kernel BUG:
>
> [ 568.877799] ------------[ cut here ]------------
> [ 568.877833] kernel BUG at fs/ext4/mballoc.c:3728!
> [ 568.877840] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 568.877845] SMP NR_CPUS=32 NUMA pSeries
> [ 568.877852] Modules linked in: binfmt_misc
> [ 568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809f-dirty #1
> [ 568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000
> [ 568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8
> [ 568.877879] REGS: c0000001fa956ed0 TRAP: 0700 Not tainted (3.10.0-03216-g7c6809f-dirty)
> [ 568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 24000428 XER: 00000000
> [ 568.877902] SOFTE: 1
> [ 568.877905] CFAR: c0000000002b5464
> [ 568.877908]
> GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000
> GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f
> GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000
> GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0
> GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672
> GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000
> GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288
> GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10
> [ 568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0
> [ 568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0
> [ 568.878004] Call Trace:
> [ 568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable)
> [ 568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444
> [ 568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734
> [ 568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4
> [ 568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178
> [ 568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490
> [ 568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60
> [ 568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80
> [ 568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0
> [ 568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc
> [ 568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4
> [ 568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80
> [ 568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c
> [ 568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90
> [ 568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c
> [ 568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c
> [ 568.878125] Instruction dump:
> [ 568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70
> [ 568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78
> [ 568.878160] ---[ end trace 594d911d9654770b ]---
>
> In addition fix the STD_GROUP optimization so that it works for
> bigalloc file systems as well.
>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Reported-by: Li Zhong <lizhongfs@gmail.com>
> Cc: Lukas Czerner <lczerner@redhat.com>
> Cc: stable@vger.kernel.org # 3.10
> ---
> fs/ext4/balloc.c | 4 ++--
> fs/ext4/super.c | 8 ++++----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 5833939..ddd715e 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -38,8 +38,8 @@ ext4_group_t ext4_get_group_number(struct super_block *sb,
> ext4_group_t group;
>
> if (test_opt2(sb, STD_GROUP_SIZE))
> - group = (le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) +
> - block) >>
> + group = (block -
> + le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block)) >>
> (EXT4_BLOCK_SIZE_BITS(sb) + EXT4_CLUSTER_BITS(sb) + 3);
> else
> ext4_get_group_no_and_offset(sb, block, &group, NULL);
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 85b3dd6..8862d4d 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3624,10 +3624,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
> sbi->s_addr_per_block_bits = ilog2(EXT4_ADDR_PER_BLOCK(sb));
> sbi->s_desc_per_block_bits = ilog2(EXT4_DESC_PER_BLOCK(sb));
>
> - /* Do we have standard group size of blocksize * 8 blocks ? */
> - if (sbi->s_blocks_per_group == blocksize << 3)
> - set_opt2(sb, STD_GROUP_SIZE);
> -
> for (i = 0; i < 4; i++)
> sbi->s_hash_seed[i] = le32_to_cpu(es->s_hash_seed[i]);
> sbi->s_def_hash_version = es->s_def_hash_version;
> @@ -3697,6 +3693,10 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
> goto failed_mount;
> }
>
> + /* Do we have standard group size of clustersize * 8 blocks ? */
> + if (sbi->s_blocks_per_group == clustersize << 3)
> + set_opt2(sb, STD_GROUP_SIZE);
> +
> /*
> * Test whether we have more sectors than will fit in sector_t,
> * and whether the max offset is addressable by the page cache.
next prev parent reply other threads:[~2013-07-07 0:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-05 9:29 [RFC PATCH] ext4: Fix group number calculation Li Zhong
2013-07-06 3:11 ` Theodore Ts'o
2013-07-07 0:14 ` Li Zhong [this message]
2013-07-08 6:25 ` Lukáš Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1373156041.14101.8.camel@ThinkPad-T5421 \
--to=lizhongfs@gmail.com \
--cc=adilger.kernel@dilger.ca \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.