* [RFC PATCH] ext4: validate number of meta clusters in group
@ 2016-07-01 13:06 Vegard Nossum
2016-07-02 7:49 ` Darrick J. Wong
0 siblings, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2016-07-01 13:06 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Ext4 Developers List, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1149 bytes --]
Hi,
I've found that sbi->s_es->s_reserved_gdt_blocks is not validated before
being used, so e.g. a value of 25600 will overflow the buffer head and
corrupt random kernel memory (I've observed 20+ different stack traces
due to this bug! many of them long after the code has finished).
The following patch fixes it for me:
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 3020fd7..1ea5054 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -208,6 +208,9 @@ static int ext4_init_block_bitmap(struct super_block
*sb,
memset(bh->b_data, 0, sb->s_blocksize);
bit_max = ext4_num_base_meta_clusters(sb, block_group);
+ if ((bit_max >> 3) >= bh->b_size)
+ return -EFSCORRUPTED;
+
for (bit = 0; bit < bit_max; bit++)
ext4_set_bit(bit, bh->b_data);
However, I think there are potentially more bugs later in this function
where offsets are not validated so it needs to be reviewed carefully.
Another question is whether we should do the validation earlier, e.g. in
ext4_fill_super(). I'm not too familiar with the code, but maybe
something like the attached patch would be better? It does seem to fix
the issue as well.
Vegard
[-- Attachment #2: 0001-ext4-validate-number-of-base-meta-clusters-in-group.patch --]
[-- Type: text/x-patch, Size: 2567 bytes --]
>From efcee80eb78816a4d495224ffc624adf04217044 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@oracle.com>
Date: Fri, 1 Jul 2016 15:03:39 +0200
Subject: [PATCH] ext4: validate number of base meta clusters in group
---
fs/ext4/balloc.c | 4 +---
fs/ext4/ext4.h | 2 ++
fs/ext4/super.c | 10 ++++++++++
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 3020fd7..ec03f01 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -22,8 +22,6 @@
#include <trace/events/ext4.h>
-static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
- ext4_group_t block_group);
/*
* balloc.c contains the blocks allocation and deallocation routines
*/
@@ -817,7 +815,7 @@ unsigned long ext4_bg_num_gdb(struct super_block *sb, ext4_group_t group)
* This function returns the number of file system metadata clusters at
* the beginning of a block group, including the reserved gdt blocks.
*/
-static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
+unsigned ext4_num_base_meta_clusters(struct super_block *sb,
ext4_group_t block_group)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b84aa1ca..e492b0b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2270,6 +2270,8 @@ extern ext4_grpblk_t ext4_block_group_offset(struct super_block *sb,
extern int ext4_bg_has_super(struct super_block *sb, ext4_group_t group);
extern unsigned long ext4_bg_num_gdb(struct super_block *sb,
ext4_group_t group);
+extern unsigned ext4_num_base_meta_clusters(struct super_block *sb,
+ ext4_group_t block_group);
extern ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
ext4_fsblk_t goal,
unsigned int flags,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 563555e..b5a9d28 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3641,6 +3641,16 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount;
}
sbi->s_groups_count = blocks_count;
+
+ for (i = 0; i < sbi->s_groups_count; ++i) {
+ int bit_max = ext4_num_base_meta_clusters(sb, i);
+ if ((bit_max >> 3) >= sb->s_blocksize) {
+ ext4_msg(sb, KERN_WARNING, "meta cluster base for "
+ "group %u exceeds block size", i);
+ goto failed_mount;
+ }
+ }
+
sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
(EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
db_count = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
--
1.9.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ext4: validate number of meta clusters in group
2016-07-01 13:06 [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
@ 2016-07-02 7:49 ` Darrick J. Wong
2016-07-06 0:03 ` [PATCH] ext4: validate s_reserved_gdt_blocks on mount Theodore Ts'o
2016-07-07 20:10 ` [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
0 siblings, 2 replies; 7+ messages in thread
From: Darrick J. Wong @ 2016-07-02 7:49 UTC (permalink / raw)
To: Vegard Nossum; +Cc: Theodore Ts'o, Ext4 Developers List, linux-fsdevel
On Fri, Jul 01, 2016 at 03:06:41PM +0200, Vegard Nossum wrote:
> Hi,
>
> I've found that sbi->s_es->s_reserved_gdt_blocks is not validated before
> being used, so e.g. a value of 25600 will overflow the buffer head and
> corrupt random kernel memory (I've observed 20+ different stack traces
> due to this bug! many of them long after the code has finished).
This function merely initializes a bitmap block once ext4 decides to start
using the block group, which could be a long time after mount...
> The following patch fixes it for me:
>
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 3020fd7..1ea5054 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -208,6 +208,9 @@ static int ext4_init_block_bitmap(struct super_block
> *sb,
> memset(bh->b_data, 0, sb->s_blocksize);
>
> bit_max = ext4_num_base_meta_clusters(sb, block_group);
> + if ((bit_max >> 3) >= bh->b_size)
> + return -EFSCORRUPTED;
> +
> for (bit = 0; bit < bit_max; bit++)
> ext4_set_bit(bit, bh->b_data);
>
> However, I think there are potentially more bugs later in this function
> where offsets are not validated so it needs to be reviewed carefully.
>
> Another question is whether we should do the validation earlier, e.g. in
> ext4_fill_super(). I'm not too familiar with the code, but maybe
> something like the attached patch would be better? It does seem to fix
> the issue as well.
...whereas superblock fields such as s_reserved_gdt_blocks ought to be
checked (and -EFSCORRUPTED'ed) at mount time. Since we know that BG 0
has everything (superblock, old school gdt tables, inodes), maybe we
should make mount check that ext4_num_base_meta_clusters() >
NBBY * fs->block_size?
(Kinda surprised ext4 doesn't already do this...)
--D
>
>
> Vegard
> From efcee80eb78816a4d495224ffc624adf04217044 Mon Sep 17 00:00:00 2001
> From: Vegard Nossum <vegard.nossum@oracle.com>
> Date: Fri, 1 Jul 2016 15:03:39 +0200
> Subject: [PATCH] ext4: validate number of base meta clusters in group
>
> ---
> fs/ext4/balloc.c | 4 +---
> fs/ext4/ext4.h | 2 ++
> fs/ext4/super.c | 10 ++++++++++
> 3 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 3020fd7..ec03f01 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -22,8 +22,6 @@
>
> #include <trace/events/ext4.h>
>
> -static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
> - ext4_group_t block_group);
> /*
> * balloc.c contains the blocks allocation and deallocation routines
> */
> @@ -817,7 +815,7 @@ unsigned long ext4_bg_num_gdb(struct super_block *sb, ext4_group_t group)
> * This function returns the number of file system metadata clusters at
> * the beginning of a block group, including the reserved gdt blocks.
> */
> -static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
> +unsigned ext4_num_base_meta_clusters(struct super_block *sb,
> ext4_group_t block_group)
> {
> struct ext4_sb_info *sbi = EXT4_SB(sb);
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index b84aa1ca..e492b0b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2270,6 +2270,8 @@ extern ext4_grpblk_t ext4_block_group_offset(struct super_block *sb,
> extern int ext4_bg_has_super(struct super_block *sb, ext4_group_t group);
> extern unsigned long ext4_bg_num_gdb(struct super_block *sb,
> ext4_group_t group);
> +extern unsigned ext4_num_base_meta_clusters(struct super_block *sb,
> + ext4_group_t block_group);
> extern ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
> ext4_fsblk_t goal,
> unsigned int flags,
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 563555e..b5a9d28 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3641,6 +3641,16 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
> goto failed_mount;
> }
> sbi->s_groups_count = blocks_count;
> +
> + for (i = 0; i < sbi->s_groups_count; ++i) {
> + int bit_max = ext4_num_base_meta_clusters(sb, i);
> + if ((bit_max >> 3) >= sb->s_blocksize) {
> + ext4_msg(sb, KERN_WARNING, "meta cluster base for "
> + "group %u exceeds block size", i);
> + goto failed_mount;
> + }
> + }
> +
> sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
> (EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
> db_count = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
> --
> 1.9.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] ext4: validate s_reserved_gdt_blocks on mount
2016-07-02 7:49 ` Darrick J. Wong
@ 2016-07-06 0:03 ` Theodore Ts'o
2016-07-07 20:10 ` [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
1 sibling, 0 replies; 7+ messages in thread
From: Theodore Ts'o @ 2016-07-06 0:03 UTC (permalink / raw)
To: Ext4 Developers List; +Cc: vegard.nossum, Theodore Ts'o, stable
If s_reserved_gdt_blocks is extremely large, it's possible for
ext4_init_block_bitmap(), which is called when ext4 sets up an
uninitialized block bitmap, to corrupt random kernel memory. Add the
same checks which e2fsck has --- it must never be larger than
blocksize / sizeof(__u32) --- and then add a backup check in
ext4_init_block_bitmap() in case the superblock gets modified after
the file system is mounted.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
fs/ext4/balloc.c | 3 +++
fs/ext4/super.c | 7 +++++++
2 files changed, 10 insertions(+)
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 0b8105b..799a92b 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -208,6 +208,9 @@ static int ext4_init_block_bitmap(struct super_block *sb,
memset(bh->b_data, 0, sb->s_blocksize);
bit_max = ext4_num_base_meta_clusters(sb, block_group);
+ if ((bit_max >> 3) >= bh->b_size)
+ return -EFSCORRUPTED;
+
for (bit = 0; bit < bit_max; bit++)
ext4_set_bit(bit, bh->b_data);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5664ee6..13c49af7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3416,6 +3416,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount;
}
+ if (le16_to_cpu(sbi->s_es->s_reserved_gdt_blocks) > (blocksize / 4)) {
+ ext4_msg(sb, KERN_ERR,
+ "Number of reserved GDT blocks insanely large: %d",
+ le16_to_cpu(sbi->s_es->s_reserved_gdt_blocks));
+ goto failed_mount;
+ }
+
if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
err = bdev_dax_supported(sb, blocksize);
if (err)
--
2.5.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ext4: validate number of meta clusters in group
2016-07-02 7:49 ` Darrick J. Wong
2016-07-06 0:03 ` [PATCH] ext4: validate s_reserved_gdt_blocks on mount Theodore Ts'o
@ 2016-07-07 20:10 ` Vegard Nossum
2016-07-11 2:51 ` Theodore Ts'o
1 sibling, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2016-07-07 20:10 UTC (permalink / raw)
To: Darrick J. Wong, Theodore Ts'o; +Cc: Ext4 Developers List, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 2218 bytes --]
On 07/02/2016 09:49 AM, Darrick J. Wong wrote:
> On Fri, Jul 01, 2016 at 03:06:41PM +0200, Vegard Nossum wrote:
>> Hi,
>>
>> I've found that sbi->s_es->s_reserved_gdt_blocks is not validated before
>> being used, so e.g. a value of 25600 will overflow the buffer head and
>> corrupt random kernel memory (I've observed 20+ different stack traces
>> due to this bug! many of them long after the code has finished).
>
> This function merely initializes a bitmap block once ext4 decides to start
> using the block group, which could be a long time after mount...
>
>> The following patch fixes it for me:
>>
>> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
>> index 3020fd7..1ea5054 100644
>> --- a/fs/ext4/balloc.c
>> +++ b/fs/ext4/balloc.c
>> @@ -208,6 +208,9 @@ static int ext4_init_block_bitmap(struct super_block
>> *sb,
>> memset(bh->b_data, 0, sb->s_blocksize);
>>
>> bit_max = ext4_num_base_meta_clusters(sb, block_group);
>> + if ((bit_max >> 3) >= bh->b_size)
>> + return -EFSCORRUPTED;
>> +
>> for (bit = 0; bit < bit_max; bit++)
>> ext4_set_bit(bit, bh->b_data);
>>
>> However, I think there are potentially more bugs later in this function
>> where offsets are not validated so it needs to be reviewed carefully.
>>
>> Another question is whether we should do the validation earlier, e.g. in
>> ext4_fill_super(). I'm not too familiar with the code, but maybe
>> something like the attached patch would be better? It does seem to fix
>> the issue as well.
>
> ...whereas superblock fields such as s_reserved_gdt_blocks ought to be
> checked (and -EFSCORRUPTED'ed) at mount time. Since we know that BG 0
> has everything (superblock, old school gdt tables, inodes), maybe we
> should make mount check that ext4_num_base_meta_clusters() >
> NBBY * fs->block_size?
>
> (Kinda surprised ext4 doesn't already do this...)
I ran into a second problem (this time it was num_clusters_in_group()
returning a bogus value) with the same symptoms (random memory
corruptions), the new attached patch fixes both problems by checking the
values at mount time.
I'm using (bit_max >> 3) >= sb->s_blocksize which should be equivalent
to what you suggested (num_clusters() > NBBY * fs->block_size).
Vegard
[-- Attachment #2: 0001-ext4-validate-number-of-clusters-in-group.patch --]
[-- Type: text/x-patch, Size: 4425 bytes --]
>From 2e2fc0c40d604e88748bba8d440763249c55b391 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@oracle.com>
Date: Fri, 1 Jul 2016 15:03:39 +0200
Subject: [PATCH] ext4: validate number of clusters in group
I've found that sbi->s_es->s_reserved_gdt_blocks is not validated before
being used, so e.g. a value of 25600 will overflow the buffer head and
corrupt random kernel memory (I've observed 20+ different stack traces
due to these bugs, many of them long after the code has finished).
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
fs/ext4/balloc.c | 12 +++++-------
fs/ext4/ext4.h | 4 ++++
fs/ext4/super.c | 19 +++++++++++++++++++
3 files changed, 28 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 3020fd7..5a5c679 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -22,8 +22,6 @@
#include <trace/events/ext4.h>
-static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
- ext4_group_t block_group);
/*
* balloc.c contains the blocks allocation and deallocation routines
*/
@@ -155,8 +153,8 @@ static unsigned ext4_num_overhead_clusters(struct super_block *sb,
return num_clusters;
}
-static unsigned int num_clusters_in_group(struct super_block *sb,
- ext4_group_t block_group)
+unsigned int ext4_num_clusters_in_group(struct super_block *sb,
+ ext4_group_t block_group)
{
unsigned int blocks;
@@ -237,7 +235,7 @@ static int ext4_init_block_bitmap(struct super_block *sb,
* the blocksize * 8 ( which is the size of bitmap ), set rest
* of the block bitmap to 1
*/
- ext4_mark_bitmap_end(num_clusters_in_group(sb, block_group),
+ ext4_mark_bitmap_end(ext4_num_clusters_in_group(sb, block_group),
sb->s_blocksize * 8, bh->b_data);
ext4_block_bitmap_csum_set(sb, block_group, gdp, bh);
ext4_group_desc_csum_set(sb, block_group, gdp);
@@ -251,7 +249,7 @@ unsigned ext4_free_clusters_after_init(struct super_block *sb,
ext4_group_t block_group,
struct ext4_group_desc *gdp)
{
- return num_clusters_in_group(sb, block_group) -
+ return ext4_num_clusters_in_group(sb, block_group) -
ext4_num_overhead_clusters(sb, block_group, gdp);
}
@@ -817,7 +815,7 @@ unsigned long ext4_bg_num_gdb(struct super_block *sb, ext4_group_t group)
* This function returns the number of file system metadata clusters at
* the beginning of a block group, including the reserved gdt blocks.
*/
-static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
+unsigned ext4_num_base_meta_clusters(struct super_block *sb,
ext4_group_t block_group)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b84aa1ca..c325c95 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2270,6 +2270,10 @@ extern ext4_grpblk_t ext4_block_group_offset(struct super_block *sb,
extern int ext4_bg_has_super(struct super_block *sb, ext4_group_t group);
extern unsigned long ext4_bg_num_gdb(struct super_block *sb,
ext4_group_t group);
+extern unsigned ext4_num_base_meta_clusters(struct super_block *sb,
+ ext4_group_t block_group);
+extern unsigned int ext4_num_clusters_in_group(struct super_block *sb,
+ ext4_group_t block_group);
extern ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
ext4_fsblk_t goal,
unsigned int flags,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 563555e..05ec0a7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3641,6 +3641,25 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
goto failed_mount;
}
sbi->s_groups_count = blocks_count;
+
+ for (i = 0; i < sbi->s_groups_count; ++i) {
+ int bit_max;
+
+ bit_max = ext4_num_base_meta_clusters(sb, i);
+ if ((bit_max >> 3) >= sb->s_blocksize) {
+ ext4_msg(sb, KERN_WARNING, "meta cluster base for "
+ "group %u exceeds block size", i);
+ goto failed_mount;
+ }
+
+ bit_max = ext4_num_clusters_in_group(sb, i);
+ if ((bit_max >> 3) >= sb->s_blocksize) {
+ ext4_msg(sb, KERN_WARNING, "clusters in "
+ "group %u exceeds block size", i);
+ goto failed_mount;
+ }
+ }
+
sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
(EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
db_count = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
--
1.9.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ext4: validate number of meta clusters in group
2016-07-07 20:10 ` [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
@ 2016-07-11 2:51 ` Theodore Ts'o
2016-07-11 18:50 ` Vegard Nossum
0 siblings, 1 reply; 7+ messages in thread
From: Theodore Ts'o @ 2016-07-11 2:51 UTC (permalink / raw)
To: Vegard Nossum; +Cc: Darrick J. Wong, Ext4 Developers List, linux-fsdevel
On Thu, Jul 07, 2016 at 10:10:40PM +0200, Vegard Nossum wrote:
>
> I ran into a second problem (this time it was num_clusters_in_group()
> returning a bogus value) with the same symptoms (random memory
> corruptions), the new attached patch fixes both problems by checking the
> values at mount time.
Can you give me a dumpe2fs -h of a file system that is causing
num_clusters_in_group() to be bogus?
I want to make sure I'm checking that correct base values, insteda of
doing a brute force loop over all of the block groups and calling
ext4_num_clusters_in_group() and ext4_num_base_meta_clusters() for all
block groups.
Thanks!!
- Ted
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ext4: validate number of meta clusters in group
2016-07-11 2:51 ` Theodore Ts'o
@ 2016-07-11 18:50 ` Vegard Nossum
2016-07-11 20:30 ` Vegard Nossum
0 siblings, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2016-07-11 18:50 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Darrick J. Wong, Ext4 Developers List, linux-fsdevel
On 07/11/2016 04:51 AM, Theodore Ts'o wrote:
> On Thu, Jul 07, 2016 at 10:10:40PM +0200, Vegard Nossum wrote:
>>
>> I ran into a second problem (this time it was num_clusters_in_group()
>> returning a bogus value) with the same symptoms (random memory
>> corruptions), the new attached patch fixes both problems by checking the
>> values at mount time.
>
> Can you give me a dumpe2fs -h of a file system that is causing
> num_clusters_in_group() to be bogus?
>
> I want to make sure I'm checking that correct base values, insteda of
> doing a brute force loop over all of the block groups and calling
> ext4_num_clusters_in_group() and ext4_num_base_meta_clusters() for all
> block groups.
>
> Thanks!!
It's sbi->s_es->s_reserved_gdt_blocks:
$ dumpe2fs -h input/da33000e751e26880ba5c2ee31e871b99f3d12e4.full
dumpe2fs 1.42.9 (4-Feb-2014)
Filesystem volume name: <none>
Last mounted on: /home/vegard/kernel-fuzzing-v1.0-pre1/mnt
Filesystem UUID: c54d8f19-a95c-41b0-8e9a-2e612005ef76
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: dir_prealloc filetype extent flex_bg
sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: journal_data_writeback debug bsdgroups
user_xattr acl uid16 nobarrier block_validity discard nodelalloc
MNTOPT_12 MNTOPT_13 MNTOPT_14 MNTOPT_16 MNTOPT_17 MNTOPT_18 MNTOPT_19
MNTOPT_20 MNTOPT_21 MNTOPT_22 MNTOPT_24 MNTOPT_25 MNTOPT_26 MNTOPT_27
MNTOPT_28 MNTOPT_29 MNTOPT_30
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 4096
Block count: 16384
Reserved block count: 819
Free blocks: 0
Free inodes: 0
First block: 1
Block size: 1024
Fragment size: 1024
Reserved GDT blocks: 65535
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 2048
Inode blocks per group: 256
RAID stripe width: 3
First meta block group: 2139062143
Flex block group size: 16
Filesystem created: Tue Oct 13 17:55:43 2037
Last mount time: Mon Jul 11 20:27:01 2016
Last write time: Mon Jul 11 20:27:01 2016
Mount count: 2
Maximum mount count: -1
Last checked: Wed Jul 6 11:18:12 2016
Check interval: 0 (<none>)
Lifetime writes: 1213 kB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: HASHALG_127
Directory Hash Seed: 7f7f7f7f-7f7f-7f7f-7f7f-7f7f7f7f7f7f
Journal backup: type 127
With this patch:
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 3020fd7..87655c6 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -208,6 +208,8 @@ static int ext4_init_block_bitmap(struct super_block
*sb,
memset(bh->b_data, 0, sb->s_blocksize);
bit_max = ext4_num_base_meta_clusters(sb, block_group);
+ printk(KERN_ERR "kernel BUG: %llu > %llu\n", bit_max,
sb->s_blocksize * 8);
+ BUG_ON(bit_max > sb->s_blocksize * 8);
for (bit = 0; bit < bit_max; bit++)
ext4_set_bit(bit, bh->b_data);
I'm getting:
[EXT4 FS bs=1024, gc=2, bpg=8192, ipg=2048, mo=e000e02c, mo2=0002]
System zones: 1-2, 67-67, 98-609, 4179-4179, 5714-5714, 8193-8194
EXT4-fs (loop0): mounting with "discard" option, but the device does not
support discard
EXT4-fs (loop0): mounted filesystem without journal. Opts: errors=remount-ro
kernel BUG: 65537 > 8192
------------[ cut here ]------------
kernel BUG at fs/ext4/balloc.c:212!
invalid opcode: 0000 [#1]
CPU: 0 PID: 53 Comm: ext4.exe Not tainted 4.7.0-rc5+ #638
task: ffff8800003354c0 ti: ffff880000338000 task.ti: ffff880000338000
RIP: 0010:[<ffffffff810fbd20>] [<ffffffff810fbd20>]
ext4_read_block_bitmap_nowait+0x590/0x5a0
RSP: 0018:ffff88000033b8b8 EFLAGS: 00010202
RAX: 0000000000002000 RBX: ffff880000301800 RCX: ffffffff81629200
RDX: 0000000000010001 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffff88000033b8f8 R08: 0000000000000400 R09: 0000000000000006
R10: ffffffff8170f144 R11: 000000000000008d R12: ffff880000353800
R13: 0000000000000000 R14: 0000000000010001 R15: ffff8800070185b0
FS: 00007f8056697780(0000) GS:ffffffff81621000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000060b000 CR3: 000000000031d000 CR4: 00000000000006b0
Stack:
0000000002408840 0000000000000004 ffff880000301000 0000000000000000
ffff88000027c260 ffffea000000af70 0000000000000000 ffff880000301800
ffff88000033b980 ffffffff8112993d ffffea000000af70 ffffffff8107faed
Call Trace:
[<ffffffff8112993d>] ext4_mb_init_cache+0x14d/0x5e0
[<ffffffff8107faed>] ? add_to_page_cache_lru+0x7d/0xf0
[<ffffffff81129f15>] ext4_mb_init_group+0x145/0x270
[<ffffffff8112a5d8>] ext4_mb_load_buddy_gfp+0x408/0x480
[<ffffffff8112e595>] ext4_free_blocks+0x315/0x9c0
[<ffffffff8113236c>] ext4_clear_blocks+0x18c/0x260
[<ffffffff81132555>] ext4_free_data+0x115/0x160
[<ffffffff8113367a>] ext4_ind_truncate+0x2aa/0x330
[<ffffffff8112dc2d>] ? ext4_discard_preallocations+0x11d/0x380
[<ffffffff81049fd3>] ? __might_sleep+0x43/0x80
[<ffffffff81107795>] ext4_truncate+0x2a5/0x2f0
[<ffffffff8110844f>] ext4_direct_IO+0x4ff/0x5c0
[<ffffffff8108242f>] generic_file_direct_write+0x9f/0x150
[<ffffffff81082784>] __generic_file_write_iter+0xb4/0x1e0
[<ffffffff81049fd3>] ? __might_sleep+0x43/0x80
[<ffffffff810fdc8b>] ext4_file_write_iter+0x11b/0x320
[<ffffffff810be81b>] ? do_filp_open+0x8b/0xd0
[<ffffffff810b1faf>] __vfs_write+0xbf/0x120
[<ffffffff810b2115>] vfs_write+0xa5/0x110
[<ffffffff810b2274>] SyS_write+0x44/0xb0
[<ffffffff8129242f>] entry_SYSCALL_64_fastpath+0x1a/0xa4
Code: 28 48 d3 ea 48 63 d2 48 0f ab 10 e9 ba fe ff ff 4c 89 e6 48 89 df
4c 89 45 c8 45 31 f6 e8 f9 85 01 00 4c 8b 45 c8 48 89 c2 eb 95 <0f> 0b
66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41
RIP [<ffffffff810fbd20>] ext4_read_block_bitmap_nowait+0x590/0x5a0
RSP <ffff88000033b8b8>
---[ end trace 12713795a17c50f4 ]---
Hope this helps,
Vegard
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] ext4: validate number of meta clusters in group
2016-07-11 18:50 ` Vegard Nossum
@ 2016-07-11 20:30 ` Vegard Nossum
0 siblings, 0 replies; 7+ messages in thread
From: Vegard Nossum @ 2016-07-11 20:30 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Darrick J. Wong, Ext4 Developers List, linux-fsdevel
On 07/11/2016 08:50 PM, Vegard Nossum wrote:
> On 07/11/2016 04:51 AM, Theodore Ts'o wrote:
>> On Thu, Jul 07, 2016 at 10:10:40PM +0200, Vegard Nossum wrote:
>>>
>>> I ran into a second problem (this time it was num_clusters_in_group()
>>> returning a bogus value) with the same symptoms (random memory
>>> corruptions), the new attached patch fixes both problems by checking the
>>> values at mount time.
>>
>> Can you give me a dumpe2fs -h of a file system that is causing
>> num_clusters_in_group() to be bogus?
>>
>> I want to make sure I'm checking that correct base values, insteda of
>> doing a brute force loop over all of the block groups and calling
>> ext4_num_clusters_in_group() and ext4_num_base_meta_clusters() for all
>> block groups.
>>
>> Thanks!!
>
> It's sbi->s_es->s_reserved_gdt_blocks:
Durrr, no, it's not, I just realised you asked about
num_clusters_in_group() and not num_base_meta_clusters().
So I did the same thing for that and I tracked it down to
s_blocks_count_{lo,hi} both being 0, causing num_clusters_in_group() to
effectively return 0 - ext4_group_first_block_no(sb, block_group).
But dumpe2fs shows block count to be 16384, so I was a bit puzzled. I
set a breakpoint on s_blocks_count_lo and indeed it's being corrupted:
Hardware watchpoint 2: ((struct ext4_super_block *)
0x61e2c400)->s_blocks_count_lo
Old value = 16384
New value = 0
0x00000000602d9d59 in memset ()
(gdb) bt
#0 0x00000000602d9d59 in memset ()
#1 0x000000006010e944 in ext4_init_block_bitmap (...) at
fs/ext4/balloc.c:215
#2 ext4_read_block_bitmap_nowait (...) at fs/ext4/balloc.c:455
Curiously enough, that's this memset() in the same function:
memset(bh->b_data, 0, sb->s_blocksize);
Checking with some debug printks, it indeed seems like bh->b_data points
to the struct ext4_super_block (!):
&EXT4_SB(sb)->s_es->s_blocks_count_lo = 0000000063a3c404
bh->b_data = 0000000063a3c400
bh->b_size = 400
Well, you can disregard my patch for sure. I'm not sure how the bitmap
we're supposed to initialise ends up pointing to the ext4_super_block
though.
Vegard
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-07-11 20:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-01 13:06 [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
2016-07-02 7:49 ` Darrick J. Wong
2016-07-06 0:03 ` [PATCH] ext4: validate s_reserved_gdt_blocks on mount Theodore Ts'o
2016-07-07 20:10 ` [RFC PATCH] ext4: validate number of meta clusters in group Vegard Nossum
2016-07-11 2:51 ` Theodore Ts'o
2016-07-11 18:50 ` Vegard Nossum
2016-07-11 20:30 ` Vegard Nossum
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).