* [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack
@ 2019-03-13 8:55 Qu Wenruo
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
` (6 more replies)
0 siblings, 7 replies; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs
Thanks for the report from Yoon Jungyeon <jungyeon@gatech.edu>, we have
more fuzzed image to torture btrfs.
Those images exposed the following problems:
- Chunk check is not comprehensive nor early enough
Chunk item check lacks profile bits check (e.g RAID|DUP profile is
invalid).
And for certain fuzzed image, the other copy can be valid, current
check timming is after tree block read, so no way to retry the other
copy.
Address the check timing in the 1st patch, while for the profile bits,
check it in the 4th patch.
- Lack of device item check
Address it in the 2nd patch.
- First key and level check be exploited by cached extent buffer
Cached bad extent buffer can avoid first key and level check.
This is addressed in the 3rd patch.
- Inode type mismatch can lead to NULL dereference in endio function
If an inode claims itself as symlink but still has regular file
extent, then endio function will cause NULL pointer dereference.
Fix it by do extra inode mode and dir item type cross check, at
get_extent() time and inode lookup time.
Addressed in the 5th and 6th patch.
Qu Wenruo (6):
btrfs: tree-checker: Verify chunk items
btrfs: tree-checker: Verify dev item
btrfs: Check the first key and level for cached extent buffer
btrfs: tree-checker: Enhance chunk checker to validate chunk profiler
btrfs: tree-checker: Verify inode item
btrfs: inode: Verify inode mode to avoid NULL pointer dereference
fs/btrfs/ctree.c | 10 +
fs/btrfs/ctree.h | 2 +
fs/btrfs/disk-io.c | 10 +-
fs/btrfs/disk-io.h | 3 +
fs/btrfs/inode.c | 38 +++-
fs/btrfs/tests/inode-tests.c | 1 +
fs/btrfs/tree-checker.c | 342 +++++++++++++++++++++++++++++++++++
fs/btrfs/tree-checker.h | 3 +
fs/btrfs/volumes.c | 103 +----------
fs/btrfs/volumes.h | 9 +
10 files changed, 406 insertions(+), 115 deletions(-)
--
2.21.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/6] btrfs: tree-checker: Verify chunk items
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:19 ` Nikolay Borisov
2019-03-19 14:50 ` David Sterba
2019-03-13 8:55 ` [PATCH 2/6] btrfs: tree-checker: Verify dev item Qu Wenruo
` (5 subsequent siblings)
6 siblings, 2 replies; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs; +Cc: Yoon Jungyeon
We already have btrfs_check_chunk_valid() to verify each chunk before
tree-checker.
Merge that function into tree-checker, and update its error message to
be more readable.
Old error message would be something like:
BTRFS error (device dm-3): invalid chunk num_stipres: 0
New error message would be:
Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
Or
Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0
Btrfs_check_chunk_valid() is exported for super block syschunk array
verification.
Also make tree-checker to verify chunk items, this makes chunk item
checker covers all chunks and avoid fuzzed image.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202751
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/tree-checker.c | 152 ++++++++++++++++++++++++++++++++++++++++
fs/btrfs/tree-checker.h | 3 +
fs/btrfs/volumes.c | 94 +------------------------
3 files changed, 156 insertions(+), 93 deletions(-)
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index b8cdaf472031..fe3f37c23c29 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -448,6 +448,152 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
return 0;
}
+__printf(5, 6)
+__cold
+static void chunk_err(const struct btrfs_fs_info *fs_info,
+ const struct extent_buffer *leaf,
+ const struct btrfs_chunk *chunk, u64 logical,
+ const char *fmt, ...)
+{
+ /* Only superblock eb is able to have such small offset */
+ bool is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET);
+ struct va_format vaf;
+ va_list args;
+ int i;
+ int slot = -1;
+
+ if (!is_sb) {
+ /*
+ * Get the slot number by iterating through all slots, this
+ * would provide better readability.
+ */
+ for (i = 0; i < btrfs_header_nritems(leaf); i++) {
+ if (btrfs_item_ptr_offset(leaf, i) ==
+ (unsigned long) chunk) {
+ slot = i;
+ break;
+ }
+ }
+ }
+ va_start(args, fmt);
+ vaf.fmt = fmt;
+ vaf.va = &args;
+
+ if (is_sb)
+ btrfs_crit(fs_info,
+ "corrupt superblock syschunk array: chunk_start=%llu, %pV",
+ logical, &vaf);
+ else
+ btrfs_crit(fs_info,
+ "corrupt leaf: root=%llu block=%llu slot=%d chunk_start=%llu, %pV",
+ BTRFS_CHUNK_TREE_OBJECTID, leaf->start, slot,
+ logical, &vaf);
+ va_end(args);
+}
+
+/*
+ * The common chunk check which could also work on super block sys chunk array.
+ *
+ * Return -EUCLEAN if anything is corrupted.
+ * Return 0 if everything is OK.
+ */
+int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *leaf,
+ struct btrfs_chunk *chunk, u64 logical)
+{
+ u64 length;
+ u64 stripe_len;
+ u16 num_stripes;
+ u16 sub_stripes;
+ u64 type;
+ u64 features;
+ bool mixed = false;
+
+ length = btrfs_chunk_length(leaf, chunk);
+ stripe_len = btrfs_chunk_stripe_len(leaf, chunk);
+ num_stripes = btrfs_chunk_num_stripes(leaf, chunk);
+ sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk);
+ type = btrfs_chunk_type(leaf, chunk);
+
+ if (!num_stripes) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk num_stripes: %u", num_stripes);
+ return -EUCLEAN;
+ }
+ if (!IS_ALIGNED(logical, fs_info->sectorsize)) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk logical %llu", logical);
+ return -EUCLEAN;
+ }
+ if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk sectorsize %u",
+ btrfs_chunk_sector_size(leaf, chunk));
+ return -EUCLEAN;
+ }
+ if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk length %llu", length);
+ return -EUCLEAN;
+ }
+ if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk stripe length: %llu", stripe_len);
+ return -EUCLEAN;
+ }
+ if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) &
+ type) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "unrecognized chunk type: %llu",
+ ~(BTRFS_BLOCK_GROUP_TYPE_MASK |
+ BTRFS_BLOCK_GROUP_PROFILE_MASK) &
+ btrfs_chunk_type(leaf, chunk));
+ return -EUCLEAN;
+ }
+
+ if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "missing chunk type flag: 0x%llx", type);
+ return -EUCLEAN;
+ }
+
+ if ((type & BTRFS_BLOCK_GROUP_SYSTEM) &&
+ (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "system chunk with data or metadata type: 0x%llx", type);
+ return -EUCLEAN;
+ }
+
+ features = btrfs_super_incompat_flags(fs_info->super_copy);
+ if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
+ mixed = true;
+
+ if (!mixed) {
+ if ((type & BTRFS_BLOCK_GROUP_METADATA) &&
+ (type & BTRFS_BLOCK_GROUP_DATA)) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "mixed chunk type in non-mixed mode: 0x%llx", type);
+ return -EUCLEAN;
+ }
+ }
+
+ if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
+ (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
+ (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
+ (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
+ (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
+ ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
+ num_stripes != 1)) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid num_stripes:sub_stripes %u:%u for profile %llu",
+ num_stripes, sub_stripes,
+ type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
+ return -EUCLEAN;
+ }
+
+ return 0;
+}
+
/*
* Common point to switch the item-specific validation.
*/
@@ -455,6 +601,7 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
struct extent_buffer *leaf,
struct btrfs_key *key, int slot)
{
+ struct btrfs_chunk *chunk;
int ret = 0;
switch (key->type) {
@@ -472,6 +619,11 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
case BTRFS_BLOCK_GROUP_ITEM_KEY:
ret = check_block_group_item(fs_info, leaf, key, slot);
break;
+ case BTRFS_CHUNK_ITEM_KEY:
+ chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk);
+ ret = btrfs_check_chunk_valid(fs_info, leaf, chunk,
+ key->offset);
+ break;
}
return ret;
}
diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
index 6f8d1b627c53..6cb96ba5e711 100644
--- a/fs/btrfs/tree-checker.h
+++ b/fs/btrfs/tree-checker.h
@@ -33,4 +33,7 @@ int btrfs_check_leaf_write(struct btrfs_fs_info *fs_info,
struct extent_buffer *leaf);
int btrfs_check_node(struct btrfs_fs_info *fs_info, struct extent_buffer *node);
+int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *leaf,
+ struct btrfs_chunk *chunk, u64 logical);
#endif
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 15561926ab32..0e3822870f1e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -27,6 +27,7 @@
#include "math.h"
#include "dev-replace.h"
#include "sysfs.h"
+#include "tree-checker.h"
const struct btrfs_raid_attr btrfs_raid_array[BTRFS_NR_RAID_TYPES] = {
[BTRFS_RAID_RAID10] = {
@@ -6705,99 +6706,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
return dev;
}
-/* Return -EIO if any error, otherwise return 0. */
-static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
- struct extent_buffer *leaf,
- struct btrfs_chunk *chunk, u64 logical)
-{
- u64 length;
- u64 stripe_len;
- u16 num_stripes;
- u16 sub_stripes;
- u64 type;
- u64 features;
- bool mixed = false;
-
- length = btrfs_chunk_length(leaf, chunk);
- stripe_len = btrfs_chunk_stripe_len(leaf, chunk);
- num_stripes = btrfs_chunk_num_stripes(leaf, chunk);
- sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk);
- type = btrfs_chunk_type(leaf, chunk);
-
- if (!num_stripes) {
- btrfs_err(fs_info, "invalid chunk num_stripes: %u",
- num_stripes);
- return -EIO;
- }
- if (!IS_ALIGNED(logical, fs_info->sectorsize)) {
- btrfs_err(fs_info, "invalid chunk logical %llu", logical);
- return -EIO;
- }
- if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) {
- btrfs_err(fs_info, "invalid chunk sectorsize %u",
- btrfs_chunk_sector_size(leaf, chunk));
- return -EIO;
- }
- if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) {
- btrfs_err(fs_info, "invalid chunk length %llu", length);
- return -EIO;
- }
- if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) {
- btrfs_err(fs_info, "invalid chunk stripe length: %llu",
- stripe_len);
- return -EIO;
- }
- if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) &
- type) {
- btrfs_err(fs_info, "unrecognized chunk type: %llu",
- ~(BTRFS_BLOCK_GROUP_TYPE_MASK |
- BTRFS_BLOCK_GROUP_PROFILE_MASK) &
- btrfs_chunk_type(leaf, chunk));
- return -EIO;
- }
-
- if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
- btrfs_err(fs_info, "missing chunk type flag: 0x%llx", type);
- return -EIO;
- }
-
- if ((type & BTRFS_BLOCK_GROUP_SYSTEM) &&
- (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) {
- btrfs_err(fs_info,
- "system chunk with data or metadata type: 0x%llx", type);
- return -EIO;
- }
-
- features = btrfs_super_incompat_flags(fs_info->super_copy);
- if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
- mixed = true;
-
- if (!mixed) {
- if ((type & BTRFS_BLOCK_GROUP_METADATA) &&
- (type & BTRFS_BLOCK_GROUP_DATA)) {
- btrfs_err(fs_info,
- "mixed chunk type in non-mixed mode: 0x%llx", type);
- return -EIO;
- }
- }
-
- if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
- (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
- (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
- (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
- (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
- ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
- num_stripes != 1)) {
- btrfs_err(fs_info,
- "invalid num_stripes:sub_stripes %u:%u for profile %llu",
- num_stripes, sub_stripes,
- type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
- return -EIO;
- }
-
- return 0;
-}
-
static void btrfs_report_missing_device(struct btrfs_fs_info *fs_info,
u64 devid, u8 *uuid, bool error)
{
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/6] btrfs: tree-checker: Verify dev item
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:19 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer Qu Wenruo
` (4 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs; +Cc: Yoon Jungyeon
[BUG]
For fuzzed image whose DEV_ITEM has invalid total_bytes as 0, then
kernel will just panic:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
#PF error: [normal kernel read fault]
PGD 800000022b2bd067 P4D 800000022b2bd067 PUD 22b2bc067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 1106 Comm: mount Not tainted 5.0.0-rc8+ #9
RIP: 0010:btrfs_verify_dev_extents+0x2a5/0x5a0
Call Trace:
open_ctree+0x160d/0x2149
btrfs_mount_root+0x5b2/0x680
[CAUSE]
If device extent verification finds a deivce with 0 total_bytes, then it
assumes it's a seed dummy, then search for seed devices.
But in this case, there is no seed device at all, causing NULL pointer.
[FIX]
Since this is caused by fuzzed image, let's go the tree-check way, just
add a new verification for device item.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202691
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/tree-checker.c | 84 +++++++++++++++++++++++++++++++++++++++++
fs/btrfs/volumes.c | 9 -----
fs/btrfs/volumes.h | 9 +++++
3 files changed, 93 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index fe3f37c23c29..5ccb4be583ea 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -594,6 +594,87 @@ int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
return 0;
}
+
+__printf(4, 5)
+__cold
+static void dev_item_err(const struct btrfs_fs_info *fs_info,
+ const struct extent_buffer *eb, int slot,
+ const char *fmt, ...)
+{
+ struct btrfs_key key;
+ struct va_format vaf;
+ va_list args;
+
+ btrfs_item_key_to_cpu(eb, &key, slot);
+ va_start(args, fmt);
+
+ vaf.fmt = fmt;
+ vaf.va = &args;
+
+ btrfs_crit(fs_info,
+ "corrupt %s: root=%llu block=%llu slot=%d devid=%llu %pV",
+ btrfs_header_level(eb) == 0 ? "leaf" : "node",
+ btrfs_header_owner(eb), btrfs_header_bytenr(eb), slot,
+ key.objectid, &vaf);
+ va_end(args);
+}
+
+static int check_dev_item(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *leaf,
+ struct btrfs_key *key, int slot)
+{
+ struct btrfs_dev_item *ditem;
+ u64 max_devid = max(BTRFS_MAX_DEVS(fs_info), BTRFS_MAX_DEVS_SYS_CHUNK);
+
+ if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) {
+ dev_item_err(fs_info, leaf, slot,
+ "invalid objectid: has=%llu expect=%llu",
+ key->objectid, BTRFS_DEV_ITEMS_OBJECTID);
+ goto error;
+ }
+ if (key->offset > max_devid) {
+ dev_item_err(fs_info, leaf, slot,
+ "invalid devid: has=%llu expect=[0, %llu]",
+ key->offset, max_devid);
+ goto error;
+ }
+ ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item);
+ if (btrfs_device_id(leaf, ditem) != key->offset) {
+ dev_item_err(fs_info, leaf, slot,
+ "devid mismatch: key has=%llu item has=%llu",
+ key->offset, btrfs_device_id(leaf, ditem));
+ goto error;
+ }
+
+ /*
+ * Since btrfs device add doesn't check device size at all, we could
+ * have device item whose size is smaller than 1M which is useless, but
+ * still valid.
+ * So here we can only check the obviously wrong case.
+ */
+ if (btrfs_device_total_bytes(leaf, ditem) == 0) {
+ dev_item_err(fs_info, leaf, slot,
+ "invalid total bytes: have 0");
+ goto error;
+ }
+ if (btrfs_device_bytes_used(leaf, ditem) >
+ btrfs_device_total_bytes(leaf, ditem)) {
+ dev_item_err(fs_info, leaf, slot,
+ "invalid bytes used: have %llu expect [0, %llu]",
+ btrfs_device_bytes_used(leaf, ditem),
+ btrfs_device_total_bytes(leaf, ditem));
+ goto error;
+ }
+ /*
+ * Remaining members like io_align/type/gen/dev_group aren't really
+ * utilized.
+ * Skip them to make later usage of them easier.
+ */
+ return 0;
+error:
+ return -EUCLEAN;
+}
+
/*
* Common point to switch the item-specific validation.
*/
@@ -624,6 +705,9 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
ret = btrfs_check_chunk_valid(fs_info, leaf, chunk,
key->offset);
break;
+ case BTRFS_DEV_ITEM_KEY:
+ ret = check_dev_item(fs_info, leaf, key, slot);
+ break;
}
return ret;
}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0e3822870f1e..0b839ecbe73f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4964,15 +4964,6 @@ static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type)
btrfs_set_fs_incompat(info, RAID56);
}
-#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \
- - sizeof(struct btrfs_chunk)) \
- / sizeof(struct btrfs_stripe) + 1)
-
-#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \
- - 2 * sizeof(struct btrfs_disk_key) \
- - 2 * sizeof(struct btrfs_chunk)) \
- / sizeof(struct btrfs_stripe) + 1)
-
static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
u64 start, u64 type)
{
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index ed806649a473..481c012eae79 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -258,6 +258,15 @@ struct btrfs_fs_devices {
#define BTRFS_BIO_INLINE_CSUM_SIZE 64
+#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \
+ - sizeof(struct btrfs_chunk)) \
+ / sizeof(struct btrfs_stripe) + 1)
+
+#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \
+ - 2 * sizeof(struct btrfs_disk_key) \
+ - 2 * sizeof(struct btrfs_chunk)) \
+ / sizeof(struct btrfs_stripe) + 1)
+
/*
* we need the mirror number and stripe index to be passed around
* the call chain while we are processing end_io (especially errors).
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
2019-03-13 8:55 ` [PATCH 2/6] btrfs: tree-checker: Verify dev item Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:24 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler Qu Wenruo
` (3 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs; +Cc: Yoon Jungyeon
[BUG]
When reading a file from a fuzzed image, kernel can panic like:
BTRFS warning (device loop0): csum failed root 5 ino 270 off 0 csum 0x98f94189 expected csum 0x00000000 mirror 1
assertion failed: !memcmp_extent_buffer(b, &disk_key, offsetof(struct btrfs_leaf, items[0].key), sizeof(disk_key)), file: fs/btrfs/ctree.c, line: 2544
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.h:3500!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:btrfs_search_slot.cold.24+0x61/0x63 [btrfs]
Call Trace:
btrfs_lookup_csum+0x52/0x150 [btrfs]
__btrfs_lookup_bio_sums+0x209/0x640 [btrfs]
btrfs_submit_bio_hook+0x103/0x170 [btrfs]
submit_one_bio+0x59/0x80 [btrfs]
extent_read_full_page+0x58/0x80 [btrfs]
generic_file_read_iter+0x2f6/0x9d0
__vfs_read+0x14d/0x1a0
vfs_read+0x8d/0x140
ksys_read+0x52/0xc0
do_syscall_64+0x60/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
[CAUSE]
The fuzzed image has a corrupted leaf whose first key doesn't match with its parent:
checksum tree key (CSUM_TREE ROOT_ITEM 0)
node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE
fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
...
key (EXTENT_CSUM EXTENT_CSUM 79691776) block 29761536 gen 19
leaf 29761536 items 1 free space 1726 generation 19 owner CSUM_TREE
leaf 29761536 flags 0x1(WRITTEN) backref revision 1
fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
item 0 key (EXTENT_CSUM EXTENT_CSUM 8798638964736) itemoff 1751 itemsize 2244
range start 8798638964736 end 8798641262592 length 2297856
When reading above tree block, we have extent_buffer->refs = 2 in the
context:
- initial one from __alloc_extent_buffer()
alloc_extent_buffer()
|- __alloc_extent_buffer()
|- atomic_set(&eb->refs, 1)
- one being added to fs_info->buffer_radix
alloc_extent_buffer()
|- check_buffer_tree_ref()
|- atomic_inc(&eb->refs)
So even we call free_extent_buffer() in read_tree_block or other similar
situation, we only decrease the refs by 1, it doesn't reach 0 and won't
be freed right now.
The staled eb and its corrupted content will still be kept cached.
Further more, we have several extra cases where we either don't do
first key check or the check is not proper for all callers:
- scrub
We just don't have first key in this context.
- shared tree block
One tree block can be shared by several snapshot/subvolume trees.
In that case, the first key check for one subvolume doesn't apply to
another.
So for above reasons, a corrupted extent buffer can sneak into the
buffer cache.
[FIX]
Export verify_level_key() as btrfs_verify_level_key() and call it in
read_block_for_search() to fill the hole.
Due to above described reasons, even we can free corrupted extent buffer
from cache, we still need the check in read_block_for_search(), for
scrub and shared tree blocks.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202755
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202757
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202759
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202761
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202767
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202769
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/ctree.c | 10 ++++++++++
fs/btrfs/disk-io.c | 10 +++++-----
fs/btrfs/disk-io.h | 3 +++
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 5a6c39b44c84..7672932aa5b4 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2401,6 +2401,16 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
if (tmp) {
/* first we do an atomic uptodate check */
if (btrfs_buffer_uptodate(tmp, gen, 1) > 0) {
+ /*
+ * Do extra check for first_key, eb can be stale due to
+ * being cached, read from scrub, or have multiple
+ * parents (shared tree blocks).
+ */
+ if (btrfs_verify_level_key(fs_info, tmp,
+ parent_level - 1, &first_key, gen)) {
+ free_extent_buffer(tmp);
+ return -EUCLEAN;
+ }
*eb_ret = tmp;
return 0;
}
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 298b34721bc0..e2a0cb362d28 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -423,9 +423,9 @@ static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info,
return ret;
}
-static int verify_level_key(struct btrfs_fs_info *fs_info,
- struct extent_buffer *eb, int level,
- struct btrfs_key *first_key, u64 parent_transid)
+int btrfs_verify_level_key(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *eb, int level,
+ struct btrfs_key *first_key, u64 parent_transid)
{
int found_level;
struct btrfs_key found_key;
@@ -500,8 +500,8 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
if (verify_parent_transid(io_tree, eb,
parent_transid, 0))
ret = -EIO;
- else if (verify_level_key(fs_info, eb, level,
- first_key, parent_transid))
+ else if (btrfs_verify_level_key(fs_info, eb, level,
+ first_key, parent_transid))
ret = -EUCLEAN;
else
break;
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 987a64bc0c66..67a9fe2d29c7 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -39,6 +39,9 @@ static inline u64 btrfs_sb_offset(int mirror)
struct btrfs_device;
struct btrfs_fs_devices;
+int btrfs_verify_level_key(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *eb, int level,
+ struct btrfs_key *first_key, u64 parent_transid);
struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
u64 parent_transid, int level,
struct btrfs_key *first_key);
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
` (2 preceding siblings ...)
2019-03-13 8:55 ` [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:18 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 5/6] btrfs: tree-checker: Verify inode item Qu Wenruo
` (2 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs; +Cc: Yoon Jungyeon
Btrfs-progs already has comprehensive type checker, to ensure there is
only 0 (SINGLE profile) or 1 (DUP/RAID0/1/5/6/10) bit set for chunk
profile bits.
Do the same work for kernel.
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202765
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/tree-checker.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 5ccb4be583ea..c08609627720 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -551,6 +551,13 @@ int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
return -EUCLEAN;
}
+ if (!is_power_of_2(type & BTRFS_BLOCK_GROUP_PROFILE_MASK) &&
+ (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) != 0) {
+ chunk_err(fs_info, leaf, chunk, logical,
+ "invalid chunk profile flag: 0x%llx, expect 0 or 1 bit set",
+ type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
+ return -EUCLEAN;
+ }
if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
chunk_err(fs_info, leaf, chunk, logical,
"missing chunk type flag: 0x%llx", type);
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/6] btrfs: tree-checker: Verify inode item
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
` (3 preceding siblings ...)
2019-03-13 8:55 ` [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:28 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Qu Wenruo
2019-03-13 9:01 ` [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs
There is a report in kernel bugzilla about mismatch file type in dir
item and inode item.
This inspires us to check inode mode in inode item.
This patch will check the following members:
- inode key objectid
Should be ROOT_DIR_DIR or [256, (u64)-256] or FREE_INO.
- inode key offset
Should be 0
- inode item generation
- inode item transid
No newer than sb generation + 1.
The +1 is for log tree.
- inode item mode
No unknown bits.
No invalid S_IF* bit.
NOTE: S_IFMT check is not enough, need extra check for it.
- inode item nlink
Dir should have no more link than 1.
- inode item flags
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/ctree.h | 2 +
fs/btrfs/tree-checker.c | 99 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 7a2a2621f0d9..f002c63a34e3 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1504,6 +1504,8 @@ do { \
#define BTRFS_INODE_COMPRESS (1 << 11)
#define BTRFS_INODE_ROOT_ITEM_INIT (1 << 31)
+#define BTRFS_INODE_FLAG_MASK (((1 << 12) - 1) |\
+ BTRFS_INODE_ROOT_ITEM_INIT)
struct btrfs_map_token {
const struct extent_buffer *eb;
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index c08609627720..4b178ef1a9e7 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -682,6 +682,102 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
return -EUCLEAN;
}
+/* Inode item error output has the same format as dir_item_err() */
+#define inode_item_err(fs_info, eb, slot, fmt, ...) \
+ dir_item_err(fs_info, eb, slot, fmt, __VA_ARGS__)
+
+static int check_inode_item(struct btrfs_fs_info *fs_info,
+ struct extent_buffer *leaf,
+ struct btrfs_key *key, int slot)
+{
+ struct btrfs_inode_item *iitem;
+ u64 super_gen = btrfs_super_generation(fs_info->super_copy);
+ u32 valid_mask = (S_IFMT | S_ISUID | S_ISGID | S_ISVTX | 0777);
+ u32 mode;
+
+ if ((key->objectid < BTRFS_FIRST_FREE_OBJECTID ||
+ key->objectid > BTRFS_LAST_FREE_OBJECTID) &&
+ key->objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
+ key->objectid != BTRFS_FREE_INO_OBJECTID) {
+ generic_err(fs_info, leaf, slot,
+ "invalid key objectid: has %llu expect %llu or [%llu, %llu] or %llu",
+ key->objectid, BTRFS_ROOT_TREE_DIR_OBJECTID,
+ BTRFS_FIRST_FREE_OBJECTID,
+ BTRFS_LAST_FREE_OBJECTID,
+ BTRFS_FREE_INO_OBJECTID);
+ goto error;
+ }
+ if (key->offset != 0) {
+ inode_item_err(fs_info, leaf, slot,
+ "invalid key offset: has %llu expect 0",
+ key->offset);
+ goto error;
+ }
+ iitem = btrfs_item_ptr(leaf, slot, struct btrfs_inode_item);
+
+ /* Here we use super block generation + 1 to handle log tree */
+ if (btrfs_inode_generation(leaf, iitem) > super_gen + 1) {
+ inode_item_err(fs_info, leaf, slot,
+ "invalid inode generation: has %llu expect (0, %llu]",
+ btrfs_inode_generation(leaf, iitem),
+ super_gen + 1);
+ goto error;
+ }
+ /* Note for ROOT_TREE_DIR_ITEM, mkfs could make its transid as 0 */
+ if (btrfs_inode_transid(leaf, iitem) > super_gen + 1) {
+ inode_item_err(fs_info, leaf, slot,
+ "invalid inode generation: has %llu expect [0, %llu]",
+ btrfs_inode_transid(leaf, iitem),
+ super_gen + 1);
+ goto error;
+ }
+
+ /*
+ * For size and nbytes it's better not to be too strict, as for dir
+ * item its size/nbytes can easily get wrong, but doesn't affect
+ * any thing of the fs. So here we skip the check.
+ */
+
+ mode = btrfs_inode_mode(leaf, iitem);
+ if (mode & ~valid_mask) {
+ inode_item_err(fs_info, leaf, slot,
+ "unknown mode bit detected: 0x%x",
+ mode & ~valid_mask);
+ goto error;
+ }
+
+ /*
+ * S_IFMT is not bit mapped so we can't completely rely is_power_of_2(),
+ * but is_power_of_2() can save us from checking FIFO/CHR/DIR/REG.
+ * Only needs to check BLK, LNK and SOCKS
+ */
+ if (!is_power_of_2(mode & S_IFMT)) {
+ if (!S_ISLNK(mode) && ! S_ISBLK(mode) && !S_ISSOCK(mode)) {
+ inode_item_err(fs_info, leaf, slot,
+ "invalid mode: has 0%o expect valid S_IF* bit(s)",
+ mode & S_IFMT);
+ goto error;
+ }
+ }
+ if (S_ISDIR(mode) && btrfs_inode_nlink(leaf, iitem) > 1) {
+ inode_item_err(fs_info, leaf, slot,
+ "invalid nlink: has %u expect no more than 1 for dir",
+ btrfs_inode_nlink(leaf, iitem));
+ goto error;
+ }
+ if (btrfs_inode_flags(leaf, iitem) & ~BTRFS_INODE_FLAG_MASK) {
+ inode_item_err(fs_info, leaf, slot,
+ "unknown flags detected: 0x%llx",
+ btrfs_inode_flags(leaf, iitem) &
+ ~BTRFS_INODE_FLAG_MASK);
+ goto error;
+ }
+ return 0;
+
+error:
+ return -EUCLEAN;
+}
+
/*
* Common point to switch the item-specific validation.
*/
@@ -715,6 +811,9 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
case BTRFS_DEV_ITEM_KEY:
ret = check_dev_item(fs_info, leaf, key, slot);
break;
+ case BTRFS_INODE_ITEM_KEY:
+ ret = check_inode_item(fs_info, leaf, key, slot);
+ break;
}
return ret;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
` (4 preceding siblings ...)
2019-03-13 8:55 ` [PATCH 5/6] btrfs: tree-checker: Verify inode item Qu Wenruo
@ 2019-03-13 8:55 ` Qu Wenruo
2019-03-13 9:41 ` Nikolay Borisov
2019-03-13 9:01 ` [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 8:55 UTC (permalink / raw)
To: linux-btrfs; +Cc: Yoon Jungyeon
[BUG]
When access a file on a crafted image, btrfs can crash in block layer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 136501067 P4D 136501067 PUD 124519067 PMD 0
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc8-default #252
RIP: 0010:end_bio_extent_readpage+0x144/0x700
Call Trace:
<IRQ>
blk_update_request+0x8f/0x350
blk_mq_end_request+0x1a/0x120
blk_done_softirq+0x99/0xc0
__do_softirq+0xc7/0x467
irq_exit+0xd1/0xe0
call_function_single_interrupt+0xf/0x20
</IRQ>
RIP: 0010:default_idle+0x1e/0x170
[CAUSE]
The crafted image has a pretty tricky corruption, the INODE_ITEM has a
different type against its parent dir:
item 20 key (268 INODE_ITEM 0) itemoff 2808 itemsize 160
generation 13 transid 13 size 1048576 nbytes 1048576
block group 0 mode 121644 links 1 uid 0 gid 0 rdev 0
sequence 9 flags 0x0(none)
This mode number 0120000 means it's a soft link.
But the dir item think it's still a regular file:
item 8 key (264 DIR_INDEX 5) itemoff 3707 itemsize 32
location key (268 INODE_ITEM 0) type FILE
transid 13 data_len 0 name_len 2
name: f4
item 40 key (264 DIR_ITEM 51821248) itemoff 1573 itemsize 32
location key (268 INODE_ITEM 0) type FILE
transid 13 data_len 0 name_len 2
name: f4
For btrfs symlink, we don't set BTRFS_I(inode)->io_tree.ops and leave
it empty, as symlink is only designed to have inlined extent, all
handled by tree block read.
Thus no need to trigger btrfs_submit_bio_hook() for inline file extent.
However for end_bio_extent_readpage() it expects tree->ops populated, as
it's reading regular data extent.
This causes NULL pointer dereference.
[FIX]
This patch fixes the problem by 2 directions:
- Verify inode mode against its dir item when looking up inode
So in btrfs_lookup_dentry() if we find inode mode mismatch with dir
item, we error out so that corrupted inode will not be access.
- Verify inode mode when getting extent mapping
Only regular file should have regular or preallocated extent.
If we found regular/preallocated file extent for soft link or
whatever, we error out before we submit read bio.
With this fix that crafted image can be rejected gracefully:
BTRFS critical (device loop0): inode mode mismatch with dir: inode mode=0121644 btrfs type=7 dir type=1
Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202763
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/inode.c | 38 ++++++++++++++++++++++++++++--------
fs/btrfs/tests/inode-tests.c | 1 +
2 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5c349667c761..4ced641aaa8e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5396,12 +5396,13 @@ void btrfs_evict_inode(struct inode *inode)
}
/*
- * this returns the key found in the dir entry in the location pointer.
+ * this returns the key found in the dir entry in the location pointer,
+ * fill @type with BTRFS_FT_*, and return 0.
* If no dir entries were found, returns -ENOENT.
* If found a corrupted location in dir entry, returns -EUCLEAN.
*/
static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
- struct btrfs_key *location)
+ struct btrfs_key *location, u8 *type)
{
const char *name = dentry->d_name.name;
int namelen = dentry->d_name.len;
@@ -5430,6 +5431,8 @@ static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
__func__, name, btrfs_ino(BTRFS_I(dir)),
location->objectid, location->type, location->offset);
}
+ if (!ret)
+ *type = btrfs_dir_type(path->nodes[0], di);
out:
btrfs_free_path(path);
return ret;
@@ -5667,6 +5670,11 @@ static struct inode *new_simple_dir(struct super_block *s,
return inode;
}
+static inline u8 btrfs_inode_type(struct inode *inode)
+{
+ return btrfs_type_by_mode[(inode->i_mode & S_IFMT) >> S_SHIFT];
+}
+
struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
{
struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
@@ -5674,18 +5682,29 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
struct btrfs_root *root = BTRFS_I(dir)->root;
struct btrfs_root *sub_root = root;
struct btrfs_key location;
+ u8 di_type = 0;
int index;
int ret = 0;
if (dentry->d_name.len > BTRFS_NAME_LEN)
return ERR_PTR(-ENAMETOOLONG);
- ret = btrfs_inode_by_name(dir, dentry, &location);
+ ret = btrfs_inode_by_name(dir, dentry, &location, &di_type);
if (ret < 0)
return ERR_PTR(ret);
if (location.type == BTRFS_INODE_ITEM_KEY) {
inode = btrfs_iget(dir->i_sb, &location, root, NULL);
+
+ /* Do extra check against inode mode with di_type */
+ if (btrfs_inode_type(inode) != di_type) {
+ btrfs_crit(fs_info,
+"inode mode mismatch with dir: inode mode=0%o btrfs type=%u dir type=%u",
+ inode->i_mode, btrfs_inode_type(inode),
+ di_type);
+ iput(inode);
+ return ERR_PTR(-EUCLEAN);
+ }
return inode;
}
@@ -6290,11 +6309,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
return ERR_PTR(ret);
}
-static inline u8 btrfs_inode_type(struct inode *inode)
-{
- return btrfs_type_by_mode[(inode->i_mode & S_IFMT) >> S_SHIFT];
-}
-
/*
* utility function to add 'inode' into 'parent_inode' with
* a give name and a given sequence number.
@@ -6816,6 +6830,14 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
extent_start = found_key.offset;
if (found_type == BTRFS_FILE_EXTENT_REG ||
found_type == BTRFS_FILE_EXTENT_PREALLOC) {
+ /* Only regular file could have regular/prealloc extent */
+ if (!S_ISREG(inode->vfs_inode.i_mode)) {
+ ret = -EUCLEAN;
+ btrfs_crit(fs_info,
+ "regular/prealloc extent found for non-regular inode %llu",
+ btrfs_ino(inode));
+ goto out;
+ }
extent_end = extent_start +
btrfs_file_extent_num_bytes(leaf, item);
diff --git a/fs/btrfs/tests/inode-tests.c b/fs/btrfs/tests/inode-tests.c
index af0c8e30d9e2..b01dbcab9eed 100644
--- a/fs/btrfs/tests/inode-tests.c
+++ b/fs/btrfs/tests/inode-tests.c
@@ -232,6 +232,7 @@ static noinline int test_btrfs_get_extent(u32 sectorsize, u32 nodesize)
return ret;
}
+ inode->i_mode = S_IFREG;
BTRFS_I(inode)->location.type = BTRFS_INODE_ITEM_KEY;
BTRFS_I(inode)->location.objectid = BTRFS_FIRST_FREE_OBJECTID;
BTRFS_I(inode)->location.offset = 0;
--
2.21.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
` (5 preceding siblings ...)
2019-03-13 8:55 ` [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Qu Wenruo
@ 2019-03-13 9:01 ` Qu Wenruo
2019-03-19 15:34 ` David Sterba
6 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-13 9:01 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
[-- Attachment #1.1: Type: text/plain, Size: 2440 bytes --]
Just forgot the repo:
It can be fetched from github:
https://github.com/adam900710/linux/tree/tree_checker_enhancement
Which is based on my previous write time tree checker patchset.
Although the patchset itself can also be applied to v5.0-rc7 tag without
manual modification.
Thanks,
Qu
On 2019/3/13 下午4:55, Qu Wenruo wrote:
> Thanks for the report from Yoon Jungyeon <jungyeon@gatech.edu>, we have
> more fuzzed image to torture btrfs.
>
> Those images exposed the following problems:
>
> - Chunk check is not comprehensive nor early enough
> Chunk item check lacks profile bits check (e.g RAID|DUP profile is
> invalid).
> And for certain fuzzed image, the other copy can be valid, current
> check timming is after tree block read, so no way to retry the other
> copy.
>
> Address the check timing in the 1st patch, while for the profile bits,
> check it in the 4th patch.
>
> - Lack of device item check
> Address it in the 2nd patch.
>
> - First key and level check be exploited by cached extent buffer
> Cached bad extent buffer can avoid first key and level check.
> This is addressed in the 3rd patch.
>
> - Inode type mismatch can lead to NULL dereference in endio function
> If an inode claims itself as symlink but still has regular file
> extent, then endio function will cause NULL pointer dereference.
> Fix it by do extra inode mode and dir item type cross check, at
> get_extent() time and inode lookup time.
> Addressed in the 5th and 6th patch.
>
> Qu Wenruo (6):
> btrfs: tree-checker: Verify chunk items
> btrfs: tree-checker: Verify dev item
> btrfs: Check the first key and level for cached extent buffer
> btrfs: tree-checker: Enhance chunk checker to validate chunk profiler
> btrfs: tree-checker: Verify inode item
> btrfs: inode: Verify inode mode to avoid NULL pointer dereference
>
> fs/btrfs/ctree.c | 10 +
> fs/btrfs/ctree.h | 2 +
> fs/btrfs/disk-io.c | 10 +-
> fs/btrfs/disk-io.h | 3 +
> fs/btrfs/inode.c | 38 +++-
> fs/btrfs/tests/inode-tests.c | 1 +
> fs/btrfs/tree-checker.c | 342 +++++++++++++++++++++++++++++++++++
> fs/btrfs/tree-checker.h | 3 +
> fs/btrfs/volumes.c | 103 +----------
> fs/btrfs/volumes.h | 9 +
> 10 files changed, 406 insertions(+), 115 deletions(-)
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler
2019-03-13 8:55 ` [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler Qu Wenruo
@ 2019-03-13 9:18 ` Nikolay Borisov
0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:18 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Yoon Jungyeon
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> Btrfs-progs already has comprehensive type checker, to ensure there is
> only 0 (SINGLE profile) or 1 (DUP/RAID0/1/5/6/10) bit set for chunk
> profile bits.
>
> Do the same work for kernel.
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202765
> Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/tree-checker.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 5ccb4be583ea..c08609627720 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -551,6 +551,13 @@ int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> return -EUCLEAN;
> }
>
> + if (!is_power_of_2(type & BTRFS_BLOCK_GROUP_PROFILE_MASK) &&
> + (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) != 0) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk profile flag: 0x%llx, expect 0 or 1 bit set",
> + type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
> + return -EUCLEAN;
> + }
> if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
> chunk_err(fs_info, leaf, chunk, logical,
> "missing chunk type flag: 0x%llx", type);
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/6] btrfs: tree-checker: Verify chunk items
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
@ 2019-03-13 9:19 ` Nikolay Borisov
2019-03-19 14:50 ` David Sterba
1 sibling, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:19 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Yoon Jungyeon
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> We already have btrfs_check_chunk_valid() to verify each chunk before
> tree-checker.
>
> Merge that function into tree-checker, and update its error message to
> be more readable.
>
> Old error message would be something like:
> BTRFS error (device dm-3): invalid chunk num_stipres: 0
>
> New error message would be:
> Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
> Or
> Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0
>
> Btrfs_check_chunk_valid() is exported for super block syschunk array
> verification.
>
> Also make tree-checker to verify chunk items, this makes chunk item
> checker covers all chunks and avoid fuzzed image.
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202751
> Signed-off-by: Qu Wenruo <wqu@suse.com>
I had already reviewed the earlier posting of this patch so:
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/tree-checker.c | 152 ++++++++++++++++++++++++++++++++++++++++
> fs/btrfs/tree-checker.h | 3 +
> fs/btrfs/volumes.c | 94 +------------------------
> 3 files changed, 156 insertions(+), 93 deletions(-)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index b8cdaf472031..fe3f37c23c29 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -448,6 +448,152 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
> return 0;
> }
>
> +__printf(5, 6)
> +__cold
> +static void chunk_err(const struct btrfs_fs_info *fs_info,
> + const struct extent_buffer *leaf,
> + const struct btrfs_chunk *chunk, u64 logical,
> + const char *fmt, ...)
> +{
> + /* Only superblock eb is able to have such small offset */
> + bool is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET);
> + struct va_format vaf;
> + va_list args;
> + int i;
> + int slot = -1;
> +
> + if (!is_sb) {
> + /*
> + * Get the slot number by iterating through all slots, this
> + * would provide better readability.
> + */
> + for (i = 0; i < btrfs_header_nritems(leaf); i++) {
> + if (btrfs_item_ptr_offset(leaf, i) ==
> + (unsigned long) chunk) {
> + slot = i;
> + break;
> + }
> + }
> + }
> + va_start(args, fmt);
> + vaf.fmt = fmt;
> + vaf.va = &args;
> +
> + if (is_sb)
> + btrfs_crit(fs_info,
> + "corrupt superblock syschunk array: chunk_start=%llu, %pV",
> + logical, &vaf);
> + else
> + btrfs_crit(fs_info,
> + "corrupt leaf: root=%llu block=%llu slot=%d chunk_start=%llu, %pV",
> + BTRFS_CHUNK_TREE_OBJECTID, leaf->start, slot,
> + logical, &vaf);
> + va_end(args);
> +}
> +
> +/*
> + * The common chunk check which could also work on super block sys chunk array.
> + *
> + * Return -EUCLEAN if anything is corrupted.
> + * Return 0 if everything is OK.
> + */
> +int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *leaf,
> + struct btrfs_chunk *chunk, u64 logical)
> +{
> + u64 length;
> + u64 stripe_len;
> + u16 num_stripes;
> + u16 sub_stripes;
> + u64 type;
> + u64 features;
> + bool mixed = false;
> +
> + length = btrfs_chunk_length(leaf, chunk);
> + stripe_len = btrfs_chunk_stripe_len(leaf, chunk);
> + num_stripes = btrfs_chunk_num_stripes(leaf, chunk);
> + sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk);
> + type = btrfs_chunk_type(leaf, chunk);
> +
> + if (!num_stripes) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk num_stripes: %u", num_stripes);
> + return -EUCLEAN;
> + }
> + if (!IS_ALIGNED(logical, fs_info->sectorsize)) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk logical %llu", logical);
> + return -EUCLEAN;
> + }
> + if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk sectorsize %u",
> + btrfs_chunk_sector_size(leaf, chunk));
> + return -EUCLEAN;
> + }
> + if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk length %llu", length);
> + return -EUCLEAN;
> + }
> + if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid chunk stripe length: %llu", stripe_len);
> + return -EUCLEAN;
> + }
> + if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) &
> + type) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "unrecognized chunk type: %llu",
> + ~(BTRFS_BLOCK_GROUP_TYPE_MASK |
> + BTRFS_BLOCK_GROUP_PROFILE_MASK) &
> + btrfs_chunk_type(leaf, chunk));
> + return -EUCLEAN;
> + }
> +
> + if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "missing chunk type flag: 0x%llx", type);
> + return -EUCLEAN;
> + }
> +
> + if ((type & BTRFS_BLOCK_GROUP_SYSTEM) &&
> + (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "system chunk with data or metadata type: 0x%llx", type);
> + return -EUCLEAN;
> + }
> +
> + features = btrfs_super_incompat_flags(fs_info->super_copy);
> + if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
> + mixed = true;
> +
> + if (!mixed) {
> + if ((type & BTRFS_BLOCK_GROUP_METADATA) &&
> + (type & BTRFS_BLOCK_GROUP_DATA)) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "mixed chunk type in non-mixed mode: 0x%llx", type);
> + return -EUCLEAN;
> + }
> + }
> +
> + if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
> + (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
> + (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
> + (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> + (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> + ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
> + num_stripes != 1)) {
> + chunk_err(fs_info, leaf, chunk, logical,
> + "invalid num_stripes:sub_stripes %u:%u for profile %llu",
> + num_stripes, sub_stripes,
> + type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
> + return -EUCLEAN;
> + }
> +
> + return 0;
> +}
> +
> /*
> * Common point to switch the item-specific validation.
> */
> @@ -455,6 +601,7 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
> struct extent_buffer *leaf,
> struct btrfs_key *key, int slot)
> {
> + struct btrfs_chunk *chunk;
> int ret = 0;
>
> switch (key->type) {
> @@ -472,6 +619,11 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
> case BTRFS_BLOCK_GROUP_ITEM_KEY:
> ret = check_block_group_item(fs_info, leaf, key, slot);
> break;
> + case BTRFS_CHUNK_ITEM_KEY:
> + chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk);
> + ret = btrfs_check_chunk_valid(fs_info, leaf, chunk,
> + key->offset);
> + break;
> }
> return ret;
> }
> diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
> index 6f8d1b627c53..6cb96ba5e711 100644
> --- a/fs/btrfs/tree-checker.h
> +++ b/fs/btrfs/tree-checker.h
> @@ -33,4 +33,7 @@ int btrfs_check_leaf_write(struct btrfs_fs_info *fs_info,
> struct extent_buffer *leaf);
> int btrfs_check_node(struct btrfs_fs_info *fs_info, struct extent_buffer *node);
>
> +int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *leaf,
> + struct btrfs_chunk *chunk, u64 logical);
> #endif
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 15561926ab32..0e3822870f1e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -27,6 +27,7 @@
> #include "math.h"
> #include "dev-replace.h"
> #include "sysfs.h"
> +#include "tree-checker.h"
>
> const struct btrfs_raid_attr btrfs_raid_array[BTRFS_NR_RAID_TYPES] = {
> [BTRFS_RAID_RAID10] = {
> @@ -6705,99 +6706,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
> return dev;
> }
>
> -/* Return -EIO if any error, otherwise return 0. */
> -static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> - struct extent_buffer *leaf,
> - struct btrfs_chunk *chunk, u64 logical)
> -{
> - u64 length;
> - u64 stripe_len;
> - u16 num_stripes;
> - u16 sub_stripes;
> - u64 type;
> - u64 features;
> - bool mixed = false;
> -
> - length = btrfs_chunk_length(leaf, chunk);
> - stripe_len = btrfs_chunk_stripe_len(leaf, chunk);
> - num_stripes = btrfs_chunk_num_stripes(leaf, chunk);
> - sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk);
> - type = btrfs_chunk_type(leaf, chunk);
> -
> - if (!num_stripes) {
> - btrfs_err(fs_info, "invalid chunk num_stripes: %u",
> - num_stripes);
> - return -EIO;
> - }
> - if (!IS_ALIGNED(logical, fs_info->sectorsize)) {
> - btrfs_err(fs_info, "invalid chunk logical %llu", logical);
> - return -EIO;
> - }
> - if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) {
> - btrfs_err(fs_info, "invalid chunk sectorsize %u",
> - btrfs_chunk_sector_size(leaf, chunk));
> - return -EIO;
> - }
> - if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) {
> - btrfs_err(fs_info, "invalid chunk length %llu", length);
> - return -EIO;
> - }
> - if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) {
> - btrfs_err(fs_info, "invalid chunk stripe length: %llu",
> - stripe_len);
> - return -EIO;
> - }
> - if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) &
> - type) {
> - btrfs_err(fs_info, "unrecognized chunk type: %llu",
> - ~(BTRFS_BLOCK_GROUP_TYPE_MASK |
> - BTRFS_BLOCK_GROUP_PROFILE_MASK) &
> - btrfs_chunk_type(leaf, chunk));
> - return -EIO;
> - }
> -
> - if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
> - btrfs_err(fs_info, "missing chunk type flag: 0x%llx", type);
> - return -EIO;
> - }
> -
> - if ((type & BTRFS_BLOCK_GROUP_SYSTEM) &&
> - (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) {
> - btrfs_err(fs_info,
> - "system chunk with data or metadata type: 0x%llx", type);
> - return -EIO;
> - }
> -
> - features = btrfs_super_incompat_flags(fs_info->super_copy);
> - if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
> - mixed = true;
> -
> - if (!mixed) {
> - if ((type & BTRFS_BLOCK_GROUP_METADATA) &&
> - (type & BTRFS_BLOCK_GROUP_DATA)) {
> - btrfs_err(fs_info,
> - "mixed chunk type in non-mixed mode: 0x%llx", type);
> - return -EIO;
> - }
> - }
> -
> - if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
> - (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
> - (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
> - (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> - (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> - ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
> - num_stripes != 1)) {
> - btrfs_err(fs_info,
> - "invalid num_stripes:sub_stripes %u:%u for profile %llu",
> - num_stripes, sub_stripes,
> - type & BTRFS_BLOCK_GROUP_PROFILE_MASK);
> - return -EIO;
> - }
> -
> - return 0;
> -}
> -
> static void btrfs_report_missing_device(struct btrfs_fs_info *fs_info,
> u64 devid, u8 *uuid, bool error)
> {
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/6] btrfs: tree-checker: Verify dev item
2019-03-13 8:55 ` [PATCH 2/6] btrfs: tree-checker: Verify dev item Qu Wenruo
@ 2019-03-13 9:19 ` Nikolay Borisov
0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:19 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Yoon Jungyeon
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> [BUG]
> For fuzzed image whose DEV_ITEM has invalid total_bytes as 0, then
> kernel will just panic:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
> #PF error: [normal kernel read fault]
> PGD 800000022b2bd067 P4D 800000022b2bd067 PUD 22b2bc067 PMD 0
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 1106 Comm: mount Not tainted 5.0.0-rc8+ #9
> RIP: 0010:btrfs_verify_dev_extents+0x2a5/0x5a0
> Call Trace:
> open_ctree+0x160d/0x2149
> btrfs_mount_root+0x5b2/0x680
>
> [CAUSE]
> If device extent verification finds a deivce with 0 total_bytes, then it
> assumes it's a seed dummy, then search for seed devices.
>
> But in this case, there is no seed device at all, causing NULL pointer.
>
> [FIX]
> Since this is caused by fuzzed image, let's go the tree-check way, just
> add a new verification for device item.
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202691
> Signed-off-by: Qu Wenruo <wqu@suse.com>
I had already reviewed the earlier posting of this patch so:
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/tree-checker.c | 84 +++++++++++++++++++++++++++++++++++++++++
> fs/btrfs/volumes.c | 9 -----
> fs/btrfs/volumes.h | 9 +++++
> 3 files changed, 93 insertions(+), 9 deletions(-)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index fe3f37c23c29..5ccb4be583ea 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -594,6 +594,87 @@ int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> return 0;
> }
>
> +
> +__printf(4, 5)
> +__cold
> +static void dev_item_err(const struct btrfs_fs_info *fs_info,
> + const struct extent_buffer *eb, int slot,
> + const char *fmt, ...)
> +{
> + struct btrfs_key key;
> + struct va_format vaf;
> + va_list args;
> +
> + btrfs_item_key_to_cpu(eb, &key, slot);
> + va_start(args, fmt);
> +
> + vaf.fmt = fmt;
> + vaf.va = &args;
> +
> + btrfs_crit(fs_info,
> + "corrupt %s: root=%llu block=%llu slot=%d devid=%llu %pV",
> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
> + btrfs_header_owner(eb), btrfs_header_bytenr(eb), slot,
> + key.objectid, &vaf);
> + va_end(args);
> +}
> +
> +static int check_dev_item(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *leaf,
> + struct btrfs_key *key, int slot)
> +{
> + struct btrfs_dev_item *ditem;
> + u64 max_devid = max(BTRFS_MAX_DEVS(fs_info), BTRFS_MAX_DEVS_SYS_CHUNK);
> +
> + if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) {
> + dev_item_err(fs_info, leaf, slot,
> + "invalid objectid: has=%llu expect=%llu",
> + key->objectid, BTRFS_DEV_ITEMS_OBJECTID);
> + goto error;
> + }
> + if (key->offset > max_devid) {
> + dev_item_err(fs_info, leaf, slot,
> + "invalid devid: has=%llu expect=[0, %llu]",
> + key->offset, max_devid);
> + goto error;
> + }
> + ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item);
> + if (btrfs_device_id(leaf, ditem) != key->offset) {
> + dev_item_err(fs_info, leaf, slot,
> + "devid mismatch: key has=%llu item has=%llu",
> + key->offset, btrfs_device_id(leaf, ditem));
> + goto error;
> + }
> +
> + /*
> + * Since btrfs device add doesn't check device size at all, we could
> + * have device item whose size is smaller than 1M which is useless, but
> + * still valid.
> + * So here we can only check the obviously wrong case.
> + */
> + if (btrfs_device_total_bytes(leaf, ditem) == 0) {
> + dev_item_err(fs_info, leaf, slot,
> + "invalid total bytes: have 0");
> + goto error;
> + }
> + if (btrfs_device_bytes_used(leaf, ditem) >
> + btrfs_device_total_bytes(leaf, ditem)) {
> + dev_item_err(fs_info, leaf, slot,
> + "invalid bytes used: have %llu expect [0, %llu]",
> + btrfs_device_bytes_used(leaf, ditem),
> + btrfs_device_total_bytes(leaf, ditem));
> + goto error;
> + }
> + /*
> + * Remaining members like io_align/type/gen/dev_group aren't really
> + * utilized.
> + * Skip them to make later usage of them easier.
> + */
> + return 0;
> +error:
> + return -EUCLEAN;
> +}
> +
> /*
> * Common point to switch the item-specific validation.
> */
> @@ -624,6 +705,9 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
> ret = btrfs_check_chunk_valid(fs_info, leaf, chunk,
> key->offset);
> break;
> + case BTRFS_DEV_ITEM_KEY:
> + ret = check_dev_item(fs_info, leaf, key, slot);
> + break;
> }
> return ret;
> }
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 0e3822870f1e..0b839ecbe73f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4964,15 +4964,6 @@ static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type)
> btrfs_set_fs_incompat(info, RAID56);
> }
>
> -#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \
> - - sizeof(struct btrfs_chunk)) \
> - / sizeof(struct btrfs_stripe) + 1)
> -
> -#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \
> - - 2 * sizeof(struct btrfs_disk_key) \
> - - 2 * sizeof(struct btrfs_chunk)) \
> - / sizeof(struct btrfs_stripe) + 1)
> -
> static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
> u64 start, u64 type)
> {
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index ed806649a473..481c012eae79 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -258,6 +258,15 @@ struct btrfs_fs_devices {
>
> #define BTRFS_BIO_INLINE_CSUM_SIZE 64
>
> +#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \
> + - sizeof(struct btrfs_chunk)) \
> + / sizeof(struct btrfs_stripe) + 1)
> +
> +#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \
> + - 2 * sizeof(struct btrfs_disk_key) \
> + - 2 * sizeof(struct btrfs_chunk)) \
> + / sizeof(struct btrfs_stripe) + 1)
> +
> /*
> * we need the mirror number and stripe index to be passed around
> * the call chain while we are processing end_io (especially errors).
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer
2019-03-13 8:55 ` [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer Qu Wenruo
@ 2019-03-13 9:24 ` Nikolay Borisov
0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:24 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Yoon Jungyeon
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> [BUG]
> When reading a file from a fuzzed image, kernel can panic like:
> BTRFS warning (device loop0): csum failed root 5 ino 270 off 0 csum 0x98f94189 expected csum 0x00000000 mirror 1
> assertion failed: !memcmp_extent_buffer(b, &disk_key, offsetof(struct btrfs_leaf, items[0].key), sizeof(disk_key)), file: fs/btrfs/ctree.c, line: 2544
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/ctree.h:3500!
> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> RIP: 0010:btrfs_search_slot.cold.24+0x61/0x63 [btrfs]
> Call Trace:
> btrfs_lookup_csum+0x52/0x150 [btrfs]
> __btrfs_lookup_bio_sums+0x209/0x640 [btrfs]
> btrfs_submit_bio_hook+0x103/0x170 [btrfs]
> submit_one_bio+0x59/0x80 [btrfs]
> extent_read_full_page+0x58/0x80 [btrfs]
> generic_file_read_iter+0x2f6/0x9d0
> __vfs_read+0x14d/0x1a0
> vfs_read+0x8d/0x140
> ksys_read+0x52/0xc0
> do_syscall_64+0x60/0x210
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> [CAUSE]
> The fuzzed image has a corrupted leaf whose first key doesn't match with its parent:
> checksum tree key (CSUM_TREE ROOT_ITEM 0)
> node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE
> fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
> chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
> ...
> key (EXTENT_CSUM EXTENT_CSUM 79691776) block 29761536 gen 19
>
> leaf 29761536 items 1 free space 1726 generation 19 owner CSUM_TREE
> leaf 29761536 flags 0x1(WRITTEN) backref revision 1
> fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
> chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
> item 0 key (EXTENT_CSUM EXTENT_CSUM 8798638964736) itemoff 1751 itemsize 2244
> range start 8798638964736 end 8798641262592 length 2297856
>
> When reading above tree block, we have extent_buffer->refs = 2 in the
> context:
> - initial one from __alloc_extent_buffer()
> alloc_extent_buffer()
> |- __alloc_extent_buffer()
> |- atomic_set(&eb->refs, 1)
>
> - one being added to fs_info->buffer_radix
> alloc_extent_buffer()
> |- check_buffer_tree_ref()
> |- atomic_inc(&eb->refs)
>
> So even we call free_extent_buffer() in read_tree_block or other similar
> situation, we only decrease the refs by 1, it doesn't reach 0 and won't
> be freed right now.
>
> The staled eb and its corrupted content will still be kept cached.
>
> Further more, we have several extra cases where we either don't do
> first key check or the check is not proper for all callers:
> - scrub
> We just don't have first key in this context.
>
> - shared tree block
> One tree block can be shared by several snapshot/subvolume trees.
> In that case, the first key check for one subvolume doesn't apply to
> another.
>
> So for above reasons, a corrupted extent buffer can sneak into the
> buffer cache.
>
> [FIX]
> Export verify_level_key() as btrfs_verify_level_key() and call it in
> read_block_for_search() to fill the hole.
>
> Due to above described reasons, even we can free corrupted extent buffer
> from cache, we still need the check in read_block_for_search(), for
> scrub and shared tree blocks.
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202755
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202757
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202759
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202761
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202767
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202769
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
I had already reviewed the earlier posting of this patch so:
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> fs/btrfs/ctree.c | 10 ++++++++++
> fs/btrfs/disk-io.c | 10 +++++-----
> fs/btrfs/disk-io.h | 3 +++
> 3 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 5a6c39b44c84..7672932aa5b4 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2401,6 +2401,16 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
> if (tmp) {
> /* first we do an atomic uptodate check */
> if (btrfs_buffer_uptodate(tmp, gen, 1) > 0) {
> + /*
> + * Do extra check for first_key, eb can be stale due to
> + * being cached, read from scrub, or have multiple
> + * parents (shared tree blocks).
> + */
> + if (btrfs_verify_level_key(fs_info, tmp,
> + parent_level - 1, &first_key, gen)) {
> + free_extent_buffer(tmp);
> + return -EUCLEAN;
> + }
> *eb_ret = tmp;
> return 0;
> }
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 298b34721bc0..e2a0cb362d28 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -423,9 +423,9 @@ static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info,
> return ret;
> }
>
> -static int verify_level_key(struct btrfs_fs_info *fs_info,
> - struct extent_buffer *eb, int level,
> - struct btrfs_key *first_key, u64 parent_transid)
> +int btrfs_verify_level_key(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *eb, int level,
> + struct btrfs_key *first_key, u64 parent_transid)
> {
> int found_level;
> struct btrfs_key found_key;
> @@ -500,8 +500,8 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
> if (verify_parent_transid(io_tree, eb,
> parent_transid, 0))
> ret = -EIO;
> - else if (verify_level_key(fs_info, eb, level,
> - first_key, parent_transid))
> + else if (btrfs_verify_level_key(fs_info, eb, level,
> + first_key, parent_transid))
> ret = -EUCLEAN;
> else
> break;
> diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
> index 987a64bc0c66..67a9fe2d29c7 100644
> --- a/fs/btrfs/disk-io.h
> +++ b/fs/btrfs/disk-io.h
> @@ -39,6 +39,9 @@ static inline u64 btrfs_sb_offset(int mirror)
> struct btrfs_device;
> struct btrfs_fs_devices;
>
> +int btrfs_verify_level_key(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *eb, int level,
> + struct btrfs_key *first_key, u64 parent_transid);
> struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
> u64 parent_transid, int level,
> struct btrfs_key *first_key);
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/6] btrfs: tree-checker: Verify inode item
2019-03-13 8:55 ` [PATCH 5/6] btrfs: tree-checker: Verify inode item Qu Wenruo
@ 2019-03-13 9:28 ` Nikolay Borisov
0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:28 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> There is a report in kernel bugzilla about mismatch file type in dir
> item and inode item.
>
> This inspires us to check inode mode in inode item.
>
> This patch will check the following members:
> - inode key objectid
> Should be ROOT_DIR_DIR or [256, (u64)-256] or FREE_INO.
>
> - inode key offset
> Should be 0
>
> - inode item generation
> - inode item transid
> No newer than sb generation + 1.
> The +1 is for log tree.
>
> - inode item mode
> No unknown bits.
> No invalid S_IF* bit.
> NOTE: S_IFMT check is not enough, need extra check for it.
>
> - inode item nlink
> Dir should have no more link than 1.
>
> - inode item flags
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/ctree.h | 2 +
> fs/btrfs/tree-checker.c | 99 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 101 insertions(+)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 7a2a2621f0d9..f002c63a34e3 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1504,6 +1504,8 @@ do { \
> #define BTRFS_INODE_COMPRESS (1 << 11)
>
> #define BTRFS_INODE_ROOT_ITEM_INIT (1 << 31)
> +#define BTRFS_INODE_FLAG_MASK (((1 << 12) - 1) |\
> + BTRFS_INODE_ROOT_ITEM_INIT)
>
> struct btrfs_map_token {
> const struct extent_buffer *eb;
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index c08609627720..4b178ef1a9e7 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -682,6 +682,102 @@ static int check_dev_item(struct btrfs_fs_info *fs_info,
> return -EUCLEAN;
> }
>
> +/* Inode item error output has the same format as dir_item_err() */
> +#define inode_item_err(fs_info, eb, slot, fmt, ...) \
> + dir_item_err(fs_info, eb, slot, fmt, __VA_ARGS__)
> +
> +static int check_inode_item(struct btrfs_fs_info *fs_info,
> + struct extent_buffer *leaf,
> + struct btrfs_key *key, int slot)
> +{
> + struct btrfs_inode_item *iitem;
> + u64 super_gen = btrfs_super_generation(fs_info->super_copy);
> + u32 valid_mask = (S_IFMT | S_ISUID | S_ISGID | S_ISVTX | 0777);
> + u32 mode;
> +
> + if ((key->objectid < BTRFS_FIRST_FREE_OBJECTID ||
> + key->objectid > BTRFS_LAST_FREE_OBJECTID) &&
> + key->objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
> + key->objectid != BTRFS_FREE_INO_OBJECTID) {
> + generic_err(fs_info, leaf, slot,
> + "invalid key objectid: has %llu expect %llu or [%llu, %llu] or %llu",
> + key->objectid, BTRFS_ROOT_TREE_DIR_OBJECTID,
> + BTRFS_FIRST_FREE_OBJECTID,
> + BTRFS_LAST_FREE_OBJECTID,
> + BTRFS_FREE_INO_OBJECTID);
> + goto error;
> + }
> + if (key->offset != 0) {
> + inode_item_err(fs_info, leaf, slot,
> + "invalid key offset: has %llu expect 0",
> + key->offset);
> + goto error;
> + }
> + iitem = btrfs_item_ptr(leaf, slot, struct btrfs_inode_item);
> +
> + /* Here we use super block generation + 1 to handle log tree */
> + if (btrfs_inode_generation(leaf, iitem) > super_gen + 1) {
> + inode_item_err(fs_info, leaf, slot,
> + "invalid inode generation: has %llu expect (0, %llu]",
> + btrfs_inode_generation(leaf, iitem),
> + super_gen + 1);
> + goto error;
> + }
> + /* Note for ROOT_TREE_DIR_ITEM, mkfs could make its transid as 0 */
> + if (btrfs_inode_transid(leaf, iitem) > super_gen + 1) {
> + inode_item_err(fs_info, leaf, slot,
> + "invalid inode generation: has %llu expect [0, %llu]",
> + btrfs_inode_transid(leaf, iitem),
> + super_gen + 1);
> + goto error;
> + }
> +
> + /*
> + * For size and nbytes it's better not to be too strict, as for dir
> + * item its size/nbytes can easily get wrong, but doesn't affect
> + * any thing of the fs. So here we skip the check.
> + */
> +
> + mode = btrfs_inode_mode(leaf, iitem);
> + if (mode & ~valid_mask) {
> + inode_item_err(fs_info, leaf, slot,
> + "unknown mode bit detected: 0x%x",
> + mode & ~valid_mask);
> + goto error;
> + }
> +
> + /*
> + * S_IFMT is not bit mapped so we can't completely rely is_power_of_2(),
> + * but is_power_of_2() can save us from checking FIFO/CHR/DIR/REG.
> + * Only needs to check BLK, LNK and SOCKS
> + */
> + if (!is_power_of_2(mode & S_IFMT)) {
> + if (!S_ISLNK(mode) && ! S_ISBLK(mode) && !S_ISSOCK(mode)) {
> + inode_item_err(fs_info, leaf, slot,
> + "invalid mode: has 0%o expect valid S_IF* bit(s)",
> + mode & S_IFMT);
> + goto error;
> + }
> + }
> + if (S_ISDIR(mode) && btrfs_inode_nlink(leaf, iitem) > 1) {
> + inode_item_err(fs_info, leaf, slot,
> + "invalid nlink: has %u expect no more than 1 for dir",
> + btrfs_inode_nlink(leaf, iitem));
> + goto error;
> + }
> + if (btrfs_inode_flags(leaf, iitem) & ~BTRFS_INODE_FLAG_MASK) {
> + inode_item_err(fs_info, leaf, slot,
> + "unknown flags detected: 0x%llx",
> + btrfs_inode_flags(leaf, iitem) &
> + ~BTRFS_INODE_FLAG_MASK);
> + goto error;
> + }
> + return 0;
> +
> +error:
> + return -EUCLEAN;
> +}
> +
> /*
> * Common point to switch the item-specific validation.
> */
> @@ -715,6 +811,9 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info,
> case BTRFS_DEV_ITEM_KEY:
> ret = check_dev_item(fs_info, leaf, key, slot);
> break;
> + case BTRFS_INODE_ITEM_KEY:
> + ret = check_inode_item(fs_info, leaf, key, slot);
> + break;
> }
> return ret;
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference
2019-03-13 8:55 ` [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Qu Wenruo
@ 2019-03-13 9:41 ` Nikolay Borisov
0 siblings, 0 replies; 18+ messages in thread
From: Nikolay Borisov @ 2019-03-13 9:41 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Yoon Jungyeon
On 13.03.19 г. 10:55 ч., Qu Wenruo wrote:
> [BUG]
> When access a file on a crafted image, btrfs can crash in block layer:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> PGD 136501067 P4D 136501067 PUD 124519067 PMD 0
> CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc8-default #252
> RIP: 0010:end_bio_extent_readpage+0x144/0x700
> Call Trace:
> <IRQ>
> blk_update_request+0x8f/0x350
> blk_mq_end_request+0x1a/0x120
> blk_done_softirq+0x99/0xc0
> __do_softirq+0xc7/0x467
> irq_exit+0xd1/0xe0
> call_function_single_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:default_idle+0x1e/0x170
>
> [CAUSE]
> The crafted image has a pretty tricky corruption, the INODE_ITEM has a
> different type against its parent dir:
> item 20 key (268 INODE_ITEM 0) itemoff 2808 itemsize 160
> generation 13 transid 13 size 1048576 nbytes 1048576
> block group 0 mode 121644 links 1 uid 0 gid 0 rdev 0
> sequence 9 flags 0x0(none)
>
> This mode number 0120000 means it's a soft link.
>
> But the dir item think it's still a regular file:
> item 8 key (264 DIR_INDEX 5) itemoff 3707 itemsize 32
> location key (268 INODE_ITEM 0) type FILE
> transid 13 data_len 0 name_len 2
> name: f4
> item 40 key (264 DIR_ITEM 51821248) itemoff 1573 itemsize 32
> location key (268 INODE_ITEM 0) type FILE
> transid 13 data_len 0 name_len 2
> name: f4
>
> For btrfs symlink, we don't set BTRFS_I(inode)->io_tree.ops and leave
> it empty, as symlink is only designed to have inlined extent, all
> handled by tree block read.
> Thus no need to trigger btrfs_submit_bio_hook() for inline file extent.
>
> However for end_bio_extent_readpage() it expects tree->ops populated, as
> it's reading regular data extent.
> This causes NULL pointer dereference.
>
> [FIX]
> This patch fixes the problem by 2 directions:
> - Verify inode mode against its dir item when looking up inode
> So in btrfs_lookup_dentry() if we find inode mode mismatch with dir
> item, we error out so that corrupted inode will not be access.
>
> - Verify inode mode when getting extent mapping
> Only regular file should have regular or preallocated extent.
> If we found regular/preallocated file extent for soft link or
> whatever, we error out before we submit read bio.
>
> With this fix that crafted image can be rejected gracefully:
> BTRFS critical (device loop0): inode mode mismatch with dir: inode mode=0121644 btrfs type=7 dir type=1
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202763
> Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/inode.c | 38 ++++++++++++++++++++++++++++--------
> fs/btrfs/tests/inode-tests.c | 1 +
> 2 files changed, 31 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 5c349667c761..4ced641aaa8e 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -5396,12 +5396,13 @@ void btrfs_evict_inode(struct inode *inode)
> }
>
> /*
> - * this returns the key found in the dir entry in the location pointer.
> + * this returns the key found in the dir entry in the location pointer,
> + * fill @type with BTRFS_FT_*, and return 0.
> * If no dir entries were found, returns -ENOENT.
> * If found a corrupted location in dir entry, returns -EUCLEAN.
> */
> static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
> - struct btrfs_key *location)
> + struct btrfs_key *location, u8 *type)
> {
> const char *name = dentry->d_name.name;
> int namelen = dentry->d_name.len;
> @@ -5430,6 +5431,8 @@ static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
> __func__, name, btrfs_ino(BTRFS_I(dir)),
> location->objectid, location->type, location->offset);
> }
> + if (!ret)
> + *type = btrfs_dir_type(path->nodes[0], di);
> out:
> btrfs_free_path(path);
> return ret;
> @@ -5667,6 +5670,11 @@ static struct inode *new_simple_dir(struct super_block *s,
> return inode;
> }
>
> +static inline u8 btrfs_inode_type(struct inode *inode)
> +{
> + return btrfs_type_by_mode[(inode->i_mode & S_IFMT) >> S_SHIFT];
> +}
> +
> struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
> {
> struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> @@ -5674,18 +5682,29 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry)
> struct btrfs_root *root = BTRFS_I(dir)->root;
> struct btrfs_root *sub_root = root;
> struct btrfs_key location;
> + u8 di_type = 0;
> int index;
> int ret = 0;
>
> if (dentry->d_name.len > BTRFS_NAME_LEN)
> return ERR_PTR(-ENAMETOOLONG);
>
> - ret = btrfs_inode_by_name(dir, dentry, &location);
> + ret = btrfs_inode_by_name(dir, dentry, &location, &di_type);
> if (ret < 0)
> return ERR_PTR(ret);
>
> if (location.type == BTRFS_INODE_ITEM_KEY) {
> inode = btrfs_iget(dir->i_sb, &location, root, NULL);
> +
> + /* Do extra check against inode mode with di_type */
> + if (btrfs_inode_type(inode) != di_type) {
> + btrfs_crit(fs_info,
> +"inode mode mismatch with dir: inode mode=0%o btrfs type=%u dir type=%u",
> + inode->i_mode, btrfs_inode_type(inode),
> + di_type);
> + iput(inode);
> + return ERR_PTR(-EUCLEAN);
> + }
> return inode;
> }
>
> @@ -6290,11 +6309,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
> return ERR_PTR(ret);
> }
>
> -static inline u8 btrfs_inode_type(struct inode *inode)
> -{
> - return btrfs_type_by_mode[(inode->i_mode & S_IFMT) >> S_SHIFT];
> -}
> -
> /*
> * utility function to add 'inode' into 'parent_inode' with
> * a give name and a given sequence number.
> @@ -6816,6 +6830,14 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
> extent_start = found_key.offset;
> if (found_type == BTRFS_FILE_EXTENT_REG ||
> found_type == BTRFS_FILE_EXTENT_PREALLOC) {
> + /* Only regular file could have regular/prealloc extent */
> + if (!S_ISREG(inode->vfs_inode.i_mode)) {
> + ret = -EUCLEAN;
> + btrfs_crit(fs_info,
> + "regular/prealloc extent found for non-regular inode %llu",
> + btrfs_ino(inode));
> + goto out;
> + }
> extent_end = extent_start +
> btrfs_file_extent_num_bytes(leaf, item);
>
> diff --git a/fs/btrfs/tests/inode-tests.c b/fs/btrfs/tests/inode-tests.c
> index af0c8e30d9e2..b01dbcab9eed 100644
> --- a/fs/btrfs/tests/inode-tests.c
> +++ b/fs/btrfs/tests/inode-tests.c
> @@ -232,6 +232,7 @@ static noinline int test_btrfs_get_extent(u32 sectorsize, u32 nodesize)
> return ret;
> }
>
> + inode->i_mode = S_IFREG;
> BTRFS_I(inode)->location.type = BTRFS_INODE_ITEM_KEY;
> BTRFS_I(inode)->location.objectid = BTRFS_FIRST_FREE_OBJECTID;
> BTRFS_I(inode)->location.offset = 0;
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/6] btrfs: tree-checker: Verify chunk items
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
2019-03-13 9:19 ` Nikolay Borisov
@ 2019-03-19 14:50 ` David Sterba
2019-03-20 0:46 ` Qu Wenruo
1 sibling, 1 reply; 18+ messages in thread
From: David Sterba @ 2019-03-19 14:50 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs, Yoon Jungyeon
On Wed, Mar 13, 2019 at 04:55:06PM +0800, Qu Wenruo wrote:
> We already have btrfs_check_chunk_valid() to verify each chunk before
> tree-checker.
>
> Merge that function into tree-checker, and update its error message to
> be more readable.
>
> Old error message would be something like:
> BTRFS error (device dm-3): invalid chunk num_stipres: 0
>
> New error message would be:
> Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
> Or
> Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0
>
> Btrfs_check_chunk_valid() is exported for super block syschunk array
> verification.
>
> Also make tree-checker to verify chunk items, this makes chunk item
> checker covers all chunks and avoid fuzzed image.
>
> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202751
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/tree-checker.c | 152 ++++++++++++++++++++++++++++++++++++++++
> fs/btrfs/tree-checker.h | 3 +
> fs/btrfs/volumes.c | 94 +------------------------
> 3 files changed, 156 insertions(+), 93 deletions(-)
Please split the patch to part where you just move the code and where
the logic is changed. btrfs_check_chunk_valid is not the same, old has
-EIO and new -EUCLEAN. Moving a function is ok in the same patch if
there's no change.
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index b8cdaf472031..fe3f37c23c29 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -448,6 +448,152 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
> return 0;
> }
>
> +__printf(5, 6)
> +__cold
> +static void chunk_err(const struct btrfs_fs_info *fs_info,
> + const struct extent_buffer *leaf,
> + const struct btrfs_chunk *chunk, u64 logical,
> + const char *fmt, ...)
> +{
> + /* Only superblock eb is able to have such small offset */
> + bool is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET);
Please move the initialization and comment out of the declaration block
> + struct va_format vaf;
> + va_list args;
> + int i;
> + int slot = -1;
> +
> + if (!is_sb) {
> + /*
> + * Get the slot number by iterating through all slots, this
> + * would provide better readability.
> + */
> + for (i = 0; i < btrfs_header_nritems(leaf); i++) {
> + if (btrfs_item_ptr_offset(leaf, i) ==
> + (unsigned long) chunk) {
> + slot = i;
> + break;
> + }
> + }
> + }
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack
2019-03-13 9:01 ` [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
@ 2019-03-19 15:34 ` David Sterba
0 siblings, 0 replies; 18+ messages in thread
From: David Sterba @ 2019-03-19 15:34 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs
On Wed, Mar 13, 2019 at 05:01:12PM +0800, Qu Wenruo wrote:
> Just forgot the repo:
>
> It can be fetched from github:
> https://github.com/adam900710/linux/tree/tree_checker_enhancement
> Which is based on my previous write time tree checker patchset.
>
> Although the patchset itself can also be applied to v5.0-rc7 tag without
> manual modification.
Please base the next iteration on 5.1-rc1, thanks. There are about 100
patches since 5.0-rc7 and merge conflicts start to show up.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/6] btrfs: tree-checker: Verify chunk items
2019-03-19 14:50 ` David Sterba
@ 2019-03-20 0:46 ` Qu Wenruo
2019-03-20 5:03 ` Qu Wenruo
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2019-03-20 0:46 UTC (permalink / raw)
To: dsterba, Qu Wenruo, linux-btrfs, Yoon Jungyeon
[-- Attachment #1.1: Type: text/plain, Size: 3150 bytes --]
On 2019/3/19 下午10:50, David Sterba wrote:
> On Wed, Mar 13, 2019 at 04:55:06PM +0800, Qu Wenruo wrote:
>> We already have btrfs_check_chunk_valid() to verify each chunk before
>> tree-checker.
>>
>> Merge that function into tree-checker, and update its error message to
>> be more readable.
>>
>> Old error message would be something like:
>> BTRFS error (device dm-3): invalid chunk num_stipres: 0
>>
>> New error message would be:
>> Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
>> Or
>> Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0
>>
>> Btrfs_check_chunk_valid() is exported for super block syschunk array
>> verification.
>>
>> Also make tree-checker to verify chunk items, this makes chunk item
>> checker covers all chunks and avoid fuzzed image.
>>
>> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202751
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/tree-checker.c | 152 ++++++++++++++++++++++++++++++++++++++++
>> fs/btrfs/tree-checker.h | 3 +
>> fs/btrfs/volumes.c | 94 +------------------------
>> 3 files changed, 156 insertions(+), 93 deletions(-)
>
> Please split the patch to part where you just move the code and where
> the logic is changed. btrfs_check_chunk_valid is not the same, old has
> -EIO and new -EUCLEAN. Moving a function is ok in the same patch if
> there's no change.
There is change in the verification timing by just moving the code to
tree checker.
The new timing of chunk verification will make btrfs more robust by
trying the other copy when content doesn't make sense.
In fact the code move itself would solve the bug in the kernel bugzilla.
So It doesn't make that much sense to split the patch.
Thanks,
Qu
>
>>
>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>> index b8cdaf472031..fe3f37c23c29 100644
>> --- a/fs/btrfs/tree-checker.c
>> +++ b/fs/btrfs/tree-checker.c
>> @@ -448,6 +448,152 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
>> return 0;
>> }
>>
>> +__printf(5, 6)
>> +__cold
>> +static void chunk_err(const struct btrfs_fs_info *fs_info,
>> + const struct extent_buffer *leaf,
>> + const struct btrfs_chunk *chunk, u64 logical,
>> + const char *fmt, ...)
>> +{
>> + /* Only superblock eb is able to have such small offset */
>> + bool is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET);
>
> Please move the initialization and comment out of the declaration block
>
>> + struct va_format vaf;
>> + va_list args;
>> + int i;
>> + int slot = -1;
>> +
>> + if (!is_sb) {
>> + /*
>> + * Get the slot number by iterating through all slots, this
>> + * would provide better readability.
>> + */
>> + for (i = 0; i < btrfs_header_nritems(leaf); i++) {
>> + if (btrfs_item_ptr_offset(leaf, i) ==
>> + (unsigned long) chunk) {
>> + slot = i;
>> + break;
>> + }
>> + }
>> + }
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/6] btrfs: tree-checker: Verify chunk items
2019-03-20 0:46 ` Qu Wenruo
@ 2019-03-20 5:03 ` Qu Wenruo
0 siblings, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2019-03-20 5:03 UTC (permalink / raw)
To: dsterba, Qu Wenruo, linux-btrfs, Yoon Jungyeon
[-- Attachment #1.1: Type: text/plain, Size: 3483 bytes --]
On 2019/3/20 上午8:46, Qu Wenruo wrote:
>
>
> On 2019/3/19 下午10:50, David Sterba wrote:
>> On Wed, Mar 13, 2019 at 04:55:06PM +0800, Qu Wenruo wrote:
>>> We already have btrfs_check_chunk_valid() to verify each chunk before
>>> tree-checker.
>>>
>>> Merge that function into tree-checker, and update its error message to
>>> be more readable.
>>>
>>> Old error message would be something like:
>>> BTRFS error (device dm-3): invalid chunk num_stipres: 0
>>>
>>> New error message would be:
>>> Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0
>>> Or
>>> Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0
>>>
>>> Btrfs_check_chunk_valid() is exported for super block syschunk array
>>> verification.
>>>
>>> Also make tree-checker to verify chunk items, this makes chunk item
>>> checker covers all chunks and avoid fuzzed image.
>>>
>>> Reported-by: Yoon Jungyeon <jungyeon@gatech.edu>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202751
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>> fs/btrfs/tree-checker.c | 152 ++++++++++++++++++++++++++++++++++++++++
>>> fs/btrfs/tree-checker.h | 3 +
>>> fs/btrfs/volumes.c | 94 +------------------------
>>> 3 files changed, 156 insertions(+), 93 deletions(-)
>>
>> Please split the patch to part where you just move the code and where
>> the logic is changed. btrfs_check_chunk_valid is not the same, old has
>> -EIO and new -EUCLEAN. Moving a function is ok in the same patch if
>> there's no change.
>
> There is change in the verification timing by just moving the code to
> tree checker.
My bad, I though the code move and verification timing must be done in
one patch, and that's definitely wrong.
I'll split the code move and logic change.
Thanks,
Qu
>
> The new timing of chunk verification will make btrfs more robust by
> trying the other copy when content doesn't make sense.
>
> In fact the code move itself would solve the bug in the kernel bugzilla.
>
> So It doesn't make that much sense to split the patch.
>
> Thanks,
> Qu
>
>>
>>>
>>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>>> index b8cdaf472031..fe3f37c23c29 100644
>>> --- a/fs/btrfs/tree-checker.c
>>> +++ b/fs/btrfs/tree-checker.c
>>> @@ -448,6 +448,152 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
>>> return 0;
>>> }
>>>
>>> +__printf(5, 6)
>>> +__cold
>>> +static void chunk_err(const struct btrfs_fs_info *fs_info,
>>> + const struct extent_buffer *leaf,
>>> + const struct btrfs_chunk *chunk, u64 logical,
>>> + const char *fmt, ...)
>>> +{
>>> + /* Only superblock eb is able to have such small offset */
>>> + bool is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET);
>>
>> Please move the initialization and comment out of the declaration block
>>
>>> + struct va_format vaf;
>>> + va_list args;
>>> + int i;
>>> + int slot = -1;
>>> +
>>> + if (!is_sb) {
>>> + /*
>>> + * Get the slot number by iterating through all slots, this
>>> + * would provide better readability.
>>> + */
>>> + for (i = 0; i < btrfs_header_nritems(leaf); i++) {
>>> + if (btrfs_item_ptr_offset(leaf, i) ==
>>> + (unsigned long) chunk) {
>>> + slot = i;
>>> + break;
>>> + }
>>> + }
>>> + }
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-03-20 5:04 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-13 8:55 [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
2019-03-13 8:55 ` [PATCH 1/6] btrfs: tree-checker: Verify chunk items Qu Wenruo
2019-03-13 9:19 ` Nikolay Borisov
2019-03-19 14:50 ` David Sterba
2019-03-20 0:46 ` Qu Wenruo
2019-03-20 5:03 ` Qu Wenruo
2019-03-13 8:55 ` [PATCH 2/6] btrfs: tree-checker: Verify dev item Qu Wenruo
2019-03-13 9:19 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 3/6] btrfs: Check the first key and level for cached extent buffer Qu Wenruo
2019-03-13 9:24 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 4/6] btrfs: tree-checker: Enhance chunk checker to validate chunk profiler Qu Wenruo
2019-03-13 9:18 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 5/6] btrfs: tree-checker: Verify inode item Qu Wenruo
2019-03-13 9:28 ` Nikolay Borisov
2019-03-13 8:55 ` [PATCH 6/6] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Qu Wenruo
2019-03-13 9:41 ` Nikolay Borisov
2019-03-13 9:01 ` [PATCH 0/6] btrfs: Enhance tree checker and runtime checker to handle the new wave of fuzzed image attack Qu Wenruo
2019-03-19 15:34 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).