From: Qu Wenruo <wqu@suse.com>
To: Nikolay Borisov <nborisov@suse.com>, linux-btrfs@vger.kernel.org
Cc: Filipe Manana <fdmanana@suse.com>
Subject: Re: [PATCH v4] btrfs: trim: fix underflow in trim length to prevent access beyond device boundary
Date: Tue, 11 Aug 2020 16:46:57 +0800 [thread overview]
Message-ID: <269982fa-0174-5816-3a23-37912737abc9@suse.com> (raw)
In-Reply-To: <5cda2c95-e407-8b11-e206-20c4aac5d48b@suse.com>
On 2020/8/11 下午4:41, Nikolay Borisov wrote:
>
>
> On 31.07.20 г. 14:29 ч., Qu Wenruo wrote:
>> [BUG]
>> The following script can lead to tons of beyond device boundary access:
>>
>> mkfs.btrfs -f $dev -b 10G
>> mount $dev $mnt
>> trimfs $mnt
>> btrfs filesystem resize 1:-1G $mnt
>> trimfs $mnt
>>
>> [CAUSE]
>> Since commit 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to
>> find_first_clear_extent_bit"), we try to avoid trimming ranges that's
>> already trimmed.
>>
>> So we check device->alloc_state by finding the first range which doesn't
>> have CHUNK_TRIMMED and CHUNK_ALLOCATED not set.
>>
>> But if we shrunk the device, that bits are not cleared, thus we could
>> easily got a range starts beyond the shrunk device size.
>>
>> This results the returned @start and @end are all beyond device size,
>> then we call "end = min(end, device->total_bytes -1);" making @end
>> smaller than device size.
>>
>> Then finally we goes "len = end - start + 1", totally underflow the
>> result, and lead to the beyond-device-boundary access.
>>
>> [FIX]
>> This patch will fix the problem in two ways:
>> - Clear CHUNK_TRIMMED | CHUNK_ALLOCATED bits when shrinking device
>> This is the root fix
>>
>> - Add extra safe net when trimming free device extents
>> We check and warn if the returned range is already beyond current
>> device.
>>
>> Link: https://github.com/kdave/btrfs-progs/issues/282
>> Fixes: 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Add proper fixes tag
>> - Add extra warning for beyond device end case
>> - Add graceful exit for already trimmed case
>> v3:
>> - Don't return EUCLEAN for beyond boundary access
>> - Rephrase the warning message for beyond boundary access
>> v4:
>> - Remove one duplicated check on exiting the trim loop
>> ---
>> fs/btrfs/extent-tree.c | 14 ++++++++++++++
>> fs/btrfs/volumes.c | 12 ++++++++++++
>> 2 files changed, 26 insertions(+)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index fa7d83051587..6b1b5dfba4b3 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -33,6 +33,7 @@
>> #include "delalloc-space.h"
>> #include "block-group.h"
>> #include "discard.h"
>> +#include "rcu-string.h"
>>
>> #undef SCRAMBLE_DELAYED_REFS
>>
>> @@ -5669,6 +5670,19 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
>> &start, &end,
>> CHUNK_TRIMMED | CHUNK_ALLOCATED);
>>
>> + /* CHUNK_* bits not cleared properly */
>> + if (start > device->total_bytes) {
>> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> + btrfs_warn_in_rcu(fs_info,
>> +"ignoring attempt to trim beyond device size: offset %llu length %llu device %s device size %llu",
>> + start, end - start + 1,
>> + rcu_str_deref(device->name),
>> + device->total_bytes);
>> + mutex_unlock(&fs_info->chunk_mutex);
>> + ret = 0;
>> + break;
>> + }
>
> Isn't this a NOOP, because the latter chunk ensures we can never cross
> device->total_bytes. Since this is a purely defensive mechanism and
> following this patch we *should* never have CHUNK_* bits set beyond
> device->total_bytes I'd say make this an ASSERT(). Otherwise you force
> people to pay the cost of the check for every trim ...
I'm fine with the ASSERT() idea.
But on the other hand, we really don't know how things can go wrong, and
such graceful exit makes us way easier to expose and fix bugs when it
happens in a production system.
So currently I'm 50-50 on change it to ASSERT().
Thanks,
Qu
>
>
>> +
>> /* Ensure we skip the reserved area in the first 1M */
>> start = max_t(u64, start, SZ_1M);
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index d7670e2a9f39..4e51ef68ea72 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -4720,6 +4720,18 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
>> }
>>
>> mutex_lock(&fs_info->chunk_mutex);
>> + /*
>> + * Also clear any CHUNK_TRIMMED and CHUNK_ALLOCATED bits beyond the
>> + * current device boundary.
>> + * This shouldn't fail, as alloc_state should only utilize those two
>> + * bits, thus we shouldn't alloc new memory for clearing the status.
>> + *
>> + * So here we just do an ASSERT() to catch future behavior change.
>> + */
>> + ret = clear_extent_bits(&device->alloc_state, new_size, (u64)-1,
>> + CHUNK_TRIMMED | CHUNK_ALLOCATED);
>> + ASSERT(!ret);
>
> I agree with this part.
>
>> +
>> btrfs_device_set_disk_total_bytes(device, new_size);
>> if (list_empty(&device->post_commit_list))
>> list_add_tail(&device->post_commit_list,
>>
>
next prev parent reply other threads:[~2020-08-11 8:47 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-31 11:29 [PATCH v4] btrfs: trim: fix underflow in trim length to prevent access beyond device boundary Qu Wenruo
2020-07-31 14:08 ` David Sterba
2020-07-31 23:35 ` Qu Wenruo
2020-08-11 7:22 ` David Sterba
2020-08-11 7:42 ` Qu Wenruo
2020-08-12 6:10 ` David Sterba
2020-08-12 6:33 ` Qu Wenruo
2020-08-12 6:37 ` David Sterba
2020-08-11 8:41 ` Nikolay Borisov
2020-08-11 8:46 ` Qu Wenruo [this message]
2020-08-11 10:24 ` Filipe Manana
2020-08-12 6:14 ` David Sterba
2020-08-12 6:43 ` [PATCH v5] " David Sterba
2020-08-12 6:57 ` Qu Wenruo
2020-08-12 11:14 ` Qu Wenruo
2020-08-12 11:24 ` Nikolay Borisov
2020-08-12 11:26 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=269982fa-0174-5816-3a23-37912737abc9@suse.com \
--to=wqu@suse.com \
--cc=fdmanana@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox