Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, dsterba@suse.com
Subject: Re: [PATCH v2] btrfs: fix u32 overflows when left shifting @stripe_nr
Date: Tue, 20 Jun 2023 12:27:43 +0200	[thread overview]
Message-ID: <20230620102743.GI16168@twin.jikos.cz> (raw)
In-Reply-To: <1974782b207e7011a859a45115cf4875475204dc.1687254779.git.wqu@suse.com>

On Tue, Jun 20, 2023 at 05:57:31PM +0800, Qu Wenruo wrote:
> [BUG]
> David reported an ASSERT() get triggered during certain fio load.
> 
> The ASSERT() is from rbio_add_bio() of raid56.c:
> 
> 	ASSERT(orig_logical >= full_stripe_start &&
> 	       orig_logical + orig_len <= full_stripe_start +
> 	       rbio->nr_data * BTRFS_STRIPE_LEN);
> 
> Which is checking if the target rbio is crossing the full stripe
> boundary.
> 
> [CAUSE]
> Commit a97699d1d610 ("btrfs: replace map_lookup->stripe_len by
> BTRFS_STRIPE_LEN") changes how we calculate the map length, to reduce
> u64 division.
> 
> Function btrfs_max_io_len() is to get the length to the stripe boundary.
> 
> It calculates the full stripe start offset (inside the chunk) by the
> following command:
> 
> 		*full_stripe_start =
> 			rounddown(*stripe_nr, nr_data_stripes(map)) <<
> 			BTRFS_STRIPE_LEN_SHIFT;
> 
> The calculation itself is fine, but the value returned by rounddown() is
> dependent on both @stripe_nr (which is u32) and nr_data_stripes() (which
> returned int).
> 
> Thus the result is also u32, then we do the left shift, which can
> overflow u32.
> 
> If such overflow happens, @full_stripe_start will be a value way smaller
> than @offset, causing later "full_stripe_len - (offset -
> *full_stripe_start)" to underflow, thus make later length calculation to
> have no stripe boundary limit, resulting a write bio to exceed stripe
> boundary.
> 
> There are some other locations like this, with a u32 @stripe_nr got left
> shift, which can lead to a similar overflow.
> 
> [FIX]
> Fix all @stripe_nr with left shift with a type cast to u64 before the
> left shift.
> 
> Those involved @stripe_nr or similar variables are recording the stripe
> number inside the chunk, which is small enough to be contained by u32,
> but their offset inside the chunk can not fit into u32.
> 
> Thus for those specific left shifts, a type cast to u64 is necessary.
> 
> Reported-by: David Sterba <dsterba@suse.com>
> Fixes: a97699d1d610 ("btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Fix all @stripe_nr with left shift
> - Apply the ASSERT() on full stripe checks for all RAID56 IOs.
> ---
>  fs/btrfs/volumes.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index b8540af6e136..ed3765d21cb0 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -5985,12 +5985,12 @@ struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
>  	stripe_nr = offset >> BTRFS_STRIPE_LEN_SHIFT;
>  
>  	/* stripe_offset is the offset of this block in its stripe */
> -	stripe_offset = offset - (stripe_nr << BTRFS_STRIPE_LEN_SHIFT);
> +	stripe_offset = offset - ((u64)stripe_nr << BTRFS_STRIPE_LEN_SHIFT);

This needs a helper, mandating a type cast for correctness in so many
places is a bad pattern.

>  
>  	stripe_nr_end = round_up(offset + length, BTRFS_STRIPE_LEN) >>
>  			BTRFS_STRIPE_LEN_SHIFT;
>  	stripe_cnt = stripe_nr_end - stripe_nr;
> -	stripe_end_offset = (stripe_nr_end << BTRFS_STRIPE_LEN_SHIFT) -
> +	stripe_end_offset = ((u64)stripe_nr_end << BTRFS_STRIPE_LEN_SHIFT) -
>  			    (offset + length);
>  	/*
>  	 * after this, stripe_nr is the number of stripes on this
> @@ -6033,7 +6033,7 @@ struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
>  	for (i = 0; i < *num_stripes; i++) {
>  		stripes[i].physical =
>  			map->stripes[stripe_index].physical +
> -			stripe_offset + (stripe_nr << BTRFS_STRIPE_LEN_SHIFT);
> +			stripe_offset + ((u64)stripe_nr << BTRFS_STRIPE_LEN_SHIFT);
>  		stripes[i].dev = map->stripes[stripe_index].dev;
>  
>  		if (map->type & (BTRFS_BLOCK_GROUP_RAID0 |
> @@ -6199,15 +6199,18 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op,
>  		 * not ensured to be power of 2.
>  		 */
>  		*full_stripe_start =
> -			rounddown(*stripe_nr, nr_data_stripes(map)) <<
> +			(u64)rounddown(*stripe_nr, nr_data_stripes(map)) <<
>  			BTRFS_STRIPE_LEN_SHIFT;
>  
> +		ASSERT(*full_stripe_start + full_stripe_len > offset);
> +		ASSERT(*full_stripe_start <= offset);
>  		/*
>  		 * For writes to RAID56, allow to write a full stripe set, but
>  		 * no straddling of stripe sets.
>  		 */
> -		if (op == BTRFS_MAP_WRITE)
> +		if (op == BTRFS_MAP_WRITE) {
>  			return full_stripe_len - (offset - *full_stripe_start);
> +		}

No { }

>  	}

  reply	other threads:[~2023-06-20 10:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-20  9:57 [PATCH v2] btrfs: fix u32 overflows when left shifting @stripe_nr Qu Wenruo
2023-06-20 10:27 ` David Sterba [this message]
2023-06-20 11:24   ` Qu Wenruo
2023-06-20 11:56     ` David Sterba
2023-06-20 12:05       ` Qu Wenruo
2023-06-20 18:27         ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230620102743.GI16168@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox