Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Sasha Levin <sashal@kernel.org>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH AUTOSEL 5.11 03/12] btrfs: subpage: fix the false data csum mismatch error
Date: Mon, 8 Mar 2021 16:43:26 +0100	[thread overview]
Message-ID: <20210308154326.GB7604@twin.jikos.cz> (raw)
In-Reply-To: <20210307135746.967418-3-sashal@kernel.org>

On Sun, Mar 07, 2021 at 08:57:37AM -0500, Sasha Levin wrote:
> From: Qu Wenruo <wqu@suse.com>
> 
> [ Upstream commit c28ea613fafad910d08f67efe76ae552b1434e44 ]
> 
> [BUG]
> When running fstresss, we can hit strange data csum mismatch where the
> on-disk data is in fact correct (passes scrub).
> 
> With some extra debug info added, we have the following traces:
> 
>   0482us: btrfs_do_readpage: root=5 ino=284 offset=393216, submit force=0 pgoff=0 iosize=8192
>   0494us: btrfs_do_readpage: root=5 ino=284 offset=401408, submit force=0 pgoff=8192 iosize=4096
>   0498us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=393216 len=8192
>   0591us: btrfs_do_readpage: root=5 ino=284 offset=405504, submit force=0 pgoff=12288 iosize=36864
>   0594us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=401408 len=4096
>   0863us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=405504 len=36864
>   0933us: btrfs_verify_data_csum: root=5 ino=284 offset=393216 len=8192
>   0967us: btrfs_do_readpage: root=5 ino=284 offset=442368, skip beyond isize pgoff=49152 iosize=16384
>   1047us: btrfs_verify_data_csum: root=5 ino=284 offset=401408 len=4096
>   1163us: btrfs_verify_data_csum: root=5 ino=284 offset=405504 len=36864
>   1290us: check_data_csum: !!! root=5 ino=284 offset=438272 pg_off=45056 !!!
>   7387us: end_bio_extent_readpage: root=5 ino=284 before pending_read_bios=0
> 
> [CAUSE]
> Normally we expect all submitted bio reads to only touch the range we
> specified, and under subpage context, it means we should only touch the
> range specified in each bvec.
> 
> But in data read path, inside end_bio_extent_readpage(), we have page
> zeroing which only takes regular page size into consideration.
> 
> This means for subpage if we have an inode whose content looks like below:
> 
>   0       16K     32K     48K     64K
>   |///////|       |///////|       |
> 
>   |//| = data needs to be read from disk
>   |  | = hole
> 
> And i_size is 64K initially.
> 
> Then the following race can happen:
> 
> 		T1		|		T2
> --------------------------------+--------------------------------
> btrfs_do_readpage()		|
> |- isize = 64K;			|
> |  At this time, the isize is 	|
> |  64K				|
> |				|
> |- submit_extent_page()		|
> |  submit previous assembled bio|
> |  assemble bio for [0, 16K)	|
> |				|
> |- submit_extent_page()		|
>    submit read bio for [0, 16K) |
>    assemble read bio for	|
>    [32K, 48K)			|
>  				|
> 				| btrfs_setsize()
> 				| |- i_size_write(, 16K);
> 				|    Now i_size is only 16K
> end_io() for [0K, 16K)		|
> |- end_bio_extent_readpage()	|
>    |- btrfs_verify_data_csum()  |
>    |  No csum error		|
>    |- i_size = 16K;		|
>    |- zero_user_segment(16K,	|
>       PAGE_SIZE);		|
>       !!! We zeroed range	|
>       !!! [32K, 48K)		|
> 				| end_io for [32K, 48K)
> 				| |- end_bio_extent_readpage()
> 				|    |- btrfs_verify_data_csum()
> 				|       ! CSUM MISMATCH !
> 				|       ! As the range is zeroed now !
> 
> [FIX]
> To fix the problem, make end_bio_extent_readpage() to only zero the
> range of bvec.
> 
> The bug only affects subpage read-write support, as for full read-only
> mount we can't change i_size thus won't hit the race condition.

Please drop this patch from autosel because of the above, this is in a
feature that's in progress and does not affect regular filesystems.

      reply	other threads:[~2021-03-08 15:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-07 13:57 [PATCH AUTOSEL 5.11 01/12] btrfs: avoid checking for RO block group twice during nocow writeback Sasha Levin
2021-03-07 13:57 ` [PATCH AUTOSEL 5.11 03/12] btrfs: subpage: fix the false data csum mismatch error Sasha Levin
2021-03-08 15:43   ` David Sterba [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210308154326.GB7604@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox