[RFC PATCH] btrfs: scrub: don't call calc_sector_number on failed bios

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: Johannes Thumshirn <jth@kernel.org>
To: linux-btrfs@vger.kernel.org
Cc: David Sterba <dsterba@suse.com>,
	Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
	Johannes Thumshirn <jth@kernel.org>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>
Subject: [RFC PATCH] btrfs: scrub: don't call calc_sector_number on failed bios
Date: Mon, 10 Jun 2024 16:42:29 +0200	[thread overview]
Message-ID: <20240610144229.26373-1-jth@kernel.org> (raw)

When running fstests' test-case btrfs/060 with MKFS_OPTIONS="-O rst"
to force the creation of a RAID stripe tree even on regular devices I
hit the follwoing ASSERT() in calc_sector_number():

    ASSERT(i < stripe->nr_sectors);

This ASSERT() is triggered, because we cannot find the page in the
passed in bio_vec in the scrub_stripe's pages[] array.

When having a closer look at the stripe using drgn on the vmcore dump
and comparing the stripe's pages to the bio_vec's pages it is evident
that we cannot find it as first_bvec's bv_page is NULL:

    >>> stripe = t.stack_trace()[13]['stripe']
    >>> print(stripe.pages)
    (struct page *[16]){ 0xffffea000418b280, 0xffffea00043051c0,
    0xffffea000430ed00, 0xffffea00040fcc00, 0xffffea000441fc80,
    0xffffea00040fc980, 0xffffea00040fc9c0, 0xffffea00040fc940,
    0xffffea0004223040, 0xffffea00043a1940, 0xffffea00040fea80,
    0xffffea00040a5740, 0xffffea0004490f40, 0xffffea00040f7dc0,
    0xffffea00044985c0, 0xffffea00040f7d80 }
    >>> bio = t.stack_trace()[12]['bbio'].bio
    >>> print(bio.bi_io_vec)
    *(struct bio_vec *)0xffff88810632bc00 = {
            .bv_page = (struct page *)0x0,
            .bv_len = (unsigned int)0,
            .bv_offset = (unsigned int)0,
    }

Upon closer inspection of the bio itself we see that bio->bi_status is
10, which corresponds to BLK_STS_IOERR:

    >>> print(bio)
    (struct bio){
            .bi_next = (struct bio *)0x0,
            .bi_bdev = (struct block_device *)0x0,
            .bi_opf = (blk_opf_t)0,
            .bi_flags = (unsigned short)0,
            .bi_ioprio = (unsigned short)0,
            .bi_write_hint = (enum rw_hint)WRITE_LIFE_NOT_SET,
            .bi_status = (blk_status_t)10,
            .__bi_remaining = (atomic_t){
                    .counter = (int)1,
            },
            .bi_iter = (struct bvec_iter){
            	 .bi_sector = (sector_t)2173056,
                    .bi_size = (unsigned int)0,
                    .bi_idx = (unsigned int)0,
                    .bi_bvec_done = (unsigned int)0,
            },
            .bi_cookie = (blk_qc_t)4294967295,
            .__bi_nr_segments = (unsigned int)4294967295,
            .bi_end_io = (bio_end_io_t *)0x0,
            .bi_private = (void *)0x0,
            .bi_integrity = (struct bio_integrity_payload *)0x0,
            .bi_vcnt = (unsigned short)0,
            .bi_max_vecs = (unsigned short)16,
            .__bi_cnt = (atomic_t){
                    .counter = (int)1,
            },
            .bi_io_vec = (struct bio_vec *)0xffff88810632bc00,
            .bi_pool = (struct bio_set *)btrfs_bioset+0x0 = 0xffffffff82c85620,
            .bi_inline_vecs = (struct bio_vec []){},
    }

So only call calc_sector_number when we know the bio completed without error.

Cc: Qu Wenru <wqu@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/scrub.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 188a9c42c9eb..91590dc509de 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1099,12 +1099,17 @@ static void scrub_read_endio(struct btrfs_bio *bbio)
 {
 	struct scrub_stripe *stripe = bbio->private;
 	struct bio_vec *bvec;
-	int sector_nr = calc_sector_number(stripe, bio_first_bvec_all(&bbio->bio));
+	int sector_nr = 0;
 	int num_sectors;
 	u32 bio_size = 0;
 	int i;
 
-	ASSERT(sector_nr < stripe->nr_sectors);
+	if (bbio->bio.bi_status == BLK_STS_OK) {
+		sector_nr = calc_sector_number(stripe,
+					       bio_first_bvec_all(&bbio->bio));
+		ASSERT(sector_nr < stripe->nr_sectors);
+	}
+
 	bio_for_each_bvec_all(bvec, &bbio->bio, i)
 		bio_size += bvec->bv_len;
 	num_sectors = bio_size >> stripe->bg->fs_info->sectorsize_bits;
-- 
2.43.0

next             reply	other threads:[~2024-06-10 14:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-10 14:42 Johannes Thumshirn [this message]
2024-06-10 19:40 ` [RFC PATCH] btrfs: scrub: don't call calc_sector_number on failed bios Josef Bacik
2024-06-10 21:24 ` Qu Wenruo
2024-06-11  6:26   ` Johannes Thumshirn
2024-06-16 22:14     ` Qu Wenruo
2024-06-16 22:29       ` Qu Wenruo

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:188a9c42c9e dfblob:91590dc509d )
 OR (
bs:"[RFC PATCH] btrfs: scrub: don't call calc_sector_number on failed bios" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240610144229.26373-1-jth@kernel.org \
    --to=jth@kernel.org \
    --cc=dsterba@suse.com \
    --cc=johannes.thumshirn@wdc.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox