From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Qu Wenruo <wqu@suse.com>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.9 31/40] btrfs: scrub: handle RST lookup error correctly
Date: Tue, 9 Jul 2024 12:19:11 -0400 [thread overview]
Message-ID: <20240709162007.30160-31-sashal@kernel.org> (raw)
In-Reply-To: <20240709162007.30160-1-sashal@kernel.org>
From: Qu Wenruo <wqu@suse.com>
[ Upstream commit 2c49908634a2b97b1c3abe0589be2739ac5e7fd5 ]
[BUG]
When running btrfs/060 with forced RST feature, it would crash the
following ASSERT() inside scrub_read_endio():
ASSERT(sector_nr < stripe->nr_sectors);
Before that, we would have tree dump from
btrfs_get_raid_extent_offset(), as we failed to find the RST entry for
the range.
[CAUSE]
Inside scrub_submit_extent_sector_read() every time we allocated a new
bbio we immediately called btrfs_map_block() to make sure there was some
RST range covering the scrub target.
But if btrfs_map_block() fails, we immediately call endio for the bbio,
while the bbio is newly allocated, it's completely empty.
Then inside scrub_read_endio(), we go through the bvecs to find
the sector number (as bi_sector is no longer reliable if the bio is
submitted to lower layers).
And since the bio is empty, such bvecs iteration would not find any
sector matching the sector, and return sector_nr == stripe->nr_sectors,
triggering the ASSERT().
[FIX]
Instead of calling btrfs_map_block() after allocating a new bbio, call
btrfs_map_block() first.
Since our only objective of calling btrfs_map_block() is only to update
stripe_len, there is really no need to do that after btrfs_alloc_bio().
This new timing would avoid the problem of handling empty bbio
completely, and in fact fixes a possible race window for the old code,
where if the submission thread is the only owner of the pending_io, the
scrub would never finish (since we didn't decrease the pending_io
counter).
Although the root cause of RST lookup failure still needs to be
addressed.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/scrub.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 4b22cfe9a98cb..e3e0b8a4c187c 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1688,20 +1688,24 @@ static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx,
(i << fs_info->sectorsize_bits);
int err;
- bbio = btrfs_bio_alloc(stripe->nr_sectors, REQ_OP_READ,
- fs_info, scrub_read_endio, stripe);
- bbio->bio.bi_iter.bi_sector = logical >> SECTOR_SHIFT;
-
io_stripe.is_scrub = true;
+ stripe_len = (nr_sectors - i) << fs_info->sectorsize_bits;
+ /*
+ * For RST cases, we need to manually split the bbio to
+ * follow the RST boundary.
+ */
err = btrfs_map_block(fs_info, BTRFS_MAP_READ, logical,
- &stripe_len, &bioc, &io_stripe,
- &mirror);
+ &stripe_len, &bioc, &io_stripe, &mirror);
btrfs_put_bioc(bioc);
- if (err) {
- btrfs_bio_end_io(bbio,
- errno_to_blk_status(err));
- return;
+ if (err < 0) {
+ set_bit(i, &stripe->io_error_bitmap);
+ set_bit(i, &stripe->error_bitmap);
+ continue;
}
+
+ bbio = btrfs_bio_alloc(stripe->nr_sectors, REQ_OP_READ,
+ fs_info, scrub_read_endio, stripe);
+ bbio->bio.bi_iter.bi_sector = logical >> SECTOR_SHIFT;
}
__bio_add_page(&bbio->bio, page, fs_info->sectorsize, pgoff);
--
2.43.0
next prev parent reply other threads:[~2024-07-09 16:21 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-09 16:18 [PATCH AUTOSEL 6.9 01/40] workqueue: Refactor worker ID formatting and make wq_worker_comm() use full ID string Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 02/40] Input: elantech - fix touchpad state on resume for Lenovo N24 Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 03/40] Input: i8042 - add Ayaneo Kun to i8042 quirk table Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 04/40] ASoC: rt722-sdca-sdw: add silence detection register as volatile Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 05/40] ASoC: codecs: ES8326: Solve headphone detection issue Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 06/40] Input: xpad - add support for ASUS ROG RAIKIRI PRO Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 07/40] ASoC: topology: Fix references to freed memory Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 08/40] ASoC: Intel: avs: Fix route override Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 09/40] ASoC: topology: Do not assign fields that are already set Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 10/40] ASoC: topology: Clean up route loading Sasha Levin
2024-07-09 16:34 ` Mark Brown
2024-07-22 12:49 ` Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 11/40] bytcr_rt5640 : inverse jack detect for Archos 101 cesium Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 12/40] ALSA: dmaengine: Synchronize dma channel after drop() Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 13/40] ASoC: ti: davinci-mcasp: Set min period size using FIFO config Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 14/40] ASoC: ti: omap-hdmi: Fix too long driver name Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 15/40] ASoC: SOF: sof-audio: Skip unprepare for in-use widgets on error rollback Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 16/40] ASoC: rt722-sdca-sdw: add debounce time for type detection Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 17/40] ASoC: topology: Fix route memory corruption Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 18/40] ASoC: cs35l56: Disconnect ASP1 TX sources when ASP1 DAI is hooked up Sasha Levin
2024-07-09 16:18 ` [PATCH AUTOSEL 6.9 19/40] nvme: fix NVME_NS_DEAC may incorrectly identifying the disk as EXT_LBA Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 20/40] Input: ads7846 - use spi_device_id table Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 21/40] can: kvaser_usb: fix return value for hif_usb_send_regout Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 22/40] net: mvpp2: fill-in dev_port attribute Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 23/40] gpio: pca953x: fix pca953x_irq_bus_sync_unlock race Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 24/40] octeontx2-pf: Fix coverity and klockwork issues in octeon PF driver Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 25/40] s390/sclp: Fix sclp_init() cleanup on failure Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 26/40] platform/mellanox: nvsw-sn2201: Add check for platform_device_add_resources Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 27/40] platform/x86: wireless-hotkey: Add support for LG Airplane Button Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 28/40] platform/x86: lg-laptop: Remove LGEX0815 hotkey handling Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 29/40] platform/x86: lg-laptop: Change ACPI device id Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 30/40] platform/x86: lg-laptop: Use ACPI device handle when evaluating WMAB/WMBB Sasha Levin
2024-07-09 16:19 ` Sasha Levin [this message]
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 32/40] btrfs: qgroup: fix quota root leak after quota disable failure Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 33/40] ibmvnic: Add tx check to prevent skb leak Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 34/40] ALSA: PCM: Allow resume only for suspended streams Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 35/40] ALSA: hda/relatek: Enable Mute LED on HP Laptop 15-gw0xxx Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 36/40] ALSA: dmaengine_pcm: terminate dmaengine before synchronize Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 37/40] parisc: use generic sys_fanotify_mark implementation Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 38/40] drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1 Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 39/40] ASoC: amd: yc: Fix non-functional mic on ASUS M5602RA Sasha Levin
2024-07-09 16:19 ` [PATCH AUTOSEL 6.9 40/40] net: usb: qmi_wwan: add Telit FN912 compositions Sasha Levin
2024-07-09 17:33 ` [PATCH AUTOSEL 6.9 01/40] workqueue: Refactor worker ID formatting and make wq_worker_comm() use full ID string Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240709162007.30160-31-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=johannes.thumshirn@wdc.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).