* RFC: untangle and fix __blkdev_issue_discard
@ 2024-03-07 15:11 Christoph Hellwig
2024-03-07 15:11 ` [PATCH 01/10] block: remove the discard_granularity check in __blkdev_issue_discard Christoph Hellwig
` (10 more replies)
0 siblings, 11 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
Hi all,
this tries to address the block for-next oops Chandan reported on XFS.
I can't actually reproduce it unfortunately, but this series should
sort it out by movign the fatal_signal_pending check out of all but
the ioctl path. The write_zeroes and secure_erase path will need
similar treatment eventually.
Test with blktests and the xfstests discard group for xfs only. Note that
the latter has a pre-existing regression in generic/500 that I'll look
into in a bit.
Diffstat:
block/blk-lib.c | 78 +++++++++++++-------------------------
block/ioctl.c | 13 ++++--
drivers/md/dm-thin.c | 5 +-
drivers/md/md.c | 6 +-
drivers/nvme/target/io-cmd-bdev.c | 16 ++-----
fs/ext4/mballoc.c | 16 ++++---
fs/f2fs/segment.c | 10 ++--
fs/xfs/xfs_discard.c | 47 +++++++---------------
fs/xfs/xfs_discard.h | 2
include/linux/blkdev.h | 4 -
10 files changed, 84 insertions(+), 113 deletions(-)
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 01/10] block: remove the discard_granularity check in __blkdev_issue_discard
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 02/10] block: move discard checks into the ioctl handler Christoph Hellwig
` (9 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
We now set a default granularity in the queue limits API, so don't
bother with this extra check.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-lib.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/block/blk-lib.c b/block/blk-lib.c
index dc8e35d0a51d6d..f873eb9a886f63 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -66,13 +66,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
if (!bdev_max_discard_sectors(bdev))
return -EOPNOTSUPP;
- /* In case the discard granularity isn't set by buggy device driver */
- if (WARN_ON_ONCE(!bdev_discard_granularity(bdev))) {
- pr_err_ratelimited("%pg: Error: discard_granularity is 0.\n",
- bdev);
- return -EOPNOTSUPP;
- }
-
bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
if ((sector | nr_sects) & bs_mask)
return -EINVAL;
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 02/10] block: move discard checks into the ioctl handler
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
2024-03-07 15:11 ` [PATCH 01/10] block: remove the discard_granularity check in __blkdev_issue_discard Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 21:33 ` Dave Chinner
2024-03-07 15:11 ` [PATCH 03/10] block: add a blk_next_discard_bio helper Christoph Hellwig
` (8 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
Most bio operations get basic sanity checking in submit_bio and anything
more complicated than that is done in the callers. Discards are a bit
different from that in that a lot of checking is done in
__blkdev_issue_discard, and the specific errnos for that are returned
to userspace. Move the checks that require specific errnos to the ioctl
handler instead, and just leave the basic sanity checking in submit_bio
for the other handlers. This introduces two changes in behavior:
1) the logical block size alignment check of the start and len is lost
for non-ioctl callers.
This matches what is done for other operations including reads and
writes. We should probably verify this for all bios, but for now
make discards match the normal flow.
2) for non-ioctl callers all errors are reported on I/O completion now
instead of synchronously. Callers in general mostly ignore or log
errors so this will actually simplify the code once cleaned up
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-lib.c | 13 -------------
block/ioctl.c | 13 +++++++++----
2 files changed, 9 insertions(+), 17 deletions(-)
diff --git a/block/blk-lib.c b/block/blk-lib.c
index f873eb9a886f63..50923508a32466 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -59,19 +59,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
{
struct bio *bio = *biop;
- sector_t bs_mask;
-
- if (bdev_read_only(bdev))
- return -EPERM;
- if (!bdev_max_discard_sectors(bdev))
- return -EOPNOTSUPP;
-
- bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
- if ((sector | nr_sects) & bs_mask)
- return -EINVAL;
-
- if (!nr_sects)
- return -EINVAL;
while (nr_sects) {
sector_t req_sects =
diff --git a/block/ioctl.c b/block/ioctl.c
index de0cc0d215c633..1d5de0a890c5e8 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -95,6 +95,8 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
unsigned long arg)
{
+ sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
+ sector_t sector, nr_sects;
uint64_t range[2];
uint64_t start, len;
struct inode *inode = bdev->bd_inode;
@@ -105,18 +107,21 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
if (!bdev_max_discard_sectors(bdev))
return -EOPNOTSUPP;
+ if (bdev_read_only(bdev))
+ return -EPERM;
if (copy_from_user(range, (void __user *)arg, sizeof(range)))
return -EFAULT;
start = range[0];
len = range[1];
+ sector = start >> SECTOR_SHIFT;
+ nr_sects = len >> SECTOR_SHIFT;
- if (start & 511)
+ if (!nr_sects)
return -EINVAL;
- if (len & 511)
+ if ((sector | nr_sects) & bs_mask)
return -EINVAL;
-
if (start + len > bdev_nr_bytes(bdev))
return -EINVAL;
@@ -124,7 +129,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
err = truncate_bdev_range(bdev, mode, start, start + len - 1);
if (err)
goto fail;
- err = blkdev_issue_discard(bdev, start >> 9, len >> 9, GFP_KERNEL);
+ err = blkdev_issue_discard(bdev, sector, nr_sects, GFP_KERNEL);
fail:
filemap_invalidate_unlock(inode->i_mapping);
return err;
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 03/10] block: add a blk_next_discard_bio helper
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
2024-03-07 15:11 ` [PATCH 01/10] block: remove the discard_granularity check in __blkdev_issue_discard Christoph Hellwig
2024-03-07 15:11 ` [PATCH 02/10] block: move discard checks into the ioctl handler Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 04/10] xfs: switch to using blk_next_discard_bio directly Christoph Hellwig
` (7 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This factors the guts of __blkdev_issue_discard into a helper that can
be used by callers avoid the fatal_signal_pending check and to generally
simplify the bio handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-lib.c | 41 ++++++++++++++++++++++++-----------------
include/linux/blkdev.h | 2 ++
2 files changed, 26 insertions(+), 17 deletions(-)
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 50923508a32466..4f2d52210b129c 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -55,28 +55,35 @@ static void await_bio_chain(struct bio *bio)
blk_wait_io(&done);
}
+bool blk_next_discard_bio(struct block_device *bdev, struct bio **biop,
+ sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask)
+{
+ sector_t bio_sects = min(*nr_sects, bio_discard_limit(bdev, *sector));
+
+ if (!bio_sects)
+ return false;
+
+ *biop = blk_next_bio(*biop, bdev, 0, REQ_OP_DISCARD, gfp_mask);
+ (*biop)->bi_iter.bi_sector = *sector;
+ (*biop)->bi_iter.bi_size = bio_sects << SECTOR_SHIFT;
+ *sector += bio_sects;
+ *nr_sects -= bio_sects;
+ /*
+ * We can loop for a long time in here if someone does full device
+ * discards (like mkfs). Be nice and allow us to schedule out to avoid
+ * softlocking if preempt is disabled.
+ */
+ cond_resched();
+ return true;
+}
+EXPORT_SYMBOL_GPL(blk_next_discard_bio);
+
int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
{
struct bio *bio = *biop;
- while (nr_sects) {
- sector_t req_sects =
- min(nr_sects, bio_discard_limit(bdev, sector));
-
- bio = blk_next_bio(bio, bdev, 0, REQ_OP_DISCARD, gfp_mask);
- bio->bi_iter.bi_sector = sector;
- bio->bi_iter.bi_size = req_sects << 9;
- sector += req_sects;
- nr_sects -= req_sects;
-
- /*
- * We can loop for a long time in here, if someone does
- * full device discards (like mkfs). Be nice and allow
- * us to schedule out to avoid softlocking if preempt
- * is disabled.
- */
- cond_resched();
+ while (blk_next_discard_bio(bdev, &bio, §or, &nr_sects, gfp_mask)) {
if (fatal_signal_pending(current)) {
await_bio_chain(bio);
return -EINTR;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a9a0bd6cac4aff..b87cd889008291 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1059,6 +1059,8 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask);
int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, struct bio **biop);
+bool blk_next_discard_bio(struct block_device *bdev, struct bio **biop,
+ sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp);
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 04/10] xfs: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (2 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 03/10] block: add a blk_next_discard_bio helper Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 05/10] f2fs: " Christoph Hellwig
` (6 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_discard.c | 47 ++++++++++++++------------------------------
fs/xfs/xfs_discard.h | 2 +-
2 files changed, 16 insertions(+), 33 deletions(-)
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index d5787991bb5b46..6396a4e14809a2 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -102,33 +102,26 @@ xfs_discard_endio(
* list. We plug and chain the bios so that we only need a single completion
* call to clear all the busy extents once the discards are complete.
*/
-int
+void
xfs_discard_extents(
struct xfs_mount *mp,
struct xfs_busy_extents *extents)
{
+ struct block_device *bdev = mp->m_ddev_targp->bt_bdev;
struct xfs_extent_busy *busyp;
struct bio *bio = NULL;
struct blk_plug plug;
- int error = 0;
blk_start_plug(&plug);
list_for_each_entry(busyp, &extents->extent_list, list) {
+ sector_t sector = XFS_AGB_TO_DADDR(mp, busyp->agno, busyp->bno);
+ sector_t nr_sects = XFS_FSB_TO_BB(mp, busyp->length);
+
trace_xfs_discard_extent(mp, busyp->agno, busyp->bno,
busyp->length);
-
- error = __blkdev_issue_discard(mp->m_ddev_targp->bt_bdev,
- XFS_AGB_TO_DADDR(mp, busyp->agno, busyp->bno),
- XFS_FSB_TO_BB(mp, busyp->length),
- GFP_NOFS, &bio);
- if (error && error != -EOPNOTSUPP) {
- xfs_info(mp,
- "discard failed for extent [0x%llx,%u], error %d",
- (unsigned long long)busyp->bno,
- busyp->length,
- error);
- break;
- }
+ while (blk_next_discard_bio(bdev, &bio, §or, &nr_sects,
+ GFP_NOFS))
+ ;
}
if (bio) {
@@ -139,11 +132,8 @@ xfs_discard_extents(
xfs_discard_endio_work(&extents->endio_work);
}
blk_finish_plug(&plug);
-
- return error;
}
-
static int
xfs_trim_gather_extents(
struct xfs_perag *pag,
@@ -306,16 +296,14 @@ xfs_trim_extents(
.ar_blockcount = pag->pagf_longest,
.ar_startblock = NULLAGBLOCK,
};
- int error = 0;
do {
struct xfs_busy_extents *extents;
+ int error;
extents = kzalloc(sizeof(*extents), GFP_KERNEL);
- if (!extents) {
- error = -ENOMEM;
- break;
- }
+ if (!extents)
+ return -ENOMEM;
extents->mount = pag->pag_mount;
extents->owner = extents;
@@ -325,7 +313,7 @@ xfs_trim_extents(
&tcur, extents, blocks_trimmed);
if (error) {
kfree(extents);
- break;
+ return error;
}
/*
@@ -338,17 +326,12 @@ xfs_trim_extents(
* list after this function call, as it may have been freed by
* the time control returns to us.
*/
- error = xfs_discard_extents(pag->pag_mount, extents);
- if (error)
- break;
-
+ xfs_discard_extents(pag->pag_mount, extents);
if (xfs_trim_should_stop())
- break;
-
+ return 0;
} while (tcur.ar_blockcount != 0);
- return error;
-
+ return 0;
}
/*
diff --git a/fs/xfs/xfs_discard.h b/fs/xfs/xfs_discard.h
index 2b1a85223a56c6..8c5cc4af6a0787 100644
--- a/fs/xfs/xfs_discard.h
+++ b/fs/xfs/xfs_discard.h
@@ -6,7 +6,7 @@ struct fstrim_range;
struct xfs_mount;
struct xfs_busy_extents;
-int xfs_discard_extents(struct xfs_mount *mp, struct xfs_busy_extents *busy);
+void xfs_discard_extents(struct xfs_mount *mp, struct xfs_busy_extents *busy);
int xfs_ioc_trim(struct xfs_mount *mp, struct fstrim_range __user *fstrim);
#endif /* XFS_DISCARD_H */
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 05/10] f2fs: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (3 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 04/10] xfs: switch to using blk_next_discard_bio directly Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 06/10] ext4: " Christoph Hellwig
` (5 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/f2fs/segment.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e1065ba7020761..c131e138d74f94 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1305,10 +1305,12 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
if (time_to_inject(sbi, FAULT_DISCARD)) {
err = -EIO;
} else {
- err = __blkdev_issue_discard(bdev,
- SECTOR_FROM_BLOCK(start),
- SECTOR_FROM_BLOCK(len),
- GFP_NOFS, &bio);
+ sector_t sector = SECTOR_FROM_BLOCK(start);
+ sector_t nr_sects = SECTOR_FROM_BLOCK(len);
+
+ while (blk_next_discard_bio(bdev, &bio, §or,
+ &nr_sects, GFP_NOFS))
+ ;
}
if (err) {
spin_lock_irqsave(&dc->lock, flags);
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 06/10] ext4: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (4 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 05/10] f2fs: " Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 16:13 ` Keith Busch
2024-03-07 15:11 ` [PATCH 07/10] nvmet: " Christoph Hellwig
` (4 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/ext4/mballoc.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index e4f7cf9d89c45a..73437510bde26c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3840,12 +3840,16 @@ static inline int ext4_issue_discard(struct super_block *sb,
trace_ext4_discard_blocks(sb,
(unsigned long long) discard_block, count);
if (biop) {
- return __blkdev_issue_discard(sb->s_bdev,
- (sector_t)discard_block << (sb->s_blocksize_bits - 9),
- (sector_t)count << (sb->s_blocksize_bits - 9),
- GFP_NOFS, biop);
- } else
- return sb_issue_discard(sb, discard_block, count, GFP_NOFS, 0);
+ unsigned int sshift = (sb->s_blocksize_bits - SECTOR_SHIFT);
+ sector_t sector = (sector_t)discard_block << sshift;
+ sector_t nr_sects = (sector_t)count << sshift;
+
+ while (blk_next_discard_bio(sb->s_bdev, biop, §or,
+ &nr_sects, GFP_NOFS))
+ ;
+ return 0;
+ }
+ return sb_issue_discard(sb, discard_block, count, GFP_NOFS, 0);
}
static void ext4_free_data_in_buddy(struct super_block *sb,
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 07/10] nvmet: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (5 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 06/10] ext4: " Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 08/10] md: " Christoph Hellwig
` (3 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/nvme/target/io-cmd-bdev.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index f11400a908f269..c1345aaf837d93 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -363,17 +363,13 @@ u16 nvmet_bdev_flush(struct nvmet_req *req)
static u16 nvmet_bdev_discard_range(struct nvmet_req *req,
struct nvme_dsm_range *range, struct bio **bio)
{
- struct nvmet_ns *ns = req->ns;
- int ret;
+ sector_t sector = nvmet_lba_to_sect(req->ns, range->slba);
+ sector_t nr_sects = le32_to_cpu(range->nlb) <<
+ (req->ns->blksize_shift - SECTOR_SHIFT);
- ret = __blkdev_issue_discard(ns->bdev,
- nvmet_lba_to_sect(ns, range->slba),
- le32_to_cpu(range->nlb) << (ns->blksize_shift - 9),
- GFP_KERNEL, bio);
- if (ret && ret != -EOPNOTSUPP) {
- req->error_slba = le64_to_cpu(range->slba);
- return errno_to_nvme_status(req, ret);
- }
+ while (blk_next_discard_bio(req->ns->bdev, bio, §or, &nr_sects,
+ GFP_KERNEL))
+ ;
return NVME_SC_SUCCESS;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 08/10] md: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (6 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 07/10] nvmet: " Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 09/10] dm-thin: " Christoph Hellwig
` (2 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/md.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7d7b982e369c11..5803a298dd40f9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8722,8 +8722,10 @@ void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev,
{
struct bio *discard_bio = NULL;
- if (__blkdev_issue_discard(rdev->bdev, start, size, GFP_NOIO,
- &discard_bio) || !discard_bio)
+ while (blk_next_discard_bio(rdev->bdev, &discard_bio, &start, &size,
+ GFP_NOIO))
+ ;
+ if (!discard_bio)
return;
bio_chain(discard_bio, bio);
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 09/10] dm-thin: switch to using blk_next_discard_bio directly
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (7 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 08/10] md: " Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 10/10] block: remove __blkdev_issue_discard Christoph Hellwig
2024-03-07 21:05 ` RFC: untangle and fix __blkdev_issue_discard Keith Busch
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
This fixes fatal signals getting into the way and corrupting the bio
chain and removes the need to handle synchronous errors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/dm-thin.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 07c7f9795b107b..becf1b66262d34 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -398,10 +398,13 @@ static void begin_discard(struct discard_op *op, struct thin_c *tc, struct bio *
static int issue_discard(struct discard_op *op, dm_block_t data_b, dm_block_t data_e)
{
struct thin_c *tc = op->tc;
+ struct block_device *bdev = tc->pool_dev->bdev;
sector_t s = block_to_sectors(tc->pool, data_b);
sector_t len = block_to_sectors(tc->pool, data_e - data_b);
- return __blkdev_issue_discard(tc->pool_dev->bdev, s, len, GFP_NOIO, &op->bio);
+ while (blk_next_discard_bio(bdev, &op->bio, &s, &len, GFP_NOIO))
+ ;
+ return 0;
}
static void end_discard(struct discard_op *op, int r)
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 10/10] block: remove __blkdev_issue_discard
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (8 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 09/10] dm-thin: " Christoph Hellwig
@ 2024-03-07 15:11 ` Christoph Hellwig
2024-03-07 21:05 ` RFC: untangle and fix __blkdev_issue_discard Keith Busch
10 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-07 15:11 UTC (permalink / raw)
To: Jens Axboe, Chandan Babu R, Keith Busch
Cc: linux-block, linux-nvme, linux-xfs
Fold what is left of __blkdev_issue_discard into blkdev_issue_discard and
simplify the error handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-lib.c | 29 +++++++++--------------------
include/linux/blkdev.h | 2 --
2 files changed, 9 insertions(+), 22 deletions(-)
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 4f2d52210b129c..39e6e21eb2982d 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -78,23 +78,6 @@ bool blk_next_discard_bio(struct block_device *bdev, struct bio **biop,
}
EXPORT_SYMBOL_GPL(blk_next_discard_bio);
-int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
- sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
-{
- struct bio *bio = *biop;
-
- while (blk_next_discard_bio(bdev, &bio, §or, &nr_sects, gfp_mask)) {
- if (fatal_signal_pending(current)) {
- await_bio_chain(bio);
- return -EINTR;
- }
- }
-
- *biop = bio;
- return 0;
-}
-EXPORT_SYMBOL(__blkdev_issue_discard);
-
/**
* blkdev_issue_discard - queue a discard
* @bdev: blockdev to issue discard for
@@ -113,16 +96,22 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
int ret;
blk_start_plug(&plug);
- ret = __blkdev_issue_discard(bdev, sector, nr_sects, gfp_mask, &bio);
- if (!ret && bio) {
+ while (blk_next_discard_bio(bdev, &bio, §or, &nr_sects, gfp_mask))
+ if (fatal_signal_pending(current))
+ goto fatal_signal;
+ if (bio) {
ret = submit_bio_wait(bio);
if (ret == -EOPNOTSUPP)
ret = 0;
bio_put(bio);
}
blk_finish_plug(&plug);
-
return ret;
+
+fatal_signal:
+ blk_finish_plug(&plug);
+ await_bio_chain(bio);
+ return -EINTR;
}
EXPORT_SYMBOL(blkdev_issue_discard);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b87cd889008291..a1e638fb90fa77 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1057,8 +1057,6 @@ extern void blk_io_schedule(void);
int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask);
-int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
- sector_t nr_sects, gfp_t gfp_mask, struct bio **biop);
bool blk_next_discard_bio(struct block_device *bdev, struct bio **biop,
sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector,
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 06/10] ext4: switch to using blk_next_discard_bio directly
2024-03-07 15:11 ` [PATCH 06/10] ext4: " Christoph Hellwig
@ 2024-03-07 16:13 ` Keith Busch
2024-03-08 15:21 ` Christoph Hellwig
0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2024-03-07 16:13 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs
On Thu, Mar 07, 2024 at 08:11:53AM -0700, Christoph Hellwig wrote:
> @@ -3840,12 +3840,16 @@ static inline int ext4_issue_discard(struct super_block *sb,
> trace_ext4_discard_blocks(sb,
> (unsigned long long) discard_block, count);
> if (biop) {
Does this 'if' case even need to exist? It looks unreachable since there
are only two callers of ext4_issue_discard(), and they both set 'biop'
to NULL. It looks like the last remaining caller using 'biop' was
removed with 55cdd0af2bc5ffc ("ext4: get discard out of jbd2 commit
kthread contex")
> - return __blkdev_issue_discard(sb->s_bdev,
> - (sector_t)discard_block << (sb->s_blocksize_bits - 9),
> - (sector_t)count << (sb->s_blocksize_bits - 9),
> - GFP_NOFS, biop);
> - } else
> - return sb_issue_discard(sb, discard_block, count, GFP_NOFS, 0);
> + unsigned int sshift = (sb->s_blocksize_bits - SECTOR_SHIFT);
> + sector_t sector = (sector_t)discard_block << sshift;
> + sector_t nr_sects = (sector_t)count << sshift;
> +
> + while (blk_next_discard_bio(sb->s_bdev, biop, §or,
> + &nr_sects, GFP_NOFS))
> + ;
This pattern is repeated often in this series, so perhaps a helper
function for this common use case.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFC: untangle and fix __blkdev_issue_discard
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
` (9 preceding siblings ...)
2024-03-07 15:11 ` [PATCH 10/10] block: remove __blkdev_issue_discard Christoph Hellwig
@ 2024-03-07 21:05 ` Keith Busch
2024-03-08 15:21 ` Christoph Hellwig
10 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2024-03-07 21:05 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs
On Thu, Mar 07, 2024 at 08:11:47AM -0700, Christoph Hellwig wrote:
> this tries to address the block for-next oops Chandan reported on XFS.
> I can't actually reproduce it unfortunately, but this series should
> sort it out by movign the fatal_signal_pending check out of all but
> the ioctl path.
The last patch moves fatal_signal_pending check to blkdev_issue_discard
path, which has a more users than just the ioctl path.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/10] block: move discard checks into the ioctl handler
2024-03-07 15:11 ` [PATCH 02/10] block: move discard checks into the ioctl handler Christoph Hellwig
@ 2024-03-07 21:33 ` Dave Chinner
2024-03-08 15:22 ` Christoph Hellwig
0 siblings, 1 reply; 18+ messages in thread
From: Dave Chinner @ 2024-03-07 21:33 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Chandan Babu R, Keith Busch, linux-block, linux-nvme,
linux-xfs
On Thu, Mar 07, 2024 at 08:11:49AM -0700, Christoph Hellwig wrote:
> Most bio operations get basic sanity checking in submit_bio and anything
> more complicated than that is done in the callers. Discards are a bit
> different from that in that a lot of checking is done in
> __blkdev_issue_discard, and the specific errnos for that are returned
> to userspace. Move the checks that require specific errnos to the ioctl
> handler instead, and just leave the basic sanity checking in submit_bio
> for the other handlers. This introduces two changes in behavior:
>
> 1) the logical block size alignment check of the start and len is lost
> for non-ioctl callers.
> This matches what is done for other operations including reads and
> writes. We should probably verify this for all bios, but for now
> make discards match the normal flow.
> 2) for non-ioctl callers all errors are reported on I/O completion now
> instead of synchronously. Callers in general mostly ignore or log
> errors so this will actually simplify the code once cleaned up
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
OK.
> ---
> block/blk-lib.c | 13 -------------
> block/ioctl.c | 13 +++++++++----
> 2 files changed, 9 insertions(+), 17 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index f873eb9a886f63..50923508a32466 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -59,19 +59,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
> {
> struct bio *bio = *biop;
> - sector_t bs_mask;
> -
> - if (bdev_read_only(bdev))
> - return -EPERM;
> - if (!bdev_max_discard_sectors(bdev))
> - return -EOPNOTSUPP;
> -
> - bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
> - if ((sector | nr_sects) & bs_mask)
> - return -EINVAL;
> -
> - if (!nr_sects)
> - return -EINVAL;
>
> while (nr_sects) {
> sector_t req_sects =
> diff --git a/block/ioctl.c b/block/ioctl.c
> index de0cc0d215c633..1d5de0a890c5e8 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -95,6 +95,8 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
> static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
> unsigned long arg)
> {
> + sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
> + sector_t sector, nr_sects;
This changes the alignment checks from a hard coded 512 byte sector
to the logical block size of the device. I don't see a problem with
this (it fixes a bug) but it should at least be mentioned in the
commit message.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 06/10] ext4: switch to using blk_next_discard_bio directly
2024-03-07 16:13 ` Keith Busch
@ 2024-03-08 15:21 ` Christoph Hellwig
0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-08 15:21 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, linux-block,
linux-nvme, linux-xfs
On Thu, Mar 07, 2024 at 09:13:01AM -0700, Keith Busch wrote:
> On Thu, Mar 07, 2024 at 08:11:53AM -0700, Christoph Hellwig wrote:
> > @@ -3840,12 +3840,16 @@ static inline int ext4_issue_discard(struct super_block *sb,
> > trace_ext4_discard_blocks(sb,
> > (unsigned long long) discard_block, count);
> > if (biop) {
>
> Does this 'if' case even need to exist? It looks unreachable since there
> are only two callers of ext4_issue_discard(), and they both set 'biop'
> to NULL. It looks like the last remaining caller using 'biop' was
> removed with 55cdd0af2bc5ffc ("ext4: get discard out of jbd2 commit
> kthread contex")
Yeah. I didn't really want to dig so far into code I don't know well
for this series, though :)
> > + while (blk_next_discard_bio(sb->s_bdev, biop, §or,
> > + &nr_sects, GFP_NOFS))
> > + ;
>
> This pattern is repeated often in this series, so perhaps a helper
> function for this common use case.
Well, it's 2-3 lines vs 1 line for a much better interface.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFC: untangle and fix __blkdev_issue_discard
2024-03-07 21:05 ` RFC: untangle and fix __blkdev_issue_discard Keith Busch
@ 2024-03-08 15:21 ` Christoph Hellwig
0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-08 15:21 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, linux-block,
linux-nvme, linux-xfs
On Thu, Mar 07, 2024 at 02:05:03PM -0700, Keith Busch wrote:
> On Thu, Mar 07, 2024 at 08:11:47AM -0700, Christoph Hellwig wrote:
> > this tries to address the block for-next oops Chandan reported on XFS.
> > I can't actually reproduce it unfortunately, but this series should
> > sort it out by movign the fatal_signal_pending check out of all but
> > the ioctl path.
>
> The last patch moves fatal_signal_pending check to blkdev_issue_discard
> path, which has a more users than just the ioctl path.
That's true. I guess I actually need to open code blkdev_issue_discard
in the ioctl path or pass a flag (which would be a bit annoying)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/10] block: move discard checks into the ioctl handler
2024-03-07 21:33 ` Dave Chinner
@ 2024-03-08 15:22 ` Christoph Hellwig
2024-03-08 21:16 ` Dave Chinner
0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2024-03-08 15:22 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, Keith Busch,
linux-block, linux-nvme, linux-xfs
On Fri, Mar 08, 2024 at 08:33:08AM +1100, Dave Chinner wrote:
> > static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
> > unsigned long arg)
> > {
> > + sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
> > + sector_t sector, nr_sects;
>
> This changes the alignment checks from a hard coded 512 byte sector
> to the logical block size of the device. I don't see a problem with
> this (it fixes a bug) but it should at least be mentioned in the
> commit message.
Before the exact block size alignment check as done down in
__blkdev_issue_discard, it just moves up here now. I guess I need to
make that more clear in the commit message.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/10] block: move discard checks into the ioctl handler
2024-03-08 15:22 ` Christoph Hellwig
@ 2024-03-08 21:16 ` Dave Chinner
0 siblings, 0 replies; 18+ messages in thread
From: Dave Chinner @ 2024-03-08 21:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Chandan Babu R, Keith Busch, linux-block, linux-nvme,
linux-xfs
On Fri, Mar 08, 2024 at 04:22:44PM +0100, Christoph Hellwig wrote:
> On Fri, Mar 08, 2024 at 08:33:08AM +1100, Dave Chinner wrote:
> > > static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
> > > unsigned long arg)
> > > {
> > > + sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
> > > + sector_t sector, nr_sects;
> >
> > This changes the alignment checks from a hard coded 512 byte sector
> > to the logical block size of the device. I don't see a problem with
> > this (it fixes a bug) but it should at least be mentioned in the
> > commit message.
>
> Before the exact block size alignment check as done down in
> __blkdev_issue_discard, it just moves up here now. I guess I need to
> make that more clear in the commit message.
Ah, eyeball pattern matching fail on my part - you changed it from a
hard coded '9' to SECTOR_SHIFT as it moved (which is fine!), I just
missed that. All good!
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-03-08 21:16 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-07 15:11 RFC: untangle and fix __blkdev_issue_discard Christoph Hellwig
2024-03-07 15:11 ` [PATCH 01/10] block: remove the discard_granularity check in __blkdev_issue_discard Christoph Hellwig
2024-03-07 15:11 ` [PATCH 02/10] block: move discard checks into the ioctl handler Christoph Hellwig
2024-03-07 21:33 ` Dave Chinner
2024-03-08 15:22 ` Christoph Hellwig
2024-03-08 21:16 ` Dave Chinner
2024-03-07 15:11 ` [PATCH 03/10] block: add a blk_next_discard_bio helper Christoph Hellwig
2024-03-07 15:11 ` [PATCH 04/10] xfs: switch to using blk_next_discard_bio directly Christoph Hellwig
2024-03-07 15:11 ` [PATCH 05/10] f2fs: " Christoph Hellwig
2024-03-07 15:11 ` [PATCH 06/10] ext4: " Christoph Hellwig
2024-03-07 16:13 ` Keith Busch
2024-03-08 15:21 ` Christoph Hellwig
2024-03-07 15:11 ` [PATCH 07/10] nvmet: " Christoph Hellwig
2024-03-07 15:11 ` [PATCH 08/10] md: " Christoph Hellwig
2024-03-07 15:11 ` [PATCH 09/10] dm-thin: " Christoph Hellwig
2024-03-07 15:11 ` [PATCH 10/10] block: remove __blkdev_issue_discard Christoph Hellwig
2024-03-07 21:05 ` RFC: untangle and fix __blkdev_issue_discard Keith Busch
2024-03-08 15:21 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).