* [PATCH 0/2] block: Fix __blkdev_issue_write_zeroes() limit handling @ 2024-08-15 8:27 John Garry 2024-08-15 8:27 ` [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() John Garry 2024-08-15 8:27 ` [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() John Garry 0 siblings, 2 replies; 9+ messages in thread From: John Garry @ 2024-08-15 8:27 UTC (permalink / raw) To: axboe, hch; +Cc: martin.petersen, linux-block, kbusch, John Garry As reported in [0], we may get an infinite loop in __blkdev_issue_write_zeroes() for making an XFS FS on a raid0 config. Fix __blkdev_issue_write_zeroes() limit handling by reading the write zeroes limit outside the loop. Also include a change to drop the unnecessary NULL queue check in bdev_write_zeroes_sectors(). [0] https://lore.kernel.org/linux-block/20240815062112.GA14067@lst.de/T/#m14ed5d882f9390a46cfe2fcfa2b51218aafbd32e John Garry (2): block: Read max write zeroes once for __blkdev_issue_write_zeroes() block: Drop NULL check in bdev_write_zeroes_sectors() block/blk-lib.c | 22 +++++++++++++++------- include/linux/blkdev.h | 7 +------ 2 files changed, 16 insertions(+), 13 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 8:27 [PATCH 0/2] block: Fix __blkdev_issue_write_zeroes() limit handling John Garry @ 2024-08-15 8:27 ` John Garry 2024-08-15 12:40 ` Christoph Hellwig 2024-08-15 8:27 ` [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() John Garry 1 sibling, 1 reply; 9+ messages in thread From: John Garry @ 2024-08-15 8:27 UTC (permalink / raw) To: axboe, hch; +Cc: martin.petersen, linux-block, kbusch, John Garry As reported in [0], we may get a hang when formatting a XFS FS on a RAID0 drive. Commit 73a768d5f955 ("block: factor out a blk_write_zeroes_limit helper") changed __blkdev_issue_write_zeroes() to read the max write zeroes value in the loop. This is not safe as max write zeroes may change in value. Specifically for the case of [0], the value goes to 0, and we get an infinite loop. Lift the limit reading out of the loop. [0] https://lore.kernel.org/linux-xfs/4d31268f-310b-4220-88a2-e191c3932a82@oracle.com/T/#t Fixes: 73a768d5f955 ("block: factor out a blk_write_zeroes_limit helper") Signed-off-by: John Garry <john.g.garry@oracle.com> --- block/blk-lib.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/block/blk-lib.c b/block/blk-lib.c index 9f735efa6c94..00cee94e5d42 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -111,13 +111,17 @@ static sector_t bio_write_zeroes_limit(struct block_device *bdev) (UINT_MAX >> SECTOR_SHIFT) & ~bs_mask); } +/* + * Pass bio_write_zeroes_limit() return value in @limit, as the return + * value may change after a REQ_OP_WRITE_ZEROES is issued. + */ static void __blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, - struct bio **biop, unsigned flags) + struct bio **biop, unsigned flags, sector_t limit) { + while (nr_sects) { - unsigned int len = min_t(sector_t, nr_sects, - bio_write_zeroes_limit(bdev)); + unsigned int len = min(nr_sects, limit); struct bio *bio; if ((flags & BLKDEV_ZERO_KILLABLE) && @@ -141,12 +145,14 @@ static void __blkdev_issue_write_zeroes(struct block_device *bdev, static int blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp, unsigned flags) { + sector_t limit = bio_write_zeroes_limit(bdev); struct bio *bio = NULL; struct blk_plug plug; int ret = 0; blk_start_plug(&plug); - __blkdev_issue_write_zeroes(bdev, sector, nr_sects, gfp, &bio, flags); + __blkdev_issue_write_zeroes(bdev, sector, nr_sects, gfp, &bio, + flags, limit); if (bio) { if ((flags & BLKDEV_ZERO_KILLABLE) && fatal_signal_pending(current)) { @@ -165,7 +171,7 @@ static int blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, * on an I/O error, in which case we'll turn any error into * "not supported" here. */ - if (ret && !bdev_write_zeroes_sectors(bdev)) + if (ret && !limit) return -EOPNOTSUPP; return ret; } @@ -265,12 +271,14 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop, unsigned flags) { + sector_t limit = bio_write_zeroes_limit(bdev); + if (bdev_read_only(bdev)) return -EPERM; - if (bdev_write_zeroes_sectors(bdev)) { + if (limit) { __blkdev_issue_write_zeroes(bdev, sector, nr_sects, - gfp_mask, biop, flags); + gfp_mask, biop, flags, limit); } else { if (flags & BLKDEV_ZERO_NOFALLBACK) return -EOPNOTSUPP; -- 2.31.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 8:27 ` [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() John Garry @ 2024-08-15 12:40 ` Christoph Hellwig 2024-08-15 13:29 ` John Garry 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2024-08-15 12:40 UTC (permalink / raw) To: John Garry; +Cc: axboe, hch, martin.petersen, linux-block, kbusch On Thu, Aug 15, 2024 at 08:27:54AM +0000, John Garry wrote: > +/* > + * Pass bio_write_zeroes_limit() return value in @limit, as the return > + * value may change after a REQ_OP_WRITE_ZEROES is issued. > + */ I don't think that really helps all that much to explain the issue, which is about SCSI not having an ahead of time flag that reliably works for write same support, which makes it clear the limit to 0 on the first I/O completion. Maybe you can actually spell this out? Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 12:40 ` Christoph Hellwig @ 2024-08-15 13:29 ` John Garry 2024-08-15 15:26 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: John Garry @ 2024-08-15 13:29 UTC (permalink / raw) To: Christoph Hellwig; +Cc: axboe, martin.petersen, linux-block, kbusch On 15/08/2024 13:40, Christoph Hellwig wrote: > On Thu, Aug 15, 2024 at 08:27:54AM +0000, John Garry wrote: >> +/* >> + * Pass bio_write_zeroes_limit() return value in @limit, as the return >> + * value may change after a REQ_OP_WRITE_ZEROES is issued. >> + */ > I don't think that really helps all that much to explain the issue, > which is about SCSI not having an ahead of time flag that reliably > works for write same support, which makes it clear the limit to 0 on > the first I/O completion. Maybe you can actually spell this out? Please just tell me what you would like to see, and I will copy verbatim. cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 13:29 ` John Garry @ 2024-08-15 15:26 ` Christoph Hellwig 2024-08-15 15:38 ` John Garry 2024-08-15 15:41 ` Martin K. Petersen 0 siblings, 2 replies; 9+ messages in thread From: Christoph Hellwig @ 2024-08-15 15:26 UTC (permalink / raw) To: John Garry; +Cc: Christoph Hellwig, axboe, martin.petersen, linux-block, kbusch On Thu, Aug 15, 2024 at 02:29:46PM +0100, John Garry wrote: > On 15/08/2024 13:40, Christoph Hellwig wrote: >> On Thu, Aug 15, 2024 at 08:27:54AM +0000, John Garry wrote: >>> +/* >>> + * Pass bio_write_zeroes_limit() return value in @limit, as the return >>> + * value may change after a REQ_OP_WRITE_ZEROES is issued. >>> + */ >> I don't think that really helps all that much to explain the issue, >> which is about SCSI not having an ahead of time flag that reliably >> works for write same support, which makes it clear the limit to 0 on >> the first I/O completion. Maybe you can actually spell this out? > > Please just tell me what you would like to see, and I will copy verbatim. Probably just what I just wrote. Unless Martin can come up with better language. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 15:26 ` Christoph Hellwig @ 2024-08-15 15:38 ` John Garry 2024-08-15 15:41 ` Martin K. Petersen 1 sibling, 0 replies; 9+ messages in thread From: John Garry @ 2024-08-15 15:38 UTC (permalink / raw) To: Christoph Hellwig; +Cc: axboe, martin.petersen, linux-block, kbusch On 15/08/2024 16:26, Christoph Hellwig wrote: >>>> +/* >>>> + * Pass bio_write_zeroes_limit() return value in @limit, as the return >>>> + * value may change after a REQ_OP_WRITE_ZEROES is issued. >>>> + */ >>> I don't think that really helps all that much to explain the issue, >>> which is about SCSI not having an ahead of time flag that reliably >>> works for write same support, which makes it clear the limit to 0 on >>> the first I/O completion. Maybe you can actually spell this out? >> Please just tell me what you would like to see, and I will copy verbatim. > Probably just what I just wrote. Unless Martin can come up with > better language. /* * SCSI does not have a reliable ahead of time flag that reliably * works for write same support; instead, the limit is set to zero for * the first failed I/O completion. As such, only read the limit prior * to submission, as it may later change. */ ok? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() 2024-08-15 15:26 ` Christoph Hellwig 2024-08-15 15:38 ` John Garry @ 2024-08-15 15:41 ` Martin K. Petersen 1 sibling, 0 replies; 9+ messages in thread From: Martin K. Petersen @ 2024-08-15 15:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: John Garry, axboe, martin.petersen, linux-block, kbusch Christoph, >>>> +/* >>>> + * Pass bio_write_zeroes_limit() return value in @limit, as the return >>>> + * value may change after a REQ_OP_WRITE_ZEROES is issued. >>>> + */ >>> I don't think that really helps all that much to explain the issue, >>> which is about SCSI not having an ahead of time flag that reliably >>> works for write same support, which makes it clear the limit to 0 on >>> the first I/O completion. Maybe you can actually spell this out? >> >> Please just tell me what you would like to see, and I will copy verbatim. > > Probably just what I just wrote. Unless Martin can come up with > better language. There is no reliable way for the SCSI subsystem to determine whether a device supports a WRITE SAME operation without actually performing a write to media. As a result, write_zeroes is enabled by default and will be disabled if a zeroing operation subsequently fails. This means that this queue limit is likely to change at runtime. Something like that, perhaps? -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() 2024-08-15 8:27 [PATCH 0/2] block: Fix __blkdev_issue_write_zeroes() limit handling John Garry 2024-08-15 8:27 ` [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() John Garry @ 2024-08-15 8:27 ` John Garry 2024-08-15 12:40 ` Christoph Hellwig 1 sibling, 1 reply; 9+ messages in thread From: John Garry @ 2024-08-15 8:27 UTC (permalink / raw) To: axboe, hch; +Cc: martin.petersen, linux-block, kbusch, John Garry Function bdev_get_queue() must not return NULL, so drop the check in bdev_write_zeroes_sectors(). Signed-off-by: John Garry <john.g.garry@oracle.com> --- include/linux/blkdev.h | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e85ec73a07d5..b7664d593486 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1296,12 +1296,7 @@ bdev_max_secure_erase_sectors(struct block_device *bdev) static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev) { - struct request_queue *q = bdev_get_queue(bdev); - - if (q) - return q->limits.max_write_zeroes_sectors; - - return 0; + return bdev_get_queue(bdev)->limits.max_write_zeroes_sectors; } static inline bool bdev_nonrot(struct block_device *bdev) -- 2.31.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() 2024-08-15 8:27 ` [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() John Garry @ 2024-08-15 12:40 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2024-08-15 12:40 UTC (permalink / raw) To: John Garry; +Cc: axboe, hch, martin.petersen, linux-block, kbusch Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-08-15 15:41 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-15 8:27 [PATCH 0/2] block: Fix __blkdev_issue_write_zeroes() limit handling John Garry 2024-08-15 8:27 ` [PATCH 1/2] block: Read max write zeroes once for __blkdev_issue_write_zeroes() John Garry 2024-08-15 12:40 ` Christoph Hellwig 2024-08-15 13:29 ` John Garry 2024-08-15 15:26 ` Christoph Hellwig 2024-08-15 15:38 ` John Garry 2024-08-15 15:41 ` Martin K. Petersen 2024-08-15 8:27 ` [PATCH 2/2] block: Drop NULL check in bdev_write_zeroes_sectors() John Garry 2024-08-15 12:40 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox