From: Theodore Ts'o <tytso@mit.edu>
To: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Subject: [PATCH RFC] block: use discard if possible in blkdev_issue_discard()
Date: Thu, 13 Feb 2014 23:32:56 -0500 [thread overview]
Message-ID: <20140214043256.GA5145@thunk.org> (raw)
How do people think I should implement this functionality? I see
three possible choices:
1) The patch attached below
2) Add this functionality to blk-lib.c, but with a new function name,
such as blkdev_zero_blocks(), and keep blkdev_issue_zeroout()
unchanged
3) This functionality shouldn't be in the block device layer; if you
want something like this, add it to fs/ext4 instead, and other file
systems can optimize sb_issue_zeroout() themselves if they want.
And if the answer is (1) or (2), do people mind if I carry this patch
in the ext4 tree, so I can use and test this right away, without
having to worry about merging with the block tree?
Thanks,
- Ted
commit 1af2359e6ffca707c41ed40618773abe944d0c54
Author: Theodore Ts'o <tytso@mit.edu>
Date: Thu Feb 13 23:27:27 2014 -0500
block: use discard if possible in blkdev_issue_zeroout()
If the block device supports discards and guarantees that subsequent
reads will return zeros (sometimes known as DZAT, for Deterministic
read Zeros After Trim), use this to implement blkdev_issue_zeroout()
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 2da76c9..62cbf28 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -269,6 +269,32 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
return ret;
}
+static int issue_zeroout_or_write_same(struct block_device *bdev,
+ sector_t sector,
+ sector_t nr_sects, gfp_t gfp_mask)
+{
+ if (bdev_write_same(bdev)) {
+ unsigned char bdn[BDEVNAME_SIZE];
+
+ if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
+ ZERO_PAGE(0)))
+ return 0;
+
+ bdevname(bdev, bdn);
+ pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+ }
+
+ return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+}
+
+/*
+ * Like sector_div except don't modify s.
+ */
+static unsigned int sector_mod(sector_t s, unsigned int m)
+{
+ return sector_div(s, m);
+}
+
/**
* blkdev_issue_zeroout - zero-fill a block range
* @bdev: blockdev to write
@@ -277,23 +303,49 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
* @gfp_mask: memory allocation flags (for bio_alloc)
*
* Description:
- * Generate and issue number of bios with zerofiled pages.
+ * Issues bios which zeros the requested block range.
*/
-
int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask)
{
- if (bdev_write_same(bdev)) {
- unsigned char bdn[BDEVNAME_SIZE];
+ struct request_queue *q = bdev_get_queue(bdev);
+ unsigned int alignment, granularity;
+ unsigned int c;
+ int ret;
- if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
- ZERO_PAGE(0)))
- return 0;
+ if (!q)
+ return -ENXIO;
- bdevname(bdev, bdn);
- pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+ if (!blk_queue_discard(q) || !queue_discard_zeroes_data(q) ||
+ q->limits.discard_misaligned)
+ return issue_zeroout_or_write_same(bdev, sector,
+
+ nr_sects, gfp_mask);
+
+ alignment = q->limits.discard_alignment >> 9;
+ granularity = q->limits.discard_granularity >> 9;
+
+ c = sector_mod(granularity + alignment - sector, granularity);
+ if (c > nr_sects)
+ c = nr_sects;
+
+ if (c) {
+ int ret = issue_zeroout_or_write_same(bdev, sector,
+ c, gfp_mask);
+ if (ret)
+ return ret;
+ nr_sects -= c;
}
+ if (nr_sects == 0)
+ return 0;
- return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+ c = sector_mod(nr_sects, granularity);
+
+ ret = blkdev_issue_discard(bdev, sector, nr_sects - c, gfp_mask, 0);
+ if (ret || c == 0)
+ return ret;
+
+ return issue_zeroout_or_write_same(bdev, sector + nr_sects - c, c,
+ gfp_mask);
}
EXPORT_SYMBOL(blkdev_issue_zeroout);
next reply other threads:[~2014-02-14 4:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-14 4:32 Theodore Ts'o [this message]
2014-02-14 13:05 ` [PATCH RFC] block: use discard if possible in blkdev_issue_discard() Christoph Hellwig
2014-02-14 14:57 ` Theodore Ts'o
2014-02-14 17:14 ` Martin K. Petersen
2014-02-15 1:29 ` Theodore Ts'o
2014-02-17 16:44 ` Martin K. Petersen
2014-02-17 19:19 ` Theodore Ts'o
2014-02-18 1:31 ` Martin K. Petersen
2014-02-18 2:17 ` Theodore Ts'o
2014-02-18 3:44 ` Martin K. Petersen
2014-02-18 5:47 ` Theodore Ts'o
2014-02-19 2:20 ` Martin K. Petersen
2014-02-17 16:41 ` Lukáš Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140214043256.GA5145@thunk.org \
--to=tytso@mit.edu \
--cc=axboe@kernel.dk \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.