linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Subject: [PATCH RFC] block: use discard if possible in blkdev_issue_discard()
Date: Thu, 13 Feb 2014 23:32:56 -0500	[thread overview]
Message-ID: <20140214043256.GA5145@thunk.org> (raw)

How do people think I should implement this functionality?  I see
three possible choices:

1) The patch attached below

2) Add this functionality to blk-lib.c, but with a new function name,
   such as blkdev_zero_blocks(), and keep blkdev_issue_zeroout()
   unchanged

3) This functionality shouldn't be in the block device layer; if you
   want something like this, add it to fs/ext4 instead, and other file
   systems can optimize sb_issue_zeroout() themselves if they want.

And if the answer is (1) or (2), do people mind if I carry this patch
in the ext4 tree, so I can use and test this right away, without
having to worry about merging with the block tree?

Thanks,

					- Ted


commit 1af2359e6ffca707c41ed40618773abe944d0c54
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Thu Feb 13 23:27:27 2014 -0500

    block: use discard if possible in blkdev_issue_zeroout()
    
    If the block device supports discards and guarantees that subsequent
    reads will return zeros (sometimes known as DZAT, for Deterministic
    read Zeros After Trim), use this to implement blkdev_issue_zeroout()
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 2da76c9..62cbf28 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -269,6 +269,32 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	return ret;
 }
 
+static int issue_zeroout_or_write_same(struct block_device *bdev,
+				       sector_t sector,
+				       sector_t nr_sects, gfp_t gfp_mask)
+{
+	if (bdev_write_same(bdev)) {
+		unsigned char bdn[BDEVNAME_SIZE];
+
+		if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
+					     ZERO_PAGE(0)))
+			return 0;
+
+		bdevname(bdev, bdn);
+		pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+	}
+
+	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+}
+
+/*
+ * Like sector_div except don't modify s.
+ */
+static unsigned int sector_mod(sector_t s, unsigned int m)
+{
+	return sector_div(s, m);
+}
+
 /**
  * blkdev_issue_zeroout - zero-fill a block range
  * @bdev:	blockdev to write
@@ -277,23 +303,49 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
  * @gfp_mask:	memory allocation flags (for bio_alloc)
  *
  * Description:
- *  Generate and issue number of bios with zerofiled pages.
+ *  Issues bios which zeros the requested block range.
  */
-
 int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 			 sector_t nr_sects, gfp_t gfp_mask)
 {
-	if (bdev_write_same(bdev)) {
-		unsigned char bdn[BDEVNAME_SIZE];
+	struct request_queue *q = bdev_get_queue(bdev);
+	unsigned int alignment, granularity;
+	unsigned int c;
+	int ret;
 
-		if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
-					     ZERO_PAGE(0)))
-			return 0;
+	if (!q)
+		return -ENXIO;
 
-		bdevname(bdev, bdn);
-		pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+	if (!blk_queue_discard(q) || !queue_discard_zeroes_data(q) ||
+	    q->limits.discard_misaligned)
+		return issue_zeroout_or_write_same(bdev, sector,
+
+						   nr_sects, gfp_mask);
+
+	alignment = q->limits.discard_alignment >> 9;
+	granularity = q->limits.discard_granularity >> 9;
+
+	c = sector_mod(granularity + alignment - sector, granularity);
+	if (c > nr_sects)
+		c = nr_sects;
+
+	if (c) {
+		int ret = issue_zeroout_or_write_same(bdev, sector,
+						      c, gfp_mask);
+		if (ret)
+			return ret;
+		nr_sects -= c;
 	}
+	if (nr_sects == 0)
+		return 0;
 
-	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+	c = sector_mod(nr_sects, granularity);
+
+	ret = blkdev_issue_discard(bdev, sector, nr_sects - c, gfp_mask, 0);
+	if (ret || c == 0)
+		return ret;
+
+	return issue_zeroout_or_write_same(bdev, sector + nr_sects - c, c,
+					   gfp_mask);
 }
 EXPORT_SYMBOL(blkdev_issue_zeroout);



             reply	other threads:[~2014-02-14  4:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-14  4:32 Theodore Ts'o [this message]
2014-02-14 13:05 ` [PATCH RFC] block: use discard if possible in blkdev_issue_discard() Christoph Hellwig
2014-02-14 14:57   ` Theodore Ts'o
2014-02-14 17:14 ` Martin K. Petersen
2014-02-15  1:29   ` Theodore Ts'o
2014-02-17 16:44     ` Martin K. Petersen
2014-02-17 19:19       ` Theodore Ts'o
2014-02-18  1:31         ` Martin K. Petersen
2014-02-18  2:17           ` Theodore Ts'o
2014-02-18  3:44             ` Martin K. Petersen
2014-02-18  5:47               ` Theodore Ts'o
2014-02-19  2:20                 ` Martin K. Petersen
2014-02-17 16:41   ` Lukáš Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140214043256.GA5145@thunk.org \
    --to=tytso@mit.edu \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).