All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Subject: [PATCH RFC] block: use discard if possible in blkdev_issue_discard()
Date: Thu, 13 Feb 2014 23:32:56 -0500	[thread overview]
Message-ID: <20140214043256.GA5145@thunk.org> (raw)

How do people think I should implement this functionality?  I see
three possible choices:

1) The patch attached below

2) Add this functionality to blk-lib.c, but with a new function name,
   such as blkdev_zero_blocks(), and keep blkdev_issue_zeroout()
   unchanged

3) This functionality shouldn't be in the block device layer; if you
   want something like this, add it to fs/ext4 instead, and other file
   systems can optimize sb_issue_zeroout() themselves if they want.

And if the answer is (1) or (2), do people mind if I carry this patch
in the ext4 tree, so I can use and test this right away, without
having to worry about merging with the block tree?

Thanks,

					- Ted


commit 1af2359e6ffca707c41ed40618773abe944d0c54
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Thu Feb 13 23:27:27 2014 -0500

    block: use discard if possible in blkdev_issue_zeroout()
    
    If the block device supports discards and guarantees that subsequent
    reads will return zeros (sometimes known as DZAT, for Deterministic
    read Zeros After Trim), use this to implement blkdev_issue_zeroout()
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 2da76c9..62cbf28 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -269,6 +269,32 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	return ret;
 }
 
+static int issue_zeroout_or_write_same(struct block_device *bdev,
+				       sector_t sector,
+				       sector_t nr_sects, gfp_t gfp_mask)
+{
+	if (bdev_write_same(bdev)) {
+		unsigned char bdn[BDEVNAME_SIZE];
+
+		if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
+					     ZERO_PAGE(0)))
+			return 0;
+
+		bdevname(bdev, bdn);
+		pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+	}
+
+	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+}
+
+/*
+ * Like sector_div except don't modify s.
+ */
+static unsigned int sector_mod(sector_t s, unsigned int m)
+{
+	return sector_div(s, m);
+}
+
 /**
  * blkdev_issue_zeroout - zero-fill a block range
  * @bdev:	blockdev to write
@@ -277,23 +303,49 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
  * @gfp_mask:	memory allocation flags (for bio_alloc)
  *
  * Description:
- *  Generate and issue number of bios with zerofiled pages.
+ *  Issues bios which zeros the requested block range.
  */
-
 int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 			 sector_t nr_sects, gfp_t gfp_mask)
 {
-	if (bdev_write_same(bdev)) {
-		unsigned char bdn[BDEVNAME_SIZE];
+	struct request_queue *q = bdev_get_queue(bdev);
+	unsigned int alignment, granularity;
+	unsigned int c;
+	int ret;
 
-		if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
-					     ZERO_PAGE(0)))
-			return 0;
+	if (!q)
+		return -ENXIO;
 
-		bdevname(bdev, bdn);
-		pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
+	if (!blk_queue_discard(q) || !queue_discard_zeroes_data(q) ||
+	    q->limits.discard_misaligned)
+		return issue_zeroout_or_write_same(bdev, sector,
+
+						   nr_sects, gfp_mask);
+
+	alignment = q->limits.discard_alignment >> 9;
+	granularity = q->limits.discard_granularity >> 9;
+
+	c = sector_mod(granularity + alignment - sector, granularity);
+	if (c > nr_sects)
+		c = nr_sects;
+
+	if (c) {
+		int ret = issue_zeroout_or_write_same(bdev, sector,
+						      c, gfp_mask);
+		if (ret)
+			return ret;
+		nr_sects -= c;
 	}
+	if (nr_sects == 0)
+		return 0;
 
-	return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
+	c = sector_mod(nr_sects, granularity);
+
+	ret = blkdev_issue_discard(bdev, sector, nr_sects - c, gfp_mask, 0);
+	if (ret || c == 0)
+		return ret;
+
+	return issue_zeroout_or_write_same(bdev, sector + nr_sects - c, c,
+					   gfp_mask);
 }
 EXPORT_SYMBOL(blkdev_issue_zeroout);



             reply	other threads:[~2014-02-14  4:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-14  4:32 Theodore Ts'o [this message]
2014-02-14 13:05 ` [PATCH RFC] block: use discard if possible in blkdev_issue_discard() Christoph Hellwig
2014-02-14 14:57   ` Theodore Ts'o
2014-02-14 17:14 ` Martin K. Petersen
2014-02-15  1:29   ` Theodore Ts'o
2014-02-17 16:44     ` Martin K. Petersen
2014-02-17 19:19       ` Theodore Ts'o
2014-02-18  1:31         ` Martin K. Petersen
2014-02-18  2:17           ` Theodore Ts'o
2014-02-18  3:44             ` Martin K. Petersen
2014-02-18  5:47               ` Theodore Ts'o
2014-02-19  2:20                 ` Martin K. Petersen
2014-02-17 16:41   ` Lukáš Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140214043256.GA5145@thunk.org \
    --to=tytso@mit.edu \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.