public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Czerner <lczerner@redhat.com>
To: linux-fsdevel@vger.kernel.org
Cc: axboe@kernel.dk, martin.petersen@oracle.com, hch@lst.de,
	Lukas Czerner <lczerner@redhat.com>
Subject: [PATCH] block: reintroduce discard_zeroes_data sysfs file and BLKDISCARDZEROES
Date: Wed, 16 Aug 2017 15:19:41 +0200	[thread overview]
Message-ID: <1502889581-19483-1-git-send-email-lczerner@redhat.com> (raw)

Discard and zeroout code has been significantly rewritten recently and
as a part of the rewrite we got rid o f the discard_zeroes_data flag.

With commit 48920ff2a5a9 ("block: remove the discard_zeroes_data flag")
discard_zeroes_data sysfs file and discard_zeroes_data ioctl now always
returns zero, regardless of what the device actually supports. This has
broken userspace utilities in a way that they will not take advantage of
this functionality even if the device actually supports it.

Now in order for user to figure out whether the device does suppot
deterministic read zeroes after discard without actually running
fallocate is to check for discard support (discard_max_bytes) and
zeroout hw offload (write_zeroes_max_bytes).

However we still have discard_zeroes_data sysfs file and
BLKDISCARDZEROES ioctl so I do not see any reason why not to do this
check in kernel and provide convenient and compatible way to continue to
export this information to use space.

With this patch both BLKDISCARDZEROES ioctl and discard_zeroes_data will
return 1 in the case that discard and hw offload for write zeroes is
supported. Otherwise it will return 0.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 Documentation/ABI/testing/sysfs-block | 11 +++++++++--
 Documentation/block/queue-sysfs.txt   |  5 +++++
 block/blk-sysfs.c                     |  5 ++++-
 block/ioctl.c                         |  6 +++++-
 4 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index dea212d..6ea0d03 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -213,8 +213,15 @@ What:		/sys/block/<disk>/queue/discard_zeroes_data
 Date:		May 2011
 Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 Description:
-		Will always return 0.  Don't rely on any specific behavior
-		for discards, and don't read this file.
+		Devices that support discard functionality may return
+		stale or random data when a previously discarded block
+		is read back. This can cause problems if the filesystem
+		expects discarded blocks to be explicitly cleared. If a
+		device reports that it deterministically returns zeroes
+		when a discarded area is read the discard_zeroes_data
+		parameter will be set to one. Otherwise it will be 0 and
+		the result of reading a discarded area is undefined.
+
 
 What:		/sys/block/<disk>/queue/write_same_max_bytes
 Date:		January 2012
diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt
index 2c1e670..b7f6bdc 100644
--- a/Documentation/block/queue-sysfs.txt
+++ b/Documentation/block/queue-sysfs.txt
@@ -43,6 +43,11 @@ large discards are issued, setting this value lower will make Linux issue
 smaller discards and potentially help reduce latencies induced by large
 discard operations.
 
+discard_zeroes_data (RO)
+------------------------
+When read, this file will show if the discarded block are zeroed by the
+device or not. If its value is '1' the blocks are zeroed otherwise not.
+
 hw_sector_size (RO)
 -------------------
 This is the hardware sector size of the device, in bytes.
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 27aceab..5b41ad0 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -209,7 +209,10 @@ static ssize_t queue_discard_max_store(struct request_queue *q,
 
 static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char *page)
 {
-	return queue_var_show(0, page);
+	if (blk_queue_discard(q) && q->limits.max_write_zeroes_sectors)
+		return queue_var_show(1, page);
+	else
+		return queue_var_show(0, page);
 }
 
 static ssize_t queue_write_same_max_show(struct request_queue *q, char *page)
diff --git a/block/ioctl.c b/block/ioctl.c
index 0de02ee..faecd44 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -508,6 +508,7 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 	void __user *argp = (void __user *)arg;
 	loff_t size;
 	unsigned int max_sectors;
+	struct request_queue *q = bdev_get_queue(bdev);
 
 	switch (cmd) {
 	case BLKFLSBUF:
@@ -547,7 +548,10 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 	case BLKALIGNOFF:
 		return put_int(arg, bdev_alignment_offset(bdev));
 	case BLKDISCARDZEROES:
-		return put_uint(arg, 0);
+		if (blk_queue_discard(q) && q->limits.max_write_zeroes_sectors)
+			return put_uint(arg, 1);
+		else
+			return put_uint(arg, 0);
 	case BLKSECTGET:
 		max_sectors = min_t(unsigned int, USHRT_MAX,
 				    queue_max_sectors(bdev_get_queue(bdev)));
-- 
2.7.5

             reply	other threads:[~2017-08-16 13:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-16 13:19 Lukas Czerner [this message]
2017-08-16 15:18 ` [PATCH] block: reintroduce discard_zeroes_data sysfs file and BLKDISCARDZEROES Christoph Hellwig
2017-08-16 15:48   ` Lukas Czerner
2017-08-17  1:49     ` Martin K. Petersen
2017-08-17  7:47       ` Lukas Czerner
2017-08-17  8:17         ` Christoph Hellwig
2017-08-17  8:41           ` Lukas Czerner
2017-08-17  9:52             ` Christoph Hellwig
2017-08-17 13:35               ` Lukas Czerner
2017-08-17 17:47                 ` Martin K. Petersen
2017-08-17 19:35                   ` Lukas Czerner
2017-08-17 20:39                   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1502889581-19483-1-git-send-email-lczerner@redhat.com \
    --to=lczerner@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox