public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, axboe@kernel.dk,
	martin.petersen@oracle.com, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH] block: reintroduce discard_zeroes_data sysfs file and BLKDISCARDZEROES
Date: Wed, 16 Aug 2017 21:49:22 -0400	[thread overview]
Message-ID: <yq1inhnneu5.fsf@oracle.com> (raw)
In-Reply-To: <20170816154845.y5kcq3ssbp7efduy@rh_laptop> (Lukas Czerner's message of "Wed, 16 Aug 2017 17:48:45 +0200")


Lukas,

> I'd like to be able to recognize where we have a device that does
> support write zero with unmap, TRIM with RZAT and whatever else that
> provides this.

The problem was that the original REQ_DISCARD was used both for
de-provisioning block ranges and for zeroing them. On top of that, the
storage standards tweaked the definitions a bit so the semantics became
even more confusing and harder to honor in the drivers.

As a result, we changed things so that discards are only used to
de-provision blocks. And the zeroout call/ioctl is used to zero block
ranges.

Which ATA/SCSI/NVMe command is issued on the back-end depends on what's
supported by the device and is hidden from the caller.

However, zeroout is guaranteed to return a zeroed block range on
subsequent reads. The blocks may be unmapped, anchored, written
explicitly, written with write same, or a combination thereof. But you
are guaranteed predictable results.

Whereas a discarded region may be sliced and diced and rounded off
before it hits the device. Which is then free to ignore all or parts of
the request.

Consequently, discard_zeroes_data is meaningless. Because there is no
guarantee that all of the discarded blocks will be acted upon. It
kinda-sorta sometimes worked (if the device was whitelisted, had a
reported alignment of 0, a granularity of 512 bytes, stacking didn't get
in the way, and you were lucky on the device end). But there were always
conditions.

So taking a step back: What information specifically were you trying to
obtain from querying that flag? And why do you need it?

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2017-08-17  1:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-16 13:19 [PATCH] block: reintroduce discard_zeroes_data sysfs file and BLKDISCARDZEROES Lukas Czerner
2017-08-16 15:18 ` Christoph Hellwig
2017-08-16 15:48   ` Lukas Czerner
2017-08-17  1:49     ` Martin K. Petersen [this message]
2017-08-17  7:47       ` Lukas Czerner
2017-08-17  8:17         ` Christoph Hellwig
2017-08-17  8:41           ` Lukas Czerner
2017-08-17  9:52             ` Christoph Hellwig
2017-08-17 13:35               ` Lukas Czerner
2017-08-17 17:47                 ` Martin K. Petersen
2017-08-17 19:35                   ` Lukas Czerner
2017-08-17 20:39                   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1inhnneu5.fsf@oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=lczerner@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox