From: Douglas Gilbert <dgilbert@interlog.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>,
James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
Jens Axboe <axboe@fb.com>,
"lsf@lists.linux-foundation.org" <lsf@lists.linux-foundation.org>
Subject: Re: [Lsf] LSF/MM Schedule and improving discard support
Date: Wed, 13 Apr 2016 18:04:51 -0400 [thread overview]
Message-ID: <570EC283.3070605@interlog.com> (raw)
In-Reply-To: <20160413173037.GB14037@birch.djwong.org>
On 16-04-13 01:30 PM, Darrick J. Wong wrote:
> On Wed, Apr 13, 2016 at 09:51:04AM -0700, James Bottomley wrote:
>> On Wed, 2016-04-13 at 09:29 -0700, Bart Van Assche wrote:
>>> On 04/13/2016 09:21 AM, Martin K. Petersen wrote:
>>>> From a filesystem/ioctl perspective, BLKDISCARD is a hint. We
>>>> should not be
>>>> rounding off or aligning anything.
>>>
>>> Hello Martin,
>>>
>>> Today if a BLKDISCARD ioctl passes a non-aligned start and/or end
>>> sector to the kernel then the block layer will submit invalid (non
>>> -aligned) REQ_DISCARD requests to the block driver the ioctl applies
>>> to. This is not acceptable. Does the above mean that you are
>>> proposing to fail such BLKDISCARD ioctls with an error code?
>>
>> The answer would be of course not. discard is a hint so malformed
>> discard gets ignored by the device and success is returned because you
>> can't oblige devices to obey hints (that's why they're called hints).
>
> Agree. For blockdev FALLOC_FL_PUNCH_HOLE I think we can simply check for
> logical block size ("lbs") alignment and then pass the request to the
> device with the understanding that it can do as it pleases. We asked the
> device to try to deallocate blocks, and perhaps it cannot.
>
> Just to be clear, this only applies to zeroing discard; the "discard and who
> knows what you can now read back" thing that nobody likes has been temporarily
> wired up to FALLOC_FL_PUNCH_HOLE | FALLOC_FL_NO_HIDE_STALE. :)
In May last year, T10 added another wrinkle when they expanded the LBPRZ
field from 1 to 3 bits (in the LBP VPD page but _not_ in the READ
CAPACITY(16) response). The expansion is to allow a new response when
an unmapped logical block is read: return a "provisioning initialization
pattern". That new piece of jargon is defined as a "non-zero pattern that
is the length of one logical block".
It seems that the "provisioning initialization pattern" is the same for
every unmapped logical block and is chosen by the manufacturer. It can
be read with the new REPORT PROVISIONING INITIALIZATION PATTERN command.
If LBPRZ=2 and FORMAT UNIT is called with an "initialization pattern"
equal to the disk's "provisioning initialization pattern" then all
logical blocks are unmapped. Clear?
Doug Gilbert
>> However, the problem of needing a mandatory discard for scrubbing
>> blocks is part of the fallocate discussion, I think.
>
> The third fallocate mode (FALLOC_FL_ZERO_RANGE) doesn't fit with the phrase
> "mandatory discard for scrubbing blocks", though if one removed "discard" from
> that phrase then it would. The only thing that ZERO_RANGE guarantees is that
> subsequent reads return zeroes. XFS punches the entire range and reallocates
> it with unwritten extents; ext4 fills the holes in the range with unwritten
> extents and converts real extents to unwritten. Both also write zeroes to any
> part of the range that doesn't align to an FS block.
>
> Yes, I think there are several questions to resolve here for mandatory zeroing
> with FALLOC_FL_ZERO_RANGE (summarizing the issues I've come up with so far):
>
> a) Should blockdev fallocate accept byte-granular offset/length arguments, even
> if it has to use the page cache to write zeroes to the device? This is what
> file fallocate does today.
>
> b) If blockdev fallocate does impose alignment requirements, should it return
> EINVAL to a request that isn't aligned to the logical block size?
>
> c) If a device really really prefers that its requests are aligned to
> min_io_size (which can be much larger than the logical block size), should it
> reject requests that aren't aligned to min_io? Or perhaps it should take care
> of the alignment problems on its own somehow?
>
> For allocate mode (the thing Mike Snitzer brought up in another thread
> yesterday), the alignment problems are much easier because we're allowed to
> round the start down and the end up to fit whatever alignment we require.
>
> Should we promote this to a storage track session at LSF next week?
>
> --D
>
>>
>> James
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
prev parent reply other threads:[~2016-04-13 22:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-07 15:39 LSF/MM Schedule and improving discard support Bart Van Assche
2016-04-07 15:51 ` James Bottomley
2016-04-13 15:57 ` Bart Van Assche
2016-04-13 16:21 ` Martin K. Petersen
2016-04-13 16:29 ` Bart Van Assche
2016-04-13 16:43 ` [Lsf] " Martin K. Petersen
2016-04-13 16:57 ` Bart Van Assche
2016-04-13 17:13 ` Martin K. Petersen
2016-04-13 16:51 ` James Bottomley
2016-04-13 17:30 ` Darrick J. Wong
2016-04-13 22:04 ` Douglas Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=570EC283.3070605@interlog.com \
--to=dgilbert@interlog.com \
--cc=axboe@fb.com \
--cc=bart.vanassche@sandisk.com \
--cc=darrick.wong@oracle.com \
--cc=jejb@linux.vnet.ibm.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf@lists.linux-foundation.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).