* RFC: 512e ZBC host-managed disks
@ 2017-01-12 8:13 Damien Le Moal
2017-01-12 8:20 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Damien Le Moal @ 2017-01-12 8:13 UTC (permalink / raw)
To: Martin K. Petersen, James Bottomley
Cc: linux-scsi@vger.kernel.org, Hannes Reinecke, Christoph Hellwig,
Shaun Tancheff, linux-block@vger.kernel.org
Regular block devices are always accessible in units of logical block
sizes, regardless of the actual physical block size that the device has.
For hard disks, the common cases are:
512n: 512 B logical and physical blocks
512e: 512B logical blocks and 4096B physical blocks
4Kn: 4096B logical and physical blocks
and the sd.c in the kernel checks requests 512B "sectors" position and
size alignment against the disk declared logical block size. All is fine
with this, nothing new.
However, for host-managed zoned block devices (ZBC), the 512e case
breaks this model: the standard allows for 512B logical block reads,
*but* writes MUST be aligned on 4KB boundaries within sequential zones
(still using the 512B logical block size addressing). This is a problem
for users of the disk, e.g. an FS, who may wrongly believe that writing
512B units is possible (and so that it can use 512B FS block size).
Host-aware devices do not have this restriction. Nor does the
restriction apply to writes in conventional zones of host-managed devices.
Summary: for HM 512e block devices, reads are 512e compliant, but writes
in sequential zones are 4Kn compliant.
I would like an opinion on if we should do something about this. I see
the following possible options:
(1) Do nothing and let the disk user deal with the write alignment
problem. It already has to do so anyway as writes must be sequential.
But this would force in-kernel users to go and look at the device
physical block size, which is not something usually done by layers above
the block layer (FS, device mappers etc).
(2) For 512e host-managed devices, always report to the block layer
(device queue) a larger logical block size of 4096B to allow for disk
users to seamlessly adjust to the disk type without having to deal with
the physical sector size. I do not think that this would actually not
require changing the scsi_disk->sector_size field to that incorrect
value so that command addressing does not break. But I wonder if this
may not break a lot of things because of the difference introduced.
(3) Any other idea ?
Best regards.
--
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
Damien.LeMoal@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: 512e ZBC host-managed disks
2017-01-12 8:13 RFC: 512e ZBC host-managed disks Damien Le Moal
@ 2017-01-12 8:20 ` Christoph Hellwig
2017-01-12 15:02 ` Jeff Moyer
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2017-01-12 8:20 UTC (permalink / raw)
To: Damien Le Moal
Cc: Martin K. Petersen, James Bottomley, linux-scsi@vger.kernel.org,
Hannes Reinecke, Christoph Hellwig, Shaun Tancheff,
linux-block@vger.kernel.org
On Thu, Jan 12, 2017 at 05:13:52PM +0900, Damien Le Moal wrote:
> (3) Any other idea ?
Do nothing and ignore the problem. This whole idea so braindead that
the person coming up with the T10 language should be shot. Either a device
has 511 logical sectors or 4k but not this crazy mix.
And make sure no one ships such a piece of crap because we are hell
not going to support it.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: 512e ZBC host-managed disks
2017-01-12 8:20 ` Christoph Hellwig
@ 2017-01-12 15:02 ` Jeff Moyer
2017-01-13 0:14 ` Damien Le Moal
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2017-01-12 15:02 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Damien Le Moal, Martin K. Petersen, James Bottomley,
linux-scsi@vger.kernel.org, Hannes Reinecke, Shaun Tancheff,
linux-block@vger.kernel.org
Christoph Hellwig <hch@lst.de> writes:
> On Thu, Jan 12, 2017 at 05:13:52PM +0900, Damien Le Moal wrote:
>> (3) Any other idea ?
>
> Do nothing and ignore the problem. This whole idea so braindead that
> the person coming up with the T10 language should be shot. Either a device
> has 511 logical sectors or 4k but not this crazy mix.
>
> And make sure no one ships such a piece of crap because we are hell
> not going to support it.
Agreed. This is insane.
-Jeff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: 512e ZBC host-managed disks
2017-01-12 15:02 ` Jeff Moyer
@ 2017-01-13 0:14 ` Damien Le Moal
0 siblings, 0 replies; 4+ messages in thread
From: Damien Le Moal @ 2017-01-13 0:14 UTC (permalink / raw)
To: Jeff Moyer, Christoph Hellwig
Cc: Martin K. Petersen, James Bottomley, linux-scsi@vger.kernel.org,
Hannes Reinecke, Shaun Tancheff, linux-block@vger.kernel.org
On 1/13/17 00:02, Jeff Moyer wrote:
> Christoph Hellwig <hch@lst.de> writes:
>
>> On Thu, Jan 12, 2017 at 05:13:52PM +0900, Damien Le Moal wrote:
>>> (3) Any other idea ?
>>
>> Do nothing and ignore the problem. This whole idea so braindead that
>> the person coming up with the T10 language should be shot. Either a device
>> has 511 logical sectors or 4k but not this crazy mix.
>>
>> And make sure no one ships such a piece of crap because we are hell
>> not going to support it.
>
> Agreed. This is insane.
Christoph, Jeff,
Thank you for the feedback.
--
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
Damien.LeMoal@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-01-13 0:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-12 8:13 RFC: 512e ZBC host-managed disks Damien Le Moal
2017-01-12 8:20 ` Christoph Hellwig
2017-01-12 15:02 ` Jeff Moyer
2017-01-13 0:14 ` Damien Le Moal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).