public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
* ZBC_IN command translation
@ 2017-06-07  9:03 Damien Le Moal
  2017-06-07  9:07 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2017-06-07  9:03 UTC (permalink / raw)
  To: linux-ide, Hannes Reinecke, Tejun Heo

Tejun, Hannes,

Currently, in libata-scsi.c, the function ata_scsi_zbc_in_xlat
translating ZBC REPORT ZONES into the ZAC version returns an error if
the scsi command buffer length is not aligned on 512. This is possible
since the ZBC version allows report zones buffer as a multiple of 64B,
while the ZAC version of the same command requires 512B size alignment
of the command buffer.

However, SAT-4, in section 9.13.3 says:

"The SATL shall send the ATA REPORT ZONES EXT command with the ATA
RETURN PAGE COUNT field set to INT((ALLOCATION LENGTH + 511)/512)."

So clearly, instead of returning an error, we need to bounce the scsi
command buffer to a bigger 512B size aligned temporary buffer for the
ZAC report zones. I do not see any code ready to use to do so easily,
but I may be missing it. Is there something that can be used to do so or
do I need to cook something ?

Best regards.

-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-07  9:03 ZBC_IN command translation Damien Le Moal
@ 2017-06-07  9:07 ` Christoph Hellwig
  2017-06-07  9:11   ` Damien Le Moal
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2017-06-07  9:07 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide, Hannes Reinecke, Tejun Heo

On Wed, Jun 07, 2017 at 06:03:52PM +0900, Damien Le Moal wrote:
> Tejun, Hannes,
> 
> Currently, in libata-scsi.c, the function ata_scsi_zbc_in_xlat
> translating ZBC REPORT ZONES into the ZAC version returns an error if
> the scsi command buffer length is not aligned on 512. This is possible
> since the ZBC version allows report zones buffer as a multiple of 64B,
> while the ZAC version of the same command requires 512B size alignment
> of the command buffer.
> 
> However, SAT-4, in section 9.13.3 says:
> 
> "The SATL shall send the ATA REPORT ZONES EXT command with the ATA
> RETURN PAGE COUNT field set to INT((ALLOCATION LENGTH + 511)/512)."
> 
> So clearly, instead of returning an error, we need to bounce the scsi
> command buffer to a bigger 512B size aligned temporary buffer for the
> ZAC report zones. I do not see any code ready to use to do so easily,
> but I may be missing it. Is there something that can be used to do so or
> do I need to cook something ?

Just like for TRIM we should reject any ZBC_IN/OUT command that doesn't
come from the block layer, and then we can tightly control what input
we get.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-07  9:07 ` Christoph Hellwig
@ 2017-06-07  9:11   ` Damien Le Moal
  2017-06-07 13:24     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2017-06-07  9:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-ide, Hannes Reinecke, Tejun Heo

Christoph,

On 6/7/17 18:07, Christoph Hellwig wrote:
> On Wed, Jun 07, 2017 at 06:03:52PM +0900, Damien Le Moal wrote:
>> Tejun, Hannes,
>>
>> Currently, in libata-scsi.c, the function ata_scsi_zbc_in_xlat
>> translating ZBC REPORT ZONES into the ZAC version returns an error if
>> the scsi command buffer length is not aligned on 512. This is possible
>> since the ZBC version allows report zones buffer as a multiple of 64B,
>> while the ZAC version of the same command requires 512B size alignment
>> of the command buffer.
>>
>> However, SAT-4, in section 9.13.3 says:
>>
>> "The SATL shall send the ATA REPORT ZONES EXT command with the ATA
>> RETURN PAGE COUNT field set to INT((ALLOCATION LENGTH + 511)/512)."
>>
>> So clearly, instead of returning an error, we need to bounce the scsi
>> command buffer to a bigger 512B size aligned temporary buffer for the
>> ZAC report zones. I do not see any code ready to use to do so easily,
>> but I may be missing it. Is there something that can be used to do so or
>> do I need to cook something ?
> 
> Just like for TRIM we should reject any ZBC_IN/OUT command that doesn't
> come from the block layer, and then we can tightly control what input
> we get.

Sure, that is one solution. However, not a ideal one as that would
prevent ZBC/ZAC drives working with target/iscsi.
There are a few problems there that I am fixing right now, but that
translation error is stopping all effort.

-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-07  9:11   ` Damien Le Moal
@ 2017-06-07 13:24     ` Christoph Hellwig
  2017-06-08  0:18       ` Damien Le Moal
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2017-06-07 13:24 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Christoph Hellwig, linux-ide, Hannes Reinecke, Tejun Heo

On Wed, Jun 07, 2017 at 06:11:33PM +0900, Damien Le Moal wrote:
> Sure, that is one solution. However, not a ideal one as that would
> prevent ZBC/ZAC drives working with target/iscsi.
> There are a few problems there that I am fixing right now, but that
> translation error is stopping all effort.

The normal way to use the target code is through the block layer.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-07 13:24     ` Christoph Hellwig
@ 2017-06-08  0:18       ` Damien Le Moal
  2017-06-08  6:55         ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2017-06-08  0:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ide, Hannes Reinecke, Tejun Heo, Jens Axboe,
	Martin K. Petersen

Christoph,

(+CC Jens and Martin)

On 6/7/17 22:24, Christoph Hellwig wrote:
> On Wed, Jun 07, 2017 at 06:11:33PM +0900, Damien Le Moal wrote:
>> Sure, that is one solution. However, not a ideal one as that would
>> prevent ZBC/ZAC drives working with target/iscsi.
>> There are a few problems there that I am fixing right now, but that
>> translation error is stopping all effort.
> 
> The normal way to use the target code is through the block layer.

A block backstore in target indeed is great as any block device, not
just physical ones, can be exported. The problem is that for a zoned
block device the block layer API lacks functions and information to
fully emulate the corresponding ZBC SCSI device in target: no
open/close/finish zone functions (no REQ_OP_xx for that), and the
information from the zoned block device characteristics VPD page 0xB6 is
not available.

So I am limiting support in target to the passthrough SCSI (pscsi)
backstore type (and will also add a user emulation backstore). Fixes
needed in pscsi are rather trivial, but the block layer is not involved
at the highest level as scsi commands are simply passed along through
requests. Hence the ZBC/ZAC translation problem showing up.

I could work on completing support for ZBC at the block layer API:
1) Add REQ_OP_ZONE_OPEN/CLOSE/FINISH, with corresponding
blkdev_issue_xxx functions
2) Add more queue attributes and corresponding files in sysfs for the
zoned block device characteristics
3) Fix sd.c to support the new REQ_OP_ZONE_xxx
4) Add corresponding user ioctls for OPEN/CLOSE/FINISH

That's a lot of work for mostly just being used in the target code as
none of the other areas in the kernel supporting zoned block devices
don't really need these features (dm, f2fs, and on-going btrfs). But if
you think that is OK to add all this, I will. And fix target ZBC support
using that.

Best regards.

-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-08  0:18       ` Damien Le Moal
@ 2017-06-08  6:55         ` Christoph Hellwig
  2017-06-08  8:27           ` Damien Le Moal
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2017-06-08  6:55 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Christoph Hellwig, linux-ide, Hannes Reinecke, Tejun Heo,
	Jens Axboe, Martin K. Petersen

On Thu, Jun 08, 2017 at 09:18:39AM +0900, Damien Le Moal wrote:
> So I am limiting support in target to the passthrough SCSI (pscsi)
> backstore type (and will also add a user emulation backstore). Fixes
> needed in pscsi are rather trivial, but the block layer is not involved
> at the highest level as scsi commands are simply passed along through
> requests. Hence the ZBC/ZAC translation problem showing up.

Bah, don't do that.  Passthrough backends are a nightmare in so many
ways and I really prefer to not see them spread.

> 
> I could work on completing support for ZBC at the block layer API:
> 1) Add REQ_OP_ZONE_OPEN/CLOSE/FINISH, with corresponding
> blkdev_issue_xxx functions

Or just emulate them.

> 2) Add more queue attributes and corresponding files in sysfs for the
> zoned block device characteristics
> 3) Fix sd.c to support the new REQ_OP_ZONE_xxx
> 4) Add corresponding user ioctls for OPEN/CLOSE/FINISH

What is the practical use cae for that?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ZBC_IN command translation
  2017-06-08  6:55         ` Christoph Hellwig
@ 2017-06-08  8:27           ` Damien Le Moal
  0 siblings, 0 replies; 7+ messages in thread
From: Damien Le Moal @ 2017-06-08  8:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-ide, Hannes Reinecke, Tejun Heo, Jens Axboe,
	Martin K. Petersen



On 6/8/17 15:55, Christoph Hellwig wrote:
> On Thu, Jun 08, 2017 at 09:18:39AM +0900, Damien Le Moal wrote:
>> So I am limiting support in target to the passthrough SCSI (pscsi)
>> backstore type (and will also add a user emulation backstore). Fixes
>> needed in pscsi are rather trivial, but the block layer is not involved
>> at the highest level as scsi commands are simply passed along through
>> requests. Hence the ZBC/ZAC translation problem showing up.
> 
> Bah, don't do that.  Passthrough backends are a nightmare in so many
> ways and I really prefer to not see them spread.

I found out this morning how hard it is... And that was just with report
zones. Sending a large report zone for instance triggers warnings all
over the place, including block and scsi layer. Handling of incoming
requests sgl seem to be very deficient.

>> I could work on completing support for ZBC at the block layer API:
>> 1) Add REQ_OP_ZONE_OPEN/CLOSE/FINISH, with corresponding
>> blkdev_issue_xxx functions
> 
> Or just emulate them.

That would mean maintaining the condition of all zones in memory to
override the condition reported by report zone. And doing so without
knowing the actual disk limit on the maximum number of open zones.

Doable I guess. Need to think more about it. Also wondering if the
emulated zone condition can be volatile...

>> 2) Add more queue attributes and corresponding files in sysfs for the
>> zoned block device characteristics
>> 3) Fix sd.c to support the new REQ_OP_ZONE_xxx
>> 4) Add corresponding user ioctls for OPEN/CLOSE/FINISH
> 
> What is the practical use cae for that?

The only one that I can see would be to simplify the above mentioned
emulation. With this support in place, no zone condition emulation is
necessary as zones can be fully managed. The emulation is reduced to
conversion from SCSI command to block layer calls.

Apart from that, I do not see any compelling use case for
open/close/finish. f2fs does not use it, on-going btrfs work does not
need it either, same for the device mapper support.

So that would be a lot of work for just the target block backstore.

-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-06-08  8:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-07  9:03 ZBC_IN command translation Damien Le Moal
2017-06-07  9:07 ` Christoph Hellwig
2017-06-07  9:11   ` Damien Le Moal
2017-06-07 13:24     ` Christoph Hellwig
2017-06-08  0:18       ` Damien Le Moal
2017-06-08  6:55         ` Christoph Hellwig
2017-06-08  8:27           ` Damien Le Moal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox