Linux ATA/IDE development
 help / color / mirror / Atom feed
* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Shaun Tancheff @ 2016-08-22 22:00 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSEnMS+wYWiwLxoYsTDOoVb9qJ=gZbcQmhsZumiO+wbQi4A@mail.gmail.com>

On Mon, Aug 22, 2016 at 4:20 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 04:23, Shaun Tancheff <shaun@tancheff.com> wrote:
>> Safely overwriting the attached page to ATA format from the SCSI formatted
>> variant.
>>
>> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
>> ---
>> v6:
>>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
>> v5:
>>  - Added prep patch to work with non-page aligned scatterlist pages
>>    and use kmap_atomic() to lock page during modification.
>>
>>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>>  include/linux/ata.h       | 26 ----------------------
>>  2 files changed, 51 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index e207b33..7990cb2 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -3282,6 +3282,54 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>>         return 1;
>>  }
>>
>> +/**
>> + * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
>> + * @cmd: SCSI command being translated
>> + * @num: Maximum number of entries (nominally 64).
>> + * @sector: Starting sector
>> + * @count: Total Range of request
>> + *
>> + * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
>> + * descriptor.
>> + *
>> + * Upto 64 entries of the format:
>> + *   63:48 Range Length
>> + *   47:0  LBA
>> + *
>> + *  Range Length of 0 is ignored.
>> + *  LBA's should be sorted order and not overlap.
>> + *
>> + * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
>> + */
>> +static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
>> +                                             u64 sector, u32 count)
>> +{
>> +       __le64 *buffer;
>> +       u32 i = 0, used_bytes;
>> +       unsigned long flags;
>> +
>> +       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
>
> Why do we want to check "512" (a literal number) against
> ATA_SCSI_RBUF_SIZE here?

Because this is re-using the response buffer in a new way.
Such reuse could be a surprise to someone refactoring that
code, although it's pretty unlikely. The build bug on
gives some level of confidence that it won't go unnoticed.
It also codifies the assumption that we can write 512 bytes
to the global ata_scsi_rbuf buffer.

As to using a literal 512, I was just following what was here
before.

It looks like there is a ATA_SECT_SIZE that can replace most
of the literal 512's in here though. That might be a nice cleanup
overall. Not sure it belongs here though.

>> +
>> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
>> +       buffer = ((void *)ata_scsi_rbuf);
>> +       while (i < num) {
>> +               u64 entry = sector |
>> +                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> +               buffer[i++] = __cpu_to_le64(entry);
>> +               if (count <= 0xffff)
>> +                       break;
>> +               count -= 0xffff;
>> +               sector += 0xffff;
>> +       }
>> +
>> +       used_bytes = ALIGN(i * 8, 512);
>> +       memset(buffer + i, 0, used_bytes - i * 8);
>> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
>> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>> +
>> +       return used_bytes;
>> +}
>> +
>>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>  {
>>         struct ata_taskfile *tf = &qc->tf;
>> @@ -3290,8 +3338,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         const u8 *cdb = scmd->cmnd;
>>         u64 block;
>>         u32 n_block;
>> +       const u32 trmax = ATA_MAX_TRIM_RNUM;
>>         u32 size;
>> -       void *buf;
>>         u16 fp;
>>         u8 bp = 0xff;
>>
>> @@ -3319,10 +3367,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         if (!scsi_sg_count(scmd))
>>                 goto invalid_param_len;
>>
>> -       buf = page_address(sg_page(scsi_sglist(scmd)));
>> -
>> -       if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
>> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
>> +       if (n_block <= 0xffff * trmax) {
>> +               size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>>         } else {
>>                 fp = 2;
>>                 goto invalid_fld;
>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>> index adbc812..45a1d71 100644
>> --- a/include/linux/ata.h
>> +++ b/include/linux/ata.h
>> @@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>>  #endif
>>  }
>>
>> -/*
>> - * Write LBA Range Entries to the buffer that will cover the extent from
>> - * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
>> - * TO NV CACHE PINNED SET.
>> - */
>> -static inline unsigned ata_set_lba_range_entries(void *_buffer,
>> -               unsigned num, u64 sector, unsigned long count)
>> -{
>> -       __le64 *buffer = _buffer;
>> -       unsigned i = 0, used_bytes;
>> -
>> -       while (i < num) {
>> -               u64 entry = sector |
>> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> -               buffer[i++] = __cpu_to_le64(entry);
>> -               if (count <= 0xffff)
>> -                       break;
>> -               count -= 0xffff;
>> -               sector += 0xffff;
>> -       }
>> -
>> -       used_bytes = ALIGN(i * 8, 512);
>> -       memset(buffer + i, 0, used_bytes - i * 8);
>> -       return used_bytes;
>> -}
>> -
>>  static inline bool ata_ok(u8 status)
>>  {
>>         return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
>> --
>> 2.9.3
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Tom Yan @ 2016-08-22 21:20 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-2-shaun@tancheff.com>

On 22 August 2016 at 04:23, Shaun Tancheff <shaun@tancheff.com> wrote:
> Safely overwriting the attached page to ATA format from the SCSI formatted
> variant.
>
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6:
>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
> v5:
>  - Added prep patch to work with non-page aligned scatterlist pages
>    and use kmap_atomic() to lock page during modification.
>
>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/ata.h       | 26 ----------------------
>  2 files changed, 51 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index e207b33..7990cb2 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3282,6 +3282,54 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>         return 1;
>  }
>
> +/**
> + * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
> + * @cmd: SCSI command being translated
> + * @num: Maximum number of entries (nominally 64).
> + * @sector: Starting sector
> + * @count: Total Range of request
> + *
> + * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
> + * descriptor.
> + *
> + * Upto 64 entries of the format:
> + *   63:48 Range Length
> + *   47:0  LBA
> + *
> + *  Range Length of 0 is ignored.
> + *  LBA's should be sorted order and not overlap.
> + *
> + * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
> + */
> +static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
> +                                             u64 sector, u32 count)
> +{
> +       __le64 *buffer;
> +       u32 i = 0, used_bytes;
> +       unsigned long flags;
> +
> +       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);

Why do we want to check "512" (a literal number) against
ATA_SCSI_RBUF_SIZE here?

> +
> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> +       buffer = ((void *)ata_scsi_rbuf);
> +       while (i < num) {
> +               u64 entry = sector |
> +                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> +               buffer[i++] = __cpu_to_le64(entry);
> +               if (count <= 0xffff)
> +                       break;
> +               count -= 0xffff;
> +               sector += 0xffff;
> +       }
> +
> +       used_bytes = ALIGN(i * 8, 512);
> +       memset(buffer + i, 0, used_bytes - i * 8);
> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
> +
> +       return used_bytes;
> +}
> +
>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  {
>         struct ata_taskfile *tf = &qc->tf;
> @@ -3290,8 +3338,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         const u8 *cdb = scmd->cmnd;
>         u64 block;
>         u32 n_block;
> +       const u32 trmax = ATA_MAX_TRIM_RNUM;
>         u32 size;
> -       void *buf;
>         u16 fp;
>         u8 bp = 0xff;
>
> @@ -3319,10 +3367,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         if (!scsi_sg_count(scmd))
>                 goto invalid_param_len;
>
> -       buf = page_address(sg_page(scsi_sglist(scmd)));
> -
> -       if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
> +       if (n_block <= 0xffff * trmax) {
> +               size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>         } else {
>                 fp = 2;
>                 goto invalid_fld;
> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index adbc812..45a1d71 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>  #endif
>  }
>
> -/*
> - * Write LBA Range Entries to the buffer that will cover the extent from
> - * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
> - * TO NV CACHE PINNED SET.
> - */
> -static inline unsigned ata_set_lba_range_entries(void *_buffer,
> -               unsigned num, u64 sector, unsigned long count)
> -{
> -       __le64 *buffer = _buffer;
> -       unsigned i = 0, used_bytes;
> -
> -       while (i < num) {
> -               u64 entry = sector |
> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> -               buffer[i++] = __cpu_to_le64(entry);
> -               if (count <= 0xffff)
> -                       break;
> -               count -= 0xffff;
> -               sector += 0xffff;
> -       }
> -
> -       used_bytes = ALIGN(i * 8, 512);
> -       memset(buffer + i, 0, used_bytes - i * 8);
> -       return used_bytes;
> -}
> -
>  static inline bool ata_ok(u8 status)
>  {
>         return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
> --
> 2.9.3
>

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Tom Yan @ 2016-08-22 20:57 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSEkOm3nmmeHMQ5Jok2v2Up3ygNE5AE0sjkFTY26_4PTKvg@mail.gmail.com>

On 22 August 2016 at 18:52, Tom Yan <tom.ty89@gmail.com> wrote:
>
> For the "payload block size" that is "always" 512-byte as per the same
> spec, I don't think we need to concern about it. I think it only
> matters if we want to enable multi-block TRIM payload according to the
> reported limit in IDENTIFY DEVICE data. Probably it merely means that
> the "each of the blocks" in the reported limit will always mean 64
> TRIM entries, even on 4Kn drives, instead of 512.
>

Actually that's what we assumed in our current code that forms the DSM
command as well. See how (hob_)feature for queued TRIM and how
(hob_)nsect for non-queued TRIM are set in ata_scsi_write_same_xlat in
libata-scsi ('size / 512').

So if you are going to increase the payload block size per logical
sector size (in other word, you assume that the payload block size is
_expected_ to be the logical block size, in spite of what ACS told,
you'll need to adjust how the aforementioned fields in the ATA
taskfile are set as well.

^ permalink raw reply

* Re: [PATCH] ata: do not hard code limit in ata_set_lba_range_entries()
From: Tom Yan @ 2016-08-22 20:53 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, Tejun Heo, Martin K . Petersen, linux-ide,
	linux-scsi
In-Reply-To: <CAJVOszC2sp0=GA02ZB00s46F84XfU7DvzNSravyhbjxY651ozQ@mail.gmail.com>

On 22 August 2016 at 20:32, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>> I don't see how that's possible. count / n_block will always be
>> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
>> that isn't even a "buffer limit" anyway. By SG_IO do you mean like
>> SCSI Write Same commands that issued with sg_write_same or so? If
>> that's the case, that's what exactly commit 5c79097a28c2
>> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
>> limit") is for.
>
> Ah, I see. You are guarding the only user of ata_set_lba_range_entries().

Yup. It is the only right thing to do anyway, that we leave the
function "open" and guard per context when we use it. Say if
ata_set_lba_range_entries() is gonna be a function that is shared by
others, it would only make this commit more important. As I said, we
did not guard it with a certain buffer limit, but merely redundantly
guard it with a ("humanized") limit that applies to TRIM only.

> Still if you are going to do that you have to alert any new user that they
> must have an appropriately sized buffer to be overwriting.
>
> Better to move it out of ata.h then the limit the scope of accidental
> misuse?

I am not sure if it is really necessary but that's fine to me. I see
that you have been doing it in your SCT Write Same patch series.
Probably I can/should just leave the move to you?

>
> Regards,
> Shaun

^ permalink raw reply

* Re: [PATCH] ata: do not hard code limit in ata_set_lba_range_entries()
From: Shaun Tancheff @ 2016-08-22 20:32 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, Tejun Heo, Martin K . Petersen, linux-ide,
	linux-scsi
In-Reply-To: <CAGnHSE=Laj5kpFNJ-EqdN-NTUQJyOmwYtjWM7iLqGxDRqMSoxA@mail.gmail.com>

On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> I don't see how that's possible. count / n_block will always be
> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
> that isn't even a "buffer limit" anyway. By SG_IO do you mean like
> SCSI Write Same commands that issued with sg_write_same or so? If
> that's the case, that's what exactly commit 5c79097a28c2
> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
> limit") is for.

Ah, I see. You are guarding the only user of ata_set_lba_range_entries().
Still if you are going to do that you have to alert any new user that they
must have an appropriately sized buffer to be overwriting.

Better to move it out of ata.h then the limit the scope of accidental
misuse?

Regards,
Shaun

^ permalink raw reply

* Re: [PATCH v6 2/4] Add support for SCT Write Same
From: Tom Yan @ 2016-08-22 20:14 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAJVOszBPL4LM8MYMyrmnfFBqqdV6E3OCpShn12KKFv8E+tSL8Q@mail.gmail.com>

On 23 August 2016 at 03:43, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>>> +       if (unmap) {
>>> +               /* If trim is not enabled the cmd is invalid. */
>>> +               if ((dev->horkage & ATA_HORKAGE_NOTRIM) ||
>>> +                   !ata_id_has_trim(dev->id)) {
>>> +                       fp = 1;
>>> +                       bp = 3;
>>> +                       goto invalid_fld;
>>> +               }
>>> +               /* If the request is too large the cmd is invalid */
>>> +               if (n_block > 0xffff * trmax) {
>>> +                       fp = 2;
>>> +                       goto invalid_fld;
>>> +               }
>>
>> This response should be generally applied to the Write Same (16)
>> translation, since it is required by SBC,
>>
>>> +       } else {
>>> +               /* If write same is not available the cmd is invalid */
>>> +               if (!ata_id_sct_write_same(dev->id)) {
>>> +                       fp = 1;
>>> +                       bp = 3;
>>> +                       goto invalid_fld;
>>> +               }
>>
>> therefore, you should add an n_block check here as well, if you are
>> going to advertise an Maximum Write Same Length even when the device
>> supports only SCT Write Same but not TRIM. Most likely you would want
>> to simply move the existing check one-level up (if the same limit is
>> advertised no matter TRIM is supported not or not).
>
> Why would we enforce upper level limits on something that doesn't
> have any?

If we advertise a limit in our SATL, it makes sense that we should
make sure the behaviour is consistent when we issue a write same
through the block layer / ioctl and when we issue a SCSI Write Same
command directly (e.g. with sg_write_same). IMHO that's pretty much
why SBC would mandate such behaviour as well.

>
> If the upper level, or SG_IO, chooses to set a timeout of 10 hours and
> wipe a whole disk it should be free to do so.
>

That's why I said, "if you are going to advertise an Maximum Write Same Length".

^ permalink raw reply

* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Tom Yan @ 2016-08-22 20:12 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAJVOszCXdZVGwCcOv+E-NT8bANsvPEUHH+UUe=eFza0WhfEpNw@mail.gmail.com>

Oh, 0xffff is fine for me, as long as you change it in
ata_scsiop_inq_b0 as well. Actually I think replacing it with a macro
(as suggested by Martin in another thread) is a good idea as well. A
line split would be better than introducing an unnecessary variable
that costs readability in the logic sense.

On 23 August 2016 at 03:51, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Mon, Aug 22, 2016 at 2:27 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 22 August 2016 at 12:23, Shaun Tancheff <shaun@tancheff.com> wrote:
>>> Safely overwriting the attached page to ATA format from the SCSI formatted
>>> variant.
>>>
>>> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
>>> ---
>>> v6:
>>>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>>>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
>>> v5:
>>>  - Added prep patch to work with non-page aligned scatterlist pages
>>>    and use kmap_atomic() to lock page during modification.
>>>
>>>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>>>  include/linux/ata.h       | 26 ----------------------
>>>  2 files changed, 51 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>> index e207b33..7990cb2 100644
>>> --- a/drivers/ata/libata-scsi.c
>>> +++ b/drivers/ata/libata-scsi.c
>>> @@ -3282,6 +3282,54 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>>>         return 1;
>>>  }
>>>
>>> +/**
>>> + * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
>>> + * @cmd: SCSI command being translated
>>> + * @num: Maximum number of entries (nominally 64).
>>> + * @sector: Starting sector
>>> + * @count: Total Range of request
>>> + *
>>> + * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
>>> + * descriptor.
>>> + *
>>> + * Upto 64 entries of the format:
>>> + *   63:48 Range Length
>>> + *   47:0  LBA
>>> + *
>>> + *  Range Length of 0 is ignored.
>>> + *  LBA's should be sorted order and not overlap.
>>> + *
>>> + * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
>>> + */
>>> +static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
>>> +                                             u64 sector, u32 count)
>>> +{
>>> +       __le64 *buffer;
>>> +       u32 i = 0, used_bytes;
>>> +       unsigned long flags;
>>> +
>>> +       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
>>> +
>>> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
>>> +       buffer = ((void *)ata_scsi_rbuf);
>>> +       while (i < num) {
>>> +               u64 entry = sector |
>>> +                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>>> +               buffer[i++] = __cpu_to_le64(entry);
>>> +               if (count <= 0xffff)
>>> +                       break;
>>> +               count -= 0xffff;
>>> +               sector += 0xffff;
>>> +       }
>>> +
>>> +       used_bytes = ALIGN(i * 8, 512);
>>> +       memset(buffer + i, 0, used_bytes - i * 8);
>>> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
>>> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>>> +
>>> +       return used_bytes;
>>> +}
>>> +
>>>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>>  {
>>>         struct ata_taskfile *tf = &qc->tf;
>>> @@ -3290,8 +3338,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>>         const u8 *cdb = scmd->cmnd;
>>>         u64 block;
>>>         u32 n_block;
>>> +       const u32 trmax = ATA_MAX_TRIM_RNUM;
>>
>> This does not seem like a good idea.
>>
>>>         u32 size;
>>> -       void *buf;
>>>         u16 fp;
>>>         u8 bp = 0xff;
>>>
>>> @@ -3319,10 +3367,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>>         if (!scsi_sg_count(scmd))
>>>                 goto invalid_param_len;
>>>
>>> -       buf = page_address(sg_page(scsi_sglist(scmd)));
>>> -
>>> -       if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
>>> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
>>> +       if (n_block <= 0xffff * trmax) {
>>
>> Note that the limit here would always be the advertised Maximum Write
>> Same Length (ata_scsiop_inq_b0). It would be best for them to look the
>> same. Besides, it doesn't seem necessary to create this trmax anyway.
>
> It is entirely a style thing. I tend to prefer hex when describing an interface
> that specifies number of bits. The trmax is just to keep the following line
> under 80 chars.
> I am not tied to either. I can change it if people really find it confusing.
>
>>> +               size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>>>         } else {
>>>                 fp = 2;
>>>                 goto invalid_fld;
>>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>>> index adbc812..45a1d71 100644
>>> --- a/include/linux/ata.h
>>> +++ b/include/linux/ata.h
>>> @@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>>>  #endif
>>>  }
>>>
>>> -/*
>>> - * Write LBA Range Entries to the buffer that will cover the extent from
>>> - * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
>>> - * TO NV CACHE PINNED SET.
>>> - */
>>> -static inline unsigned ata_set_lba_range_entries(void *_buffer,
>>> -               unsigned num, u64 sector, unsigned long count)
>>> -{
>>> -       __le64 *buffer = _buffer;
>>> -       unsigned i = 0, used_bytes;
>>> -
>>> -       while (i < num) {
>>> -               u64 entry = sector |
>>> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>>> -               buffer[i++] = __cpu_to_le64(entry);
>>> -               if (count <= 0xffff)
>>> -                       break;
>>> -               count -= 0xffff;
>>> -               sector += 0xffff;
>>> -       }
>>> -
>>> -       used_bytes = ALIGN(i * 8, 512);
>>> -       memset(buffer + i, 0, used_bytes - i * 8);
>>> -       return used_bytes;
>>> -}
>>> -
>>>  static inline bool ata_ok(u8 status)
>>>  {
>>>         return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
>>> --
>>> 2.9.3
>>>
>
>
>
> --
> Shaun Tancheff

^ permalink raw reply

* Re: [PATCH] ata: do not hard code limit in ata_set_lba_range_entries()
From: Tom Yan @ 2016-08-22 20:07 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Tejun Heo, Martin K . Petersen, Shaun Tancheff, linux-ide,
	linux-scsi
In-Reply-To: <CAJ48U8UsFFD=fb6WQEVAam9-9RajVjLNrgiKmXCXvkCFOg2JnQ@mail.gmail.com>

I don't see how that's possible. count / n_block will always be
smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
that isn't even a "buffer limit" anyway. By SG_IO do you mean like
SCSI Write Same commands that issued with sg_write_same or so? If
that's the case, that's what exactly commit 5c79097a28c2
("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
limit") is for.

On 23 August 2016 at 03:58, Shaun Tancheff <shaun@tancheff.com> wrote:
> On Mon, Aug 22, 2016 at 1:55 PM,  <tom.ty89@gmail.com> wrote:
>> From: Tom Yan <tom.ty89@gmail.com>
>>
>> In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
>> n_block that exceeds limit"), it is made sure that
>> ata_set_lba_range_entries() will never be called with a request
>> size (n_block) that is larger than the number of blocks that a
>> 512-byte block TRIM payload can describe (65535 * 64 = 4194240),
>> in addition to acknowlegding the SCSI/block layer with the same
>> limit by advertising it as the Maximum Write Same Length.
>>
>> Therefore, it is unnecessary to hard code the same limit in
>> ata_set_lba_range_entries() itself, which would only cost extra
>> maintenance effort. Such effort can be noticed in, for example,
>> commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
>> number of TRIM ranges").
>>
>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index be9c76c..9b74ecb 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         buf = page_address(sg_page(scsi_sglist(scmd)));
>>
>>         if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
>> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
>> +               size = ata_set_lba_range_entries(buf, block, n_block);
>>         } else {
>>                 fp = 2;
>>                 goto invalid_fld;
>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>> index adbc812..5e2e9ad 100644
>> --- a/include/linux/ata.h
>> +++ b/include/linux/ata.h
>> @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>>   * TO NV CACHE PINNED SET.
>>   */
>>  static inline unsigned ata_set_lba_range_entries(void *_buffer,
>> -               unsigned num, u64 sector, unsigned long count)
>> +               u64 sector, unsigned long count)
>>  {
>>         __le64 *buffer = _buffer;
>>         unsigned i = 0, used_bytes;
>>
>> -       while (i < num) {
>> -               u64 entry = sector |
>> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> +       while (count > 0) {
>> +               u64 range, entry;
>> +
>> +               range = count > 0xffff ? 0xffff : count;
>> +               entry = sector | (range << 48);
>>                 buffer[i++] = __cpu_to_le64(entry);
>> -               if (count <= 0xffff)
>> -                       break;
>> -               count -= 0xffff;
>> -               sector += 0xffff;
>> +               count -= range;
>> +               sector += range;
>>         }
>
> I think the problem here is that I can now inject a buffer overflow
> via SG_IO.
>
>>         used_bytes = ALIGN(i * 8, 512);
>> --
>> 2.9.3
>>

^ permalink raw reply

* Re: [PATCH] ata: do not hard code limit in ata_set_lba_range_entries()
From: Shaun Tancheff @ 2016-08-22 19:58 UTC (permalink / raw)
  To: Tom Yan
  Cc: Tejun Heo, Martin K . Petersen, Shaun Tancheff, linux-ide,
	linux-scsi
In-Reply-To: <e35afc461c9a79f568f2bcd4ef2cd1e07fb50082.1471890326.git.tom.ty89@gmail.com>

On Mon, Aug 22, 2016 at 1:55 PM,  <tom.ty89@gmail.com> wrote:
> From: Tom Yan <tom.ty89@gmail.com>
>
> In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
> n_block that exceeds limit"), it is made sure that
> ata_set_lba_range_entries() will never be called with a request
> size (n_block) that is larger than the number of blocks that a
> 512-byte block TRIM payload can describe (65535 * 64 = 4194240),
> in addition to acknowlegding the SCSI/block layer with the same
> limit by advertising it as the Maximum Write Same Length.
>
> Therefore, it is unnecessary to hard code the same limit in
> ata_set_lba_range_entries() itself, which would only cost extra
> maintenance effort. Such effort can be noticed in, for example,
> commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
> number of TRIM ranges").
>
> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index be9c76c..9b74ecb 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         buf = page_address(sg_page(scsi_sglist(scmd)));
>
>         if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
> +               size = ata_set_lba_range_entries(buf, block, n_block);
>         } else {
>                 fp = 2;
>                 goto invalid_fld;
> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index adbc812..5e2e9ad 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>   * TO NV CACHE PINNED SET.
>   */
>  static inline unsigned ata_set_lba_range_entries(void *_buffer,
> -               unsigned num, u64 sector, unsigned long count)
> +               u64 sector, unsigned long count)
>  {
>         __le64 *buffer = _buffer;
>         unsigned i = 0, used_bytes;
>
> -       while (i < num) {
> -               u64 entry = sector |
> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> +       while (count > 0) {
> +               u64 range, entry;
> +
> +               range = count > 0xffff ? 0xffff : count;
> +               entry = sector | (range << 48);
>                 buffer[i++] = __cpu_to_le64(entry);
> -               if (count <= 0xffff)
> -                       break;
> -               count -= 0xffff;
> -               sector += 0xffff;
> +               count -= range;
> +               sector += range;
>         }

I think the problem here is that I can now inject a buffer overflow
via SG_IO.

>         used_bytes = ALIGN(i * 8, 512);
> --
> 2.9.3
>

^ permalink raw reply

* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Shaun Tancheff @ 2016-08-22 19:51 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSEkKcp-mr0LDfqPyUbd=f8RygsGWD756CT7u5Vycs-cPkg@mail.gmail.com>

On Mon, Aug 22, 2016 at 2:27 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 12:23, Shaun Tancheff <shaun@tancheff.com> wrote:
>> Safely overwriting the attached page to ATA format from the SCSI formatted
>> variant.
>>
>> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
>> ---
>> v6:
>>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
>> v5:
>>  - Added prep patch to work with non-page aligned scatterlist pages
>>    and use kmap_atomic() to lock page during modification.
>>
>>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>>  include/linux/ata.h       | 26 ----------------------
>>  2 files changed, 51 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index e207b33..7990cb2 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -3282,6 +3282,54 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>>         return 1;
>>  }
>>
>> +/**
>> + * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
>> + * @cmd: SCSI command being translated
>> + * @num: Maximum number of entries (nominally 64).
>> + * @sector: Starting sector
>> + * @count: Total Range of request
>> + *
>> + * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
>> + * descriptor.
>> + *
>> + * Upto 64 entries of the format:
>> + *   63:48 Range Length
>> + *   47:0  LBA
>> + *
>> + *  Range Length of 0 is ignored.
>> + *  LBA's should be sorted order and not overlap.
>> + *
>> + * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
>> + */
>> +static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
>> +                                             u64 sector, u32 count)
>> +{
>> +       __le64 *buffer;
>> +       u32 i = 0, used_bytes;
>> +       unsigned long flags;
>> +
>> +       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
>> +
>> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
>> +       buffer = ((void *)ata_scsi_rbuf);
>> +       while (i < num) {
>> +               u64 entry = sector |
>> +                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> +               buffer[i++] = __cpu_to_le64(entry);
>> +               if (count <= 0xffff)
>> +                       break;
>> +               count -= 0xffff;
>> +               sector += 0xffff;
>> +       }
>> +
>> +       used_bytes = ALIGN(i * 8, 512);
>> +       memset(buffer + i, 0, used_bytes - i * 8);
>> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
>> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>> +
>> +       return used_bytes;
>> +}
>> +
>>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>  {
>>         struct ata_taskfile *tf = &qc->tf;
>> @@ -3290,8 +3338,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         const u8 *cdb = scmd->cmnd;
>>         u64 block;
>>         u32 n_block;
>> +       const u32 trmax = ATA_MAX_TRIM_RNUM;
>
> This does not seem like a good idea.
>
>>         u32 size;
>> -       void *buf;
>>         u16 fp;
>>         u8 bp = 0xff;
>>
>> @@ -3319,10 +3367,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         if (!scsi_sg_count(scmd))
>>                 goto invalid_param_len;
>>
>> -       buf = page_address(sg_page(scsi_sglist(scmd)));
>> -
>> -       if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
>> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
>> +       if (n_block <= 0xffff * trmax) {
>
> Note that the limit here would always be the advertised Maximum Write
> Same Length (ata_scsiop_inq_b0). It would be best for them to look the
> same. Besides, it doesn't seem necessary to create this trmax anyway.

It is entirely a style thing. I tend to prefer hex when describing an interface
that specifies number of bits. The trmax is just to keep the following line
under 80 chars.
I am not tied to either. I can change it if people really find it confusing.

>> +               size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>>         } else {
>>                 fp = 2;
>>                 goto invalid_fld;
>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>> index adbc812..45a1d71 100644
>> --- a/include/linux/ata.h
>> +++ b/include/linux/ata.h
>> @@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>>  #endif
>>  }
>>
>> -/*
>> - * Write LBA Range Entries to the buffer that will cover the extent from
>> - * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
>> - * TO NV CACHE PINNED SET.
>> - */
>> -static inline unsigned ata_set_lba_range_entries(void *_buffer,
>> -               unsigned num, u64 sector, unsigned long count)
>> -{
>> -       __le64 *buffer = _buffer;
>> -       unsigned i = 0, used_bytes;
>> -
>> -       while (i < num) {
>> -               u64 entry = sector |
>> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> -               buffer[i++] = __cpu_to_le64(entry);
>> -               if (count <= 0xffff)
>> -                       break;
>> -               count -= 0xffff;
>> -               sector += 0xffff;
>> -       }
>> -
>> -       used_bytes = ALIGN(i * 8, 512);
>> -       memset(buffer + i, 0, used_bytes - i * 8);
>> -       return used_bytes;
>> -}
>> -
>>  static inline bool ata_ok(u8 status)
>>  {
>>         return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
>> --
>> 2.9.3
>>



-- 
Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 2/4] Add support for SCT Write Same
From: Shaun Tancheff @ 2016-08-22 19:43 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSE=KijjA8ZBhNBQZH-5CeC1w4vu8wWob4Mm9tTKVNSL_GQ@mail.gmail.com>

On Mon, Aug 22, 2016 at 2:20 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 12:23, Shaun Tancheff <shaun@tancheff.com> wrote:
>> SATA drives may support write same via SCT. This is useful
>> for setting the drive contents to a specific pattern (0's).
>>
>> Translate a SCSI WRITE SAME 16 command to be either a DSM TRIM
>> command or an SCT Write Same command.
>>
>> Based on the UNMAP flag:
>>   - When set translate to DSM TRIM
>>   - When not set translate to SCT Write Same
>>
>> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
>> ---
>> v6:
>>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
>> v5:
>>  - Addressed review comments
>>  - Report support for ZBC only for zoned devices.
>>  - kmap page during rewrite
>>  - Fix unmap set to require trim or error, if not unmap then sct write
>>    same or error.
>> v4:
>>  - Added partial MAINTENANCE_IN opcode simulation
>>  - Dropped all changes in drivers/scsi/*
>>  - Changed to honor the UNMAP flag -> TRIM, no UNMAP -> SCT.
>> v3:
>>  - Demux UNMAP/TRIM from WRITE SAME
>> v2:
>>  - Remove fugly ata hacking from sd.c
>>
>>  drivers/ata/libata-scsi.c | 199 +++++++++++++++++++++++++++++++++++++++-------
>>  include/linux/ata.h       |  43 ++++++++++
>>  2 files changed, 213 insertions(+), 29 deletions(-)
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index 7990cb2..ebf1a04 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -1159,8 +1159,6 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev)
>>  {
>>         sdev->use_10_for_rw = 1;
>>         sdev->use_10_for_ms = 1;
>> -       sdev->no_report_opcodes = 1;
>> -       sdev->no_write_same = 1;
>>
>>         /* Schedule policy is determined by ->qc_defer() callback and
>>          * it needs to see every deferred qc.  Set dev_blocked to 1 to
>> @@ -3287,7 +3285,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>>   * @cmd: SCSI command being translated
>>   * @num: Maximum number of entries (nominally 64).
>>   * @sector: Starting sector
>> - * @count: Total Range of request
>> + * @count: Total Range of request in logical sectors
>>   *
>>   * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
>>   * descriptor.
>> @@ -3330,6 +3328,45 @@ static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
>>         return used_bytes;
>>  }
>>
>> +/**
>> + * ata_format_dsm_trim_descr() - SATL Write Same to ATA SCT Write Same
>> + * @cmd: SCSI command being translated
>> + * @lba: Starting sector
>> + * @num: Number of logical sectors to be zero'd.
>> + *
>> + * Rewrite the WRITE SAME descriptor to be an SCT Write Same formatted
>> + * descriptor.
>> + * NOTE: Writes a pattern (0's) in the foreground.
>> + *       Large write-same requents can timeout.
>> + */
>> +static void ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
>> +{
>> +       u16 *sctpg;
>> +       unsigned long flags;
>> +
>> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
>> +       sctpg = ((void *)ata_scsi_rbuf);
>> +
>> +       put_unaligned_le16(0x0002,  &sctpg[0]); /* SCT_ACT_WRITE_SAME */
>> +       put_unaligned_le16(0x0101,  &sctpg[1]); /* WRITE PTRN FG */
>> +       put_unaligned_le64(lba,     &sctpg[2]);
>> +       put_unaligned_le64(num,     &sctpg[6]);
>> +       put_unaligned_le32(0u,      &sctpg[10]);
>> +
>> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), sctpg, 512);
>> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>> +}
>> +
>> +/**
>> + * ata_scsi_write_same_xlat() - SATL Write Same to ATA SCT Write Same
>> + * @qc: Command to be translated
>> + *
>> + * Translate a SCSI WRITE SAME command to be either a DSM TRIM command or
>> + * an SCT Write Same command.
>> + * Based on WRITE SAME has the UNMAP flag
>> + *   When set translate to DSM TRIM
>> + *   When clear translate to SCT Write Same
>> + */
>>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>  {
>>         struct ata_taskfile *tf = &qc->tf;
>> @@ -3342,6 +3379,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         u32 size;
>>         u16 fp;
>>         u8 bp = 0xff;
>> +       u8 unmap = cdb[1] & 0x8;
>>
>>         /* we may not issue DMA commands if no DMA mode is set */
>>         if (unlikely(!dev->dma_mode))
>> @@ -3353,11 +3391,26 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         }
>>         scsi_16_lba_len(cdb, &block, &n_block);
>>
>> -       /* for now we only support WRITE SAME with the unmap bit set */
>> -       if (unlikely(!(cdb[1] & 0x8))) {
>> -               fp = 1;
>> -               bp = 3;
>> -               goto invalid_fld;
>> +       if (unmap) {
>> +               /* If trim is not enabled the cmd is invalid. */
>> +               if ((dev->horkage & ATA_HORKAGE_NOTRIM) ||
>> +                   !ata_id_has_trim(dev->id)) {
>> +                       fp = 1;
>> +                       bp = 3;
>> +                       goto invalid_fld;
>> +               }
>> +               /* If the request is too large the cmd is invalid */
>> +               if (n_block > 0xffff * trmax) {
>> +                       fp = 2;
>> +                       goto invalid_fld;
>> +               }
>
> This response should be generally applied to the Write Same (16)
> translation, since it is required by SBC,
>
>> +       } else {
>> +               /* If write same is not available the cmd is invalid */
>> +               if (!ata_id_sct_write_same(dev->id)) {
>> +                       fp = 1;
>> +                       bp = 3;
>> +                       goto invalid_fld;
>> +               }
>
> therefore, you should add an n_block check here as well, if you are
> going to advertise an Maximum Write Same Length even when the device
> supports only SCT Write Same but not TRIM. Most likely you would want
> to simply move the existing check one-level up (if the same limit is
> advertised no matter TRIM is supported not or not).

Why would we enforce upper level limits on something that doesn't
have any?

If the upper level, or SG_IO, chooses to set a timeout of 10 hours and
wipe a whole disk it should be free to do so.

>>         }
>>
>>         /*
>> @@ -3367,30 +3420,42 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         if (!scsi_sg_count(scmd))
>>                 goto invalid_param_len;
>>
>> -       if (n_block <= 0xffff * trmax) {
>> +       if (unmap) {
>>                 size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>> +               if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
>> +                       /* Newer devices support queued TRIM commands */
>> +                       tf->protocol = ATA_PROT_NCQ;
>> +                       tf->command = ATA_CMD_FPDMA_SEND;
>> +                       tf->hob_nsect = ATA_SUBCMD_FPDMA_SEND_DSM & 0x1f;
>> +                       tf->nsect = qc->tag << 3;
>> +                       tf->hob_feature = (size / 512) >> 8;
>> +                       tf->feature = size / 512;
>> +
>> +                       tf->auxiliary = 1;
>> +               } else {
>> +                       tf->protocol = ATA_PROT_DMA;
>> +                       tf->hob_feature = 0;
>> +                       tf->feature = ATA_DSM_TRIM;
>> +                       tf->hob_nsect = (size / 512) >> 8;
>> +                       tf->nsect = size / 512;
>> +                       tf->command = ATA_CMD_DSM;
>> +               }
>>         } else {
>> -               fp = 2;
>> -               goto invalid_fld;
>> -       }
>> -
>> -       if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
>> -               /* Newer devices support queued TRIM commands */
>> -               tf->protocol = ATA_PROT_NCQ;
>> -               tf->command = ATA_CMD_FPDMA_SEND;
>> -               tf->hob_nsect = ATA_SUBCMD_FPDMA_SEND_DSM & 0x1f;
>> -               tf->nsect = qc->tag << 3;
>> -               tf->hob_feature = (size / 512) >> 8;
>> -               tf->feature = size / 512;
>> +               ata_format_sct_write_same(scmd, block, n_block);
>>
>> -               tf->auxiliary = 1;
>> -       } else {
>> -               tf->protocol = ATA_PROT_DMA;
>>                 tf->hob_feature = 0;
>> -               tf->feature = ATA_DSM_TRIM;
>> -               tf->hob_nsect = (size / 512) >> 8;
>> -               tf->nsect = size / 512;
>> -               tf->command = ATA_CMD_DSM;
>> +               tf->feature = 0;
>> +               tf->hob_nsect = 0;
>> +               tf->nsect = 1;
>> +               tf->lbah = 0;
>> +               tf->lbam = 0;
>> +               tf->lbal = ATA_CMD_STANDBYNOW1;
>> +               tf->hob_lbah = 0;
>> +               tf->hob_lbam = 0;
>> +               tf->hob_lbal = 0;
>> +               tf->device = ATA_CMD_STANDBYNOW1;
>> +               tf->protocol = ATA_PROT_DMA;
>> +               tf->command = ATA_CMD_WRITE_LOG_DMA_EXT;
>>         }
>>
>>         tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE | ATA_TFLAG_LBA48 |
>> @@ -3414,6 +3479,76 @@ invalid_opcode:
>>  }
>>
>>  /**
>> + *     ata_scsiop_maint_in - Simulate a subset of MAINTENANCE_IN
>> + *     @args: device MAINTENANCE_IN data / SCSI command of interest.
>> + *     @rbuf: Response buffer, to which simulated SCSI cmd output is sent.
>> + *
>> + *     Yields a subset to satisfy scsi_report_opcode()
>> + *
>> + *     LOCKING:
>> + *     spin_lock_irqsave(host lock)
>> + */
>> +static unsigned int ata_scsiop_maint_in(struct ata_scsi_args *args, u8 *rbuf)
>> +{
>> +       struct ata_device *dev = args->dev;
>> +       u8 *cdb = args->cmd->cmnd;
>> +       u8 supported = 0;
>> +       unsigned int err = 0;
>> +
>> +       if (cdb[2] != 1) {
>> +               ata_dev_warn(dev, "invalid command format %d\n", cdb[2]);
>> +               err = 2;
>> +               goto out;
>> +       }
>> +       switch (cdb[3]) {
>> +       case INQUIRY:
>> +       case MODE_SENSE:
>> +       case MODE_SENSE_10:
>> +       case READ_CAPACITY:
>> +       case SERVICE_ACTION_IN_16:
>> +       case REPORT_LUNS:
>> +       case REQUEST_SENSE:
>> +       case SYNCHRONIZE_CACHE:
>> +       case REZERO_UNIT:
>> +       case SEEK_6:
>> +       case SEEK_10:
>> +       case TEST_UNIT_READY:
>> +       case SEND_DIAGNOSTIC:
>> +       case MAINTENANCE_IN:
>> +       case READ_6:
>> +       case READ_10:
>> +       case READ_16:
>> +       case WRITE_6:
>> +       case WRITE_10:
>> +       case WRITE_16:
>> +       case ATA_12:
>> +       case ATA_16:
>> +       case VERIFY:
>> +       case VERIFY_16:
>> +       case MODE_SELECT:
>> +       case MODE_SELECT_10:
>> +       case START_STOP:
>> +               supported = 3;
>> +               break;
>> +       case WRITE_SAME_16:
>> +               if (ata_id_sct_write_same(dev->id))
>> +                       supported = 3;
>> +               break;
>> +       case ZBC_IN:
>> +       case ZBC_OUT:
>> +               if (ata_id_zoned_cap(dev->id) ||
>> +                   dev->class == ATA_DEV_ZAC)
>> +                       supported = 3;
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +out:
>> +       rbuf[1] = supported; /* supported */
>> +       return err;
>> +}
>> +
>> +/**
>>   *     ata_scsi_report_zones_complete - convert ATA output
>>   *     @qc: command structure returning the data
>>   *
>> @@ -4193,6 +4328,13 @@ void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd)
>>                         ata_scsi_invalid_field(dev, cmd, 1);
>>                 break;
>>
>> +       case MAINTENANCE_IN:
>> +               if (scsicmd[1] == MI_REPORT_SUPPORTED_OPERATION_CODES)
>> +                       ata_scsi_rbuf_fill(&args, ata_scsiop_maint_in);
>> +               else
>> +                       ata_scsi_invalid_field(dev, cmd, 1);
>> +               break;
>> +
>>         /* all other commands */
>>         default:
>>                 ata_scsi_set_sense(dev, cmd, ILLEGAL_REQUEST, 0x20, 0x0);
>> @@ -4225,7 +4367,6 @@ int ata_scsi_add_hosts(struct ata_host *host, struct scsi_host_template *sht)
>>                 shost->max_lun = 1;
>>                 shost->max_channel = 1;
>>                 shost->max_cmd_len = 16;
>> -               shost->no_write_same = 1;
>>
>>                 /* Schedule policy is determined by ->qc_defer()
>>                  * callback and it needs to see every deferred qc.
>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>> index 45a1d71..fdb1803 100644
>> --- a/include/linux/ata.h
>> +++ b/include/linux/ata.h
>> @@ -105,6 +105,7 @@ enum {
>>         ATA_ID_CFA_KEY_MGMT     = 162,
>>         ATA_ID_CFA_MODES        = 163,
>>         ATA_ID_DATA_SET_MGMT    = 169,
>> +       ATA_ID_SCT_CMD_XPORT    = 206,
>>         ATA_ID_ROT_SPEED        = 217,
>>         ATA_ID_PIO4             = (1 << 1),
>>
>> @@ -789,6 +790,48 @@ static inline bool ata_id_sense_reporting_enabled(const u16 *id)
>>  }
>>
>>  /**
>> + *
>> + * Word: 206 - SCT Command Transport
>> + *    15:12 - Vendor Specific
>> + *     11:6 - Reserved
>> + *        5 - SCT Command Transport Data Tables supported
>> + *        4 - SCT Command Transport Features Control supported
>> + *        3 - SCT Command Transport Error Recovery Control supported
>> + *        2 - SCT Command Transport Write Same supported
>> + *        1 - SCT Command Transport Long Sector Access supported
>> + *        0 - SCT Command Transport supported
>> + */
>> +static inline bool ata_id_sct_data_tables(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 5) ? true : false;
>> +}
>> +
>> +static inline bool ata_id_sct_features_ctrl(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 4) ? true : false;
>> +}
>> +
>> +static inline bool ata_id_sct_error_recovery_ctrl(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 3) ? true : false;
>> +}
>> +
>> +static inline bool ata_id_sct_write_same(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 2) ? true : false;
>> +}
>> +
>> +static inline bool ata_id_sct_long_sector_access(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 1) ? true : false;
>> +}
>> +
>> +static inline bool ata_id_sct_supported(const u16 *id)
>> +{
>> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 0) ? true : false;
>> +}
>> +
>> +/**
>>   *     ata_id_major_version    -       get ATA level of drive
>>   *     @id: Identify data
>>   *
>> --
>> 2.9.3
>>



-- 
Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Tom Yan @ 2016-08-22 19:27 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-2-shaun@tancheff.com>

On 22 August 2016 at 12:23, Shaun Tancheff <shaun@tancheff.com> wrote:
> Safely overwriting the attached page to ATA format from the SCSI formatted
> variant.
>
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6:
>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
> v5:
>  - Added prep patch to work with non-page aligned scatterlist pages
>    and use kmap_atomic() to lock page during modification.
>
>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/ata.h       | 26 ----------------------
>  2 files changed, 51 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index e207b33..7990cb2 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3282,6 +3282,54 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>         return 1;
>  }
>
> +/**
> + * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
> + * @cmd: SCSI command being translated
> + * @num: Maximum number of entries (nominally 64).
> + * @sector: Starting sector
> + * @count: Total Range of request
> + *
> + * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
> + * descriptor.
> + *
> + * Upto 64 entries of the format:
> + *   63:48 Range Length
> + *   47:0  LBA
> + *
> + *  Range Length of 0 is ignored.
> + *  LBA's should be sorted order and not overlap.
> + *
> + * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
> + */
> +static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
> +                                             u64 sector, u32 count)
> +{
> +       __le64 *buffer;
> +       u32 i = 0, used_bytes;
> +       unsigned long flags;
> +
> +       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
> +
> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> +       buffer = ((void *)ata_scsi_rbuf);
> +       while (i < num) {
> +               u64 entry = sector |
> +                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> +               buffer[i++] = __cpu_to_le64(entry);
> +               if (count <= 0xffff)
> +                       break;
> +               count -= 0xffff;
> +               sector += 0xffff;
> +       }
> +
> +       used_bytes = ALIGN(i * 8, 512);
> +       memset(buffer + i, 0, used_bytes - i * 8);
> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
> +
> +       return used_bytes;
> +}
> +
>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  {
>         struct ata_taskfile *tf = &qc->tf;
> @@ -3290,8 +3338,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         const u8 *cdb = scmd->cmnd;
>         u64 block;
>         u32 n_block;
> +       const u32 trmax = ATA_MAX_TRIM_RNUM;

This does not seem like a good idea.

>         u32 size;
> -       void *buf;
>         u16 fp;
>         u8 bp = 0xff;
>
> @@ -3319,10 +3367,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         if (!scsi_sg_count(scmd))
>                 goto invalid_param_len;
>
> -       buf = page_address(sg_page(scsi_sglist(scmd)));
> -
> -       if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
> +       if (n_block <= 0xffff * trmax) {

Note that the limit here would always be the advertised Maximum Write
Same Length (ata_scsiop_inq_b0). It would be best for them to look the
same. Besides, it doesn't seem necessary to create this trmax anyway.

> +               size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
>         } else {
>                 fp = 2;
>                 goto invalid_fld;
> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index adbc812..45a1d71 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -1071,32 +1071,6 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>  #endif
>  }
>
> -/*
> - * Write LBA Range Entries to the buffer that will cover the extent from
> - * sector to sector + count.  This is used for TRIM and for ADD LBA(S)
> - * TO NV CACHE PINNED SET.
> - */
> -static inline unsigned ata_set_lba_range_entries(void *_buffer,
> -               unsigned num, u64 sector, unsigned long count)
> -{
> -       __le64 *buffer = _buffer;
> -       unsigned i = 0, used_bytes;
> -
> -       while (i < num) {
> -               u64 entry = sector |
> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> -               buffer[i++] = __cpu_to_le64(entry);
> -               if (count <= 0xffff)
> -                       break;
> -               count -= 0xffff;
> -               sector += 0xffff;
> -       }
> -
> -       used_bytes = ALIGN(i * 8, 512);
> -       memset(buffer + i, 0, used_bytes - i * 8);
> -       return used_bytes;
> -}
> -
>  static inline bool ata_ok(u8 status)
>  {
>         return ((status & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ | ATA_ERR))
> --
> 2.9.3
>

^ permalink raw reply

* Re: [PATCH v6 2/4] Add support for SCT Write Same
From: Tom Yan @ 2016-08-22 19:20 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-3-shaun@tancheff.com>

On 22 August 2016 at 12:23, Shaun Tancheff <shaun@tancheff.com> wrote:
> SATA drives may support write same via SCT. This is useful
> for setting the drive contents to a specific pattern (0's).
>
> Translate a SCSI WRITE SAME 16 command to be either a DSM TRIM
> command or an SCT Write Same command.
>
> Based on the UNMAP flag:
>   - When set translate to DSM TRIM
>   - When not set translate to SCT Write Same
>
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6:
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
> v5:
>  - Addressed review comments
>  - Report support for ZBC only for zoned devices.
>  - kmap page during rewrite
>  - Fix unmap set to require trim or error, if not unmap then sct write
>    same or error.
> v4:
>  - Added partial MAINTENANCE_IN opcode simulation
>  - Dropped all changes in drivers/scsi/*
>  - Changed to honor the UNMAP flag -> TRIM, no UNMAP -> SCT.
> v3:
>  - Demux UNMAP/TRIM from WRITE SAME
> v2:
>  - Remove fugly ata hacking from sd.c
>
>  drivers/ata/libata-scsi.c | 199 +++++++++++++++++++++++++++++++++++++++-------
>  include/linux/ata.h       |  43 ++++++++++
>  2 files changed, 213 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 7990cb2..ebf1a04 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1159,8 +1159,6 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev)
>  {
>         sdev->use_10_for_rw = 1;
>         sdev->use_10_for_ms = 1;
> -       sdev->no_report_opcodes = 1;
> -       sdev->no_write_same = 1;
>
>         /* Schedule policy is determined by ->qc_defer() callback and
>          * it needs to see every deferred qc.  Set dev_blocked to 1 to
> @@ -3287,7 +3285,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>   * @cmd: SCSI command being translated
>   * @num: Maximum number of entries (nominally 64).
>   * @sector: Starting sector
> - * @count: Total Range of request
> + * @count: Total Range of request in logical sectors
>   *
>   * Rewrite the WRITE SAME descriptor to be a DSM TRIM little-endian formatted
>   * descriptor.
> @@ -3330,6 +3328,45 @@ static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
>         return used_bytes;
>  }
>
> +/**
> + * ata_format_dsm_trim_descr() - SATL Write Same to ATA SCT Write Same
> + * @cmd: SCSI command being translated
> + * @lba: Starting sector
> + * @num: Number of logical sectors to be zero'd.
> + *
> + * Rewrite the WRITE SAME descriptor to be an SCT Write Same formatted
> + * descriptor.
> + * NOTE: Writes a pattern (0's) in the foreground.
> + *       Large write-same requents can timeout.
> + */
> +static void ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
> +{
> +       u16 *sctpg;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> +       sctpg = ((void *)ata_scsi_rbuf);
> +
> +       put_unaligned_le16(0x0002,  &sctpg[0]); /* SCT_ACT_WRITE_SAME */
> +       put_unaligned_le16(0x0101,  &sctpg[1]); /* WRITE PTRN FG */
> +       put_unaligned_le64(lba,     &sctpg[2]);
> +       put_unaligned_le64(num,     &sctpg[6]);
> +       put_unaligned_le32(0u,      &sctpg[10]);
> +
> +       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), sctpg, 512);
> +       spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
> +}
> +
> +/**
> + * ata_scsi_write_same_xlat() - SATL Write Same to ATA SCT Write Same
> + * @qc: Command to be translated
> + *
> + * Translate a SCSI WRITE SAME command to be either a DSM TRIM command or
> + * an SCT Write Same command.
> + * Based on WRITE SAME has the UNMAP flag
> + *   When set translate to DSM TRIM
> + *   When clear translate to SCT Write Same
> + */
>  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  {
>         struct ata_taskfile *tf = &qc->tf;
> @@ -3342,6 +3379,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         u32 size;
>         u16 fp;
>         u8 bp = 0xff;
> +       u8 unmap = cdb[1] & 0x8;
>
>         /* we may not issue DMA commands if no DMA mode is set */
>         if (unlikely(!dev->dma_mode))
> @@ -3353,11 +3391,26 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         }
>         scsi_16_lba_len(cdb, &block, &n_block);
>
> -       /* for now we only support WRITE SAME with the unmap bit set */
> -       if (unlikely(!(cdb[1] & 0x8))) {
> -               fp = 1;
> -               bp = 3;
> -               goto invalid_fld;
> +       if (unmap) {
> +               /* If trim is not enabled the cmd is invalid. */
> +               if ((dev->horkage & ATA_HORKAGE_NOTRIM) ||
> +                   !ata_id_has_trim(dev->id)) {
> +                       fp = 1;
> +                       bp = 3;
> +                       goto invalid_fld;
> +               }
> +               /* If the request is too large the cmd is invalid */
> +               if (n_block > 0xffff * trmax) {
> +                       fp = 2;
> +                       goto invalid_fld;
> +               }

This response should be generally applied to the Write Same (16)
translation, since it is required by SBC,

> +       } else {
> +               /* If write same is not available the cmd is invalid */
> +               if (!ata_id_sct_write_same(dev->id)) {
> +                       fp = 1;
> +                       bp = 3;
> +                       goto invalid_fld;
> +               }

therefore, you should add an n_block check here as well, if you are
going to advertise an Maximum Write Same Length even when the device
supports only SCT Write Same but not TRIM. Most likely you would want
to simply move the existing check one-level up (if the same limit is
advertised no matter TRIM is supported not or not).

>         }
>
>         /*
> @@ -3367,30 +3420,42 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         if (!scsi_sg_count(scmd))
>                 goto invalid_param_len;
>
> -       if (n_block <= 0xffff * trmax) {
> +       if (unmap) {
>                 size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
> +               if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
> +                       /* Newer devices support queued TRIM commands */
> +                       tf->protocol = ATA_PROT_NCQ;
> +                       tf->command = ATA_CMD_FPDMA_SEND;
> +                       tf->hob_nsect = ATA_SUBCMD_FPDMA_SEND_DSM & 0x1f;
> +                       tf->nsect = qc->tag << 3;
> +                       tf->hob_feature = (size / 512) >> 8;
> +                       tf->feature = size / 512;
> +
> +                       tf->auxiliary = 1;
> +               } else {
> +                       tf->protocol = ATA_PROT_DMA;
> +                       tf->hob_feature = 0;
> +                       tf->feature = ATA_DSM_TRIM;
> +                       tf->hob_nsect = (size / 512) >> 8;
> +                       tf->nsect = size / 512;
> +                       tf->command = ATA_CMD_DSM;
> +               }
>         } else {
> -               fp = 2;
> -               goto invalid_fld;
> -       }
> -
> -       if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
> -               /* Newer devices support queued TRIM commands */
> -               tf->protocol = ATA_PROT_NCQ;
> -               tf->command = ATA_CMD_FPDMA_SEND;
> -               tf->hob_nsect = ATA_SUBCMD_FPDMA_SEND_DSM & 0x1f;
> -               tf->nsect = qc->tag << 3;
> -               tf->hob_feature = (size / 512) >> 8;
> -               tf->feature = size / 512;
> +               ata_format_sct_write_same(scmd, block, n_block);
>
> -               tf->auxiliary = 1;
> -       } else {
> -               tf->protocol = ATA_PROT_DMA;
>                 tf->hob_feature = 0;
> -               tf->feature = ATA_DSM_TRIM;
> -               tf->hob_nsect = (size / 512) >> 8;
> -               tf->nsect = size / 512;
> -               tf->command = ATA_CMD_DSM;
> +               tf->feature = 0;
> +               tf->hob_nsect = 0;
> +               tf->nsect = 1;
> +               tf->lbah = 0;
> +               tf->lbam = 0;
> +               tf->lbal = ATA_CMD_STANDBYNOW1;
> +               tf->hob_lbah = 0;
> +               tf->hob_lbam = 0;
> +               tf->hob_lbal = 0;
> +               tf->device = ATA_CMD_STANDBYNOW1;
> +               tf->protocol = ATA_PROT_DMA;
> +               tf->command = ATA_CMD_WRITE_LOG_DMA_EXT;
>         }
>
>         tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE | ATA_TFLAG_LBA48 |
> @@ -3414,6 +3479,76 @@ invalid_opcode:
>  }
>
>  /**
> + *     ata_scsiop_maint_in - Simulate a subset of MAINTENANCE_IN
> + *     @args: device MAINTENANCE_IN data / SCSI command of interest.
> + *     @rbuf: Response buffer, to which simulated SCSI cmd output is sent.
> + *
> + *     Yields a subset to satisfy scsi_report_opcode()
> + *
> + *     LOCKING:
> + *     spin_lock_irqsave(host lock)
> + */
> +static unsigned int ata_scsiop_maint_in(struct ata_scsi_args *args, u8 *rbuf)
> +{
> +       struct ata_device *dev = args->dev;
> +       u8 *cdb = args->cmd->cmnd;
> +       u8 supported = 0;
> +       unsigned int err = 0;
> +
> +       if (cdb[2] != 1) {
> +               ata_dev_warn(dev, "invalid command format %d\n", cdb[2]);
> +               err = 2;
> +               goto out;
> +       }
> +       switch (cdb[3]) {
> +       case INQUIRY:
> +       case MODE_SENSE:
> +       case MODE_SENSE_10:
> +       case READ_CAPACITY:
> +       case SERVICE_ACTION_IN_16:
> +       case REPORT_LUNS:
> +       case REQUEST_SENSE:
> +       case SYNCHRONIZE_CACHE:
> +       case REZERO_UNIT:
> +       case SEEK_6:
> +       case SEEK_10:
> +       case TEST_UNIT_READY:
> +       case SEND_DIAGNOSTIC:
> +       case MAINTENANCE_IN:
> +       case READ_6:
> +       case READ_10:
> +       case READ_16:
> +       case WRITE_6:
> +       case WRITE_10:
> +       case WRITE_16:
> +       case ATA_12:
> +       case ATA_16:
> +       case VERIFY:
> +       case VERIFY_16:
> +       case MODE_SELECT:
> +       case MODE_SELECT_10:
> +       case START_STOP:
> +               supported = 3;
> +               break;
> +       case WRITE_SAME_16:
> +               if (ata_id_sct_write_same(dev->id))
> +                       supported = 3;
> +               break;
> +       case ZBC_IN:
> +       case ZBC_OUT:
> +               if (ata_id_zoned_cap(dev->id) ||
> +                   dev->class == ATA_DEV_ZAC)
> +                       supported = 3;
> +               break;
> +       default:
> +               break;
> +       }
> +out:
> +       rbuf[1] = supported; /* supported */
> +       return err;
> +}
> +
> +/**
>   *     ata_scsi_report_zones_complete - convert ATA output
>   *     @qc: command structure returning the data
>   *
> @@ -4193,6 +4328,13 @@ void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd)
>                         ata_scsi_invalid_field(dev, cmd, 1);
>                 break;
>
> +       case MAINTENANCE_IN:
> +               if (scsicmd[1] == MI_REPORT_SUPPORTED_OPERATION_CODES)
> +                       ata_scsi_rbuf_fill(&args, ata_scsiop_maint_in);
> +               else
> +                       ata_scsi_invalid_field(dev, cmd, 1);
> +               break;
> +
>         /* all other commands */
>         default:
>                 ata_scsi_set_sense(dev, cmd, ILLEGAL_REQUEST, 0x20, 0x0);
> @@ -4225,7 +4367,6 @@ int ata_scsi_add_hosts(struct ata_host *host, struct scsi_host_template *sht)
>                 shost->max_lun = 1;
>                 shost->max_channel = 1;
>                 shost->max_cmd_len = 16;
> -               shost->no_write_same = 1;
>
>                 /* Schedule policy is determined by ->qc_defer()
>                  * callback and it needs to see every deferred qc.
> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index 45a1d71..fdb1803 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -105,6 +105,7 @@ enum {
>         ATA_ID_CFA_KEY_MGMT     = 162,
>         ATA_ID_CFA_MODES        = 163,
>         ATA_ID_DATA_SET_MGMT    = 169,
> +       ATA_ID_SCT_CMD_XPORT    = 206,
>         ATA_ID_ROT_SPEED        = 217,
>         ATA_ID_PIO4             = (1 << 1),
>
> @@ -789,6 +790,48 @@ static inline bool ata_id_sense_reporting_enabled(const u16 *id)
>  }
>
>  /**
> + *
> + * Word: 206 - SCT Command Transport
> + *    15:12 - Vendor Specific
> + *     11:6 - Reserved
> + *        5 - SCT Command Transport Data Tables supported
> + *        4 - SCT Command Transport Features Control supported
> + *        3 - SCT Command Transport Error Recovery Control supported
> + *        2 - SCT Command Transport Write Same supported
> + *        1 - SCT Command Transport Long Sector Access supported
> + *        0 - SCT Command Transport supported
> + */
> +static inline bool ata_id_sct_data_tables(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 5) ? true : false;
> +}
> +
> +static inline bool ata_id_sct_features_ctrl(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 4) ? true : false;
> +}
> +
> +static inline bool ata_id_sct_error_recovery_ctrl(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 3) ? true : false;
> +}
> +
> +static inline bool ata_id_sct_write_same(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 2) ? true : false;
> +}
> +
> +static inline bool ata_id_sct_long_sector_access(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 1) ? true : false;
> +}
> +
> +static inline bool ata_id_sct_supported(const u16 *id)
> +{
> +       return id[ATA_ID_SCT_CMD_XPORT] & (1 << 0) ? true : false;
> +}
> +
> +/**
>   *     ata_id_major_version    -       get ATA level of drive
>   *     @id: Identify data
>   *
> --
> 2.9.3
>

^ permalink raw reply

* [PATCH] ata: do not hard code limit in ata_set_lba_range_entries()
From: tom.ty89 @ 2016-08-22 18:55 UTC (permalink / raw)
  To: tj, martin.petersen, shaun.tancheff, shaun; +Cc: linux-ide, linux-scsi, Tom Yan

From: Tom Yan <tom.ty89@gmail.com>

In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
n_block that exceeds limit"), it is made sure that
ata_set_lba_range_entries() will never be called with a request
size (n_block) that is larger than the number of blocks that a
512-byte block TRIM payload can describe (65535 * 64 = 4194240),
in addition to acknowlegding the SCSI/block layer with the same
limit by advertising it as the Maximum Write Same Length.

Therefore, it is unnecessary to hard code the same limit in
ata_set_lba_range_entries() itself, which would only cost extra
maintenance effort. Such effort can be noticed in, for example,
commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
number of TRIM ranges").

Signed-off-by: Tom Yan <tom.ty89@gmail.com>

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index be9c76c..9b74ecb 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
 	buf = page_address(sg_page(scsi_sglist(scmd)));
 
 	if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
-		size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
+		size = ata_set_lba_range_entries(buf, block, n_block);
 	} else {
 		fp = 2;
 		goto invalid_fld;
diff --git a/include/linux/ata.h b/include/linux/ata.h
index adbc812..5e2e9ad 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id)
  * TO NV CACHE PINNED SET.
  */
 static inline unsigned ata_set_lba_range_entries(void *_buffer,
-		unsigned num, u64 sector, unsigned long count)
+		u64 sector, unsigned long count)
 {
 	__le64 *buffer = _buffer;
 	unsigned i = 0, used_bytes;
 
-	while (i < num) {
-		u64 entry = sector |
-			((u64)(count > 0xffff ? 0xffff : count) << 48);
+	while (count > 0) {
+		u64 range, entry;
+
+		range = count > 0xffff ? 0xffff : count;
+	        entry = sector | (range << 48);
 		buffer[i++] = __cpu_to_le64(entry);
-		if (count <= 0xffff)
-			break;
-		count -= 0xffff;
-		sector += 0xffff;
+		count -= range;
+		sector += range;
 	}
 
 	used_bytes = ALIGN(i * 8, 512);
-- 
2.9.3


^ permalink raw reply related

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Tom Yan @ 2016-08-22 18:52 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAJVOszCjjjx39GsdFtKAdwVRG0dJ0Frf4MKSQVHorA_qSZzR7w@mail.gmail.com>

In that case I see no reason that my suggestion should not be adopted.
Currently speaking (as I mentioned in the commit message) it is
reasonable to decrease it per logical sector size in the TRIM-only
sense as well because of the block layer / bio size limit.

FWIW, as of ACS-4, RANGE LENGTH field in DSM(XL)/TRIM range entry
should be "number of logical sectors". Therefore, if there will be any
4Kn SSDs, _bytes_ TRIM'd per command is expected to be increased with
the sector size as well, just like SCT Write Same.

For the "payload block size" that is "always" 512-byte as per the same
spec, I don't think we need to concern about it. I think it only
matters if we want to enable multi-block TRIM payload according to the
reported limit in IDENTIFY DEVICE data. Probably it merely means that
the "each of the blocks" in the reported limit will always mean 64
TRIM entries, even on 4Kn drives, instead of 512.

On 23 August 2016 at 02:00, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Mon, Aug 22, 2016 at 12:02 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 22 August 2016 at 15:04, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>>> On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>>> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
>
>> Since I have no experience with SCT Write Same at all, and neither do
>> I own any spinning HDD at all, I cannot firmly suggest what to do. All
>> I can suggest is: should we decrease it per sector size? Or would 2G
>> per command still be too large to avoid timeout?
>
> Timeout for WS is 120 seconds so we should be fine there.
>
> The number to look for is the:
>    Max. Sustained Transfer Rate OD (MB/s): 190 8TB (180 5TB)
>
> Which means the above drives should complete a 2G write in
> about 10 to 11 seconds.
>
> If these were 4Kn drives and we allowed a 16G max then it
> would be 80-90 seconds, assuming the write speed didn't
> get any better.
>
> So holding the maximum to around 2G is probably the best
> overall, in my opinion.
>
> --
> Shaun Tancheff
>
> On Mon, Aug 22, 2016 at 12:02 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 22 August 2016 at 15:04, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>>> On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>>> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
>>>>> As mentioned before, as of the latest draft of ACS-4, nothing about a
>>>>> larger payload size is mentioned. Conservatively speaking, it sort of
>>>>
>>>> *payload block size
>>>>
>>>>> means that we are allowing four 512-byte block payload on 4Kn device
>>>>
>>>> *eight 512-byte-block payload
>>>>
>>>>> regardless of the reported limit in the IDENTIFY DEVICE data. I am
>>>>> really not sure if it's a good thing to do. Doesn't seem necessary
>>>>> anyway, especially when our block layer does not support such a large
>>>>> bio size (well, yet), so each request will end up using a payload of
>>>>> two 512-byte blocks at max anyway.
>>>>>
>>>>> Also, it's IMHO better to do it in a seperate patch (series) after the
>>>>> SCT Write Same support has entered libata's repo too, because this has
>>>>> nothing to with it but TRIM translation. In case the future ACS
>>>>> standards has clearer/better instruction on this, it will be easier
>>>>> for us to revert/follow up too.
>>>
>>> I am certainly fine with dropping this patch as it is not critical to
>>> the reset of the series.
>>>
>>> Nothing will break if we stick with the 512 byte fixed limit. This
>>> is at most a prep patch for handling increased limits should
>>> they be reported.
>>>
>>> All it really is doing is acknowledging that any write same
>>> must have a payload of sector_size which can be something
>>> larger than 512 bytes.
>>
>> Actually I am not sure if we should hard code the limit
>> ata_format_dsm_trim_descr() / ata_set_lba_range_entries() at all. The
>> current implementation (with or without your patch) seems redundant
>> and unnecessary to me.
>>
>> All we need to do should be: making sure that the block limits VPD
>> advertises a safe Maximum Write Same Length, and reject Write Same
>> (16) commands that have "number of blocks" that exceeds the limit
>> (which is what I introduced in commit 5c79097a28c2, "libata-scsi:
>> reject WRITE SAME (16) with n_block that exceeds limit").
>>
>> In that case, we don't need to hard code the limit in the
>> while-condition again; instead we should just make it end with the
>> request size, since the accepted request could never be larger than
>> the limit we advertise.
>>
>>>
>>>>> And you'll need to fix the Block Limits VPD simulation
>>>>> (ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
>>>>> Same Length dynamically as per the logical sector size, otherwise your
>>>>> effort will be completely in vain, even if our block layer is
>>>>> overhauled in the future.
>>>
>>> Martin had earlier suggested that I leave the write same defaults
>>> as is due to concerns with misbehaving hardware.
>>
>> It doesn't really apply in libata's anyway. SD_MAX_WS10_BLOCKS means
>> nothing to ATA drives, except from coincidentally being the same value
>> as ATA_MAX_SECTORS_LBA48 (which technically should have been 65536
>> instead).
>>
>>>
>>> I think your patch adjusting the reported limits is reasonable
>>> enough. It seems to me we should have the hardware
>>> report it's actual limits, for example, report what the spec
>>> allows.
>>
>> As you mentioned yourself before, technically SCT Write Same does not
>> have a limit. The only practical limit is the timeout in the SCSI
>> layer, so the actual bytes being (over)written is probably our only
>> concern.
>>
>> For the case of TRIM, devices do report a limit in their IDENTIFY
>> DEVICE data. However, as Martin always said, it is not an always-safe
>> piece of data for us to refer to, that's why we have been statically
>> allowing only 1-block payload.
>>
>> Therefore, it seems convenient (and consistent) that we make SCT Write
>> Same always use the same limit as TRIM, no matter if it is supported
>> on a certain device. And to make sure the actual bytes being written /
>> time required per command does not increase enormously as per the
>> sector size, we decrease the limit accordingly. Certainly that's not
>> necessary if 16G per command is fine on most devices.
>>
>> Also, does SCT Write Same commands that write 32M/256M per command
>> make any sense? I mean would we benefit from such small SCT Write Same
>> commands at all?
>>
>>>
>>> Of course there are lots of reasons to limit the absolute
>>> maximums.
>>>
>>> So in this case we are just enabling the limit to be
>>> increased but not changing the current black-listing
>>> that distrusts DSM Trim. Once we have 4Kn devices
>>> to test then we can start white-listing and see if there
>>> is an overall increase in performance.
>>>
>>>>> Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
>>>>> all, the reported Maximum Write Same Length will be:
>>>>>
>>>>> On device with TRIM support:
>>>>> - 4194240 LOGICAL sector per request split / command
>>>>> -- ~=2G on non-4Kn drives
>>>>> -- ~=16G on non-4Kn drives
>>>>>
>>>>> On device without TRIM support:
>>>>> - 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
>>>>> -- ~= 32M on non-4Kn drives
>>>>> -- ~=256M on non-4Kn drives
>>>>>
>>>>> Even if we ignore the upper limit(s) of the block layer, do we want
>>>>> such inconsistencies?
>>>
>>> Hmm. Overall I think it is still okay if a bit confusing.
>>> It is possible that for devices which support SCT Write Same
>>> and DSM TRIM will still Trim faster than they can Write Same,
>>> However the internal implementation is opaque so I can't
>>> say if Write Same is often implemented in terms of TRIM
>>> or not. I mean that's how _I_ do it [Write 1 block and map
>>> N blocks to it], But not every FTL will have come to the
>>> same conclusion.
>>
>> Why would SCT Write Same be implemented in terms of TRIM? Neither
>> would we need to care about that anyway. Considering we will unlikely
>> allow multi-block payload TRIM, and we probably have no reason to
>> touch the SCSI Write Same timeout, the only thing we need to consider
>> is whether we want to decrease the advertised limit base on the
>> typical SCT Write Same speed on traditional HDDs and the timeout,
>> especially in the 4Kn case.
>>
>> Since I have no experience with SCT Write Same at all, and neither do
>> I own any spinning HDD at all, I cannot firmly suggest what to do. All
>> I can suggest is: should we decrease it per sector size? Or would 2G
>> per command still be too large to avoid timeout?
>>
>>>
>>> I also suspect that given the choice for most use casess that
>>> TRIM is preferred over WS when TRIM supports returning
>>> zeroed blocks.
>>
>> Well, for devices with discard_zeroes_data = 1, the block layer will
>> not issue write same requests (see blkdev_issue_zeroout in
>> block/blk-lib.c). However, libata only consider the RZAT support bit
>> from a white list of devices (see ata_scsiop_read_cap in libata-scsi
>> and the white list in libata-core).
>>
>>>
>>> --
>>> Shaun Tancheff
>
>
>
> --
> Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Shaun Tancheff @ 2016-08-22 18:00 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSE=-kqLyQQ4dZ1=VQ=iBectjHmbF7wg_bxUJ_k+H6nq8bA@mail.gmail.com>

On Mon, Aug 22, 2016 at 12:02 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 15:04, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>> On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:

> Since I have no experience with SCT Write Same at all, and neither do
> I own any spinning HDD at all, I cannot firmly suggest what to do. All
> I can suggest is: should we decrease it per sector size? Or would 2G
> per command still be too large to avoid timeout?

Timeout for WS is 120 seconds so we should be fine there.

The number to look for is the:
   Max. Sustained Transfer Rate OD (MB/s): 190 8TB (180 5TB)

Which means the above drives should complete a 2G write in
about 10 to 11 seconds.

If these were 4Kn drives and we allowed a 16G max then it
would be 80-90 seconds, assuming the write speed didn't
get any better.

So holding the maximum to around 2G is probably the best
overall, in my opinion.

-- 
Shaun Tancheff

On Mon, Aug 22, 2016 at 12:02 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 15:04, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>> On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
>>>> As mentioned before, as of the latest draft of ACS-4, nothing about a
>>>> larger payload size is mentioned. Conservatively speaking, it sort of
>>>
>>> *payload block size
>>>
>>>> means that we are allowing four 512-byte block payload on 4Kn device
>>>
>>> *eight 512-byte-block payload
>>>
>>>> regardless of the reported limit in the IDENTIFY DEVICE data. I am
>>>> really not sure if it's a good thing to do. Doesn't seem necessary
>>>> anyway, especially when our block layer does not support such a large
>>>> bio size (well, yet), so each request will end up using a payload of
>>>> two 512-byte blocks at max anyway.
>>>>
>>>> Also, it's IMHO better to do it in a seperate patch (series) after the
>>>> SCT Write Same support has entered libata's repo too, because this has
>>>> nothing to with it but TRIM translation. In case the future ACS
>>>> standards has clearer/better instruction on this, it will be easier
>>>> for us to revert/follow up too.
>>
>> I am certainly fine with dropping this patch as it is not critical to
>> the reset of the series.
>>
>> Nothing will break if we stick with the 512 byte fixed limit. This
>> is at most a prep patch for handling increased limits should
>> they be reported.
>>
>> All it really is doing is acknowledging that any write same
>> must have a payload of sector_size which can be something
>> larger than 512 bytes.
>
> Actually I am not sure if we should hard code the limit
> ata_format_dsm_trim_descr() / ata_set_lba_range_entries() at all. The
> current implementation (with or without your patch) seems redundant
> and unnecessary to me.
>
> All we need to do should be: making sure that the block limits VPD
> advertises a safe Maximum Write Same Length, and reject Write Same
> (16) commands that have "number of blocks" that exceeds the limit
> (which is what I introduced in commit 5c79097a28c2, "libata-scsi:
> reject WRITE SAME (16) with n_block that exceeds limit").
>
> In that case, we don't need to hard code the limit in the
> while-condition again; instead we should just make it end with the
> request size, since the accepted request could never be larger than
> the limit we advertise.
>
>>
>>>> And you'll need to fix the Block Limits VPD simulation
>>>> (ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
>>>> Same Length dynamically as per the logical sector size, otherwise your
>>>> effort will be completely in vain, even if our block layer is
>>>> overhauled in the future.
>>
>> Martin had earlier suggested that I leave the write same defaults
>> as is due to concerns with misbehaving hardware.
>
> It doesn't really apply in libata's anyway. SD_MAX_WS10_BLOCKS means
> nothing to ATA drives, except from coincidentally being the same value
> as ATA_MAX_SECTORS_LBA48 (which technically should have been 65536
> instead).
>
>>
>> I think your patch adjusting the reported limits is reasonable
>> enough. It seems to me we should have the hardware
>> report it's actual limits, for example, report what the spec
>> allows.
>
> As you mentioned yourself before, technically SCT Write Same does not
> have a limit. The only practical limit is the timeout in the SCSI
> layer, so the actual bytes being (over)written is probably our only
> concern.
>
> For the case of TRIM, devices do report a limit in their IDENTIFY
> DEVICE data. However, as Martin always said, it is not an always-safe
> piece of data for us to refer to, that's why we have been statically
> allowing only 1-block payload.
>
> Therefore, it seems convenient (and consistent) that we make SCT Write
> Same always use the same limit as TRIM, no matter if it is supported
> on a certain device. And to make sure the actual bytes being written /
> time required per command does not increase enormously as per the
> sector size, we decrease the limit accordingly. Certainly that's not
> necessary if 16G per command is fine on most devices.
>
> Also, does SCT Write Same commands that write 32M/256M per command
> make any sense? I mean would we benefit from such small SCT Write Same
> commands at all?
>
>>
>> Of course there are lots of reasons to limit the absolute
>> maximums.
>>
>> So in this case we are just enabling the limit to be
>> increased but not changing the current black-listing
>> that distrusts DSM Trim. Once we have 4Kn devices
>> to test then we can start white-listing and see if there
>> is an overall increase in performance.
>>
>>>> Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
>>>> all, the reported Maximum Write Same Length will be:
>>>>
>>>> On device with TRIM support:
>>>> - 4194240 LOGICAL sector per request split / command
>>>> -- ~=2G on non-4Kn drives
>>>> -- ~=16G on non-4Kn drives
>>>>
>>>> On device without TRIM support:
>>>> - 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
>>>> -- ~= 32M on non-4Kn drives
>>>> -- ~=256M on non-4Kn drives
>>>>
>>>> Even if we ignore the upper limit(s) of the block layer, do we want
>>>> such inconsistencies?
>>
>> Hmm. Overall I think it is still okay if a bit confusing.
>> It is possible that for devices which support SCT Write Same
>> and DSM TRIM will still Trim faster than they can Write Same,
>> However the internal implementation is opaque so I can't
>> say if Write Same is often implemented in terms of TRIM
>> or not. I mean that's how _I_ do it [Write 1 block and map
>> N blocks to it], But not every FTL will have come to the
>> same conclusion.
>
> Why would SCT Write Same be implemented in terms of TRIM? Neither
> would we need to care about that anyway. Considering we will unlikely
> allow multi-block payload TRIM, and we probably have no reason to
> touch the SCSI Write Same timeout, the only thing we need to consider
> is whether we want to decrease the advertised limit base on the
> typical SCT Write Same speed on traditional HDDs and the timeout,
> especially in the 4Kn case.
>
> Since I have no experience with SCT Write Same at all, and neither do
> I own any spinning HDD at all, I cannot firmly suggest what to do. All
> I can suggest is: should we decrease it per sector size? Or would 2G
> per command still be too large to avoid timeout?
>
>>
>> I also suspect that given the choice for most use casess that
>> TRIM is preferred over WS when TRIM supports returning
>> zeroed blocks.
>
> Well, for devices with discard_zeroes_data = 1, the block layer will
> not issue write same requests (see blkdev_issue_zeroout in
> block/blk-lib.c). However, libata only consider the RZAT support bit
> from a white list of devices (see ata_scsiop_read_cap in libata-scsi
> and the white list in libata-core).
>
>>
>> --
>> Shaun Tancheff



-- 
Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Tom Yan @ 2016-08-22 17:02 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAJVOszCvXwjJCEwxOG0GhN9wq+Axq15zW7Ja=EybyZLRZBYkpA@mail.gmail.com>

On 22 August 2016 at 15:04, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
>>> As mentioned before, as of the latest draft of ACS-4, nothing about a
>>> larger payload size is mentioned. Conservatively speaking, it sort of
>>
>> *payload block size
>>
>>> means that we are allowing four 512-byte block payload on 4Kn device
>>
>> *eight 512-byte-block payload
>>
>>> regardless of the reported limit in the IDENTIFY DEVICE data. I am
>>> really not sure if it's a good thing to do. Doesn't seem necessary
>>> anyway, especially when our block layer does not support such a large
>>> bio size (well, yet), so each request will end up using a payload of
>>> two 512-byte blocks at max anyway.
>>>
>>> Also, it's IMHO better to do it in a seperate patch (series) after the
>>> SCT Write Same support has entered libata's repo too, because this has
>>> nothing to with it but TRIM translation. In case the future ACS
>>> standards has clearer/better instruction on this, it will be easier
>>> for us to revert/follow up too.
>
> I am certainly fine with dropping this patch as it is not critical to
> the reset of the series.
>
> Nothing will break if we stick with the 512 byte fixed limit. This
> is at most a prep patch for handling increased limits should
> they be reported.
>
> All it really is doing is acknowledging that any write same
> must have a payload of sector_size which can be something
> larger than 512 bytes.

Actually I am not sure if we should hard code the limit
ata_format_dsm_trim_descr() / ata_set_lba_range_entries() at all. The
current implementation (with or without your patch) seems redundant
and unnecessary to me.

All we need to do should be: making sure that the block limits VPD
advertises a safe Maximum Write Same Length, and reject Write Same
(16) commands that have "number of blocks" that exceeds the limit
(which is what I introduced in commit 5c79097a28c2, "libata-scsi:
reject WRITE SAME (16) with n_block that exceeds limit").

In that case, we don't need to hard code the limit in the
while-condition again; instead we should just make it end with the
request size, since the accepted request could never be larger than
the limit we advertise.

>
>>> And you'll need to fix the Block Limits VPD simulation
>>> (ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
>>> Same Length dynamically as per the logical sector size, otherwise your
>>> effort will be completely in vain, even if our block layer is
>>> overhauled in the future.
>
> Martin had earlier suggested that I leave the write same defaults
> as is due to concerns with misbehaving hardware.

It doesn't really apply in libata's anyway. SD_MAX_WS10_BLOCKS means
nothing to ATA drives, except from coincidentally being the same value
as ATA_MAX_SECTORS_LBA48 (which technically should have been 65536
instead).

>
> I think your patch adjusting the reported limits is reasonable
> enough. It seems to me we should have the hardware
> report it's actual limits, for example, report what the spec
> allows.

As you mentioned yourself before, technically SCT Write Same does not
have a limit. The only practical limit is the timeout in the SCSI
layer, so the actual bytes being (over)written is probably our only
concern.

For the case of TRIM, devices do report a limit in their IDENTIFY
DEVICE data. However, as Martin always said, it is not an always-safe
piece of data for us to refer to, that's why we have been statically
allowing only 1-block payload.

Therefore, it seems convenient (and consistent) that we make SCT Write
Same always use the same limit as TRIM, no matter if it is supported
on a certain device. And to make sure the actual bytes being written /
time required per command does not increase enormously as per the
sector size, we decrease the limit accordingly. Certainly that's not
necessary if 16G per command is fine on most devices.

Also, does SCT Write Same commands that write 32M/256M per command
make any sense? I mean would we benefit from such small SCT Write Same
commands at all?

>
> Of course there are lots of reasons to limit the absolute
> maximums.
>
> So in this case we are just enabling the limit to be
> increased but not changing the current black-listing
> that distrusts DSM Trim. Once we have 4Kn devices
> to test then we can start white-listing and see if there
> is an overall increase in performance.
>
>>> Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
>>> all, the reported Maximum Write Same Length will be:
>>>
>>> On device with TRIM support:
>>> - 4194240 LOGICAL sector per request split / command
>>> -- ~=2G on non-4Kn drives
>>> -- ~=16G on non-4Kn drives
>>>
>>> On device without TRIM support:
>>> - 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
>>> -- ~= 32M on non-4Kn drives
>>> -- ~=256M on non-4Kn drives
>>>
>>> Even if we ignore the upper limit(s) of the block layer, do we want
>>> such inconsistencies?
>
> Hmm. Overall I think it is still okay if a bit confusing.
> It is possible that for devices which support SCT Write Same
> and DSM TRIM will still Trim faster than they can Write Same,
> However the internal implementation is opaque so I can't
> say if Write Same is often implemented in terms of TRIM
> or not. I mean that's how _I_ do it [Write 1 block and map
> N blocks to it], But not every FTL will have come to the
> same conclusion.

Why would SCT Write Same be implemented in terms of TRIM? Neither
would we need to care about that anyway. Considering we will unlikely
allow multi-block payload TRIM, and we probably have no reason to
touch the SCSI Write Same timeout, the only thing we need to consider
is whether we want to decrease the advertised limit base on the
typical SCT Write Same speed on traditional HDDs and the timeout,
especially in the 4Kn case.

Since I have no experience with SCT Write Same at all, and neither do
I own any spinning HDD at all, I cannot firmly suggest what to do. All
I can suggest is: should we decrease it per sector size? Or would 2G
per command still be too large to avoid timeout?

>
> I also suspect that given the choice for most use casess that
> TRIM is preferred over WS when TRIM supports returning
> zeroed blocks.

Well, for devices with discard_zeroes_data = 1, the block layer will
not issue write same requests (see blkdev_issue_zeroout in
block/blk-lib.c). However, libata only consider the RZAT support bit
from a white list of devices (see ata_scsiop_read_cap in libata-scsi
and the white list in libata-core).

>
> --
> Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Shaun Tancheff @ 2016-08-22 15:04 UTC (permalink / raw)
  To: Tom Yan
  Cc: Shaun Tancheff, linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke
In-Reply-To: <CAGnHSEmFMx1Q6mRV-hXwR4_08OvvSR1xcxoT1dLjVMvOHYY_+g@mail.gmail.com>

On Mon, Aug 22, 2016 at 3:33 AM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
>> As mentioned before, as of the latest draft of ACS-4, nothing about a
>> larger payload size is mentioned. Conservatively speaking, it sort of
>
> *payload block size
>
>> means that we are allowing four 512-byte block payload on 4Kn device
>
> *eight 512-byte-block payload
>
>> regardless of the reported limit in the IDENTIFY DEVICE data. I am
>> really not sure if it's a good thing to do. Doesn't seem necessary
>> anyway, especially when our block layer does not support such a large
>> bio size (well, yet), so each request will end up using a payload of
>> two 512-byte blocks at max anyway.
>>
>> Also, it's IMHO better to do it in a seperate patch (series) after the
>> SCT Write Same support has entered libata's repo too, because this has
>> nothing to with it but TRIM translation. In case the future ACS
>> standards has clearer/better instruction on this, it will be easier
>> for us to revert/follow up too.

I am certainly fine with dropping this patch as it is not critical to
the reset of the series.

Nothing will break if we stick with the 512 byte fixed limit. This
is at most a prep patch for handling increased limits should
they be reported.

All it really is doing is acknowledging that any write same
must have a payload of sector_size which can be something
larger than 512 bytes.

>> And you'll need to fix the Block Limits VPD simulation
>> (ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
>> Same Length dynamically as per the logical sector size, otherwise your
>> effort will be completely in vain, even if our block layer is
>> overhauled in the future.

Martin had earlier suggested that I leave the write same defaults
as is due to concerns with misbehaving hardware.

I think your patch adjusting the reported limits is reasonable
enough. It seems to me we should have the hardware
report it's actual limits, for example, report what the spec
allows.

Of course there are lots of reasons to limit the absolute
maximums.

So in this case we are just enabling the limit to be
increased but not changing the current black-listing
that distrusts DSM Trim. Once we have 4Kn devices
to test then we can start white-listing and see if there
is an overall increase in performance.

>> Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
>> all, the reported Maximum Write Same Length will be:
>>
>> On device with TRIM support:
>> - 4194240 LOGICAL sector per request split / command
>> -- ~=2G on non-4Kn drives
>> -- ~=16G on non-4Kn drives
>>
>> On device without TRIM support:
>> - 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
>> -- ~= 32M on non-4Kn drives
>> -- ~=256M on non-4Kn drives
>>
>> Even if we ignore the upper limit(s) of the block layer, do we want
>> such inconsistencies?

Hmm. Overall I think it is still okay if a bit confusing.
It is possible that for devices which support SCT Write Same
and DSM TRIM will still Trim faster than they can Write Same,
However the internal implementation is opaque so I can't
say if Write Same is often implemented in terms of TRIM
or not. I mean that's how _I_ do it [Write 1 block and map
N blocks to it], But not every FTL will have come to the
same conclusion.

I also suspect that given the choice for most use casess that
TRIM is preferred over WS when TRIM supports returning
zeroed blocks.

-- 
Shaun Tancheff

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Tom Yan @ 2016-08-22  8:33 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <CAGnHSEn95Z8wKfq1Nf0ntKSzze-ScP5WeAahziD5Sk0QfmEXyw@mail.gmail.com>

On 22 August 2016 at 08:31, Tom Yan <tom.ty89@gmail.com> wrote:
> As mentioned before, as of the latest draft of ACS-4, nothing about a
> larger payload size is mentioned. Conservatively speaking, it sort of

*payload block size

> means that we are allowing four 512-byte block payload on 4Kn device

*eight 512-byte-block payload

> regardless of the reported limit in the IDENTIFY DEVICE data. I am
> really not sure if it's a good thing to do. Doesn't seem necessary
> anyway, especially when our block layer does not support such a large
> bio size (well, yet), so each request will end up using a payload of
> two 512-byte blocks at max anyway.
>
> Also, it's IMHO better to do it in a seperate patch (series) after the
> SCT Write Same support has entered libata's repo too, because this has
> nothing to with it but TRIM translation. In case the future ACS
> standards has clearer/better instruction on this, it will be easier
> for us to revert/follow up too.
>
> And you'll need to fix the Block Limits VPD simulation
> (ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
> Same Length dynamically as per the logical sector size, otherwise your
> effort will be completely in vain, even if our block layer is
> overhauled in the future.
>
> Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
> all, the reported Maximum Write Same Length will be:
>
> On device with TRIM support:
> - 4194240 LOGICAL sector per request split / command
> -- ~=2G on non-4Kn drives
> -- ~=16G on non-4Kn drives
>
> On device without TRIM support:
> - 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
> -- ~= 32M on non-4Kn drives
> -- ~=256M on non-4Kn drives
>
> Even if we ignore the upper limit(s) of the block layer, do we want
> such inconsistencies?
>

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Tom Yan @ 2016-08-22  8:31 UTC (permalink / raw)
  To: Shaun Tancheff
  Cc: linux-ide, LKML, Tejun Heo, Christoph Hellwig,
	Martin K . Petersen, Damien Le Moal, Hannes Reinecke,
	Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-4-shaun@tancheff.com>

As mentioned before, as of the latest draft of ACS-4, nothing about a
larger payload size is mentioned. Conservatively speaking, it sort of
means that we are allowing four 512-byte block payload on 4Kn device
regardless of the reported limit in the IDENTIFY DEVICE data. I am
really not sure if it's a good thing to do. Doesn't seem necessary
anyway, especially when our block layer does not support such a large
bio size (well, yet), so each request will end up using a payload of
two 512-byte blocks at max anyway.

Also, it's IMHO better to do it in a seperate patch (series) after the
SCT Write Same support has entered libata's repo too, because this has
nothing to with it but TRIM translation. In case the future ACS
standards has clearer/better instruction on this, it will be easier
for us to revert/follow up too.

And you'll need to fix the Block Limits VPD simulation
(ata_scsiop_inq_b0) too, so that it will advertise the Maximum Write
Same Length dynamically as per the logical sector size, otherwise your
effort will be completely in vain, even if our block layer is
overhauled in the future.

Please be noted that, since your haven't touched ata_scsiop_inq_b0 at
all, the reported Maximum Write Same Length will be:

On device with TRIM support:
- 4194240 LOGICAL sector per request split / command
-- ~=2G on non-4Kn drives
-- ~=16G on non-4Kn drives

On device without TRIM support:
- 0 --> SD_MAX_WS10_BLOCKS (65535) per request split / command
-- ~= 32M on non-4Kn drives
-- ~=256M on non-4Kn drives

Even if we ignore the upper limit(s) of the block layer, do we want
such inconsistencies?

On 22 August 2016 at 04:23, Shaun Tancheff <shaun@tancheff.com> wrote:
> Correct handling of devices with sector_size other that 512 bytes.
>
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> In the case of a 4Kn device sector_size it is possible to describe a much
> larger DSM Trim than the current fixed default of 512 bytes.
>
> This patch assumes the minimum descriptor is sector_size and fills out
> the descriptor accordingly.
>
> The ACS-2 specification is quite clear that the DSM command payload is
> sized as number of 512 byte transfers so a 4Kn device will operate
> correctly without this patch.
>
> v5:
>  - Added support for a sector_size descriptor other than 512 bytes.
>
>  drivers/ata/libata-scsi.c | 85 +++++++++++++++++++++++++++++++----------------
>  1 file changed, 57 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index ebf1a04..37f456e 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3283,7 +3283,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>  /**
>   * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
>   * @cmd: SCSI command being translated
> - * @num: Maximum number of entries (nominally 64).
> + * @trmax: Maximum number of entries that will fit in sector_size bytes.
>   * @sector: Starting sector
>   * @count: Total Range of request in logical sectors
>   *
> @@ -3298,63 +3298,80 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>   *  LBA's should be sorted order and not overlap.
>   *
>   * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
> + *
> + * Return: Number of bytes copied into sglist.
>   */
> -static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
> -                                             u64 sector, u32 count)
> +static size_t ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 trmax,
> +                                       u64 sector, u32 count)
>  {
> -       __le64 *buffer;
> -       u32 i = 0, used_bytes;
> +       struct scsi_device *sdp = cmd->device;
> +       size_t len = sdp->sector_size;
> +       size_t r;
> +       __le64 *buf;
> +       u32 i = 0;
>         unsigned long flags;
>
> -       BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
> +       WARN_ON(len > ATA_SCSI_RBUF_SIZE);
> +
> +       if (len > ATA_SCSI_RBUF_SIZE)
> +               len = ATA_SCSI_RBUF_SIZE;
>
>         spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> -       buffer = ((void *)ata_scsi_rbuf);
> -       while (i < num) {
> +       buf = ((void *)ata_scsi_rbuf);
> +       memset(buf, 0, len);
> +       while (i < trmax) {
>                 u64 entry = sector |
>                         ((u64)(count > 0xffff ? 0xffff : count) << 48);
> -               buffer[i++] = __cpu_to_le64(entry);
> +               buf[i++] = __cpu_to_le64(entry);
>                 if (count <= 0xffff)
>                         break;
>                 count -= 0xffff;
>                 sector += 0xffff;
>         }
> -
> -       used_bytes = ALIGN(i * 8, 512);
> -       memset(buffer + i, 0, used_bytes - i * 8);
> -       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
> +       r = sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buf, len);
>         spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>
> -       return used_bytes;
> +       return r;
>  }
>
>  /**
>   * ata_format_dsm_trim_descr() - SATL Write Same to ATA SCT Write Same
>   * @cmd: SCSI command being translated
>   * @lba: Starting sector
> - * @num: Number of logical sectors to be zero'd.
> + * @num: Number of sectors to be zero'd.
>   *
> - * Rewrite the WRITE SAME descriptor to be an SCT Write Same formatted
> + * Rewrite the WRITE SAME payload to be an SCT Write Same formatted
>   * descriptor.
>   * NOTE: Writes a pattern (0's) in the foreground.
> - *       Large write-same requents can timeout.
> + *
> + * Return: Number of bytes copied into sglist.
>   */
> -static void ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
> +static size_t ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
>  {
> -       u16 *sctpg;
> +       struct scsi_device *sdp = cmd->device;
> +       size_t len = sdp->sector_size;
> +       size_t r;
> +       u16 *buf;
>         unsigned long flags;
>
>         spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> -       sctpg = ((void *)ata_scsi_rbuf);
> +       buf = ((void *)ata_scsi_rbuf);
> +
> +       put_unaligned_le16(0x0002,  &buf[0]); /* SCT_ACT_WRITE_SAME */
> +       put_unaligned_le16(0x0101,  &buf[1]); /* WRITE PTRN FG */
> +       put_unaligned_le64(lba,     &buf[2]);
> +       put_unaligned_le64(num,     &buf[6]);
> +       put_unaligned_le32(0u,      &buf[10]); /* pattern */
> +
> +       WARN_ON(len > ATA_SCSI_RBUF_SIZE);
>
> -       put_unaligned_le16(0x0002,  &sctpg[0]); /* SCT_ACT_WRITE_SAME */
> -       put_unaligned_le16(0x0101,  &sctpg[1]); /* WRITE PTRN FG */
> -       put_unaligned_le64(lba,     &sctpg[2]);
> -       put_unaligned_le64(num,     &sctpg[6]);
> -       put_unaligned_le32(0u,      &sctpg[10]);
> +       if (len > ATA_SCSI_RBUF_SIZE)
> +               len = ATA_SCSI_RBUF_SIZE;
>
> -       sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), sctpg, 512);
> +       r = sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buf, len);
>         spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
> +
> +       return r;
>  }
>
>  /**
> @@ -3371,11 +3388,13 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  {
>         struct ata_taskfile *tf = &qc->tf;
>         struct scsi_cmnd *scmd = qc->scsicmd;
> +       struct scsi_device *sdp = scmd->device;
> +       size_t len = sdp->sector_size;
>         struct ata_device *dev = qc->dev;
>         const u8 *cdb = scmd->cmnd;
>         u64 block;
>         u32 n_block;
> -       const u32 trmax = ATA_MAX_TRIM_RNUM;
> +       const u32 trmax = len >> 3;
>         u32 size;
>         u16 fp;
>         u8 bp = 0xff;
> @@ -3420,8 +3439,16 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         if (!scsi_sg_count(scmd))
>                 goto invalid_param_len;
>
> +       /*
> +        * size must match sector size in bytes
> +        * For DATA SET MANAGEMENT TRIM in ACS-2 nsect (aka count)
> +        * is defined as number of 512 byte blocks to be transferred.
> +        */
>         if (unmap) {
>                 size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
> +               if (size != len)
> +                       goto invalid_param_len;
> +
>                 if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
>                         /* Newer devices support queued TRIM commands */
>                         tf->protocol = ATA_PROT_NCQ;
> @@ -3441,7 +3468,9 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>                         tf->command = ATA_CMD_DSM;
>                 }
>         } else {
> -               ata_format_sct_write_same(scmd, block, n_block);
> +               size = ata_format_sct_write_same(scmd, block, n_block);
> +               if (size != len)
> +                       goto invalid_param_len;
>
>                 tf->hob_feature = 0;
>                 tf->feature = 0;
> --
> 2.9.3
>

^ permalink raw reply

* Re: [PATCH v6 0/4] SCT Write Same
From: Hannes Reinecke @ 2016-08-22  6:32 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Tom Yan, Martin K . Petersen,
	Damien Le Moal, Josh Bingaman, Hannes Reinecke
In-Reply-To: <20160822042321.24367-1-shaun@tancheff.com>

On 08/22/2016 06:23 AM, Shaun Tancheff wrote:
> At some point the method of issuing Write Same for ATA drives changed.
> Currently write same is commonly available via SCT so expose the SCT
> capabilities and use SCT Write Same when it is available.
> 
> This is useful for zoned based media that prefers to support discard
> with lbprz set, aka discard zeroes data by mapping discard operations to
> reset write pointer operations. Conventional zones that do not support
> reset write pointer can still honor the discard zeroes data by issuing
> a write same over the zone.
> 
> It may also be nice to know if various controllers that currently
> disable WRITE SAME will work with the SCT Write Same code path:
>   aacraid, arcmsr, megaraid, 3w-9xxx, 3w-sas, 3w-xxxx, gdth, hpsa, ips,
>   megaraid, pmcraid, storvsc_drv
> 
> This patch against v4.8-rc2 is also at
>     https://github.com/stancheff/linux/tree/v4.8-rc2%2Bbiof.v9
> 
>     git@github.com:stancheff/linux.git v4.8-rc2+biof.v9
> 
> v6:
>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
>  - Added support for a sector_size descriptor other than 512 bytes.
> v5:
>  - Addressed review comments
>  - Report support for ZBC only for zoned devices.
>  - kmap page during rewrite
>  - Fix unmap set to require trim or error, if not unmap then sct write
>    same or error.
> v4:
>  - Added partial MAINTENANCE_IN opcode simulation
>  - Dropped all changes in drivers/scsi/*
>  - Changed to honor the UNMAP flag -> TRIM, no UNMAP -> SCT.
> v3:
>  - Demux UNMAP/TRIM from WRITE SAME
> v2:
>  - Remove fugly ata hacking from sd.c
> 
> Shaun Tancheff (4):
>   libata: Safely overwrite attached page in WRITE SAME xlat
>   Add support for SCT Write Same
>   SCT Write Same / DSM Trim
>   SCT Write Same handle ATA_DFLAG_PIO
> 
>  drivers/ata/libata-scsi.c | 280 +++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/ata.h       |  69 +++++++-----
>  2 files changed, 292 insertions(+), 57 deletions(-)
> 
Thanks for doing this.
It has been on my To-Do list for a long time, and it's good to see the
UNMAP/TRIM SATL cleaned up finally.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH v6 4/4] SCT Write Same handle ATA_DFLAG_PIO
From: Hannes Reinecke @ 2016-08-22  6:31 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Tom Yan, Martin K . Petersen,
	Damien Le Moal, Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-5-shaun@tancheff.com>

On 08/22/2016 06:23 AM, Shaun Tancheff wrote:
> Use non DMA write log when ATA_DFLAG_PIO is set.
> 
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6: Added check for ATA_DFLAG_PIO and fallback to non DMA write log for
>     SCT Write Same
> 
>  drivers/ata/libata-scsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 37f456e..e50b7a7 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3485,6 +3485,8 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  		tf->device = ATA_CMD_STANDBYNOW1;
>  		tf->protocol = ATA_PROT_DMA;
>  		tf->command = ATA_CMD_WRITE_LOG_DMA_EXT;
> +		if (unlikely(dev->flags & ATA_DFLAG_PIO))
> +			tf->command = ATA_CMD_WRITE_LOG_EXT;
>  	}
>  
>  	tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE | ATA_TFLAG_LBA48 |
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH v6 3/4] SCT Write Same / DSM Trim
From: Hannes Reinecke @ 2016-08-22  6:30 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Tom Yan, Martin K . Petersen,
	Damien Le Moal, Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-4-shaun@tancheff.com>

On 08/22/2016 06:23 AM, Shaun Tancheff wrote:
> Correct handling of devices with sector_size other that 512 bytes.
> 
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> In the case of a 4Kn device sector_size it is possible to describe a much
> larger DSM Trim than the current fixed default of 512 bytes.
> 
> This patch assumes the minimum descriptor is sector_size and fills out
> the descriptor accordingly.
> 
> The ACS-2 specification is quite clear that the DSM command payload is
> sized as number of 512 byte transfers so a 4Kn device will operate
> correctly without this patch.
> 
Can you please reshuffle the description to have the 'Signed-off-by'
line following the entire description?
This makes things easier to read and avoid the last part of the
description being killed by overzealous patch programs.
THX.

> v5:
>  - Added support for a sector_size descriptor other than 512 bytes.
> 
>  drivers/ata/libata-scsi.c | 85 +++++++++++++++++++++++++++++++----------------
>  1 file changed, 57 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index ebf1a04..37f456e 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3283,7 +3283,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>  /**
>   * ata_format_dsm_trim_descr() - SATL Write Same to DSM Trim
>   * @cmd: SCSI command being translated
> - * @num: Maximum number of entries (nominally 64).
> + * @trmax: Maximum number of entries that will fit in sector_size bytes.
>   * @sector: Starting sector
>   * @count: Total Range of request in logical sectors
>   *
> @@ -3298,63 +3298,80 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc)
>   *  LBA's should be sorted order and not overlap.
>   *
>   * NOTE: this is the same format as ADD LBA(S) TO NV CACHE PINNED SET
> + *
> + * Return: Number of bytes copied into sglist.
>   */
> -static unsigned int ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 num,
> -					      u64 sector, u32 count)
> +static size_t ata_format_dsm_trim_descr(struct scsi_cmnd *cmd, u32 trmax,
> +					u64 sector, u32 count)
>  {
> -	__le64 *buffer;
> -	u32 i = 0, used_bytes;
> +	struct scsi_device *sdp = cmd->device;
> +	size_t len = sdp->sector_size;
> +	size_t r;
> +	__le64 *buf;
> +	u32 i = 0;
>  	unsigned long flags;
>  
> -	BUILD_BUG_ON(512 > ATA_SCSI_RBUF_SIZE);
> +	WARN_ON(len > ATA_SCSI_RBUF_SIZE);
> +
> +	if (len > ATA_SCSI_RBUF_SIZE)
> +		len = ATA_SCSI_RBUF_SIZE;
>  
>  	spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> -	buffer = ((void *)ata_scsi_rbuf);
> -	while (i < num) {
> +	buf = ((void *)ata_scsi_rbuf);
> +	memset(buf, 0, len);
> +	while (i < trmax) {
>  		u64 entry = sector |
>  			((u64)(count > 0xffff ? 0xffff : count) << 48);
> -		buffer[i++] = __cpu_to_le64(entry);
> +		buf[i++] = __cpu_to_le64(entry);
>  		if (count <= 0xffff)
>  			break;
>  		count -= 0xffff;
>  		sector += 0xffff;
>  	}
> -
> -	used_bytes = ALIGN(i * 8, 512);
> -	memset(buffer + i, 0, used_bytes - i * 8);
> -	sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buffer, 512);
> +	r = sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buf, len);
>  	spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
>  
> -	return used_bytes;
> +	return r;
>  }
>  
>  /**
>   * ata_format_dsm_trim_descr() - SATL Write Same to ATA SCT Write Same
>   * @cmd: SCSI command being translated
>   * @lba: Starting sector
> - * @num: Number of logical sectors to be zero'd.
> + * @num: Number of sectors to be zero'd.
>   *
> - * Rewrite the WRITE SAME descriptor to be an SCT Write Same formatted
> + * Rewrite the WRITE SAME payload to be an SCT Write Same formatted
>   * descriptor.
>   * NOTE: Writes a pattern (0's) in the foreground.
> - *       Large write-same requents can timeout.
> + *
> + * Return: Number of bytes copied into sglist.
>   */
> -static void ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
> +static size_t ata_format_sct_write_same(struct scsi_cmnd *cmd, u64 lba, u64 num)
>  {
> -	u16 *sctpg;
> +	struct scsi_device *sdp = cmd->device;
> +	size_t len = sdp->sector_size;
> +	size_t r;
> +	u16 *buf;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
> -	sctpg = ((void *)ata_scsi_rbuf);
> +	buf = ((void *)ata_scsi_rbuf);
> +
> +	put_unaligned_le16(0x0002,  &buf[0]); /* SCT_ACT_WRITE_SAME */
> +	put_unaligned_le16(0x0101,  &buf[1]); /* WRITE PTRN FG */
> +	put_unaligned_le64(lba,     &buf[2]);
> +	put_unaligned_le64(num,     &buf[6]);
> +	put_unaligned_le32(0u,      &buf[10]); /* pattern */
> +
> +	WARN_ON(len > ATA_SCSI_RBUF_SIZE);
>  
> -	put_unaligned_le16(0x0002,  &sctpg[0]); /* SCT_ACT_WRITE_SAME */
> -	put_unaligned_le16(0x0101,  &sctpg[1]); /* WRITE PTRN FG */
> -	put_unaligned_le64(lba,     &sctpg[2]);
> -	put_unaligned_le64(num,     &sctpg[6]);
> -	put_unaligned_le32(0u,      &sctpg[10]);
> +	if (len > ATA_SCSI_RBUF_SIZE)
> +		len = ATA_SCSI_RBUF_SIZE;
>  
> -	sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), sctpg, 512);
> +	r = sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd), buf, len);
>  	spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
> +
> +	return r;
>  }
>  
>  /**
> @@ -3371,11 +3388,13 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  {
>  	struct ata_taskfile *tf = &qc->tf;
>  	struct scsi_cmnd *scmd = qc->scsicmd;
> +	struct scsi_device *sdp = scmd->device;
> +	size_t len = sdp->sector_size;
>  	struct ata_device *dev = qc->dev;
>  	const u8 *cdb = scmd->cmnd;
>  	u64 block;
>  	u32 n_block;
> -	const u32 trmax = ATA_MAX_TRIM_RNUM;
> +	const u32 trmax = len >> 3;
>  	u32 size;
>  	u16 fp;
>  	u8 bp = 0xff;
> @@ -3420,8 +3439,16 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  	if (!scsi_sg_count(scmd))
>  		goto invalid_param_len;
>  
> +	/*
> +	 * size must match sector size in bytes
> +	 * For DATA SET MANAGEMENT TRIM in ACS-2 nsect (aka count)
> +	 * is defined as number of 512 byte blocks to be transferred.
> +	 */
>  	if (unmap) {
>  		size = ata_format_dsm_trim_descr(scmd, trmax, block, n_block);
> +		if (size != len)
> +			goto invalid_param_len;
> +
>  		if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
>  			/* Newer devices support queued TRIM commands */
>  			tf->protocol = ATA_PROT_NCQ;
> @@ -3441,7 +3468,9 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>  			tf->command = ATA_CMD_DSM;
>  		}
>  	} else {
> -		ata_format_sct_write_same(scmd, block, n_block);
> +		size = ata_format_sct_write_same(scmd, block, n_block);
> +		if (size != len)
> +			goto invalid_param_len;
>  
>  		tf->hob_feature = 0;
>  		tf->feature = 0;
> 
Why is this not folded into the previous patches?
This mostly patches code which you've already modified and/or initiated,
right?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH v6 1/4] libata: Safely overwrite attached page in WRITE SAME xlat
From: Hannes Reinecke @ 2016-08-22  6:27 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Tom Yan, Martin K . Petersen,
	Damien Le Moal, Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-2-shaun@tancheff.com>

On 08/22/2016 06:23 AM, Shaun Tancheff wrote:
> Safely overwriting the attached page to ATA format from the SCSI formatted
> variant.
> 
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6:
>  - Fix bisect bug reported by Tom Yan <tom.ty89@gmail.com>
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
> v5:
>  - Added prep patch to work with non-page aligned scatterlist pages
>    and use kmap_atomic() to lock page during modification.
> 
>  drivers/ata/libata-scsi.c | 56 ++++++++++++++++++++++++++++++++++++++++++-----
>  include/linux/ata.h       | 26 ----------------------
>  2 files changed, 51 insertions(+), 31 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH v6 2/4] Add support for SCT Write Same
From: Hannes Reinecke @ 2016-08-22  6:27 UTC (permalink / raw)
  To: Shaun Tancheff, linux-ide, linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Tom Yan, Martin K . Petersen,
	Damien Le Moal, Josh Bingaman, Hannes Reinecke, Shaun Tancheff
In-Reply-To: <20160822042321.24367-3-shaun@tancheff.com>

On 08/22/2016 06:23 AM, Shaun Tancheff wrote:
> SATA drives may support write same via SCT. This is useful
> for setting the drive contents to a specific pattern (0's).
> 
> Translate a SCSI WRITE SAME 16 command to be either a DSM TRIM
> command or an SCT Write Same command.
> 
> Based on the UNMAP flag:
>   - When set translate to DSM TRIM
>   - When not set translate to SCT Write Same
> 
> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
> ---
> v6:
>  - Change to use sg_copy_from_buffer as per Christoph Hellwig <hch@lst.de>
> v5:
>  - Addressed review comments
>  - Report support for ZBC only for zoned devices.
>  - kmap page during rewrite
>  - Fix unmap set to require trim or error, if not unmap then sct write
>    same or error.
> v4:
>  - Added partial MAINTENANCE_IN opcode simulation
>  - Dropped all changes in drivers/scsi/*
>  - Changed to honor the UNMAP flag -> TRIM, no UNMAP -> SCT.
> v3:
>  - Demux UNMAP/TRIM from WRITE SAME
> v2:
>  - Remove fugly ata hacking from sd.c
> 
>  drivers/ata/libata-scsi.c | 199 +++++++++++++++++++++++++++++++++++++++-------
>  include/linux/ata.h       |  43 ++++++++++
>  2 files changed, 213 insertions(+), 29 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox