public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Tom Yan <tom.ty89@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org
Subject: Re: configurable discard parameters
Date: Tue, 23 Jun 2015 22:55:57 -0400	[thread overview]
Message-ID: <yq1616d4xxe.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <CAGnHSEnJ=_NV8Ed_=Ykr+30C+YSLFDPzyzeSY9b4p6pyaNOE4A@mail.gmail.com> (Tom Yan's message of "Wed, 24 Jun 2015 05:25:20 +0800")

>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes:

Tom> total untrimmed sectors: (310635 - 1 - 2410) / 32 = 9632 total
Tom> ranges: (43 GiB * (1024 ^ 3) / 512) / 65535 ~= 1376 average
Tom> untrimmed sectors per range: (9632 / 1376) = 7 = (65535 % 8)

Every type of drive has its own internal restrictions. Unless the drive
guarantees zero after TRIM it is perfectly normal for heads, tails or
random pieces in-between to be left untouched.

I am surprised that the common case of contiguous range entries was not
handled properly by your drive. Most of them deal with that just fine
(regardless of their internal granularity and alignment constraints). It
is normally only the partial block tracking between command invocations
that causes grief.

In any case. Unless discard_zeroes_data is 1 (and that requires
guarantees above and beyond what can be discovered via the ATA Command
Set) all bets are off wrt. dependable behavior.

The code below is an untested proof of concept to see what it would take
to alleviate your particular issue. I am, however, not at all convinced
that introducing such a change is worth the risk. I know of at least a
couple of devices that would blow up as a result of this patch...

-- 
Martin K. Petersen	Oracle Linux Engineering

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 3131adcc1f87..9c270a303f0b 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2238,8 +2238,24 @@ static unsigned int ata_scsiop_inq_b0(struct ata_scsi_args *args, u8 *rbuf)
 	 * with the unmap bit set.
 	 */
 	if (ata_id_has_trim(args->id)) {
-		put_unaligned_be64(65535 * 512 / 8, &rbuf[36]);
-		put_unaligned_be32(1, &rbuf[28]);
+		/*
+		 * If the drive supports reliable zero after trim we set
+		 * the granularity to 1 and the blocks per range to
+		 * 0xffff. Otherwise we set set the granularity to 8 and
+		 * restrict the blocks per range to 0xfff8. This is done
+		 * to accommodate older drives which prefer 4K-alignment.
+		 */
+
+		if (ata_id_has_zero_after_trim(args->id) &&
+		    args->dev->horkage & ATA_HORKAGE_ZERO_AFTER_TRIM) {
+			put_unaligned_be64(ATA_MAX_TRIM_RANGE * 512 / 8,
+					   &rbuf[36]);
+			put_unaligned_be32(1, &rbuf[28]);
+		} else {
+			put_unaligned_be64(ATA_ALIGNED_TRIM_RANGE * 512 / 8,
+					   &rbuf[36]);
+			put_unaligned_be32(8, &rbuf[28]);
+		}
 	}
 
 	return 0;
@@ -3168,7 +3184,14 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
 		goto invalid_fld;
 
 	buf = page_address(sg_page(scsi_sglist(scmd)));
-	size = ata_set_lba_range_entries(buf, 512, block, n_block);
+
+	if (ata_id_has_zero_after_trim(dev->id) &&
+	    dev->horkage & ATA_HORKAGE_ZERO_AFTER_TRIM)
+		size = ata_set_lba_range_entries(buf, 512, block, n_block,
+						 ATA_MAX_TRIM_RANGE);
+	else
+		size = ata_set_lba_range_entries(buf, 512, block, n_block,
+						 ATA_ALIGNED_TRIM_RANGE);
 
 	if (ata_ncq_enabled(dev) && ata_fpdma_dsm_supported(dev)) {
 		/* Newer devices support queued TRIM commands */
diff --git a/include/linux/ata.h b/include/linux/ata.h
index fed36418dd1c..8a17fa22cdbe 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -47,6 +47,8 @@ enum {
 	ATA_MAX_SECTORS		= 256,
 	ATA_MAX_SECTORS_LBA48	= 65535,/* TODO: 65536? */
 	ATA_MAX_SECTORS_TAPE	= 65535,
+	ATA_MAX_TRIM_RANGE	= 0xffff,
+	ATA_ALIGNED_TRIM_RANGE	= 0xfff8,
 
 	ATA_ID_WORDS		= 256,
 	ATA_ID_CONFIG		= 0,
@@ -1012,19 +1014,20 @@ static inline void ata_id_to_hd_driveid(u16 *id)
  * TO NV CACHE PINNED SET.
  */
 static inline unsigned ata_set_lba_range_entries(void *_buffer,
-		unsigned buf_size, u64 sector, unsigned long count)
+		unsigned buf_size, u64 sector, unsigned long count,
+		u16 bpe)
 {
 	__le64 *buffer = _buffer;
 	unsigned i = 0, used_bytes;
 
 	while (i < buf_size / 8 ) { /* 6-byte LBA + 2-byte range per entry */
 		u64 entry = sector |
-			((u64)(count > 0xffff ? 0xffff : count) << 48);
+			((u64)(count > bpe ? bpe : count) << 48);
 		buffer[i++] = __cpu_to_le64(entry);
-		if (count <= 0xffff)
+		if (count <= bpe)
 			break;
-		count -= 0xffff;
-		sector += 0xffff;
+		count -= bpe;
+		sector += bpe;
 	}
 
 	used_bytes = ALIGN(i * 8, 512);

  reply	other threads:[~2015-06-24  2:55 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-20 17:12 configurable discard parameters Tom Yan
2015-06-21  0:20 ` Martin K. Petersen
2015-06-21  7:03   ` Tom Yan
2015-06-21  8:05     ` Tom Yan
2015-06-21 12:36       ` Tom Yan
2015-06-21 20:30         ` Tom Yan
2015-06-22 20:57           ` Martin K. Petersen
2015-06-23 14:16             ` Tom Yan
2015-06-23 15:36               ` Martin K. Petersen
2015-06-23 16:41                 ` Tom Yan
2015-06-23 17:03                   ` Martin K. Petersen
2015-06-23 17:24                     ` Tom Yan
2015-06-23 18:26                       ` Martin K. Petersen
2015-06-23 21:25                         ` Tom Yan
2015-06-24  2:55                           ` Martin K. Petersen [this message]
2015-06-24 12:46                             ` Tom Yan
2015-06-25  1:15                               ` Martin K. Petersen
2015-06-26  7:05                                 ` Tom Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1616d4xxe.fsf@sermon.lab.mkp.net \
    --to=martin.petersen@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=tom.ty89@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox