linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	linux-scsi@vger.kernel.org, James.Bottomley@suse.de, hch@lst.de,
	axboe@kernel.dk
Subject: Re: scsi: convert discard to REQ_TYPE_FS instead of REQ_TYPE_BLOCK_PC
Date: Tue, 06 Jul 2010 23:35:53 -0400	[thread overview]
Message-ID: <4C33F619.4010302@interlog.com> (raw)
In-Reply-To: <yq1ocek6pf4.fsf@sermon.lab.mkp.net>

On 10-07-06 09:39 PM, Martin K. Petersen wrote:
>>>>>> "Mike" == Mike Snitzer<snitzer@redhat.com>  writes:
>
>>> That is 0x7fffff (over 8 million) blocks (4 GB) being unmapped in one
>>> operation! That may exceed the "maximum unmap lba count" field in the
>>> Block Limits VPD page.  The latest SBC draft (sbc3r22.pdf) says that
>>> field applies to the SCSI UNMAP command and does not mention the
>>> WRITE SAME (16) command but that is probably an oversight.
>
> Maximum Unmap LBA Count>  0 (in combination with the descriptor count)
> are what indicate that the device server supports UNMAP.

That has been superseded by the TPU and TPWS bits
in the Thin provisioning VPD page (B2h) in sbc3r22.
TPU and TPWS indicate support for the UNMAP and WRITE
SAME (16) with UNMAP bit ** commands respectively.

> You could argue, then, that a Maximum Unmap LBA Count>  0 but a Maximum
> Unmap Descriptor Count of 0 would provide means to indicate the maximum
> range for WRITE SAME.  But the T10 people I have talked to all agree
> that the LBA count for WRITE SAME is gated by the command's LBA count
> and nothing else.  So no special casing for when the UNMAP bit is set.
> I.e. the max for WRITE SAME(16) is 32-bits times logical_block_size.

I think sbc3r22 is just flaky in that area and will
be cleaned up soon. As the words stand now, in the
Block limits VPD page "maximum unmap lba count" only
applies to the UNMAP command while "optimal unmap
granularity" applies to both the UNMAP command and
the WRITE SAME(16) command. Inconsistent.
And "maximum unmap lba count"==0 implying no UNMAP
command is pointless given the TPU bit.

> Mike>  # cat /sys/block/sda/queue/discard_granularity
> Mike>  512
> Mike>  # cat /sys/block/sda/queue/discard_max_bytes
> Mike>  4294966784
>
> Mike>  I'll look to understand why 'discard_max_bytes' is so large for
> Mike>  this LUN despite the standard Block limits VPD page not reflecting
> Mike>  this.
>
> discard_max_bytes is 0xFFFFFFFF for WRITE SAME(16).

FORMAT UNIT has several associated mechanisms (e.g
IMMED bit and REQUEST SENSE polling) that let it
run for a long time. WRITE SAME has no such mechanisms.
There was a proposal put to t10 to place an upper limit
on WRITE SAME's lba count but I think that has been
dropped. IMO if we want to give large block counts to
UNMAP or WRITE SAME in the absence of guidance from the
block limits VPD page, then we need to cope with
device saying "nope".

Whatever device Mike has it seems to be failing the
WRITE SAME(16) command due to the huge lba block count.
Does the device work with a smaller lba block count?
For example:
    sg_write_same --unmap --lba 0 --num 1024 /dev/sda



** WRITE SAME (32) also has an UNMAP bit.

Doug Gilbert


  parent reply	other threads:[~2010-07-07  3:36 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-06  7:01 [PATCH] scsi: convert discard to REQ_TYPE_FS instead of REQ_TYPE_BLOCK_PC FUJITA Tomonori
2010-07-06 21:31 ` Mike Snitzer
2010-07-06 23:40   ` Douglas Gilbert
2010-07-07  0:47     ` Mike Snitzer
2010-07-07  1:39       ` Martin K. Petersen
2010-07-07  2:19         ` Mike Snitzer
2010-07-07  3:35         ` Douglas Gilbert [this message]
2010-07-08 19:11           ` Mike Snitzer
2010-07-09 16:27             ` Martin K. Petersen
2010-07-09 18:06               ` Mike Snitzer
2010-07-09 16:22           ` Martin K. Petersen
2010-07-07  4:06       ` FUJITA Tomonori
2010-07-07  4:07       ` James Bottomley
2010-07-07 16:39 ` [PATCH] " Christoph Hellwig
2010-07-08  0:40   ` FUJITA Tomonori
2010-07-08 14:35     ` James Bottomley
2010-07-09  3:55     ` Christoph Hellwig
2010-07-09  4:42       ` FUJITA Tomonori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C33F619.4010302@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=James.Bottomley@suse.de \
    --cc=axboe@kernel.dk \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=hch@lst.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).