From: Boaz Harrosh <bharrosh@panasas.com>
To: Matthew Wilcox <matthew@wil.cx>
Cc: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
Tejun Heo <tj@kernel.org>
Subject: Re: Getting TRIM working
Date: Sun, 08 Mar 2009 12:28:27 +0200 [thread overview]
Message-ID: <49B39DCB.3040203@panasas.com> (raw)
In-Reply-To: <20090306191620.GA25995@parisc-linux.org>
Matthew Wilcox wrote:
> On Wed, Mar 04, 2009 at 11:20:27AM +0200, Boaz Harrosh wrote:
>> Matthew Wilcox wrote:
>>> size = ALIGN(i * 8, 512);
>>> memset(buffer + i * 8, 0, size - i * 8);
>>> old_size = bio_iovec(bio)->bv_len;
>>> printk("before: bi_size %d, data_len %d, bv_len %d\n", bio->bi_size,
>>> req->data_len, old_size);
>>> if (size > old_size) {
>>> bio_add_pc_page(req->q, bio, bio_page(bio),
>>> size - old_size, old_size);
>>> req->data_len = size;
>>> }
>>> printk("after: bi_size %d, data_len %d, bv_len %d\n", bio->bi_size,
>>> req->data_len, bio_iovec(bio)->bv_len);
>>>
>>> Now req->data_len, bio->bi_size and bio_iovec(bio)->bv_len are all 512.
>>> Yet the AHCI driver still spits out 24 bytes and then stops (which hangs
>>> the drive). What am I missing?
>> What about the length embedded in the CDB, which is usually derived from
>> scsi_bufflen(), or other places that look at scsi_bufflen() and not at
>> request && it's bios. The later might be bigger then scsi's in split commands
>> but the drivers should only consume scsi_bufflen() bytes.
>
> A fine idea, completely true ... I fixed it like this:
>
> + old_size = bio_iovec(bio)->bv_len;
> +printk("before: bi_size %d, data_len %d, bv_len %d sdb length %d\n",
> + bio->bi_size, req->data_len, old_size, scmd->sdb.length);
> + if (size > old_size) {
> + bio_add_pc_page(req->q, bio, bio_page(bio),
> + size - old_size, old_size);
> + }
> + scmd->sdb.length = req->data_len = size;
> +printk("after: bi_size %d, data_len %d, bv_len %d sdb length %d\n",
> + bio->bi_size, req->data_len, bio_iovec(bio)->bv_len,
> + scmd->sdb.length);
>
> and it howed sdb.length being 24 before, and 512 after.
>
> And the damn thing still spit out 24 bytes onto the bus and stopped.
>
> To prove where the bug is, I lied to SCSI. I changed this:
>
> - if (bio_add_pc_page(q, bio, page, 24, 0) < 24) {
> + if (bio_add_pc_page(q, bio, page, 512, 0) < 512) {
>
> and we spat out a 512 byte sector to the disc, which accepted it and
> erased the trimmed sector. Yay.
>
> So we can go back to looking for a *fifth* place where we store the
> length of the data we're transferring. I'm not convinced this says good
> things about our storage stack that we have so many places where we
> store the length. There's more than this of course, because there's
> ATA's qc->nbytes, and tf->nsect+hob_nsect, but I already set those
> correctly.
>
That's because you are doing it at the wrong level at the wrong stage.
1. block-level submits a request
2. sd/sr or what ever ULD prepares a scsi_cmnd out of request.
Request's sizes are only a recommendation. ULD or scsi-ml may
prepare a smaller command then request. Once command is prepared
request is disregarded, you can bang on it all you want code will
not care about it one bit.
3. LLD executes the scsi-command (Not the block-request)
4. scsi-ml completes command's bytes, at this stage the request might
not be over and, and a reminder is re-prepared so the request can
be complete.
The code above scmd->sdb.length = req->data_len = size; is not allowed
and can cause data leaks.
You should ping Tejun, block-layer(1) and ATA-LLD(3) has a way to communicate
alignments and drain buffers that expose some other possible lenght's to ata.
And to your question the missing length above is probably encoded inside the
submitted CDB. (scsi_cmnd->cmnd). When you change the length before
stage (2) it works.
I think you should be using the drain mechanisms built into ata
Boaz
next prev parent reply other threads:[~2009-03-08 10:28 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-03 19:07 Getting TRIM working Matthew Wilcox
2009-03-04 9:20 ` Boaz Harrosh
2009-03-06 19:16 ` Matthew Wilcox
2009-03-08 10:28 ` Boaz Harrosh [this message]
2009-03-08 16:54 ` Matthew Wilcox
2009-03-08 17:38 ` Boaz Harrosh
2009-03-08 21:24 ` James Bottomley
2009-03-08 21:32 ` James Bottomley
2009-03-09 8:36 ` Matthew Wilcox
2009-03-09 13:52 ` Douglas Gilbert
2009-03-09 14:03 ` INCITS Matthew Wilcox
2009-03-09 14:08 ` Getting TRIM working James Bottomley
2009-03-09 14:04 ` James Bottomley
2009-03-09 14:14 ` Matthew Wilcox
2009-03-09 15:17 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49B39DCB.3040203@panasas.com \
--to=bharrosh@panasas.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=matthew@wil.cx \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).