From: John Snow <jsnow@redhat.com>
To: Anton Nefedov <anton.nefedov@virtuozzo.com>,
Alberto Garcia <berto@igalia.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, den@virtuozzo.com, armbru@redhat.com,
mreitz@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 2/7] ide: account UNMAP (TRIM) operations
Date: Wed, 6 Dec 2017 17:09:46 -0500 [thread overview]
Message-ID: <beeaf694-b8d2-bac7-a94d-da7f57cc1748@redhat.com> (raw)
In-Reply-To: <1a1334cc-8fae-03bc-b0be-70ee4b8c4ac0@virtuozzo.com>
On 12/05/2017 12:14 PM, Anton Nefedov wrote:
>
>
> On 5/12/2017 6:21 PM, Alberto Garcia wrote:
>> On Mon 20 Nov 2017 05:50:59 PM CET, Anton Nefedov wrote:
>>> Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>> hw/ide/core.c | 12 ++++++++++++
>>> 1 file changed, 12 insertions(+)
>>>
>>> diff --git a/hw/ide/core.c b/hw/ide/core.c
>>> index 471d0c9..2e4dea7 100644
>>> --- a/hw/ide/core.c
>>> +++ b/hw/ide/core.c
>>> @@ -389,6 +389,7 @@ typedef struct TrimAIOCB {
>>> QEMUIOVector *qiov;
>>> BlockAIOCB *aiocb;
>>> int i, j;
>>> + BlockAcctCookie acct;
>>> } TrimAIOCB;
>>> static void trim_aio_cancel(BlockAIOCB *acb)
>>> @@ -426,6 +427,14 @@ static void ide_trim_bh_cb(void *opaque)
>>> static void ide_issue_trim_cb(void *opaque, int ret)
>>> {
>>> TrimAIOCB *iocb = opaque;
>>> + if (iocb->i >= 0) {
>>> + if (ret >= 0) {
>>> + block_acct_done(blk_get_stats(iocb->blk), &iocb->acct);
>>> + } else {
>>> + block_acct_failed(blk_get_stats(iocb->blk), &iocb->acct);
>>> + }
>>> + }
>>
>> This part looks fine, but don't you also need to account for invalid
>> requests (in ide_dma_cb() or somewhere else) ?
>>
not ide_dma_cb this time, because the command does not use ATA registers
as input, see below :(
>> Berto
>>
>
> Good point; in fact, the TRIM sector range is never checked.
> (well it should be, down at the block layer, and then counted as error).
>
> The motivation was:
>
> commit d66168ed687325aa6d338ce3a3cff18ce3098ed6
> Author: Michael Tokarev <mjt@tls.msk.ru>
> Date: Wed Aug 13 11:23:31 2014 +0400
>
> ide: only constrain read/write requests to drive size, not other types
>
> Commit 58ac321135a introduced a check to ide dma processing which
> constrains all requests to drive size. However, apparently, some
> valid requests (like TRIM) does not fit in this constraint, and
> fails in 2.1. So check the range only for reads and writes.
>
I wound up at the same commit. The problem here is that the TRIM command
does not issue contiguous LBA+count requests in the same way using the
ATA registers like DMA R/W functions do, but instead works a bit more
like DMA commands in that it transmits a list of regions separately.
Kevin pointed this out in 2014:
https://lists.gnu.org/archive/html/qemu-devel/2014-08/msg02012.html
"I can't give you a clear answer, but it all depends on the value of
sector_num. This is the contents of the LBA registers and unused for
TRIM (spec says it's reserved, so we shouldn't be looking at it)."
He's right! This means the LBA/Count registers are undefined when we're
using TRIM in ide_dma_cb.
So, you have two things you can do:
(1) Modify ide_sector_start_dma to start accounting for TRIM early, and
then continue to mark it failed/complete where you do, but you'll need
to add in a new error case in ide_issue_trim_cb to check the range and
abort the job. We do not need to preprocess the entire "list" of TRIM
regions upfront as we are within our rights to process some of them
before aborting.
(2) Continue to start accounting where you do, per-region, which keeps
trim requests "per chunk", but add the error range checking almost
immediately after, if this is useful for statistical purposes. It will
look a little funny to start accounting and then immediately and
synchronously mark it invalid, but that's probably the most accurate thing.
I'm basing this off of the ATA8-AC3 spec, which defines the command
"DATA SET MANAGEMENT - 06h, DMA":
> If the Trim bit is set to one and:
> a) the device detects an invalid LBA Range Entry; or
> b) count is greater than IDENTIFY DEVICE data word 105 (see 7.16.7.55),
> then the device shall return command aborted.
> A device may trim one or more LBA Range Entries before it returns
command aborted. See table 209.
ATA8-ACS3 section seems pretty clear to me that we *may* abort the
command if we detect an "invalid LBA Range Entry" which is defined in
3.1.40 as:
"3.1.40 Invalid LBA Range: A range of LBAs that contains one or more
invalid LBAs."
which references 3.1.39:
3.1.39 Invalid LBA: An LBA that is greater than or equal to the largest
value reported in IDENTIFY
DEVICE data words 60..61 (see 7.16.7.22), IDENTIFY DEVICE data words
100..103
(see 7.16.7.53), or IDENTIFY DEVICE data words 230..233 (see 7.16.7.88).
Of course, we only know if a range could possibly be invalid by the time
we actually process it in ide_issue_trim_cb.
>
> It seems like the removed check was at the wrong place (trim request has
> to be parsed first to get offset and nbytes).
> Probably it should be put to ide_issue_trim_cb() instead.
>
> cc John (should have done it earlier)
>
Hi!
> /Anton
>
next prev parent reply other threads:[~2017-12-06 22:09 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 16:50 [Qemu-devel] [PATCH 0/7] discard blockstats Anton Nefedov
2017-11-20 16:50 ` [Qemu-devel] [PATCH 1/7] qapi: add unmap to BlockDeviceStats Anton Nefedov
2017-12-05 15:09 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-12-05 15:19 ` Eric Blake
2017-12-05 17:15 ` Anton Nefedov
2017-11-20 16:50 ` [Qemu-devel] [PATCH 2/7] ide: account UNMAP (TRIM) operations Anton Nefedov
2017-12-05 15:21 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-12-05 17:14 ` Anton Nefedov
2017-12-06 22:09 ` John Snow [this message]
2017-11-20 16:51 ` [Qemu-devel] [PATCH 3/7] scsi: store unmap offset and nb_sectors in request struct Anton Nefedov
2017-12-11 15:12 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-11-20 16:51 ` [Qemu-devel] [PATCH 4/7] scsi: move unmap error checking to the complete callback Anton Nefedov
2017-11-20 16:51 ` [Qemu-devel] [PATCH 5/7] scsi: account unmap operations Anton Nefedov
2017-11-20 16:51 ` [Qemu-devel] [PATCH 6/7] file-posix: account discard operations Anton Nefedov
2017-11-20 16:51 ` [Qemu-devel] [PATCH 7/7] qapi: query-blockstat: add driver specific file-posix stats Anton Nefedov
2018-08-17 17:27 ` [Qemu-devel] [PATCH 0/7] discard blockstats Paolo Bonzini
2018-08-20 10:02 ` Anton Nefedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=beeaf694-b8d2-bac7-a94d-da7f57cc1748@redhat.com \
--to=jsnow@redhat.com \
--cc=anton.nefedov@virtuozzo.com \
--cc=armbru@redhat.com \
--cc=berto@igalia.com \
--cc=den@virtuozzo.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).