All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Peter Lieven <pl@kamp.de>, Kevin Wolf <kwolf@redhat.com>
Cc: stefanha@gmail.com, jcody@redhat.com, qemu-devel@nongnu.org,
	qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async
Date: Tue, 6 Oct 2015 13:07:09 -0400	[thread overview]
Message-ID: <5613FFBD.4080907@redhat.com> (raw)
In-Reply-To: <56139278.2090002@kamp.de>



On 10/06/2015 05:20 AM, Peter Lieven wrote:
> Am 06.10.2015 um 10:57 schrieb Kevin Wolf:
>> Am 05.10.2015 um 23:15 hat John Snow geschrieben:
>>>
>>> On 09/21/2015 08:25 AM, Peter Lieven wrote:
>>>> PIO read requests on the ATAPI interface used to be sync blk requests.
>>>> This has to siginificant drawbacks. First the main loop hangs util an
>>>> I/O request is completed and secondly if the I/O request does not
>>>> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
>>>>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>   hw/ide/atapi.c | 69
>>>> ++++++++++++++++++++++++++++++++++++----------------------
>>>>   1 file changed, 43 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
>>>> index 747f466..9257e1c 100644
>>>> --- a/hw/ide/atapi.c
>>>> +++ b/hw/ide/atapi.c
>>>> @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>>>>       memset(buf, 0, 288);
>>>>   }
>>>>   -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int
>>>> sector_size)
>>>> +static void cd_read_sector_cb(void *opaque, int ret)
>>>>   {
>>>> -    int ret;
>>>> +    IDEState *s = opaque;
>>>>   -    switch(sector_size) {
>>>> -    case 2048:
>>>> -        block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> -                         4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
>>>> -        block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> -        break;
>>>> -    case 2352:
>>>> -        block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> -                         4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
>>>> -        block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> -        if (ret < 0)
>>>> -            return ret;
>>>> -        cd_data_to_raw(buf, lba);
>>>> -        break;
>>>> -    default:
>>>> -        ret = -EIO;
>>>> -        break;
>>>> +    block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> +
>>>> +    if (ret < 0) {
>>>> +        ide_atapi_io_error(s, ret);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (s->cd_sector_size == 2352) {
>>>> +        cd_data_to_raw(s->io_buffer, s->lba);
>>>>       }
>>>> -    return ret;
>>>> +
>>>> +    s->lba++;
>>>> +    s->io_buffer_index = 0;
>>>> +    s->status &= ~BUSY_STAT;
>>>> +
>>>> +    ide_atapi_cmd_reply_end(s);
>>>> +}
>>>> +
>>>> +static int cd_read_sector(IDEState *s, int lba, void *buf, int
>>>> sector_size)
>>>> +{
>>>> +    if (sector_size != 2048 && sector_size != 2352) {
>>>> +        return -EINVAL;
>>>> +    }
>>>> +
>>>> +    s->iov.iov_base = buf;
>>>> +    if (sector_size == 2352) {
>>>> +        buf += 4;
>>>> +    }
>> This doesn't look quite right, buf is never read after this.
>>
>> Also, why +=4 when it was originally buf + 16?
> 
> You are right. I mixed that up.
> 
>>
>>>> +
>>>> +    s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
>>>> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>>>> +
>>>> +    if (blk_aio_readv(s->blk, (int64_t)lba << 2, &s->qiov, 4,
>>>> +                      cd_read_sector_cb, s) == NULL) {
>>>> +        return -EIO;
>>>> +    }
>>>> +
>>>> +    block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> +                     4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> +    s->status |= BUSY_STAT;
>>>> +    return 0;
>>>>   }
>>>>   
>>> We discussed this off-list a bit, but for upstream synchronization:
>>>
>>> Unfortunately, I believe making cd_read_sector here non-blocking makes
>>> ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
>>> s->end_transfer_func() nonblocking, which functions like ide_data_readw
>>> are not prepared to cope with.
>> I don't think that's a problem as long as BSY is set while the
>> asynchronous command is running and DRQ is cleared. The latter will
>> protect ide_data_readw(). ide_sector_read() does essentially the same
>> thing.
> 
>  I was thinking the same. Without the BSY its not working at all.
> 
>>
>> Or maybe I'm just missing what you're trying to say.
>>
>>> My suggestion is to buffer an entire DRQ block of data at once
>>> (byte_count_limit) to avoid the problem.
>> No matter whether there is a problem or not, buffering more data at once
>> (and therefore doing less requests) is better for performance anyway.
> 
> Its possible to do only one read in the backend and read the whole
> request into the IO buffer. I send a follow-up.
> 

Be cautious: we only have 128K (+4 bytes) to play with in the io_buffer
and the READ10 cdb can request up to 128MiB! For performance, it might
be nice to always buffer something like:

MIN(128K, nb_sectors * sector_size)

and then as the guest drains the DRQ block of size byte_count_limit
which can only be at largest 0xFFFE (we can fit in at least two of these
per io_buffer refill) we can just shift the data_ptr and data_end
pointers to utilize io_buffer like a ring buffer.

Because the guest can at most fetch 0xfffe bytes at a time, it will tend
to leave at least 4 bytes left over from a 64 block read. Luckily, we've
got 4 extra bytes in s->io_buffer, so with a ring buffer we can always
rebuffer *at least* two full DRQ blocks of data at a time.

The routine would basically look like this:

- No DRQ blocks buffered, so read up to 64 blocks or however many are
left for our transfer
- If we have at least one full DRQ block allocated, start the transfer
and send an interrupt
- If we ran out of DRQ blocks, go back to the top and buffer them.

This would eliminate the need for code stanza #3 in
ide_atapi_cmd_reply_end, which re-starts a transfer without signaling to
the guest. We'd only have:

ide_atapi_cmd_reply_end(...) {
  if (packet_transfer_size == 0) { end(...); return; }
  if (blocks_buffered < 1) { async_buffer_blocks(...); return; }
  ide_transfer_start(...)
  ide_set_irq(s->bus);
}


which is a good deal simpler than what we have now, though I need to
look into the formatting of raw CD data a little more to make sure my
numbers make sense... it may not be quite so easy to buffer multiple DRQ
blocks in some cases, but so it goes -- we should always be able to
buffer at least one.

> Maybe do you have a pointer to the test tool that John mentioned?
> 
> Peter
> 

  reply	other threads:[~2015-10-06 17:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21 12:25 [Qemu-devel] [PATCH 0/5] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async Peter Lieven
2015-10-02 21:02   ` John Snow
2015-10-05 21:15   ` John Snow
2015-10-06  8:46     ` Peter Lieven
2015-10-06 12:08       ` Peter Lieven
2015-10-07 16:42       ` John Snow
2015-10-07 18:53         ` Peter Lieven
2015-10-08 12:06         ` Peter Lieven
2015-10-08 16:44           ` John Snow
2015-10-09  8:21             ` Kevin Wolf
2015-10-09 11:18               ` Peter Lieven
2015-10-09 16:32               ` John Snow
2015-10-14 18:19             ` Peter Lieven
2015-10-14 18:21               ` John Snow
2015-10-16 10:56                 ` Peter Lieven
2015-10-06  8:57     ` Kevin Wolf
2015-10-06  9:20       ` Peter Lieven
2015-10-06 17:07         ` John Snow [this message]
2015-10-06 17:12           ` Peter Lieven
2015-10-06 17:56             ` John Snow
2015-10-06 18:31               ` Peter Lieven
2015-10-06 18:34                 ` John Snow
2015-10-06 15:54       ` John Snow
2015-10-07  7:28         ` Kevin Wolf
2015-10-06 13:05   ` Laszlo Ersek
2015-09-21 12:25 ` [Qemu-devel] [PATCH 2/5] ide/atapi: blk_aio_readv may return NULL Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 3/5] ide: add support for cancelable read requests Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 4/5] ide/atapi: enable cancelable requests Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 5/5] block/nfs: cache allocated filesize for read-only files Peter Lieven
2015-09-21 20:58 ` [Qemu-devel] [PATCH 0/5] ide: avoid main-loop hang on CDROM/NFS failure John Snow
2015-09-21 21:22   ` Peter Lieven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5613FFBD.4080907@redhat.com \
    --to=jsnow@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.