From: John Snow <jsnow@redhat.com>
To: Peter Lieven <pl@kamp.de>, Kevin Wolf <kwolf@redhat.com>
Cc: stefanha@gmail.com, jcody@redhat.com, qemu-devel@nongnu.org,
qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async
Date: Tue, 6 Oct 2015 13:07:09 -0400 [thread overview]
Message-ID: <5613FFBD.4080907@redhat.com> (raw)
In-Reply-To: <56139278.2090002@kamp.de>
On 10/06/2015 05:20 AM, Peter Lieven wrote:
> Am 06.10.2015 um 10:57 schrieb Kevin Wolf:
>> Am 05.10.2015 um 23:15 hat John Snow geschrieben:
>>>
>>> On 09/21/2015 08:25 AM, Peter Lieven wrote:
>>>> PIO read requests on the ATAPI interface used to be sync blk requests.
>>>> This has to siginificant drawbacks. First the main loop hangs util an
>>>> I/O request is completed and secondly if the I/O request does not
>>>> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
>>>>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>> hw/ide/atapi.c | 69
>>>> ++++++++++++++++++++++++++++++++++++----------------------
>>>> 1 file changed, 43 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
>>>> index 747f466..9257e1c 100644
>>>> --- a/hw/ide/atapi.c
>>>> +++ b/hw/ide/atapi.c
>>>> @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>>>> memset(buf, 0, 288);
>>>> }
>>>> -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int
>>>> sector_size)
>>>> +static void cd_read_sector_cb(void *opaque, int ret)
>>>> {
>>>> - int ret;
>>>> + IDEState *s = opaque;
>>>> - switch(sector_size) {
>>>> - case 2048:
>>>> - block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> - ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
>>>> - block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> - break;
>>>> - case 2352:
>>>> - block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> - ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
>>>> - block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> - if (ret < 0)
>>>> - return ret;
>>>> - cd_data_to_raw(buf, lba);
>>>> - break;
>>>> - default:
>>>> - ret = -EIO;
>>>> - break;
>>>> + block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>> +
>>>> + if (ret < 0) {
>>>> + ide_atapi_io_error(s, ret);
>>>> + return;
>>>> + }
>>>> +
>>>> + if (s->cd_sector_size == 2352) {
>>>> + cd_data_to_raw(s->io_buffer, s->lba);
>>>> }
>>>> - return ret;
>>>> +
>>>> + s->lba++;
>>>> + s->io_buffer_index = 0;
>>>> + s->status &= ~BUSY_STAT;
>>>> +
>>>> + ide_atapi_cmd_reply_end(s);
>>>> +}
>>>> +
>>>> +static int cd_read_sector(IDEState *s, int lba, void *buf, int
>>>> sector_size)
>>>> +{
>>>> + if (sector_size != 2048 && sector_size != 2352) {
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + s->iov.iov_base = buf;
>>>> + if (sector_size == 2352) {
>>>> + buf += 4;
>>>> + }
>> This doesn't look quite right, buf is never read after this.
>>
>> Also, why +=4 when it was originally buf + 16?
>
> You are right. I mixed that up.
>
>>
>>>> +
>>>> + s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
>>>> + qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>>>> +
>>>> + if (blk_aio_readv(s->blk, (int64_t)lba << 2, &s->qiov, 4,
>>>> + cd_read_sector_cb, s) == NULL) {
>>>> + return -EIO;
>>>> + }
>>>> +
>>>> + block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>> + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>> + s->status |= BUSY_STAT;
>>>> + return 0;
>>>> }
>>>>
>>> We discussed this off-list a bit, but for upstream synchronization:
>>>
>>> Unfortunately, I believe making cd_read_sector here non-blocking makes
>>> ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
>>> s->end_transfer_func() nonblocking, which functions like ide_data_readw
>>> are not prepared to cope with.
>> I don't think that's a problem as long as BSY is set while the
>> asynchronous command is running and DRQ is cleared. The latter will
>> protect ide_data_readw(). ide_sector_read() does essentially the same
>> thing.
>
> I was thinking the same. Without the BSY its not working at all.
>
>>
>> Or maybe I'm just missing what you're trying to say.
>>
>>> My suggestion is to buffer an entire DRQ block of data at once
>>> (byte_count_limit) to avoid the problem.
>> No matter whether there is a problem or not, buffering more data at once
>> (and therefore doing less requests) is better for performance anyway.
>
> Its possible to do only one read in the backend and read the whole
> request into the IO buffer. I send a follow-up.
>
Be cautious: we only have 128K (+4 bytes) to play with in the io_buffer
and the READ10 cdb can request up to 128MiB! For performance, it might
be nice to always buffer something like:
MIN(128K, nb_sectors * sector_size)
and then as the guest drains the DRQ block of size byte_count_limit
which can only be at largest 0xFFFE (we can fit in at least two of these
per io_buffer refill) we can just shift the data_ptr and data_end
pointers to utilize io_buffer like a ring buffer.
Because the guest can at most fetch 0xfffe bytes at a time, it will tend
to leave at least 4 bytes left over from a 64 block read. Luckily, we've
got 4 extra bytes in s->io_buffer, so with a ring buffer we can always
rebuffer *at least* two full DRQ blocks of data at a time.
The routine would basically look like this:
- No DRQ blocks buffered, so read up to 64 blocks or however many are
left for our transfer
- If we have at least one full DRQ block allocated, start the transfer
and send an interrupt
- If we ran out of DRQ blocks, go back to the top and buffer them.
This would eliminate the need for code stanza #3 in
ide_atapi_cmd_reply_end, which re-starts a transfer without signaling to
the guest. We'd only have:
ide_atapi_cmd_reply_end(...) {
if (packet_transfer_size == 0) { end(...); return; }
if (blocks_buffered < 1) { async_buffer_blocks(...); return; }
ide_transfer_start(...)
ide_set_irq(s->bus);
}
which is a good deal simpler than what we have now, though I need to
look into the formatting of raw CD data a little more to make sure my
numbers make sense... it may not be quite so easy to buffer multiple DRQ
blocks in some cases, but so it goes -- we should always be able to
buffer at least one.
> Maybe do you have a pointer to the test tool that John mentioned?
>
> Peter
>
next prev parent reply other threads:[~2015-10-06 17:07 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-21 12:25 [Qemu-devel] [PATCH 0/5] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async Peter Lieven
2015-10-02 21:02 ` John Snow
2015-10-05 21:15 ` John Snow
2015-10-06 8:46 ` Peter Lieven
2015-10-06 12:08 ` Peter Lieven
2015-10-07 16:42 ` John Snow
2015-10-07 18:53 ` Peter Lieven
2015-10-08 12:06 ` Peter Lieven
2015-10-08 16:44 ` John Snow
2015-10-09 8:21 ` Kevin Wolf
2015-10-09 11:18 ` Peter Lieven
2015-10-09 16:32 ` John Snow
2015-10-14 18:19 ` Peter Lieven
2015-10-14 18:21 ` John Snow
2015-10-16 10:56 ` Peter Lieven
2015-10-06 8:57 ` Kevin Wolf
2015-10-06 9:20 ` Peter Lieven
2015-10-06 17:07 ` John Snow [this message]
2015-10-06 17:12 ` Peter Lieven
2015-10-06 17:56 ` John Snow
2015-10-06 18:31 ` Peter Lieven
2015-10-06 18:34 ` John Snow
2015-10-06 15:54 ` John Snow
2015-10-07 7:28 ` Kevin Wolf
2015-10-06 13:05 ` Laszlo Ersek
2015-09-21 12:25 ` [Qemu-devel] [PATCH 2/5] ide/atapi: blk_aio_readv may return NULL Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 3/5] ide: add support for cancelable read requests Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 4/5] ide/atapi: enable cancelable requests Peter Lieven
2015-09-21 12:25 ` [Qemu-devel] [PATCH 5/5] block/nfs: cache allocated filesize for read-only files Peter Lieven
2015-09-21 20:58 ` [Qemu-devel] [PATCH 0/5] ide: avoid main-loop hang on CDROM/NFS failure John Snow
2015-09-21 21:22 ` Peter Lieven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5613FFBD.4080907@redhat.com \
--to=jsnow@redhat.com \
--cc=jcody@redhat.com \
--cc=kwolf@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).