From: Anthony Liguori <anthony@codemonkey.ws>
To: Christoph Hellwig <hch@infradead.org>
Cc: Avi Kivity <avi@redhat.com>, qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [PATCH][RFC] Linux AIO support when using O_DIRECT
Date: Mon, 23 Mar 2009 13:10:30 -0500 [thread overview]
Message-ID: <49C7D096.3000302@codemonkey.ws> (raw)
In-Reply-To: <20090323172928.GB29449@infradead.org>
Christoph Hellwig wrote:
> On Mon, Mar 23, 2009 at 12:14:58PM -0500, Anthony Liguori wrote:
>
>> I'd like to see the O_DIRECT bounce buffering removed in favor of the
>> DMA API bouncing. Once that happens, raw_read and raw_pread can
>> disappear. block-raw-posix becomes much simpler.
>>
>
> See my vectored I/O patches for doing the bounce buffering at the
> optimal place for the aio path. Note that from my reading of the
> qcow/qcow2 code they might send down unaligned requests, which is
> something the dma api would not help with.
>
I was going to look today at applying those.
> For the buffered I/O path we will always have to do some sort of buffering
> due to all the partition header reading / etc. And given how that part
> isn't performance critical my preference would be to keep doing it in
> bdrv_pread/write and guarantee the lowlevel drivers proper alignment.
>
I really dislike having so many APIs. I'd rather have an aio API that
took byte accesses or have pread/pwrite always be emulated with a full
sector read/write
>> We would drop the signaling stuff and have the thread pool use an fd to
>> signal. The big problem with that right now is that it'll cause a
>> performance regression for certain platforms until we have the IO thread
>> in place.
>>
>
> Talking about signaling, does anyone remember why the Linux signalfd/
> eventfd support is only in kvm but not in upstream qemu?
>
Because upstream QEMU doesn't yet have an IO thread.
TCG chains together TBs and if you have a tight loop in a VCPU, then the
only way to break out of the loop is to receive a signal. The signal
handler will call cpu_interrupt() which will unchain TBs allowing TCG
execution to break once you return from the signal handler.
An IO thread solves this in a different way by letting select() always
run in parallel to TCG VCPU execution. When select() returns you can
send a signal to the TCG VCPU thread to break it out of chained TBs.
Not all IO in qemu generates a signal so this a potential problem but in
practice, if we don't generate a signal for disk IO completion, a number
of real world guests breaks (mostly non-x86 boards).
Regards,
Anthony Liguori
next prev parent reply other threads:[~2009-03-23 18:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-23 15:45 [PATCH][RFC] Linux AIO support when using O_DIRECT Anthony Liguori
2009-03-23 16:17 ` [Qemu-devel] " Avi Kivity
2009-03-23 17:14 ` Anthony Liguori
2009-03-23 17:29 ` Christoph Hellwig
2009-03-23 18:10 ` Anthony Liguori [this message]
2009-03-23 18:48 ` Christoph Hellwig
2009-03-23 19:35 ` Avi Kivity
2009-03-23 17:32 ` Christoph Hellwig
2009-03-23 19:58 ` Avi Kivity
2009-03-23 20:32 ` Anthony Liguori
2009-03-23 17:26 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49C7D096.3000302@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=hch@infradead.org \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox