All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH v2 2/4] Add block-queue
Date: Wed, 17 Nov 2010 14:41:22 +0100	[thread overview]
Message-ID: <4CE3DB82.6010702@redhat.com> (raw)
In-Reply-To: <AANLkTikXwU4GoD1JCh5U_QJnvgAQ+TBa+UZtrfG7H1CR@mail.gmail.com>

Am 17.11.2010 13:43, schrieb Stefan Hajnoczi:
> On Fri, Nov 5, 2010 at 6:38 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Instead of directly executing writes and fsyncs, queue them and execute them
>> asynchronously. What makes this interesting is that we can delay syncs and if
>> multiple syncs occur, we can merge them into one bdrv_flush.
> 
> The block-queue concept adds another layer to the cache=
> considerations.  We're going to complete write requests that have not
> yet been issued to the host.
> 
> This won't work for cluster filesystems where multiple VMs are
> accessing the block device, although that is a niche case.  I guess
> they can assume writes are visible after they receive a completion.

Right, but that doesn't work for anything but raw anyway. What I intend
block-queue for is just metadata writes, so nothing that raw would use.

> There must be a trade-off between holding back requests until a flush
> is forced and issuing requests as they come (what we do today).  At
> some point if you have too much data queued the flush is going to be
> painful and might result in odd timing patterns.  This feels similar
> to batching tx packets on the network.

I agree. Though as long as you use it only for metadata, the requests
are relatively small and they overwrite each other so that there's a
chance that you can drop some of the from the queue before submitting them.

What I implemented here is that the queue is flushed if it reaches some
maximum length. So this value is where you'd need to find the right
trade-off.

>> A typical sequence in qcow2 (simple cluster allocation) looks like this:
>>
>> 1. Update refcount table
>> 2. bdrv_flush
>> 3. Update L2 entry
>>
>> If we delay the operation and get three of these sequences queued before
>> actually executing, we end up with the following result, saving two syncs:
>>
>> 1. Update refcount table (req 1)
>> 2. Update refcount table (req 2)
>> 3. Update refcount table (req 3)
>> 4. bdrv_flush
>> 5. Update L2 entry (req 1)
>> 6. Update L2 entry (req 2)
>> 7. Update L2 entry (req 3)
> 
> How does block-queue group writes 1-3 and 5-7 together?  I thought
> reqs 1-3 will each have their own context but from a quick look at the
> code I don't think that is the case for qcow2 in patch 4.
> 
> Another way of asking is, how does block-queue know that it is safe to
> put writes 1-3 together before a bdrv_flush?  Why doesn't it also put
> writes 5-7 before the flush?
> 
> Perhaps your current qcow2 implementation isn't taking advantage of
> block-queue bdrv_flush() batching for concurrent write requests?
> 
> I'm missing something ;).

Yes, you are. ;-)

The contexts are indeed the mechanism that it uses to achieve this. Have
a look at qcow_aio_setup, patch 4/4 adds a blkqueue_init_context() call
there.

Kevin

  reply	other threads:[~2010-11-17 13:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-05 18:38 [Qemu-devel] [RFC PATCH v2 0/4] block-queue: Delay and batch metadata writes Kevin Wolf
2010-11-05 18:38 ` [Qemu-devel] [RFC PATCH v2 1/4] block: Fake a bdrv_aio_pwrite Kevin Wolf
2010-11-05 18:38 ` [Qemu-devel] [RFC PATCH v2 2/4] Add block-queue Kevin Wolf
2010-11-17 12:43   ` Stefan Hajnoczi
2010-11-17 13:41     ` Kevin Wolf [this message]
2010-11-17 16:23       ` Stefan Hajnoczi
2010-11-18  9:46         ` Kevin Wolf
2010-11-17 16:24       ` Stefan Hajnoczi
2010-11-05 18:38 ` [Qemu-devel] [RFC PATCH v2 3/4] Test cases for block-queue Kevin Wolf
2010-11-05 18:38 ` [Qemu-devel] [RFC PATCH v2 4/4] qcow2: Use block-queue Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CE3DB82.6010702@redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.