From: Paolo Bonzini <pbonzini@redhat.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
qemu-devel <qemu-devel@nongnu.org>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Date: Thu, 03 Jul 2014 12:29:35 +0200 [thread overview]
Message-ID: <53B5308F.3030008@redhat.com> (raw)
In-Reply-To: <CACVXFVNOpFBUwmZmZ0NczsKdnAbY=Fwvy3A-ryLUv0k7BqCKCA@mail.gmail.com>
Il 03/07/2014 06:54, Ming Lei ha scritto:
> On Thu, Jul 3, 2014 at 12:21 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 02/07/2014 17:45, Ming Lei ha scritto:
>>> The attachment debug patch skips aio_notify() if qemu_bh_schedule
>>> is running from current aio context, but looks there is still 120K
>>> writes triggered. (without the patch, 400K can be observed in
>>> same test)
>>
>> Nice. Another observation is that after aio_dispatch we'll always
>> re-evaluate everything (bottom halves, file descriptors and timeouts),
>
> The idea is very good.
>
> If aio_notify() is called from the 1st aio_dispatch() in aio_poll(),
> ctc->notifier might need to be set, but it can be handled easily.
Yes, you can just move the atomic_inc/atomic_dec in aio_poll.
>> so we can skip the aio_notify if we're inside aio_dispatch.
>>
>> So what about this untested patch:
>>
>> diff --git a/aio-posix.c b/aio-posix.c
>> index f921d4f..a23d85d 100644
>> --- a/aio-posix.c
>> +++ b/aio-posix.c
>
> #include "qemu/atomic.h"
>
>> @@ -124,6 +124,9 @@ static bool aio_dispatch(AioContext *ctx)
>> AioHandler *node;
>> bool progress = false;
>>
>> + /* No need to set the event notifier during aio_notify. */
>> + ctx->running++;
>> +
>> /*
>> * We have to walk very carefully in case qemu_aio_set_fd_handler is
>> * called while we're walking.
>> @@ -169,6 +171,11 @@ static bool aio_dispatch(AioContext *ctx)
>> /* Run our timers */
>> progress |= timerlistgroup_run_timers(&ctx->tlg);
>>
>> + smp_wmb();
>> + ctx->iter_count++;
>> + smp_wmb();
>> + ctx->running--;
>> +
>> return progress;
>> }
>>
>> diff --git a/async.c b/async.c
>> index 5b6fe6b..1f56afa 100644
>> --- a/async.c
>> +++ b/async.c
>
> #include "qemu/atomic.h"
>
>> @@ -249,7 +249,19 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
>>
>> void aio_notify(AioContext *ctx)
>> {
>> - event_notifier_set(&ctx->notifier);
>> + uint32_t iter_count;
>> + do {
>> + iter_count = ctx->iter_count;
>> + /* Read ctx->iter_count before ctx->running. */
>> + smb_rmb();
>
> s/smb/smp
>
>> + if (!ctx->running) {
>> + event_notifier_set(&ctx->notifier);
>> + return;
>> + }
>> + /* Read ctx->running before ctx->iter_count. */
>> + smb_rmb();
>
> s/smb/smp
>
>> + /* ctx might have gone to sleep. */
>> + } while (iter_count != ctx->iter_count);
>> }
>
> Since both 'running' and 'iter_count' may be read lockless, something
> like ACCESS_ONCE() should be used to avoid compiler optimization.
No, smp_rmb() is enough to avoid them. See also include/qemu/seqlock.h
The first access to ctx->iter_count _could_ be protected by
ACCESS_ONCE(), which in QEMU we call atomic_read()/atomic_set(), but
it's not necessary. See docs/atomics.txt for a description for QEMU's
atomic access functions.
> In my test, it does decrease write() very much, and I hope
> a formal version can be applied soon.
Can you take care of that (you can add my Signed-off-by), since you have
the best testing environment? v5 of the plug/unplug series will be good
to go, I think.
Paolo
next prev parent reply other threads:[~2014-07-03 10:29 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-26 15:14 [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2 Ming Lei
2014-06-26 15:29 ` Paolo Bonzini
2014-06-26 15:37 ` Ming Lei
2014-06-26 15:43 ` Paolo Bonzini
2014-06-26 15:47 ` Ming Lei
2014-06-26 15:57 ` Paolo Bonzini
2014-06-27 1:15 ` Ming Lei
2014-06-27 4:59 ` Paolo Bonzini
2014-06-27 6:23 ` Kevin Wolf
2014-06-27 7:35 ` Paolo Bonzini
2014-06-27 12:35 ` Ming Lei
2014-06-27 7:57 ` Ming Lei
2014-06-27 12:01 ` Stefan Hajnoczi
2014-06-27 12:21 ` Kevin Wolf
2014-06-27 14:50 ` Stefan Hajnoczi
2014-06-27 18:01 ` Ming Lei
2014-06-27 21:51 ` Paolo Bonzini
2014-06-28 9:58 ` Ming Lei
2014-06-30 8:08 ` Stefan Hajnoczi
2014-06-30 8:27 ` Ming Lei
2014-07-01 13:53 ` Ming Lei
2014-07-01 14:31 ` Stefan Hajnoczi
2014-07-01 14:49 ` Ming Lei
2014-07-01 16:49 ` Paolo Bonzini
2014-07-02 0:48 ` Ming Lei
2014-07-02 8:54 ` Stefan Hajnoczi
2014-07-02 9:13 ` Paolo Bonzini
2014-07-02 9:39 ` Kevin Wolf
2014-07-02 9:48 ` Paolo Bonzini
2014-07-02 10:01 ` Kevin Wolf
2014-07-02 10:23 ` Paolo Bonzini
2014-07-02 15:45 ` Ming Lei
2014-07-02 16:13 ` Ming Lei
2014-07-02 16:23 ` Paolo Bonzini
2014-07-02 16:27 ` Ming Lei
2014-07-02 16:38 ` Paolo Bonzini
2014-07-02 16:41 ` Ming Lei
2014-07-02 16:21 ` Paolo Bonzini
2014-07-03 4:54 ` Ming Lei
2014-07-03 10:29 ` Paolo Bonzini [this message]
2014-07-03 11:50 ` Ming Lei
2014-07-03 11:56 ` Paolo Bonzini
2014-07-03 12:09 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53B5308F.3030008@redhat.com \
--to=pbonzini@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).