From: Paul Brook <paul@codesourcery.com>
To: qemu-devel@nongnu.org
Cc: Dor Laor <dlaor@redhat.com>, Christoph Hellwig <hch@lst.de>,
Vadim Rozenfeld <vrozenfe@redhat.com>,
Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] Re: [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device
Date: Thu, 25 Feb 2010 15:06:05 +0000 [thread overview]
Message-ID: <201002251506.05318.paul@codesourcery.com> (raw)
In-Reply-To: <4B853ED3.3060707@codemonkey.ws>
> > Idle bottom halves (i.e. qemu_bh_schedule_idle) are just bugs waiting to
> > happen, and should never be used for anything.
>
> Idle bottom halves make considerable more sense than the normal bottom
> halves.
>
> The fact that rescheduling a bottom half within a bottom half results in
> an infinite loop is absurd. It is equally absurd that bottoms halves
> alter the select timeout. The result of that is that if a bottom half
> schedules another bottom half, and that bottom half schedules the
> previous, you get a tight infinite loop. Since bottom halves are used
> often times deep within functions, the result is very subtle infinite
> loops (that we've absolutely encountered in the past).
I disagree. The "select timeout" is a completely irrelevant implementation
detail. Anything that relies on it is just plain wrong. If you require a delay
then you should be using a timer. If scheduling a BH directly then you should
expect it to be processed without delay.
If a BH reschedules itself (indirectly or indirectly) without useful work
occuring then you absolutely should expect an infinite loop. Rescheduling
itself after doing useful work should never cause an infinite loop. The only
way it can loop inifinitely is if we have infinite amount of work to do, in
which case you loose either way. Looping over work via recursive BHs is
probably not the most efficient way to do things, but I guess it may sometimes
be the simplest in practice.
Interaction between multiple BH is slightly trickier. By my reading BH are
processed in the order they are created. It may be reasonable to guarantee
that BH are processed in the order they are scheduled. However I'm reluctant
to even go that far without a good use-case. You could probably come up with
arguments for processing them in most-recently-scheduled order.
>A main loop should have only a few characteristics. It should enable
>timeouts (based on select), it should enable fd event dispatch, and it
>should allow for idle functions to be registered. There should be no
>guarantees on when idle functions are executed other than they'll
>eventually be executed.
If you don't provide any guarantees, then surely processing them immediately
must be an acceptable implementation. I don't believe there is a useful
definition of "idle".
All existing uses of qemu_bh_schedule_idle are in fact poorly implemented
periodic polling. Furthermore these should not be using periodic polling, they
can and should be event driven. They only exist because noone has bothered to
fix the old code properly. Having arbitrary 10ms latency on DMA transfers is
just plain wrong.
>The way we use "bottom halves" today should be implemented in terms of a
>relative timeout of 0 or an absolute timeout of now. The fact that we
>can't do that in our main loop is due to the very strange dependencies
>deep within various devices on io dispatch ordering. I would love to
>eliminate this but I've not been able to spend any time on this recently.
I don't see how this helps. A self-triggering event with a timeout of "now" is
still an infinite loop. Any delay is a bugs in the dispatch loop. "idle" BHs
are relying on this bug.
Paul
next prev parent reply other threads:[~2010-02-25 15:06 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-11 7:40 [Qemu-devel] [RFC][PATCH] performance improvement for windows guests, running on top of virtio block device Vadim Rozenfeld
2010-01-11 8:30 ` [Qemu-devel] " Avi Kivity
[not found] ` <4B4AE95D.7080305@redhat.com>
2010-01-11 9:19 ` Dor Laor
2010-01-11 13:11 ` Christoph Hellwig
2010-01-11 13:13 ` Avi Kivity
2010-01-11 13:16 ` Christoph Hellwig
2010-01-11 13:47 ` Christoph Hellwig
2010-01-11 14:00 ` Anthony Liguori
2010-02-24 2:58 ` Paul Brook
2010-02-24 14:59 ` Anthony Liguori
2010-02-25 15:06 ` Paul Brook [this message]
2010-02-25 15:23 ` Anthony Liguori
2010-02-25 16:48 ` Paul Brook
2010-02-25 17:11 ` Avi Kivity
2010-02-25 17:15 ` Anthony Liguori
2010-02-25 17:33 ` Avi Kivity
2010-02-25 18:05 ` malc
2010-02-25 19:55 ` Anthony Liguori
2010-02-26 8:47 ` Avi Kivity
2010-02-26 14:36 ` Anthony Liguori
2010-02-26 15:39 ` Avi Kivity
2010-01-11 13:42 ` Christoph Hellwig
2010-01-11 13:49 ` Anthony Liguori
2010-01-11 14:29 ` Avi Kivity
2010-01-11 14:37 ` Anthony Liguori
2010-01-11 14:46 ` Avi Kivity
2010-01-11 15:13 ` Anthony Liguori
2010-01-11 15:19 ` Avi Kivity
2010-01-11 15:22 ` Anthony Liguori
2010-01-11 15:31 ` Avi Kivity
2010-01-11 15:32 ` Anthony Liguori
2010-01-11 15:35 ` Avi Kivity
2010-01-11 15:38 ` Anthony Liguori
2010-01-11 18:22 ` [Qemu-devel] " Michael S. Tsirkin
2010-01-11 18:20 ` Michael S. Tsirkin
2010-01-11 14:25 ` [Qemu-devel] " Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201002251506.05318.paul@codesourcery.com \
--to=paul@codesourcery.com \
--cc=avi@redhat.com \
--cc=dlaor@redhat.com \
--cc=hch@lst.de \
--cc=qemu-devel@nongnu.org \
--cc=vrozenfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).