From: Jan Kiszka <jan.kiszka@web.de>
To: Blue Swirl <blauwirbel@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>,
Paul Brook <paul@codesourcery.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback
Date: Thu, 27 May 2010 21:08:18 +0200 [thread overview]
Message-ID: <4BFEC322.3030207@web.de> (raw)
In-Reply-To: <AANLkTimNyQF6PqXqE6PO55eXvesbYz4FChxyaqNXr5o0@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]
Blue Swirl wrote:
> On Thu, May 27, 2010 at 6:31 PM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> Blue Swirl wrote:
>>> On Wed, May 26, 2010 at 11:26 PM, Paul Brook <paul@codesourcery.com> wrote:
>>>>> At the other extreme, would it be possible to make the educated guests
>>>>> aware of the virtualization also in clock aspect: virtio-clock?
>>>> The guest doesn't even need to be aware of virtualization. It just needs to be
>>>> able to accommodate the lack of guaranteed realtime behavior.
>>>>
>>>> The fundamental problem here is that some guest operating systems assume that
>>>> the hardware provides certain realtime guarantees with respect to execution of
>>>> interrupt handlers. In particular they assume that the CPU will always be
>>>> able to complete execution of the timer IRQ handler before the periodic timer
>>>> triggers again. In most virtualized environments you have absolutely no
>>>> guarantee of realtime response.
>>>>
>>>> With Linux guests this was solved a long time ago by the introduction of
>>>> tickless kernels. These separate the timekeeping from wakeup events, so it
>>>> doesn't matter if several wakeup triggers end up getting merged (either at the
>>>> hardware level or via top/bottom half guest IRQ handlers).
>>>>
>>>>
>>>> It's worth mentioning that this problem also occurs on real hardware,
>>>> typically due to lame hardware/drivers which end up masking interrupts or
>>>> otherwise stall the CPU for for long periods of time.
>>>>
>>>>
>>>> The PIT hack attempts to workaround broken guests by adding artificial latency
>>>> to the timer event, ensuring that the guest "sees" them all. Unfortunately
>>>> guests vary on when it is safe for them to see the next timer event, and
>>>> trying to observe this behavior involves potentially harmful heuristics and
>>>> collusion between unrelated devices (e.g. interrupt controller and timer).
>>>>
>>>> In some cases we don't even do that, and just reschedule the event some
>>>> arbitrarily small amount of time later. This assumes the guest to do useful
>>>> work in that time. In a single threaded environment this is probably true -
>>>> qemu got enough CPU to inject the first interrupt, so will probably manage to
>>>> execute some guest code before the end of its timeslice. In an environment
>>>> where interrupt processing/delivery and execution of the guest code happen in
>>>> different threads this becomes increasingly likely to fail.
>>> So any voodoo around timer events is doomed to fail in some cases.
>>> What's the amount of hacks what we want then? Is there any generic
>> The aim of this patch is to reduce the amount of existing and upcoming
>> hacks. It may still require some refinements, but I think we haven't
>> found any smarter approach yet that fits existing use cases.
>
> I don't feel we have tried other possibilities hard enough.
Well, seeing prototypes wouldn't be bad, also to run real load againt
them. But at least I'm currently clueless what to implement.
>
>>> solution, like slowing down the guest system to the point where we can
>>> guarantee the interrupt rate vs. CPU execution speed?
>> That's generally a non-option in virtualized production environments.
>> Specifically if the guest system lost interrupts due to host
>> overcommitment, you do not want it slow down even further.
>
> I meant that the guest time could be scaled down, for example 2s in
> wall clock time would be presented to the guest as 1s.
But that is precisely what already happens when the guest loses timer
interrupts. There is no other time source for this kind of guests -
often except for some external events generated by systems which you
don't want to fall behind arbitrarily.
> Then the amount
> of CPU cycles between timer interrupts would increase and hopefully
> the guest can keep up. If the guest sleeps, time base could be
> accelerated to catch up with wall clock and then set back to 1:1 rate.
Can't follow you ATM, sorry. What should be slowed down then? And how
precisely?
Jan
>
> Slowing down could be triggered by measuring the guest load (for
> example, by checking for presence of halt instructions), if it's close
> to 1, time would be slowed down. If the guest starts to issue halt
> instructions because it's more idle, we can increase speed.
>
> If this approach worked, even APIC could be made ignorant about
> coalescing voodoo so it should be a major cleanup.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
next prev parent reply other threads:[~2010-05-27 19:08 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-24 20:13 [Qemu-devel] [RFT][PATCH 00/15] HPET cleanups, fixes, enhancements Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 01/15] hpet: Catch out-of-bounds timer access Jan Kiszka
2010-05-24 20:34 ` [Qemu-devel] " Juan Quintela
2010-05-24 20:36 ` Jan Kiszka
2010-05-24 20:50 ` Juan Quintela
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 02/15] hpet: Coding style cleanups and some refactorings Jan Kiszka
2010-05-24 20:37 ` [Qemu-devel] " Juan Quintela
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 03/15] hpet: Silence warning on write to running main counter Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 04/15] hpet: Move static timer field initialization Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 05/15] hpet: Convert to qdev Jan Kiszka
2010-05-25 9:37 ` Paul Brook
2010-05-25 10:14 ` Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 06/15] hpet: Start/stop timer when HPET_TN_ENABLE is modified Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback Jan Kiszka
2010-05-25 6:07 ` Gleb Natapov
2010-05-25 6:31 ` Jan Kiszka
2010-05-25 6:40 ` Gleb Natapov
2010-05-25 6:54 ` Jan Kiszka
2010-05-25 19:09 ` [Qemu-devel] " Blue Swirl
2010-05-25 20:16 ` Anthony Liguori
2010-05-25 21:44 ` Jan Kiszka
2010-05-26 8:08 ` Gleb Natapov
2010-05-26 20:14 ` Blue Swirl
2010-05-27 5:42 ` Gleb Natapov
2010-05-26 19:55 ` Blue Swirl
2010-05-26 20:09 ` Jan Kiszka
2010-05-26 20:35 ` Blue Swirl
2010-05-26 22:35 ` Jan Kiszka
2010-05-26 23:26 ` Paul Brook
2010-05-27 17:56 ` Blue Swirl
2010-05-27 18:31 ` Jan Kiszka
2010-05-27 18:53 ` Blue Swirl
2010-05-27 19:08 ` Jan Kiszka [this message]
2010-05-27 19:19 ` Blue Swirl
2010-05-27 22:19 ` Jan Kiszka
2010-05-28 19:00 ` Blue Swirl
2010-05-30 12:00 ` Avi Kivity
2010-05-27 22:21 ` Paul Brook
2010-05-28 19:10 ` Blue Swirl
2010-05-27 22:21 ` Paul Brook
2010-05-27 6:13 ` Gleb Natapov
2010-05-27 18:37 ` Blue Swirl
2010-05-28 7:31 ` Gleb Natapov
2010-05-28 20:06 ` Blue Swirl
2010-05-28 20:47 ` Gleb Natapov
2010-05-29 7:58 ` Jan Kiszka
2010-05-29 9:35 ` Blue Swirl
2010-05-29 9:45 ` Jan Kiszka
2010-05-29 10:04 ` Blue Swirl
2010-05-29 10:16 ` Jan Kiszka
2010-05-29 10:26 ` Blue Swirl
2010-05-29 10:38 ` Jan Kiszka
2010-05-29 14:46 ` Gleb Natapov
2010-05-29 16:13 ` Blue Swirl
2010-05-29 16:37 ` Gleb Natapov
2010-05-29 21:21 ` Blue Swirl
2010-05-30 6:02 ` Gleb Natapov
2010-05-30 12:10 ` Blue Swirl
2010-05-30 12:24 ` Jan Kiszka
2010-05-30 12:58 ` Blue Swirl
2010-05-31 7:46 ` Jan Kiszka
2010-05-30 12:33 ` Gleb Natapov
2010-05-30 12:56 ` Blue Swirl
2010-05-30 13:49 ` Gleb Natapov
2010-05-30 16:54 ` Blue Swirl
2010-05-30 19:37 ` Blue Swirl
2010-05-30 20:07 ` Gleb Natapov
2010-05-30 20:21 ` Blue Swirl
2010-05-31 5:19 ` Gleb Natapov
2010-06-01 18:00 ` Blue Swirl
2010-06-01 18:30 ` Gleb Natapov
2010-06-02 19:05 ` Blue Swirl
2010-06-03 6:23 ` Jan Kiszka
2010-06-03 6:34 ` Gleb Natapov
2010-06-03 6:59 ` Jan Kiszka
2010-06-03 7:03 ` Gleb Natapov
2010-06-03 7:06 ` Gleb Natapov
2010-06-04 19:05 ` Blue Swirl
2010-06-05 0:04 ` Jan Kiszka
2010-06-05 7:20 ` Blue Swirl
2010-06-05 8:27 ` Jan Kiszka
2010-06-05 9:23 ` Blue Swirl
2010-06-05 12:14 ` Jan Kiszka
2010-06-06 7:15 ` Gleb Natapov
2010-06-06 7:39 ` Jan Kiszka
2010-06-06 7:49 ` Gleb Natapov
2010-06-06 8:07 ` Jan Kiszka
2010-06-06 9:23 ` Gleb Natapov
2010-06-06 10:10 ` Jan Kiszka
2010-06-06 10:27 ` Gleb Natapov
2010-06-06 7:39 ` Blue Swirl
2010-06-06 8:07 ` Gleb Natapov
2010-05-30 13:22 ` Blue Swirl
2010-05-29 9:15 ` Blue Swirl
2010-05-29 9:36 ` Jan Kiszka
2010-05-29 14:38 ` Gleb Natapov
2010-05-29 16:03 ` Blue Swirl
2010-05-29 16:32 ` Gleb Natapov
2010-05-29 20:52 ` Blue Swirl
2010-05-30 5:41 ` Gleb Natapov
2010-05-30 11:41 ` Blue Swirl
2010-05-30 11:52 ` Gleb Natapov
2010-05-30 12:05 ` Avi Kivity
2010-05-27 5:58 ` Gleb Natapov
2010-05-26 19:49 ` Blue Swirl
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 08/15] x86: Refactor RTC IRQ coalescing workaround Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 09/15] hpet/rtc: Rework RTC IRQ replacement by HPET Jan Kiszka
2010-05-25 9:29 ` Paul Brook
2010-05-25 10:23 ` Jan Kiszka
2010-05-25 11:05 ` Paul Brook
2010-05-25 11:19 ` Jan Kiszka
2010-05-25 11:23 ` Paul Brook
2010-05-25 11:26 ` Jan Kiszka
2010-05-25 12:03 ` Paul Brook
2010-05-25 12:39 ` Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 10/15] hpet: Drop static state Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 11/15] hpet: Add support for level-triggered interrupts Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 12/15] vmstate: Add VMSTATE_STRUCT_VARRAY_UINT8 Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 13/15] hpet: Make number of timers configurable Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 14/15] hpet: Add MSI support Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 15/15] monitor/QMP: Drop info hpet / query-hpet Jan Kiszka
2010-05-24 22:16 ` [Qemu-devel] [RFT][PATCH 00/15] HPET cleanups, fixes, enhancements Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BFEC322.3030207@web.de \
--to=jan.kiszka@web.de \
--cc=blauwirbel@gmail.com \
--cc=paul@codesourcery.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).