All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@web.de>
To: Blue Swirl <blauwirbel@gmail.com>
Cc: qemu-devel@nongnu.org, Gleb Natapov <gleb@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] Re: [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback
Date: Sat, 29 May 2010 11:36:45 +0200	[thread overview]
Message-ID: <4C00E02D.9050804@web.de> (raw)
In-Reply-To: <AANLkTimvdjU1zunypgO1VikK3Tw47TDZyrtR7m1YC2o_@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5786 bytes --]

Blue Swirl wrote:
>>> On the contrary, APIC is actually the only source of the IRQ ack
>>> information. RTC hack would not work without APIC (or the
>>> bidirectional IRQ) passing this info to RTC.
>>>
>>> What APIC doesn't have now is the timer frequency or period info. This
>>> is known by RTC and also higher levels managing the clocks.
>>>
>> So APIC has one bit of information and RTC everything else.
> 
> The information known by RTC (timer period) is also known by higher levels.

Curious to see where you'll find this.

> 
>> The current
>> approach (and proposed patch) brings this one bit of information to RTC,
>> you are arguing that RTC should be able to communicate all its info to
>> APIC. Sorry I don't see that your way has any advantage. Just more
>> complex interface and it is much easier to get it wrong for other time
>> sources.
> 
> I don't think anymore that APIC should be handling this but the
> generic stuff, like vl.c or exec.c. Then there would be only
> information passing from APIC to higher levels.

You neglect the the information required to associate a periodic source
(e.g. RTC) with an IRQ sink (e.g. APIC). Without that, you will have a
hard time figuring out if a reported IRQ coalescing requires any
activities or should simply be welcomed (for I/O IRQs).

> 
>>> I keep ignoring the idea that the current model, where both RTC and
>>> APIC must somehow work together to make coalescing work, is the only
>>> possible just because it is committed and it happens to work in some
>>> cases. It would be much better to concentrate this to one place, APIC
>>> or preferably higher level where it may benefit other timers too.
>>> Provided of course that the other models can be made to work.
>>>
>> So write the code and show us. You haven't show any evidence that RTC is
>> the wrong place. RTC knows when interrupt was acknowledge to RTC, it
>> know when clock frequency changes, it know when device reset happened.
>> APIC knows only that interrupt was coalesced. It doesn't even know that
>> it may be masked by a guest in IOAPIC (interrupts delivered while they
>> are masked not considered coalesced).
> 
> Oh, I thought interrupt masking was the reason for coalescing! What
> exactly is the reason then?

Missing acks, ie. the IRQ is still pending when the next one arrives.
You want to filter out masked/suppressed IRQs to avoid running the
de-coalescing logic on sources that are actually cut off (like the RTC
IRQ when the HPET took over).

> 
>> Time source knows only when
>> frequency changes and may be when device reset happens if timer is
>> stopped by device on reset. So RTC is actually a sweet spot if you want
>> to minimize amount of info you need to pass between various layers.
>>
>>>>> Maybe that version would not bend backwards as much as the current to
>>>>> cater for buggy hosts.
>>>>>
>>>> You mean "buggy guests"?
>>> Yes, sorry.
>>>
>>>> What guests are not buggy in your opinion?
>>>> Linux tries hard to be smart and as a result the only way to have stable
>>>> clock with it is to go paravirt.
>>> I'm not an OS designer, but I think an OS should never crash, even if
>>> a burst of IRQs is received. Reprogramming the timer should consider
>>> the pending IRQ situation (0 or 1 with real HW). Those bugs are one
>>> cause of the problem.
>> OS should never crash in the absence of HW bugs? I doubt you can design
>> an OS that can run in a face of any HW failure. Anyway here we are
>> trying to solve guests time keeping problem not crashes. Do you think
>> you can design OS that can keep time accurately no matter how crazy all
>> HW clock behaves?
> 
> I think my OS design skills are not relevant in this discussion, but
> IIRC there are fault tolerant operating systems for extreme conditions
> so it can be done.

No one can influence the design of released OS versions anymore.

> 
>>>>>> The fact is that timer device is not "just like any
>>>>>> other device" in virtual world. Any other device is easy: you just
>>>>>> implement spec as close as possible and everything works. For time
>>>>>> source device this is not enough. You can implement RTC+HPET to the
>>>>>> letter and your guest will drift like crazy.
>>>>> It's doable: a cycle accurate emulator will not cause any drift,
>>>>> without any voodoo. The interrupts would come after executing the same
>>>>> instruction as the real HW. For emulating any sufficiently buggy
>>>>> guests in any sufficiently desperate low resource conditions, this may
>>>>> be the only option that will always work.
>>>>>
>>>> Yes, but qemu and kvm are not cycle accurate emulators and don't strive
>>>> to be one. On the contrary KVM runs at native host CPU speed most of the
>>>> time, so any emulation done between two instruction is theoretically
>>>> noticeable for a guest. TSC is bypassed directly to a guest too, so
>>>> keeping all time source in perfect sync is also impossible.
>>> That is actually another cause of the problem. KVM gives the guest an
>>> illusion that the VCPU speed is equal to host speed. When they don't
>>> match, especially in critical code, there can be problems. It would be
>>> better to tell the guest a lower speed, which also can be guaranteed.
>>>
>> Not possible. It's that simple. You should take it into account in your
>> architecture design stage. In case of KVM real physical CPU executes guest
>> instruction and it does this as fast as it can. The only way we can hide
>> that from a guest is by intercepting each access to TSC and at that
>> point we can use bochs instead.
> 
> Well, as Paul pointed out, there's also icount option.

Which is not available in virtualization mode.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

  reply	other threads:[~2010-05-29 15:48 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-24 20:13 [Qemu-devel] [RFT][PATCH 00/15] HPET cleanups, fixes, enhancements Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 01/15] hpet: Catch out-of-bounds timer access Jan Kiszka
2010-05-24 20:34   ` [Qemu-devel] " Juan Quintela
2010-05-24 20:36     ` Jan Kiszka
2010-05-24 20:50       ` Juan Quintela
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 02/15] hpet: Coding style cleanups and some refactorings Jan Kiszka
2010-05-24 20:37   ` [Qemu-devel] " Juan Quintela
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 03/15] hpet: Silence warning on write to running main counter Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 04/15] hpet: Move static timer field initialization Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 05/15] hpet: Convert to qdev Jan Kiszka
2010-05-25  9:37   ` Paul Brook
2010-05-25 10:14     ` Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 06/15] hpet: Start/stop timer when HPET_TN_ENABLE is modified Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 07/15] qemu_irq: Add IRQ handlers with delivery feedback Jan Kiszka
2010-05-25  6:07   ` Gleb Natapov
2010-05-25  6:31     ` Jan Kiszka
2010-05-25  6:40       ` Gleb Natapov
2010-05-25  6:54         ` Jan Kiszka
2010-05-25 19:09   ` [Qemu-devel] " Blue Swirl
2010-05-25 20:16     ` Anthony Liguori
2010-05-25 21:44       ` Jan Kiszka
2010-05-26  8:08         ` Gleb Natapov
2010-05-26 20:14           ` Blue Swirl
2010-05-27  5:42             ` Gleb Natapov
2010-05-26 19:55         ` Blue Swirl
2010-05-26 20:09           ` Jan Kiszka
2010-05-26 20:35             ` Blue Swirl
2010-05-26 22:35               ` Jan Kiszka
2010-05-26 23:26               ` Paul Brook
2010-05-27 17:56                 ` Blue Swirl
2010-05-27 18:31                   ` Jan Kiszka
2010-05-27 18:53                     ` Blue Swirl
2010-05-27 19:08                       ` Jan Kiszka
2010-05-27 19:19                         ` Blue Swirl
2010-05-27 22:19                           ` Jan Kiszka
2010-05-28 19:00                             ` Blue Swirl
2010-05-30 12:00                             ` Avi Kivity
2010-05-27 22:21                           ` Paul Brook
2010-05-28 19:10                             ` Blue Swirl
2010-05-27 22:21                   ` Paul Brook
2010-05-27  6:13               ` Gleb Natapov
2010-05-27 18:37                 ` Blue Swirl
2010-05-28  7:31                   ` Gleb Natapov
2010-05-28 20:06                     ` Blue Swirl
2010-05-28 20:47                       ` Gleb Natapov
2010-05-29  7:58                         ` Jan Kiszka
2010-05-29  9:35                           ` Blue Swirl
2010-05-29  9:45                             ` Jan Kiszka
2010-05-29 10:04                               ` Blue Swirl
2010-05-29 10:16                                 ` Jan Kiszka
2010-05-29 10:26                                   ` Blue Swirl
2010-05-29 10:38                                     ` Jan Kiszka
2010-05-29 14:46                             ` Gleb Natapov
2010-05-29 16:13                               ` Blue Swirl
2010-05-29 16:37                                 ` Gleb Natapov
2010-05-29 21:21                                   ` Blue Swirl
2010-05-30  6:02                                     ` Gleb Natapov
2010-05-30 12:10                                       ` Blue Swirl
2010-05-30 12:24                                         ` Jan Kiszka
2010-05-30 12:58                                           ` Blue Swirl
2010-05-31  7:46                                             ` Jan Kiszka
2010-05-30 12:33                                         ` Gleb Natapov
2010-05-30 12:56                                           ` Blue Swirl
2010-05-30 13:49                                             ` Gleb Natapov
2010-05-30 16:54                                               ` Blue Swirl
2010-05-30 19:37                                               ` Blue Swirl
2010-05-30 20:07                                                 ` Gleb Natapov
2010-05-30 20:21                                                   ` Blue Swirl
2010-05-31  5:19                                                     ` Gleb Natapov
2010-06-01 18:00                                                       ` Blue Swirl
2010-06-01 18:30                                                         ` Gleb Natapov
2010-06-02 19:05                                                           ` Blue Swirl
2010-06-03  6:23                                                             ` Jan Kiszka
2010-06-03  6:34                                                               ` Gleb Natapov
2010-06-03  6:59                                                                 ` Jan Kiszka
2010-06-03  7:03                                                                   ` Gleb Natapov
2010-06-03  7:06                                                                     ` Gleb Natapov
2010-06-04 19:05                                                                       ` Blue Swirl
2010-06-05  0:04                                                                         ` Jan Kiszka
2010-06-05  7:20                                                                           ` Blue Swirl
2010-06-05  8:27                                                                             ` Jan Kiszka
2010-06-05  9:23                                                                               ` Blue Swirl
2010-06-05 12:14                                                                                 ` Jan Kiszka
2010-06-06  7:15                                                                           ` Gleb Natapov
2010-06-06  7:39                                                                             ` Jan Kiszka
2010-06-06  7:49                                                                               ` Gleb Natapov
2010-06-06  8:07                                                                                 ` Jan Kiszka
2010-06-06  9:23                                                                                   ` Gleb Natapov
2010-06-06 10:10                                                                                     ` Jan Kiszka
2010-06-06 10:27                                                                                       ` Gleb Natapov
2010-06-06  7:39                                                                             ` Blue Swirl
2010-06-06  8:07                                                                               ` Gleb Natapov
2010-05-30 13:22                                           ` Blue Swirl
2010-05-29  9:15                         ` Blue Swirl
2010-05-29  9:36                           ` Jan Kiszka [this message]
2010-05-29 14:38                           ` Gleb Natapov
2010-05-29 16:03                             ` Blue Swirl
2010-05-29 16:32                               ` Gleb Natapov
2010-05-29 20:52                                 ` Blue Swirl
2010-05-30  5:41                                   ` Gleb Natapov
2010-05-30 11:41                                     ` Blue Swirl
2010-05-30 11:52                                       ` Gleb Natapov
2010-05-30 12:05                           ` Avi Kivity
2010-05-27  5:58             ` Gleb Natapov
2010-05-26 19:49       ` Blue Swirl
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 08/15] x86: Refactor RTC IRQ coalescing workaround Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 09/15] hpet/rtc: Rework RTC IRQ replacement by HPET Jan Kiszka
2010-05-25  9:29   ` Paul Brook
2010-05-25 10:23     ` Jan Kiszka
2010-05-25 11:05       ` Paul Brook
2010-05-25 11:19         ` Jan Kiszka
2010-05-25 11:23           ` Paul Brook
2010-05-25 11:26             ` Jan Kiszka
2010-05-25 12:03               ` Paul Brook
2010-05-25 12:39                 ` Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 10/15] hpet: Drop static state Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 11/15] hpet: Add support for level-triggered interrupts Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 12/15] vmstate: Add VMSTATE_STRUCT_VARRAY_UINT8 Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 13/15] hpet: Make number of timers configurable Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 14/15] hpet: Add MSI support Jan Kiszka
2010-05-24 20:13 ` [Qemu-devel] [RFT][PATCH 15/15] monitor/QMP: Drop info hpet / query-hpet Jan Kiszka
2010-05-24 22:16 ` [Qemu-devel] [RFT][PATCH 00/15] HPET cleanups, fixes, enhancements Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C00E02D.9050804@web.de \
    --to=jan.kiszka@web.de \
    --cc=blauwirbel@gmail.com \
    --cc=gleb@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.