From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1Ky1JB-0000Fh-TE
	for qemu-devel@nongnu.org; Thu, 06 Nov 2008 04:37:45 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1Ky1JA-0000FV-PV
	for qemu-devel@nongnu.org; Thu, 06 Nov 2008 04:37:44 -0500
Received: from [199.232.76.173] (port=50596 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ky1JA-0000FS-JY
	for qemu-devel@nongnu.org; Thu, 06 Nov 2008 04:37:44 -0500
Received: from rv-out-0708.google.com ([209.85.198.246]:21199)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <balrogg@gmail.com>) id 1Ky1JA-00039I-U0
	for qemu-devel@nongnu.org; Thu, 06 Nov 2008 04:37:45 -0500
Received: by rv-out-0708.google.com with SMTP id f25so491902rvb.22
	for <qemu-devel@nongnu.org>; Thu, 06 Nov 2008 01:37:43 -0800 (PST)
Message-ID: <fb249edb0811060137m20e9ac75we6c7fdd38b8796d5@mail.gmail.com>
Date: Thu, 6 Nov 2008 10:37:43 +0100
From: "andrzej zaborowski" <balrogg@gmail.com>
Subject: Re: [Qemu-devel] [RESEND][PATCH 0/3] Fix guest time drift under heavy
	load.
In-Reply-To: <20081106071624.GC3820@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <20081029152236.14831.15193.stgit@dhcp-1-237.local>
	<490B59BF.3000205@codemonkey.ws> <20081102130441.GD16809@redhat.com>
	<49119551.2070704@redhat.com>
	<fb249edb0811050748k1e2b8d77nd68dddc01af06e08@mail.gmail.com>
	<20081106071624.GC3820@redhat.com>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Gleb Natapov <gleb@redhat.com>
Cc: dlaor@redhat.com, qemu-devel@nongnu.org

2008/11/6 Gleb Natapov <gleb@redhat.com>:
> On Wed, Nov 05, 2008 at 04:48:32PM +0100, andrzej zaborowski wrote:
>> > Btw: I ack the whole thing, including the problem, the scenario and the
>> > solution.
>>
>> I don't, as far as I understand it's a -win2k-hack type of addition,
>> i.e. the hardware doesn't do this but we want to improve usability by
>> working around a bad guest behaviour.  Modifying qemu_irq abstraction
>> doesn't sound like the right place for that, qemu_irq contrary to what
>> the name suggests doesn't have to be connected to any interrupt.
>>
> It is nothing like a -win2k-hack since there is no any guest "bad
> behaviour" that cause the problem. Yes real hardware doesn't do this,
> but real hardware also provides OS with enough CPU power to handle every
> single timer interrupt.

A guest that counts on having enough CPU for something is
timing-depenent (buggy).

> And even if _some_ interrupts are dropped the
> drift is easily fixed with NTP. Try to run Windows XP on very slow machine
> and I am sure you'll see very noticeable time drift.

Exactly.  You'll find the drift on real hardware, so you should find
it in the emulator too.  You're trying to hack around it.

Linux doesn't see this because the clocksource and the
clockevents-device come from separate clks there.  It is a windows'
problem.  It *is* "bad behaviour".

>
>> Instead you can have the interrupt sources register a callback in the
>> PIC that the PIC calls when the interrupt wasn't delivered.  Or.. in
> It requires the mapping from interrupt vector inside the PIC to
> interrupt source.

Of course.

> This approach was rejected long time ago.

Then you'll have to find a different one.  qemu_irq is the wrong place.

>
>> the case of mc146818rtc.c wouldn't it be enough to check if the irq
>> has been acked by reading RTC_REG_C?  e.g.
>>
>> static void rtc_periodic_timer(void *opaque)
>> {
>>     RTCState *s = opaque;
>>
>>     rtc_timer_update(s, s->next_periodic_time);
>> +   if (s->cmos_data[RTC_REG_C] & 0xc0)
>> +         s->irq_coalesced++;
>>     s->cmos_data[RTC_REG_C] |= 0xc0;
>>     qemu_irq_raise(s->irq);
>> }
>>
> PIC/APIC in effect has a queue of one interrupt. This means that if
> timer tick is still not acknowledged it doesn't mean that interrupt
> was not queued for delivery inside a PIC.

This doesn't matter, the tick that arrived while a previous interrupt
was not acked yet, is lost anyway, i.e. had been coalesced.  So
this'll give you the right number of interrupts to re-inject.

Ofcourse this, as well as your approach are both wrong because the
guest may be intentionally ignoring the irq and expecting the
interrupts to coalesce.  Once it starts processing the RTC interrupts
it will get an unexpected storm.

Cheers