From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KxkcY-0000iE-Gc for qemu-devel@nongnu.org; Wed, 05 Nov 2008 10:48:38 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KxkcV-0000gp-Om for qemu-devel@nongnu.org; Wed, 05 Nov 2008 10:48:37 -0500 Received: from [199.232.76.173] (port=53972 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KxkcV-0000gi-DN for qemu-devel@nongnu.org; Wed, 05 Nov 2008 10:48:35 -0500 Received: from rv-out-0708.google.com ([209.85.198.240]:33049) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KxkcV-0000Oq-2G for qemu-devel@nongnu.org; Wed, 05 Nov 2008 10:48:35 -0500 Received: by rv-out-0708.google.com with SMTP id f25so48484rvb.22 for ; Wed, 05 Nov 2008 07:48:32 -0800 (PST) Message-ID: Date: Wed, 5 Nov 2008 16:48:32 +0100 From: "andrzej zaborowski" Subject: Re: [Qemu-devel] [RESEND][PATCH 0/3] Fix guest time drift under heavy load. In-Reply-To: <49119551.2070704@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20081029152236.14831.15193.stgit@dhcp-1-237.local> <490B59BF.3000205@codemonkey.ws> <20081102130441.GD16809@redhat.com> <49119551.2070704@redhat.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov , dlaor@redhat.com, qemu-devel@nongnu.org 2008/11/5 Dor Laor : > Gleb Natapov wrote: > > On Fri, Oct 31, 2008 at 02:17:19PM -0500, Anthony Liguori wrote: > > > Gleb Natapov wrote: > > > Qemu device emulation for timers might be inaccurate and > causes coalescing of several IRQs into one. It happens when the > load on the host is high and the guest did not manage to ack the > previous IRQ. The problem can be reproduced by copying of a big > file or many small ones inside Windows guest. When you do that guest > clock start to lag behind the host one. > > The first patch in the series changes qemu_irq subsystem to return > IRQ delivery status information. If device is notified that IRQs > where lost it can regenerate them as needed. The following two > patches add IRQ regeneration to PIC and RTC devices. > > > > I don't think any of the problems raised when this was initially posted. > > > So? I raise them now. Have you tried suggested scenario and was able to > reproduce the problem? > > > > It is the same issue, just another scenario. > > Further, I don't think that always playing catch-up with interrupts is > always the best course of action. > > > > Agree. Playing catch-up with interrupts is not always the best course of > action. But sometimes there is no other choice. > > > > As I've said repeatedly in the past, any sort of time drift fixes needs > to have a lot of data posted with it that is repeatable. > > How much does this improve things with Windows? > > > The time drift is eliminated. If there is a spike in a load time may > slow down, but after that it catches up (this happens only during very > high loads though). > > > > Gleb, can you please provide more details: > - What's the host's kernel version exactly (including the high-res, dyn tick > configured) > - What's the windows version? Is it standard HAL (pit) or ACPI (rtc) or > both? > - The detailed scenario you use (example: I copied the entire c:/windows > directory, etc) > - Without the patch, what the time drift after x seconds on the host. > - With the patch, is there a drift? Is there increased cpu consumption, etc > > Btw: I ack the whole thing, including the problem, the scenario and the > solution. I don't, as far as I understand it's a -win2k-hack type of addition, i.e. the hardware doesn't do this but we want to improve usability by working around a bad guest behaviour. Modifying qemu_irq abstraction doesn't sound like the right place for that, qemu_irq contrary to what the name suggests doesn't have to be connected to any interrupt. Instead you can have the interrupt sources register a callback in the PIC that the PIC calls when the interrupt wasn't delivered. Or.. in the case of mc146818rtc.c wouldn't it be enough to check if the irq has been acked by reading RTC_REG_C? e.g. static void rtc_periodic_timer(void *opaque) { RTCState *s = opaque; rtc_timer_update(s, s->next_periodic_time); + if (s->cmos_data[RTC_REG_C] & 0xc0) + s->irq_coalesced++; s->cmos_data[RTC_REG_C] |= 0xc0; qemu_irq_raise(s->irq); } Cheers