From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: Bug: Windows 2003 fails to install on xen-unstable tip Date: Thu, 25 Apr 2013 17:40:56 +0100 Message-ID: <51795C98.4050909@eu.citrix.com> References: <516EE63E.4090008@eu.citrix.com> <5171332602000078000CEE8A@nat28.tlf.novell.com> <51712018.4010600@citrix.com> <5171591402000078000CF023@nat28.tlf.novell.com> <51714BE5.8080909@eu.citrix.com> <517535A702000078000CF75C@nat28.tlf.novell.com> <517557F3.9020605@eu.citrix.com> <51767DC402000078000CFE48@nat28.tlf.novell.com> <517664BE.401@eu.citrix.com> <51796FC902000078000D0DD4@nat28.tlf.novell.com> <20130425163426.GG37678@ocelot.phlegethon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130425163426.GG37678@ocelot.phlegethon.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tim Deegan Cc: "xen-devel@lists.xen.org" , Eddie Dong , Suravee Suthikulpanit , Jan Beulich , Roger Pau Monne List-Id: xen-devel@lists.xenproject.org On 04/25/2013 05:34 PM, Tim Deegan wrote: > At 17:02 +0100 on 25 Apr (1366909369), Jan Beulich wrote: >>>>> On 23.04.13 at 12:38, George Dunlap wrote: >>> On 23/04/13 11:25, Jan Beulich wrote: >>>> Just to double check - could you comment out entirely the first >>>> (normal code, i.e. not the one marked //todo?) "else if" in >>>> rtc_periodic_interrupt() (including its body of course)? I would >>>> expect this to not make a difference, and if so I don't see how >>>> Windows expects to be woken up again (I would guess that >>>> they internally have some gating logic preventing the normal IRQ8 >>>> handling to happen, yet of course we don't know what would >>>> reset that state). >>> >>> In fact, when I comment out that region, then it hangs in the guest BIOS >>> before even attempting to boot the CD. >> >> That's due to a different issue, that I meanwhile found a fix for >> (caused by the general vpt code expecting interrupts to be >> delivered, yet the BIOS doesn't enable the interrupt, and hence >> the RTC periodic timer doesn't get advanced to the next tick, >> and it having an earlier expiration time than the PIT one prevents >> pt_update_irq() to pick that one up for processing). > > How inconvenient - I had been solving the same problem today. Yes, if > rtc_periodic_interrupt() doesn't actually inject an interrupt, the vpt > code needs to know so it can increment the timer. > >> With that fixed and the mentioned code block removed, things >> work as I had expected. But the logging that I added in the >> course of all this shows that it really juts happens to work, >> I can't really explain why (other than the myriads of superfluous >> interrupts attempted to be injected into the guest keeping the >> VM alive). In particular, almost none of the injected IRQs >> actually reach their handler (there are only very few REG_C >> reads), but the handler also doesn't do anything really >> interesting (i.e. we don't actually need the handler to execute, >> we just need to keep a flow of interrupts going into the VM). > > Really? Does injecting spurious interrupts work too? Presumably the > handler does _something_. > >> Yet I did verify that the correct values get actually written >> through vmx_inject_extint()/__vmx_inject_exception(). >> >> So at this point I'm of the opinion that the RTC changes really >> just exposed a completely different issue, and I'm of the opinion >> that this is what needs fixing, not papering over it by reverting >> the RTC stuff. > > On the contrary, I think this shows that the RTC/vpt/guest interactions > are so badly understood as to merit backing the whole lot out for 4.3. > This has been dragging on for weeks now, and AFAICS it's going to need a > serious overhaul, plus someone manually testing all the OSes that are > likely to be affected (i.e. the crufty old ones). Yes, I think "first, do no harm" should be what we aim for here... -George