From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=60489 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P0gco-0004Vz-K2 for qemu-devel@nongnu.org; Tue, 28 Sep 2010 16:18:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1P0gcn-0003lE-CJ for qemu-devel@nongnu.org; Tue, 28 Sep 2010 16:18:06 -0400 Received: from dscas2.ad.uiuc.edu ([128.174.68.159]:56005) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P0gcn-0003kz-82 for qemu-devel@nongnu.org; Tue, 28 Sep 2010 16:18:05 -0400 Message-ID: <4CA248C5.2020409@gmail.com> Date: Tue, 28 Sep 2010 14:57:57 -0500 From: Sam King MIME-Version: 1.0 Subject: Re: [Qemu-devel] PATCH: debugging apic References: <4CA2069D.9040104@uiuc.edu> In-Reply-To: <4CA2069D.9040104@uiuc.edu> Content-Type: multipart/mixed; boundary="------------070607070003070909080909" List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org --------------070607070003070909080909 Content-Type: multipart/alternative; boundary="------------040902080606050100090004" --------------040902080606050100090004 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Thanks to Bernhard Kauer for pointing out the problem. Apparently if software disables LVT_LINT0 when there is a pending CPU_HARD_INTERRUPT you can get into trouble. I attached a patch that fixes the problem by resetting the interrupt_request. I am not sure if we need to do the same for LINT1, but this fixed the incorrect GPF I was getting. --Sam On 9/28/10 10:15 AM, Sam King wrote: > Hello, > > I am seeing a weird crash in my system and I am trying to figure out > if it is a software bug or a qemu emulation bug. From the software > perspective I am getting a GP fault at a time where it looks like > everything should be running normally. After digging into the Qemu > source code I found out where the GPF was coming from. It looks like > intno = -1 when it was being passed into do_interrupt64, which was > triggering one of the GPF checks. From what I can tell, intno was > being set to -1 by an interrupt_request in cpu-exec.c, which was going > down the following if statement around line 409 of that file: > > else if ((interrupt_request & CPU_INTERRUPT_HARD) && > (((env->hflags2 & HF2_VINTR_MASK) && > (env->hflags2 & HF2_HIF_MASK)) || > (!(env->hflags2 & HF2_VINTR_MASK) && > (env->eflags & IF_MASK && > !(env->hflags & > HF_INHIBIT_IRQ_MASK))))) > > and from within that else if statement, env has the following state: > > hflags2 = 0x00000001 > eflags = 0x00003202 > hflags = 0x0040c0b7 > interrupt request = 0x00000002 > > But intno is being set equal to -1 by the call to > cpu_get_pic_interrupt, from the call to apic_accept_pic_intr returning > 0. If I change the cpu_get_pic_interrupt code to this: > > int cpu_get_pic_interrupt(CPUState *env) > { > int intno; > > intno = apic_get_interrupt(env); > if (intno >= 0) { > /* set irq request if a PIC irq is still pending */ > /* XXX: improve that */ > pic_update_irq(isa_pic); > return intno; > } > /* read the irq from the PIC */ > if (!apic_accept_pic_intr(env)) { > //return -1; > } > > intno = pic_read_irq(isa_pic); > > return intno; > } > > Then the issue manifests as a spurious interrupt and the software > ignores it, avoiding the GPF. Does anyone have any ideas as to what > is going wrong here? Should I look more closely at the Qemu emulation > code or my software? Any help is appreciated. > > Thanks! > > --Sam --------------040902080606050100090004 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Thanks to Bernhard Kauer for pointing out the problem.  Apparently if software disables LVT_LINT0 when there is a pending CPU_HARD_INTERRUPT you can get into trouble.  I attached a patch that fixes the problem by resetting the interrupt_request.  I am not sure if we need to do the same for LINT1, but this fixed the incorrect GPF I was getting.

--Sam

On 9/28/10 10:15 AM, Sam King wrote:
Hello,

I am seeing a weird crash in my system and I am trying to figure out if it is a software bug or a qemu emulation bug.  From the software perspective I am getting a GP fault at a time where it looks like everything should be running normally.  After digging into the Qemu source code I found out where the GPF was coming from.  It looks like intno = -1 when it was being passed into do_interrupt64, which was triggering one of the GPF checks.  From what I can tell, intno was being set to -1 by an interrupt_request in cpu-exec.c, which was going down the following if statement around line 409 of that file:

else if ((interrupt_request & CPU_INTERRUPT_HARD) &&
                                   (((env->hflags2 & HF2_VINTR_MASK) &&
                                     (env->hflags2 & HF2_HIF_MASK)) ||
                                    (!(env->hflags2 & HF2_VINTR_MASK) &&
                                     (env->eflags & IF_MASK &&
                                      !(env->hflags & HF_INHIBIT_IRQ_MASK)))))

and from within that else if statement, env has the following state:

hflags2 = 0x00000001
eflags = 0x00003202
hflags = 0x0040c0b7
interrupt request = 0x00000002

But intno is being set equal to -1 by the call to cpu_get_pic_interrupt, from the call to apic_accept_pic_intr returning 0.  If I change the cpu_get_pic_interrupt code to this:

int cpu_get_pic_interrupt(CPUState *env)
{
    int intno;

    intno = apic_get_interrupt(env);
    if (intno >= 0) {
        /* set irq request if a PIC irq is still pending */
        /* XXX: improve that */
        pic_update_irq(isa_pic);
        return intno;
    }
    /* read the irq from the PIC */
    if (!apic_accept_pic_intr(env)) {
        //return -1;
    }

    intno = pic_read_irq(isa_pic);

    return intno;
}

Then the issue manifests as a spurious interrupt and the software ignores it, avoiding the GPF.  Does anyone have any ideas as to what is going wrong here?  Should I look more closely at the Qemu emulation code or my software? Any help is appreciated.

Thanks!

--Sam

--------------040902080606050100090004-- --------------070607070003070909080909 Content-Type: text/plain; x-mac-type=0; x-mac-creator=0; name="apic-race.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="apic-race.patch" *** hw/apic.c 2010-07-22 07:39:04.000000000 -0500 --- ../qemu-0.12.5-fixed/hw/apic.c 2010-09-28 14:45:55.476945540 -0500 *************** *** 841,846 **** --- 841,851 ---- s->lvt[n] = val; if (n == APIC_LVT_TIMER) apic_timer_update(s, qemu_get_clock(vm_clock)); + + if(n == APIC_LVT_LINT0) { + if((val & APIC_LVT_MASKED) && (env->interrupt_request & CPU_INTERRUPT_HARD)) + cpu_reset_interrupt(env, CPU_INTERRUPT_HARD); + } } break; case 0x38: --------------070607070003070909080909--