kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
  • * Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
           [not found]                 ` <20151119134039.GA27717@morn.lan>
           [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
    @ 2015-12-19  1:08                   ` Gonglei (Arei)
      1 sibling, 0 replies; 15+ messages in thread
    From: Gonglei (Arei) @ 2015-12-19  1:08 UTC (permalink / raw)
      To: Kevin O'Connor, Xulei (Stone), Paolo Bonzini
      Cc: seabios@seabios.org, Huangweidong (C), qemu-devel,
    	kvm@vger.kernel.org
    
    Hi Kevin & Paolo,
    
    Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.
    
    [BTW, the whole SeaBIOS log attached]
    
    [2015-12-18 10:38:10] >>>>>gonglei: enter smp_setup()...
    [2015-12-18 10:38:10] >>>>>gonglei: begine to enable local APIC...
    [2015-12-18 10:38:10] >>>>>gonglei: finish enable local APIC...
    [2015-12-18 10:38:10] >>>gonglei: cmos_smp_count=8
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=5
    [2015-12-18 10:38:10] ===: CountCPUs=2, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=7
    [2015-12-18 10:38:10] ===: CountCPUs=3, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=1
    [2015-12-18 10:38:10] ===: CountCPUs=4, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=2
    [2015-12-18 10:38:10] ===: CountCPUs=5, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=4
    [2015-12-18 10:38:10] ===: CountCPUs=6, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=3
    [2015-12-18 10:38:10] ===: CountCPUs=7, SMPStack=0x6d84
    [2015-12-18 10:38:10] >>> enter handle_smp...
    [2015-12-18 10:38:10] handle_smp: apic_id=6
    [2015-12-18 10:38:10] ===: CountCPUs=8, SMPStack=0x6d84
    [2015-12-18 10:38:10]  gonglei: finish while   
    
    [pid 31509 is a vcpu thread used 100% cpu overhead]
    
    # cat /proc/31509/stack  
    [<ffffffffa05c337c>] vmx_vcpu_run+0x35c/0x580 [kvm_intel]
    [<ffffffffa06b0f10>] em_push+0x0/0x20 [kvm]
    [<ffffffffa06a30dc>] x86_emulate_instruction+0x20c/0x440 [kvm]
    [<ffffffffa05c9224>] handle_exception+0xe4/0x1b58 [kvm_intel]
    [<ffffffffa06a24c5>] vcpu_enter_guest+0x565/0x790 [kvm]
    [<ffffffffa05bf990>] vmx_get_segment_base+0x0/0xb0 [kvm_intel]
    [<ffffffffa06a2888>] __vcpu_run+0x198/0x260 [kvm]
    [<ffffffffa06a3508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
    [<ffffffffa068f92e>] vcpu_load+0x4e/0x80 [kvm]
    [<ffffffffa068fcee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
    [<ffffffff8109618d>] futex_wake+0xfd/0x110
    [<ffffffff811ed57c>] security_file_permission+0x1c/0xa0
    [<ffffffff8116bedb>] do_vfs_ioctl+0x8b/0x3b0
    [<ffffffff8116c2a1>] sys_ioctl+0xa1/0xb0
    [<ffffffff81469272>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff
    
    And kvm tracing information:
    
    <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180078: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180078: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180079: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180079: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180079: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180080: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180080: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180080: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180081: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180081: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180081: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180081: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180082: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180083: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180083: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180083: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180084: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180084: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180084: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180084: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180085: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    <...>-31509 [035] 154753.180085: kvm_emulate_insn: 0:3:f0 53 (real)
    <...>-31509 [035] 154753.180085: kvm_inj_exception: #UD (0x0)
    <...>-31509 [035] 154753.180085: kvm_entry: vcpu 0
    <...>-31509 [035] 154753.180086: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
    
    Now, it's very clear that the guest stuck in yiled(), and then kvm encounter the exception #UD.
    
    Do you have any thoughts? Thanks!
    
    
    The Seabios patch below:
    
    diff --git a/roms/seabios/src/boot.c b/roms/seabios/src/boot.c
    index f23e9e1..552914a 100644
    --- a/roms/seabios/src/boot.c
    +++ b/roms/seabios/src/boot.c
    @@ -93,7 +93,7 @@ glob_prefix(const char *glob, const char *str)
     static int
     find_prio(const char *glob)
     {
    -    dprintf(1, "Searching bootorder for: %s\n", glob);
    +    //dprintf(1, "Searching bootorder for: %s\n", glob);
         int i;
         for (i = 0; i < BootorderCount; i++)
             if (glob_prefix(glob, Bootorder[i]))
    diff --git a/roms/seabios/src/fw/smp.c b/roms/seabios/src/fw/smp.c
    index a466ea6..46ec607 100644
    --- a/roms/seabios/src/fw/smp.c
    +++ b/roms/seabios/src/fw/smp.c
    @@ -46,12 +46,16 @@ int apic_id_is_present(u8 apic_id)
         return !!(FoundAPICIDs[apic_id/32] & (1ul << (apic_id % 32)));
     }
     
    +// Atomic lock for shared stack across processors.
    +u32 SMPLock __VISIBLE;
    +u32 SMPStack __VISIBLE;
    +
     void VISIBLE32FLAT
     handle_smp(void)
     {
         if (!CONFIG_QEMU)
             return;
    -
    +    dprintf(1, ">>> enter handle_smp...\n");
         // Enable CPU caching
         setcr0(getcr0() & ~(CR0_CD|CR0_NW));
     
    @@ -70,19 +74,16 @@ handle_smp(void)
         FoundAPICIDs[apic_id/32] |= (1 << (apic_id % 32));
     
         CountCPUs++;
    +    dprintf(1, "===: CountCPUs=%d, SMPStack=0x%x\n", CountCPUs, SMPStack);
     }
     
    -// Atomic lock for shared stack across processors.
    -u32 SMPLock __VISIBLE;
    -u32 SMPStack __VISIBLE;
    -
     // find and initialize the CPUs by launching a SIPI to them
     void
     smp_setup(void)
     {
         if (!CONFIG_QEMU)
             return;
    -
    +    dprintf(1, ">>>>>gonglei: enter smp_setup()...\n");
         ASSERT32FLAT();
         u32 eax, ebx, ecx, cpuid_features;
         cpuid(1, &eax, &ebx, &ecx, &cpuid_features);
    @@ -106,7 +107,7 @@ smp_setup(void)
         u64 new = (0xea | ((u64)SEG_BIOS<<24)
                    | (((u32)entry_smp - BUILD_BIOS_ADDR) << 8));
         *(u64*)BUILD_AP_BOOT_ADDR = new;
    -
    +    dprintf(1, ">>>>>gonglei: begine to enable local APIC...\n");
         // enable local APIC
         u32 val = readl(APIC_SVR);
         writel(APIC_SVR, val | APIC_ENABLED);
    @@ -125,10 +126,21 @@ smp_setup(void)
         writel(APIC_ICR_LOW, 0x000C4500);
         u32 sipi_vector = BUILD_AP_BOOT_ADDR >> 12;
         writel(APIC_ICR_LOW, 0x000C4600 | sipi_vector);
    -
    +    dprintf(1, ">>>>>gonglei: finish enable local APIC...\n");
         // Wait for other CPUs to process the SIPI.
         u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
    -    while (cmos_smp_count != CountCPUs)
    +    dprintf(1, ">>>gonglei: cmos_smp_count=%d\n", cmos_smp_count);
    +    while (cmos_smp_count != CountCPUs) {
             asm volatile(
                 // Release lock and allow other processors to use the stack.
                 "  movl %%esp, %1\n"
    @@ -139,8 +151,11 @@ smp_setup(void)
                 "  jc 1b\n"
                 : "+m" (SMPLock), "+m" (SMPStack)
                 : : "cc", "memory");
    +       //yield();
    +    }
    +    dprintf(1, " gonglei: finish while\n");
         yield();
    -
    +    dprintf(1, " gonglei: finish yield\n");
         // Restore memory.
         *(u64*)BUILD_AP_BOOT_ADDR = old;
     
    diff --git a/roms/seabios/src/misc.c b/roms/seabios/src/misc.c
    index 8caaf31..77f6be3 100644
    --- a/roms/seabios/src/misc.c
    +++ b/roms/seabios/src/misc.c
    @@ -64,6 +64,10 @@ void VISIBLE16
     handle_02(void)
     {
         debug_isr(DEBUG_ISR_02);
    +    dprintf(1, "gonglei hand nmi inject, write rtc \n");
     }
     
     void
    diff --git a/roms/seabios/src/stacks.c b/roms/seabios/src/stacks.c
    index 1dbdfe9..c1b5203 100644
    --- a/roms/seabios/src/stacks.c
    +++ b/roms/seabios/src/stacks.c
    @@ -174,6 +174,7 @@ call16_smm(u32 eax, u32 edx, void *func)
     static void
     call32_sloppy_prep(void)
     {
    +    dprintf(1, ">>> enter call32_sloppy_prep...\n");
         // Backup cmos index register and disable nmi
         u8 cmosindex = inb(PORT_CMOS_INDEX);
         outb(cmosindex | NMI_DISABLE_BIT, PORT_CMOS_INDEX);
    
    
    Regards,
    -Gonglei
    
    
    > -----Original Message-----
    > From: Kevin O'Connor [mailto:kevin@koconnor.net]
    > Sent: Thursday, November 19, 2015 9:41 PM
    > To: Xulei (Stone)
    > Cc: Gonglei (Arei); qemu-devel; seabios@seabios.org; Huangweidong (C)
    > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
    > problem on qemu-kvm platform
    > 
    > On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
    > > Kevin,
    > >
    > > After deeply analyzing, i think there may be 3 possible reasons:
    > > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
    > > lock to protect.  So, sometimes, 2 or more vcpu may get the same
    > > current value of CountCPUs. Then we'll get a single incrementation
    > > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
    > > loop forever;
    > 
    > The handle_smp() code is called from romlayout.S:entry_smp() which does take
    > a lock.  So, all of handle_smp() should run synchronous.
    > 
    > > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?
    > 
    > Not sure - the last time there were problems in this area of the code others
    > used kvmtrace to try and track this down.  Since you are getting dprintf
    > statements, you could also try outputting cmos_smp_count prior to the loop
    > (see patch below).
    > 
    > > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
    > > I've tested, when yield() is running, SeaBIOS seems has not created
    > > some other threads except the main thread. So I don't know what's the
    > > function of yield() here.?
    > 
    > The yield() allows hardware interrupts to occur.  But note that
    > yield() isn't called in the loop - is is only called after the loop completes.
    > 
    > If you are only getting this on massive repetitive reboot requests, there are
    > some other possible explanations:
    > 
    > - perhaps the SIPI is getting lost because one of the CPUs is still
    >   resetting or still processing a SIPI from the last reboot?
    > 
    > - the seabios code itself may have been corrupted if the memcpy() in
    >   qemu_prep_reset() got far enough along to clear HaveRunPost, but did
    >   not get far enough along to fully complete the memcpy().
    > 
    > If the failure is reproducible, the patch below could help narrow the
    > possibilities.
    > 
    > -Kevin
    > 
    > 
    > --- a/src/fw/smp.c
    > +++ b/src/fw/smp.c
    > @@ -125,6 +125,7 @@ smp_setup(void)
    > 
    >      // Wait for other CPUs to process the SIPI.
    >      u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
    > +    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
    >      while (cmos_smp_count != CountCPUs)
    >          asm volatile(
    >              // Release lock and allow other processors to use the stack.
    > @@ -136,6 +137,7 @@ smp_setup(void)
    >              "  jc 1b\n"
    >              : "+m" (SMPLock), "+m" (SMPStack)
    >              : : "cc", "memory");
    > +    dprintf(1, "finish smp\n");
    >      yield();
    > 
    >      // Restore memory.
    
    ^ permalink raw reply related	[flat|nested] 15+ messages in thread

  • end of thread, other threads:[~2015-12-23 18:06 UTC | newest]
    
    Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <8E78D212B8C25246BE4CE7EA0E645FE5291A08@SZXEMI504-MBS.china.huawei.com>
         [not found] ` <563955D4.7080000@huawei.com>
         [not found]   ` <20151104174201.GA17784@morn.lan>
         [not found]     ` <8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com>
         [not found]       ` <20151109133253.GA1790@morn.lan>
         [not found]         ` <20151109200618.GA29129@morn.lan>
         [not found]           ` <20151109202726.GA31490@morn.lan>
         [not found]             ` <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com>
         [not found]               ` <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com>
         [not found]                 ` <20151119134039.GA27717@morn.lan>
         [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
    2015-12-18 23:13                     ` [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Kevin O'Connor
    2015-12-19  6:28                       ` Gonglei (Arei)
    2015-12-19 12:03                       ` Gonglei (Arei)
    2015-12-19 15:11                         ` [Qemu-devel] " Kevin O'Connor
    2015-12-20  9:49                           ` Gonglei (Arei)
    2015-12-20 14:33                             ` [Qemu-devel] " Kevin O'Connor
    2015-12-21  9:41                               ` Gonglei (Arei)
    2015-12-21 18:47                                 ` Kevin O'Connor
    2015-12-22  2:14                                   ` [Qemu-devel] " Gonglei (Arei)
    2015-12-22  3:15                                     ` Xulei (Stone)
    2015-12-22 15:38                                       ` Kevin O'Connor
    2015-12-22 15:51                                     ` [Qemu-devel] " Kevin O'Connor
    2015-12-23  6:40                                       ` Gonglei (Arei)
    2015-12-23 18:06                                         ` [Qemu-devel] " Kevin O'Connor
    2015-12-19  1:08                   ` Gonglei (Arei)
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox;
    as well as URLs for NNTP newsgroup(s).