From mboxrd@z Thu Jan  1 00:00:00 1970
From: zhanghailiang <zhang.zhanghailiang@huawei.com>
Subject: Re: [BUG/RFC] Two cpus are not brought up normally in SLES11 sp3
 VM after reboot
Date: Tue, 7 Jul 2015 19:43:35 +0800
Message-ID: <559BBB67.4000503@huawei.com>
References: <559A342C.6020207@huawei.com>	<559A4010.30808@redhat.com>	<559A516E.1070000@huawei.com> <20150707132344.04476183@nial.brq.redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: <peter.huangpeng@huawei.com>, Paolo Bonzini <pbonzini@redhat.com>,
	<kvm@vger.kernel.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
To: Igor Mammedov <imammedo@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from szxga03-in.huawei.com ([119.145.14.66]:26463 "EHLO
	szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1757101AbbGGLoS (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 7 Jul 2015 07:44:18 -0400
In-Reply-To: <20150707132344.04476183@nial.brq.redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 2015/7/7 19:23, Igor Mammedov wrote:
> On Mon, 6 Jul 2015 17:59:10 +0800
> zhanghailiang <zhang.zhanghailiang@huawei.com> wrote:
>
>> On 2015/7/6 16:45, Paolo Bonzini wrote:
>>>
>>>
>>> On 06/07/2015 09:54, zhanghailiang wrote:
>>>>
>>>>   From host, we found that QEMU vcpu1 thread and vcpu7 thread were=
 not
>>>> consuming any cpu (Should be in idle state),
>>>> All of VCPUs' stacks in host is like bellow:
>>>>
>>>> [<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
>>>> [<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
>>>> [<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
>>>> [<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
>>>> [<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
>>>> [<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
>>>> [<ffffffff81468092>] system_call_fastpath+0x16/0x1b
>>>> [<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>
>>>> We looked into the kernel codes that could leading to the above 'S=
tuck'
>>>> warning,
> in current upstream there isn't any printk(...Stuck...) left since th=
at code path
> has been reworked.
> I've often seen this on over-committed host during guest CPUs up/down=
 torture test.
> Could you update guest kernel to upstream and see if issue reproduces=
?
>

Hmm, Unfortunately, it is very hard to reproduce, and we are still tryi=
ng to reproduce it.

=46or your test case, is it a kernel bug?
Or is there any related patch could solve your test problem been merged=
 into
upstream ?

Thanks,
zhanghailiang

>>>> and found that the only possible is the emulation of 'cpuid' instr=
uct in
>>>> kvm/qemu has something wrong.
>>>> But since we can=E2=80=99t reproduce this problem, we are not quit=
e sure.
>>>> Is there any possible that the cupid emulation in kvm/qemu has som=
e bug ?
>>>
>>> Can you explain the relationship to the cpuid emulation?  What do t=
he
>>> traces say about vcpus 1 and 7?
>>
>> OK, we searched the VM's kernel codes with the 'Stuck' message, and =
 it is located in
>> do_boot_cpu(). It's in BSP context, the call process is:
>> BSP executes start_kernel() -> smp_init() -> smp_boot_cpus() -> do_b=
oot_cpu() -> wakeup_secondary_via_INIT() to trigger APs.
>> It will wait 5s for APs to startup, if some AP not startup normally,=
 it will print 'CPU%d Stuck' or 'CPU%d: Not responding'.
>>
>> If it prints 'Stuck', it means the AP has received the SIPI interrup=
t and begins to execute the code
>> 'ENTRY(trampoline_data)' (trampoline_64.S) , but be stuck in some pl=
aces before smp_callin()(smpboot.c).
>> The follow is the starup process of BSP and AP.
>> BSP:
>> start_kernel()
>>     ->smp_init()
>>        ->smp_boot_cpus()
>>          ->do_boot_cpu()
>>              ->start_ip =3D trampoline_address(); //set the address =
that AP will go to execute
>>              ->wakeup_secondary_cpu_via_init(); // kick the secondar=
y CPU
>>              ->for (timeout =3D 0; timeout < 50000; timeout++)
>>                  if (cpumask_test_cpu(cpu, cpu_callin_mask)) break;/=
/ check if AP startup or not
>>
>> APs:
>> ENTRY(trampoline_data) (trampoline_64.S)
>>         ->ENTRY(secondary_startup_64) (head_64.S)
>>            ->start_secondary() (smpboot.c)
>>               ->cpu_init();
>>               ->smp_callin();
>>                   ->cpumask_set_cpu(cpuid, cpu_callin_mask); ->Note:=
 if AP comes here, the BSP will not prints the error message.
>>
>>   From above call process, we can be sure that, the AP has been stuc=
k between trampoline_data and the cpumask_set_cpu() in
>> smp_callin(), we look through these codes path carefully, and only f=
ound a 'hlt' instruct that could block the process.
>> It is located in trampoline_data():
>>
>> ENTRY(trampoline_data)
>>           ...
>>
>> 	call	verify_cpu		# Verify the cpu supports long mode
>> 	testl   %eax, %eax		# Check for return code
>> 	jnz	no_longmode
>>
>>           ...
>>
>> no_longmode:
>> 	hlt
>> 	jmp no_longmode
>>
>> For the process verify_cpu(),
>> we can only find the 'cpuid' sensitive instruct that could lead VM e=
xit from No-root mode.
>> This is why we doubt if cpuid emulation is wrong in KVM/QEMU that le=
ading to the fail in verify_cpu.
>>
>>   From the message in VM, we know vcpu1 and vcpu7 is something wrong=
=2E
>> [    5.060042] CPU1: Stuck ??
>> [   10.170815] CPU7: Stuck ??
>> [   10.171648] Brought up 6 CPUs
>>
>> Besides, the follow is the cpus message got from host.
>> 80FF72F5-FF6D-E411-A8C8-000000821800:/home/fsp/hrg # virsh qemu-moni=
tor-command instance-0000000
>> * CPU #0: pc=3D0x00007f64160c683d thread_id=3D68570
>>     CPU #1: pc=3D0xffffffff810301f1 (halted) thread_id=3D68573
>>     CPU #2: pc=3D0xffffffff810301e2 (halted) thread_id=3D68575
>>     CPU #3: pc=3D0xffffffff810301e2 (halted) thread_id=3D68576
>>     CPU #4: pc=3D0xffffffff810301e2 (halted) thread_id=3D68577
>>     CPU #5: pc=3D0xffffffff810301e2 (halted) thread_id=3D68578
>>     CPU #6: pc=3D0xffffffff810301e2 (halted) thread_id=3D68583
>>     CPU #7: pc=3D0xffffffff810301f1 (halted) thread_id=3D68584
>>
>> Oh, i also forgot to mention in the above message that, we have bond=
 each vCPU to different physical CPU in
>> host.
>>
>> Thanks,
>> zhanghailiang
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> .
>