Add some debug print, generic_set_all->prepare_set->write_cr0 took much time,On 09.08.12 at 11:42, "zhenzhong.duan" <zhenzhong.duan@oracle.com> wrote:于 2012-08-08 23:01, Jan Beulich 写道:On 08.08.12 at 11:48, "zhenzhong.duan"<zhenzhong.duan@oracle.com> wrote:于 2012-08-07 16:37, Jan Beulich 写道: Some spin at stop_machine after finish their job.And here you'd need to find out what they're waiting for, and what those CPUs are doing.They are waiting the vcpu calling generic_set_all and those spin at set_atomicity_lock. In fact, all are waiting generic_set_allI think we're moving in circles - what is the vCPU currently generic_set_all() then doing?
dom0 boot with 24 vcpus(same result with dom0_max_vcpus=4). No other vm except dom0. All 24 vcpus spin from xentop result. Below is xentop clip.There's not that much being done in generic_set_all(), so the code should finish reasonably quickly. Are you perhaps having more vCPU-s in the guest than pCPU-s they can run on?System env is an exalogic node with 24 cores + 100G mem (2 socket , 6 cores per socket, 2 HT threads per core). Bootup a pvhvm with 12vpcus (or 24) + 90 GB + pci passthroughed device.So you're indeed over-committing the system. How many vCPU-s does you Dom0 have? Are there any other VMs? Is there any vCPU pinning in effect?
Below is xl dmesg result for your reference. thanksDoes your hardware support Pause-Loop-Exiting (or the AMD equivalent, don't recall their term right now)?I have no access to serial line, could I get the info by a command?"xl dmesg" run early enough (i.e. before the log buffer wraps).