From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: test report for Xen 4.3 RC1 Date: Tue, 2 Jul 2013 09:36:12 -0400 Message-ID: <20130702133612.GA4143@phenom.dumpdata.com> References: <1B4B44D9196EFF41AE41FDA404FC0A1001B15762@SHSMSX102.ccr.corp.intel.com> <20130617142304.GP30071@phenom.dumpdata.com> <1B4B44D9196EFF41AE41FDA404FC0A1001B19F25@SHSMSX102.ccr.corp.intel.com> <20130621181752.GE15809@phenom.dumpdata.com> <1B4B44D9196EFF41AE41FDA404FC0A1001B2B2A8@SHSMSX102.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1B4B44D9196EFF41AE41FDA404FC0A1001B2B2A8@SHSMSX102.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Ren, Yongjie" Cc: "george.dunlap@eu.citrix.com" , "Xu, YongweiX" , "Liu, SongtaoX" , "Tian, Yongxue" , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On Tue, Jul 02, 2013 at 08:09:48AM +0000, Ren, Yongjie wrote: > > -----Original Message----- > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > > Sent: Saturday, June 22, 2013 2:18 AM > > To: Ren, Yongjie > > Cc: george.dunlap@eu.citrix.com; Xu, YongweiX; Liu, SongtaoX; Tian, > > Yongxue; xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] test report for Xen 4.3 RC1 > > > > > > > > http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1851 > > > > > > > > > > > > > > > > > > > > That looks like you are hitting the udev race. > > > > > > > > > > > > > > > > > > > > Could you verify that these patches: > > > > > > > > > > https://lkml.org/lkml/2013/5/13/520 > > > > > > > > > > > > > > > > > > > > fix the issue (They are destined for v3.11) > > > > > > > > > > > > > > > > > > > Not tried yet. I'll update it to you later. > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > We tested kernel 3.9.3 with the 2 patches you mentioned, and > > found > > > > this > > > > > > > bug still exist. For example, we did CPU online-offline for Dom0 for > > > > 100 > > > > > > times, > > > > > > > and found 2 times (of 100 times) failed. > > > > > > > > > > > > Hm, does it fail b/c udev can't online the sysfs entry? > > > > > > > > > > > I think no. > > > > > When it fails to online CPU #3 (trying online #1~#3), it doesn't show > > any > > > > info > > > > > about CPU #3 via the output of "devadm monitor --env" CMD. It does > > > > show > > > > > info about #1 and #2 which are onlined succefully. > > > > > > > > And if you re-trigger the the 'xl vcpu-set' it eventually comes back up > > right? > > > > > > > We don't use 'xl vcpu-set' command when doing the CPU hot-plug. > > > We just call the xc_cpu_online/offline() in tools/libxc/xc_cpu_hotplug.c to > > test. > > > > Oh. That is very different than what I thought. You are not offlining/onlining > > vCPUS - you offlining/onlining pCPUS! So Xen has to cramp the dom0 vCPUs > > in the > > remaining vCPUS. > > > > There should be no vCPU re-sizing correct? > > > Yes, for this case we do online/offline for pCPUs not vCPUs. > (vCPU number doesn't change.) OK, so nothing to do with Linux but mostly with Xen hypervisor. Do you know who added this functionality? Can they help? > > > > (see the attachment about my test code in that bugzilla.) > > > And, yes, if a CPU failed to online, it can also be onlined again when we > > re-trigger > > > online function. > > > > > > > > > > > > > > > > > > > > > 4. 'xl vcpu-set' can't decrease the vCPU number of a HVM > > > > guest > > > > > > > > > > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1822 > > > > > > > > > > > > > > > > > > > > That I believe was an QEMU bug: > > > > > > > > > > > > > > > > > > http://lists.xen.org/archives/html/xen-devel/2013-05/msg01054.html > > > > > > > > > > > > > > > > > > > > which should be in QEMU traditional now (05-21 was when it > > > > went > > > > > > > > > > in the tree) > > > > > > > > > > > > > > > > > > > In this year or past year, this bug always exists (at least in our > > > > > > testing). > > > > > > > > > 'xl vcpu-set' can't decrease the vCPU number of a HVM guest > > > > > > > > > > > > > > > > Could you retry with Xen 4.3 please? > > > > > > > > > > > > > > > With Xen 4.3 & Linux:3.10.0-rc3, I can't decrease the vCPU > > number of > > > > a > > > > > > guest. > > > > > > > > > > > sorry, when I said this message, I still use rhel6.4 kernel as the guest. > > > > > After upgrading guest kernel to 3.10.0-rc3, the result became better. > > > > > Basically vCPU increment/decrement can work fine. I'll close that > > bug. > > > > > > > > Excellent! > > > > > But there's still a minor issue as following. > > > > > After booting guest with 'vcpus=4' and 'maxvcpus=32', change its > > vCPU > > > > number. > > > > > # xl vcpu-set $domID 32 > > > > > then you can only get less than 32 (e.g. 19) CPUs in the guest; again, > > you > > > > set > > > > > vCPU number to 32 (from 19), then it works to get 32vCPU for the > > guest. > > > > > but 'xl vcpu-set $domID 8' can work fine as we expected. > > > > > vCPU decrement has the same result. > > > > > Can you also have a try to reproduce my issue? > > > > > > > This issue doesn't exist when using the latest QEMU traditional tree. > > > My pervious QEMU was old (March 2013), and I found some of your > > patches > > > were applied in May 2013. These fixes can fix the issue we reported. > > > Close this bug. > > > > Yes! > > > > > > But, it introduced another issue: when doing 'xl vcpu-set' for HVM > > several > > > times (e.g. 5 times), the guest will panic. Log is attached. > > > Before your patches in qemu traditional tree in May 2013, we never > > meet > > > guest kernel panic. > > > dom0: 3.10.0-rc3 > > > Xen: 4.3.0-RCx > > > QEMU: the latest traditional tree > > > guest kernel: 3.10.0-RC3 > > > I'll file another bug to track this bug ? > > > > Please. > > > Can you reproduce this ? > > > > Could you tell me how you are doing 'xl vcpu-set'? Is there a particular > > test script you are using? > > > 1. xl vcpu-set $domID 2 > 2. xl vcpu-set $domID 20 > 3. repeat step #1 and #2 for several times. (guest kernel panic ...) > > I also filed a bug in bugzilla to track this. > You can get more info in the following link. > http://bugzilla.xenproject.org/bugzilla/show_bug.cgi?id=1860 OK, thank you. I am a bit busy right now tracking down some other bugs that I promised I would look after. But after that I should have some time. > > -- > Jay > > > > > > > > > Sure. Now how many PCPUS do you have? And what version of QEMU > > > > traditional > > > > were you using? > > > > > > > There're 32 pCPU in that system we used. > > > > > > Best Regards, > > > Yongjie (Jay) > > >