From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Egger Subject: Re: libxl: cannot start guest Date: Tue, 22 May 2012 15:21:33 +0200 Message-ID: <4FBB92DD.3060100@amd.com> References: <4FB64BDC.6010801@amd.com> <1337347821.22316.122.camel@zakaz.uk.xensource.com> <4FB65B61.7000902@amd.com> <4FB66FED.5080704@amd.com> <1337356698.22316.138.camel@zakaz.uk.xensource.com> <4FBA185A.3080306@amd.com> <1337602541.24660.105.camel@zakaz.uk.xensource.com> <4FBA3EC8.3060104@amd.com> <1337608191.24660.138.camel@zakaz.uk.xensource.com> <4FBA62F7.9080308@gmx.de> <1337615835.24660.169.camel@zakaz.uk.xensource.com> <4FBB882B.1020902@amd.com> <1337691225.10118.114.camel@zakaz.uk.xensource.com> <4FBB9228.70001@gmx.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4FBB9228.70001@gmx.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 05/22/12 15:18, Christoph Egger wrote: > On 05/22/12 14:53, Ian Campbell wrote: > >> On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote: >>> On 05/21/12 17:57, Ian Campbell wrote: >>> >>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk >>>>> vdev=hda spec.backend=unknown >>>>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk >>>>> vdev=hda, using backend phy >>>>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04 >>>>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04 >>>>> xc: info: VIRTUAL MEMORY ARRANGEMENT: >>>>> Loader: 0000000000100000->000000000019bd04 >>>>> TOTAL: 0000000000000000->00000000ff800000 >>>>> ENTRY ADDRESS: 0000000000100000 >>>>> xc: info: PHYSICAL MEMORY ALLOCATION: >>>>> 4KB PAGES: 0x0000000000000200 >>>>> 2MB PAGES: 0x00000000000003fb >>>>> 1GB PAGES: 0x0000000000000002 >>>>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74 >>>>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out >>>>> of range, valid values are within range from 1 to 65535 >>>>> libxl: error: libxl_dom.c:74:libxl__sched_set_params: >>>>> libxl_sched_credit_domain_set failed -6 >>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk >>>>> vdev=hda spec.backend=phy >>>>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get >>>>> dompath for 7: Bad file descriptor >>>> >>>> This is back to the original issue, I think the last couple of mails >>>> have been something of a tangent since you weren't getting as far as >>>> this failure. >>>> >>>> I'm not really sure what to suggest here -- something is either closing >>>> the fd or scribbling over the memory which contains it. >>>> >>>> I suppose you could sprinkle calls to libxl__xs_get_dompath() around >>>> between libxl__sched_set_params and libxl__device_disk_set_backend and >>>> see where it starts failing -- that's going to be pretty tedious though. >>> >>> >>> It starts failing in libxl__build_post() right after >>> xs_introduce_domain(). >> >> What method did you use to determine that? > > > > > What you said: > > "sprinkle calls to libxl__xs_get_dompath() around between > libxl__sched_set_params and libxl__device_disk_set_backend and > see where it starts failing" > > > So at the xs_transaction_end right before that ctx->xsh is valid, but > >> right after... >> xs_introduce_domain(ctx->xsh, domid, state->store_mfn, state->store_port); >> ...it is invalid? i.e. before the free(vmpath) it is already corrupt? > > > > > Yes, you got it. > >> >> (Aside: why isn't vmpath in the gc, instead of done manually, >> nevermind...) >> >> Does the xs_introduce_domain itself succeed? > > > > > No, it fails. > >> Or do you mean that the next use of xsh after this fails > >> (where is that, somewhere back up the callchain? store_libxl_entry >> perhaps?) > >> >> xs_introduce_domain doesn't seem to do much which is untoward with the >> handle. > > > > I thinkIn xs_talkv() something must fail. > >> The only thing which springs to mind is that it may generate an >> @IntroduceDomain watch event. However xl is single threaded so we won't >> process that event until we unwind to whichever point we do an event >> loop iteration, in which case the corruption would have to happen later >> than right after xs_introduce_domain(). >> >> Did you manage to determine if "Bad file descriptor" was due to it being >> closed vs. the value being corrupted? > > My suspicion is that > > if (msg.type != type) > > in xs_talkv() is true. > > Christoph -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85689 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632