* libxl: cannot start guest
@ 2012-05-18 13:17 Christoph Egger
2012-05-18 13:30 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-18 13:17 UTC (permalink / raw)
To: xen-devel
Hi,
I am currently using c/s 25371:e9058654ca08.
When I try to start a HVM guest I get this failure:
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
libxl: error: libxl.c:3208:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 1: Bad file descriptor
libxl: error: libxl_event.c:468:libxl__ev_xswatch_register: create watch
for path /local/domain/0/device-model/1/state: Bad file descriptor
libxl: error: libxl_dm.c:1069:device_model_spawn_outcome: domain 1
device model: spawn failed (rc=-3)
assertion "ao->in_initiator" failed: file "libxl_event.c", line 1388,
function "libxl__ao_complete_check_progress_reports"
[1] Abort trap (core dumped) xl create -c ${F...
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-18 13:17 libxl: cannot start guest Christoph Egger
@ 2012-05-18 13:30 ` Ian Campbell
2012-05-18 14:23 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-18 13:30 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Fri, 2012-05-18 at 14:17 +0100, Christoph Egger wrote:
> Hi,
>
> I am currently using c/s 25371:e9058654ca08.
> When I try to start a HVM guest I get this failure:
>
>
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> Loader: 0000000000100000->000000000019bd04
> TOTAL: 0000000000000000->00000000ff800000
> ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
> 4KB PAGES: 0x0000000000000200
> 2MB PAGES: 0x00000000000003fb
> 1GB PAGES: 0x0000000000000002
> libxl: error: libxl.c:3208:libxl_sched_credit_domain_set: Cpu weight out
> of range, valid values are within range from 1 to 65535
> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
> dompath for 1: Bad file descriptor
This is on NetBSD?
These sorts of symptoms are similar to those fixed by 25364:8dce7a4121b9
but you've already got that. It might be worth doing a full clean and
rebuild, just in case.
What does your guest config look like? What is your command line?
Do you know when it last worked?
The places which close ctx->xsh are very few -- might be worth
annotating any call to xs_daemon_close() with a printf.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-18 13:30 ` Ian Campbell
@ 2012-05-18 14:23 ` Christoph Egger
2012-05-18 15:51 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-18 14:23 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/18/12 15:30, Ian Campbell wrote:
> On Fri, 2012-05-18 at 14:17 +0100, Christoph Egger wrote:
>> Hi,
>>
>> I am currently using c/s 25371:e9058654ca08.
>> When I try to start a HVM guest I get this failure:
>>
>>
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>> libxl: error: libxl.c:3208:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>> dompath for 1: Bad file descriptor
>
> This is on NetBSD?
Yes.
>
> These sorts of symptoms are similar to those fixed by 25364:8dce7a4121b9
> but you've already got that. It might be worth doing a full clean and
> rebuild, just in case.
This is a clean build.
> What does your guest config look like?
builder='hvm'
memory=4096
nestedhvm=1
name="guest"
vcpus=4
cpuid="host,page1gb=k,hypervisor=0"
acpi=1
apic=1
vif = [ 'type=ioemu, bridge=bridge0, model=e1000' ]
disk = [ 'file:/hvm-guest/guest.img,ioemu:hda,w' ]
serial='pty'
vnc=1
sdl=0
> What is your command line?
xl create -c guest.conf
> Do you know when it last worked?
Changeset 24462 worked. I need to bisect the exact
changeset.
> The places which close ctx->xsh are very few -- might be worth
> annotating any call to xs_daemon_close() with a printf.
ok.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-18 14:23 ` Christoph Egger
@ 2012-05-18 15:51 ` Christoph Egger
2012-05-18 15:58 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-18 15:51 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/18/12 16:23, Christoph Egger wrote:
> On 05/18/12 15:30, Ian Campbell wrote:
>
>> On Fri, 2012-05-18 at 14:17 +0100, Christoph Egger wrote:
>>> Hi,
>>>
>>> I am currently using c/s 25371:e9058654ca08.
>>> When I try to start a HVM guest I get this failure:
>>>
>>>
>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>> Loader: 0000000000100000->000000000019bd04
>>> TOTAL: 0000000000000000->00000000ff800000
>>> ENTRY ADDRESS: 0000000000100000
>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>> 4KB PAGES: 0x0000000000000200
>>> 2MB PAGES: 0x00000000000003fb
>>> 1GB PAGES: 0x0000000000000002
>>> libxl: error: libxl.c:3208:libxl_sched_credit_domain_set: Cpu weight out
>>> of range, valid values are within range from 1 to 65535
>>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>>> dompath for 1: Bad file descriptor
>>
>> This is on NetBSD?
>
>
> Yes.
>
>>
>> These sorts of symptoms are similar to those fixed by 25364:8dce7a4121b9
>> but you've already got that. It might be worth doing a full clean and
>> rebuild, just in case.
>
>
> This is a clean build.
>
>
>> What does your guest config look like?
>
>
> builder='hvm'
> memory=4096
> nestedhvm=1
> name="guest"
> vcpus=4
> cpuid="host,page1gb=k,hypervisor=0"
> acpi=1
> apic=1
> vif = [ 'type=ioemu, bridge=bridge0, model=e1000' ]
> disk = [ 'file:/hvm-guest/guest.img,ioemu:hda,w' ]
> serial='pty'
> vnc=1
> sdl=0
>
>> What is your command line?
>
> xl create -c guest.conf
>
>
>> Do you know when it last worked?
>
>
> Changeset 24462 worked. I need to bisect the exact
> changeset.
>
>
>> The places which close ctx->xsh are very few -- might be worth
>> annotating any call to xs_daemon_close() with a printf.
>
>
> ok.
In libxl__build_post() I check the return value
of libxl__sched_set_params().
Now trying to start a guest results in this failure:
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
(re-)build domain: -6
libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
device model's pid: No such file or directory
libxl: error: libxl.c:1162:libxl_domain_destroy:
libxl__destroy_device_model failed for 1
libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
xs_daemon_close <-- the printf annotation
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-18 15:51 ` Christoph Egger
@ 2012-05-18 15:58 ` Ian Campbell
2012-05-21 10:26 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-18 15:58 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
> In libxl__build_post() I check the return value
> of libxl__sched_set_params().
The mesages about scheduler params are a known and benign issue.
> Now trying to start a guest results in this failure:
>
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> Loader: 0000000000100000->000000000019bd04
> TOTAL: 0000000000000000->00000000ff800000
> ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
> 4KB PAGES: 0x0000000000000200
> 2MB PAGES: 0x00000000000003fb
> 1GB PAGES: 0x0000000000000002
> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
> of range, valid values are within range from 1 to 65535
> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> (re-)build domain: -6
> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> device model's pid: No such file or directory
Is your device model dying for some reason? Anything
in /var/log/xen/*guest*.log about it?
You could try "xl -vvv cr ..." too, not sure what it will say.
> libxl: error: libxl.c:1162:libxl_domain_destroy:
> libxl__destroy_device_model failed for 1
> libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
> xs_daemon_close <-- the printf annotation
>
>
> Christoph
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-18 15:58 ` Ian Campbell
@ 2012-05-21 10:26 ` Christoph Egger
2012-05-21 12:15 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-21 10:26 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/18/12 17:58, Ian Campbell wrote:
>
>> In libxl__build_post() I check the return value
>> of libxl__sched_set_params().
>
> The mesages about scheduler params are a known and benign issue.
>
>> Now trying to start a guest results in this failure:
>>
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
>> (re-)build domain: -6
>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
>> device model's pid: No such file or directory
>
> Is your device model dying for some reason? Anything
> in /var/log/xen/*guest*.log about it?
The guest logfile doesn't exist. Does that mean the errors happens
before device model has been started at all?
>
> You could try "xl -vvv cr ..." too, not sure what it will say.
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=unknown
libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
vdev=hda, using backend phy
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
(re-)build domain: -6
libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
device model's pid: No such file or directory
libxl: error: libxl.c:1162:libxl_domain_destroy:
libxl__destroy_device_model failed for 2
xc: debug: hypercall buffer: total allocations:1251 total releases:1251
xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
xc: debug: hypercall buffer: cache current size:2
xc: debug: hypercall buffer: cache hits:1248 misses:2 toobig:1
libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
xs_daemon_close <-- the printf annotation
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 10:26 ` Christoph Egger
@ 2012-05-21 12:15 ` Ian Campbell
2012-05-21 13:10 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-21 12:15 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Mon, 2012-05-21 at 11:26 +0100, Christoph Egger wrote:
> On 05/18/12 17:58, Ian Campbell wrote:
>
> >
> >> In libxl__build_post() I check the return value
> >> of libxl__sched_set_params().
> >
> > The mesages about scheduler params are a known and benign issue.
> >
> >> Now trying to start a guest results in this failure:
> >>
> >> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >> Loader: 0000000000100000->000000000019bd04
> >> TOTAL: 0000000000000000->00000000ff800000
> >> ENTRY ADDRESS: 0000000000100000
> >> xc: info: PHYSICAL MEMORY ALLOCATION:
> >> 4KB PAGES: 0x0000000000000200
> >> 2MB PAGES: 0x00000000000003fb
> >> 1GB PAGES: 0x0000000000000002
> >> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
> >> of range, valid values are within range from 1 to 65535
> >> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> >> (re-)build domain: -6
> >> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> >> device model's pid: No such file or directory
> >
> > Is your device model dying for some reason? Anything
> > in /var/log/xen/*guest*.log about it?
>
>
> The guest logfile doesn't exist.
Sorry, I meant guest as in $GUEST_NAME rather than literally "guest" (I
was totally non-obvious about that, sorry!).
> Does that mean the errors happens before device model has been started at all?
I think/hope if that were the case you would get messages about failure
to exec etc rather than timeouts.
> >
> > You could try "xl -vvv cr ..." too, not sure what it will say.
>
>
> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> vdev=hda spec.backend=unknown
> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> vdev=hda, using backend phy
> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> Loader: 0000000000100000->000000000019bd04
> TOTAL: 0000000000000000->00000000ff800000
> ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
> 4KB PAGES: 0x0000000000000200
> 2MB PAGES: 0x00000000000003fb
> 1GB PAGES: 0x0000000000000002
No messages about "xs transaction failed: Bad file descriptor" any more?
> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
> of range, valid values are within range from 1 to 65535
> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> (re-)build domain: -6
> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> device model's pid: No such file or directory
> libxl: error: libxl.c:1162:libxl_domain_destroy:
> libxl__destroy_device_model failed for 2
Hrm, actually, the device model stuff might be a red-herring -- that's
trying to tear down the device model on failure and it is entirely
reasonable for the device model to not be running if we didn't get as
far as starting it...
The interesting message is just:
> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> (re-)build domain: -6
Which is unhelpfully just a general failure from libxl__domain_build.
It seems that we have a non-logging failure path in there somewhere. I'm
afraid that the easieist way to fix this is probably just to dive into
libxl__domain_build and add prints on the various error cases of sub
functions, then recurse as you identify which one is failing etc..
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 12:15 ` Ian Campbell
@ 2012-05-21 13:10 ` Christoph Egger
2012-05-21 13:49 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-21 13:10 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/21/12 14:15, Ian Campbell wrote:
> On Mon, 2012-05-21 at 11:26 +0100, Christoph Egger wrote:
>> On 05/18/12 17:58, Ian Campbell wrote:
>>
>>>
>>>> In libxl__build_post() I check the return value
>>>> of libxl__sched_set_params().
>>>
>>> The mesages about scheduler params are a known and benign issue.
>>>
>>>> Now trying to start a guest results in this failure:
>>>>
>>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>>> Loader: 0000000000100000->000000000019bd04
>>>> TOTAL: 0000000000000000->00000000ff800000
>>>> ENTRY ADDRESS: 0000000000100000
>>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>>> 4KB PAGES: 0x0000000000000200
>>>> 2MB PAGES: 0x00000000000003fb
>>>> 1GB PAGES: 0x0000000000000002
>>>> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
>>>> of range, valid values are within range from 1 to 65535
>>>> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
>>>> (re-)build domain: -6
>>>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
>>>> device model's pid: No such file or directory
>>>
>>> Is your device model dying for some reason? Anything
>>> in /var/log/xen/*guest*.log about it?
>>
>>
>> The guest logfile doesn't exist.
>
> Sorry, I meant guest as in $GUEST_NAME rather than literally "guest" (I
> was totally non-obvious about that, sorry!).
I understood it that way. The guest logfile doesn't exist.
>
>> Does that mean the errors happens before device model has been started at all?
>
> I think/hope if that were the case you would get messages about failure
> to exec etc rather than timeouts.
>
>>>
>>> You could try "xl -vvv cr ..." too, not sure what it will say.
>>
>>
>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>> vdev=hda spec.backend=unknown
>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>> vdev=hda, using backend phy
>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>
> No messages about "xs transaction failed: Bad file descriptor" any more?
>
>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
>> (re-)build domain: -6
>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
>> device model's pid: No such file or directory
>> libxl: error: libxl.c:1162:libxl_domain_destroy:
>> libxl__destroy_device_model failed for 2
>
> Hrm, actually, the device model stuff might be a red-herring -- that's
> trying to tear down the device model on failure and it is entirely
> reasonable for the device model to not be running if we didn't get as
> far as starting it...
>
> The interesting message is just:
>> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
>> (re-)build domain: -6
>
> Which is unhelpfully just a general failure from libxl__domain_build.
>
> It seems that we have a non-logging failure path in there somewhere. I'm
> afraid that the easieist way to fix this is probably just to dive into
> libxl__domain_build and add prints on the various error cases of sub
> functions, then recurse as you identify which one is failing etc..
I did that:
Parsing config from /root/hvm-guest/netbsd_64b.conf
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=unknown
libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
vdev=hda, using backend phy
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_dom.c:74:libxl__sched_set_params:
libxl_sched_credit_domain_set failed -6
libxl: error: libxl_dom.c:192:libxl__build_post: libxl__sched_set_params
failed -6
libxl: error: libxl_create.c:322:libxl__domain_build: libxl__build_post
failed: -6
libxl: error: libxl_create.c:709:domcreate_bootloader_done: cannot
(re-)build domain: -6
libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
device model's pid: No such file or directory
libxl: error: libxl.c:1162:libxl_domain_destroy:
libxl__destroy_device_model failed for 6
xc: debug: hypercall buffer: total allocations:1264 total releases:1264
xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
xc: debug: hypercall buffer: cache current size:2
xc: debug: hypercall buffer: cache hits:1261 misses:2 toobig:1
libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
xs_daemon_close
So it is indeed that ERROR_INVAL from that 'benign' error.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 13:10 ` Christoph Egger
@ 2012-05-21 13:49 ` Ian Campbell
2012-05-21 15:43 ` Christoph Egger
2012-05-21 15:44 ` Christoph Egger
0 siblings, 2 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-21 13:49 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Mon, 2012-05-21 at 14:10 +0100, Christoph Egger wrote:
> On 05/21/12 14:15, Ian Campbell wrote:
>
> > On Mon, 2012-05-21 at 11:26 +0100, Christoph Egger wrote:
> >> On 05/18/12 17:58, Ian Campbell wrote:
> >>
> >>>
> >>>> In libxl__build_post() I check the return value
> >>>> of libxl__sched_set_params().
> >>>
> >>> The mesages about scheduler params are a known and benign issue.
> >>>
> >>>> Now trying to start a guest results in this failure:
> >>>>
> >>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >>>> Loader: 0000000000100000->000000000019bd04
> >>>> TOTAL: 0000000000000000->00000000ff800000
> >>>> ENTRY ADDRESS: 0000000000100000
> >>>> xc: info: PHYSICAL MEMORY ALLOCATION:
> >>>> 4KB PAGES: 0x0000000000000200
> >>>> 2MB PAGES: 0x00000000000003fb
> >>>> 1GB PAGES: 0x0000000000000002
> >>>> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
> >>>> of range, valid values are within range from 1 to 65535
> >>>> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> >>>> (re-)build domain: -6
> >>>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> >>>> device model's pid: No such file or directory
> >>>
> >>> Is your device model dying for some reason? Anything
> >>> in /var/log/xen/*guest*.log about it?
> >>
> >>
> >> The guest logfile doesn't exist.
> >
> > Sorry, I meant guest as in $GUEST_NAME rather than literally "guest" (I
> > was totally non-obvious about that, sorry!).
>
>
> I understood it that way. The guest logfile doesn't exist.
>
> >
> >> Does that mean the errors happens before device model has been started at all?
> >
> > I think/hope if that were the case you would get messages about failure
> > to exec etc rather than timeouts.
> >
> >>>
> >>> You could try "xl -vvv cr ..." too, not sure what it will say.
> >>
> >>
> >> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> >> vdev=hda spec.backend=unknown
> >> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> >> vdev=hda, using backend phy
> >> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> >> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> >> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >> Loader: 0000000000100000->000000000019bd04
> >> TOTAL: 0000000000000000->00000000ff800000
> >> ENTRY ADDRESS: 0000000000100000
> >> xc: info: PHYSICAL MEMORY ALLOCATION:
> >> 4KB PAGES: 0x0000000000000200
> >> 2MB PAGES: 0x00000000000003fb
> >> 1GB PAGES: 0x0000000000000002
> >
> > No messages about "xs transaction failed: Bad file descriptor" any more?
> >
> >> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> >> libxl: error: libxl.c:3211:libxl_sched_credit_domain_set: Cpu weight out
> >> of range, valid values are within range from 1 to 65535
> >> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> >> (re-)build domain: -6
> >> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> >> device model's pid: No such file or directory
> >> libxl: error: libxl.c:1162:libxl_domain_destroy:
> >> libxl__destroy_device_model failed for 2
> >
> > Hrm, actually, the device model stuff might be a red-herring -- that's
> > trying to tear down the device model on failure and it is entirely
> > reasonable for the device model to not be running if we didn't get as
> > far as starting it...
> >
> > The interesting message is just:
> >> libxl: error: libxl_create.c:694:domcreate_bootloader_done: cannot
> >> (re-)build domain: -6
> >
> > Which is unhelpfully just a general failure from libxl__domain_build.
> >
> > It seems that we have a non-logging failure path in there somewhere. I'm
> > afraid that the easieist way to fix this is probably just to dive into
> > libxl__domain_build and add prints on the various error cases of sub
> > functions, then recurse as you identify which one is failing etc..
>
> I did that:
>
> Parsing config from /root/hvm-guest/netbsd_64b.conf
> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> vdev=hda spec.backend=unknown
> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> vdev=hda, using backend phy
> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> Loader: 0000000000100000->000000000019bd04
> TOTAL: 0000000000000000->00000000ff800000
> ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
> 4KB PAGES: 0x0000000000000200
> 2MB PAGES: 0x00000000000003fb
> 1GB PAGES: 0x0000000000000002
> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
> of range, valid values are within range from 1 to 65535
> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
> libxl_sched_credit_domain_set failed -6
> libxl: error: libxl_dom.c:192:libxl__build_post: libxl__sched_set_params
> failed -6
> libxl: error: libxl_create.c:322:libxl__domain_build: libxl__build_post
> failed: -6
> libxl: error: libxl_create.c:709:domcreate_bootloader_done: cannot
> (re-)build domain: -6
> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
> device model's pid: No such file or directory
> libxl: error: libxl.c:1162:libxl_domain_destroy:
> libxl__destroy_device_model failed for 6
> xc: debug: hypercall buffer: total allocations:1264 total releases:1264
> xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
> xc: debug: hypercall buffer: cache current size:2
> xc: debug: hypercall buffer: cache hits:1261 misses:2 toobig:1
> libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
> xs_daemon_close
>
>
> So it is indeed that ERROR_INVAL from that 'benign' error
In my version of libxl libxl__build_post doesn't even look at the return
value of libxl__sched_set_params.
....
libxl__sched_set_params (gc, domid, &(info->sched_params));
....
the only other exit path from that function is:
dom_path = libxl__xs_get_dompath(gc, domid);
if (!dom_path) {
return ERROR_FAIL;
}
which is consistent with the original errors you had (but if ERROR_FAIL,
not ERROR_INVAL). This doesn't really help me figure out what is going
on though :-/
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 13:49 ` Ian Campbell
@ 2012-05-21 15:43 ` Christoph Egger
2012-05-21 15:44 ` Christoph Egger
1 sibling, 0 replies; 30+ messages in thread
From: Christoph Egger @ 2012-05-21 15:43 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/21/12 15:49, Ian Campbell wrote:
>>> It seems that we have a non-logging failure path in there somewhere. I'm
>>> afraid that the easieist way to fix this is probably just to dive into
>>> libxl__domain_build and add prints on the various error cases of sub
>>> functions, then recurse as you identify which one is failing etc..
>>
>> I did that:
>>
>> Parsing config from /root/hvm-guest/netbsd_64b.conf
>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>> vdev=hda spec.backend=unknown
>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>> vdev=hda, using backend phy
>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>> libxl_sched_credit_domain_set failed -6
>> libxl: error: libxl_dom.c:192:libxl__build_post: libxl__sched_set_params
>> failed -6
>> libxl: error: libxl_create.c:322:libxl__domain_build: libxl__build_post
>> failed: -6
>> libxl: error: libxl_create.c:709:domcreate_bootloader_done: cannot
>> (re-)build domain: -6
>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
>> device model's pid: No such file or directory
>> libxl: error: libxl.c:1162:libxl_domain_destroy:
>> libxl__destroy_device_model failed for 6
>> xc: debug: hypercall buffer: total allocations:1264 total releases:1264
>> xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
>> xc: debug: hypercall buffer: cache current size:2
>> xc: debug: hypercall buffer: cache hits:1261 misses:2 toobig:1
>> libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
>> xs_daemon_close
>>
>>
>> So it is indeed that ERROR_INVAL from that 'benign' error
>
> In my version of libxl libxl__build_post doesn't even look at the return
> value of libxl__sched_set_params.
> ....
> libxl__sched_set_params (gc, domid, &(info->sched_params));
> ....
I reverted my local change and retried. See below.
> the only other exit path from that function is:
> dom_path = libxl__xs_get_dompath(gc, domid);
> if (!dom_path) {
> return ERROR_FAIL;
> }
> which is consistent with the original errors you had (but if ERROR_FAIL,
> not ERROR_INVAL). This doesn't really help me figure out what is going
> on though :-/
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=unknown
libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
vdev=hda, using backend phy
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_dom.c:74:libxl__sched_set_params:
libxl_sched_credit_domain_set failed -6
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=phy
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hdb spec.backend=phy
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: debug: libxl_dm.c:1008:libxl__spawn_local_dm: Spawning
device-model /usr/local.25371.netbsd/libexec/qemu-dm with arguments:
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
/usr/local.25371.netbsd/libexec/qemu-dm
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -d
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 7
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -domain-name
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: HVM64-NetBSD
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vnc
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 0.0.0.0:0
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vncunused
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -serial
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: pty
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -videoram
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 8
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -boot
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: cd
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -acpi
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vcpus
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 4
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vcpu_avail
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 0xf
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -net
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
nic,vlan=0,macaddr=00:16:3e:00:ce:01,model=e1000
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -net
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
tap,vlan=0,ifname=vif7.0-emu,bridge=bridge0,script=/usr/local.25371.netbsd/etc/xen/scripts/qemu-ifup,downscript=/usr/local.25371.netbsd/etc/xen/scripts/qemu-ifup
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -M
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: xenfv
libxl: error: libxl_event.c:468:libxl__ev_xswatch_register: create watch
for path /local/domain/0/device-model/7/state: Bad file descriptor
libxl: error: libxl_dm.c:1072:device_model_spawn_outcome: domain 7
device model: spawn failed (rc=-3)
assertion "ao->in_initiator" failed: file "libxl_event.c", line 1388,
function "libxl__ao_complete_check_progress_reports"
Abort (core dumped)
(gdb) bt
#0 0x00007f7ff65059aa in _lwp_kill () from /usr/lib/libc.so.12
#1 0x00007f7ff6505612 in abort () from /usr/lib/libc.so.12
#2 0x00007f7ff65052dd in __assert13 () from /usr/lib/libc.so.12
#3 0x00007f7ff742d114 in libxl__ao_complete_check_progress_reports (
egc=0x7f7fffffd140, ao=0x7f7ff7b210e0) at libxl_event.c:1388
#4 0x00007f7ff742d2ec in egc_run_callbacks (egc=0x7f7fffffd140)
at libxl_event.c:971
#5 libxl__egc_cleanup (egc=0x7f7fffffd140) at libxl_event.c:991
#6 0x00007f7ff741890f in do_domain_create (ctx=0x7f7ff7b210b8,
d_config=<optimized out>, domid=<optimized out>,
restore_fd=<optimized out>,
ao_how=<optimized out>, aop_console_how=0x7f7fffffffff) at
libxl_create.c:905
#7 0x00007f7ff741893e in libxl_domain_create_new (ctx=<optimized out>,
d_config=<optimized out>, domid=<optimized out>, ao_how=<optimized
out>,
aop_console_how=<optimized out>) at libxl_create.c:926
#8 0x000000000040c4d9 in create_domain (dom_info=0x7f7fffffd630)
at xl_cmdimpl.c:1760
#9 0x0000000000410161 in main_create (argc=3, argv=<optimized out>)
at xl_cmdimpl.c:3730
#10 0x0000000000406d86 in main (argc=3, argv=0x7f7fffffdba0) at xl.c:208
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 13:49 ` Ian Campbell
2012-05-21 15:43 ` Christoph Egger
@ 2012-05-21 15:44 ` Christoph Egger
2012-05-21 15:57 ` Ian Campbell
1 sibling, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-21 15:44 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/21/12 15:49, Ian Campbell wrote:
>>> It seems that we have a non-logging failure path in there somewhere. I'm
>>> afraid that the easieist way to fix this is probably just to dive into
>>> libxl__domain_build and add prints on the various error cases of sub
>>> functions, then recurse as you identify which one is failing etc..
>>
>> I did that:
>>
>> Parsing config from /root/hvm-guest/netbsd_64b.conf
>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>> vdev=hda spec.backend=unknown
>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>> vdev=hda, using backend phy
>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>> libxl_sched_credit_domain_set failed -6
>> libxl: error: libxl_dom.c:192:libxl__build_post: libxl__sched_set_params
>> failed -6
>> libxl: error: libxl_create.c:322:libxl__domain_build: libxl__build_post
>> failed: -6
>> libxl: error: libxl_create.c:709:domcreate_bootloader_done: cannot
>> (re-)build domain: -6
>> libxl: error: libxl_dm.c:1104:libxl__destroy_device_model: Couldn't find
>> device model's pid: No such file or directory
>> libxl: error: libxl.c:1162:libxl_domain_destroy:
>> libxl__destroy_device_model failed for 6
>> xc: debug: hypercall buffer: total allocations:1264 total releases:1264
>> xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
>> xc: debug: hypercall buffer: cache current size:2
>> xc: debug: hypercall buffer: cache hits:1261 misses:2 toobig:1
>> libxl: error: libxl.c:155:libxl_ctx_free: libxl_ctx_free: call
>> xs_daemon_close
>>
>>
>> So it is indeed that ERROR_INVAL from that 'benign' error
>
> In my version of libxl libxl__build_post doesn't even look at the return
> value of libxl__sched_set_params.
> ....
> libxl__sched_set_params (gc, domid, &(info->sched_params));
> ....
I reverted my local change and retried. See below.
> the only other exit path from that function is:
> dom_path = libxl__xs_get_dompath(gc, domid);
> if (!dom_path) {
> return ERROR_FAIL;
> }
> which is consistent with the original errors you had (but if ERROR_FAIL,
> not ERROR_INVAL). This doesn't really help me figure out what is going
> on though :-/
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=unknown
libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
vdev=hda, using backend phy
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->000000000019bd04
TOTAL: 0000000000000000->00000000ff800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000002
xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
of range, valid values are within range from 1 to 65535
libxl: error: libxl_dom.c:74:libxl__sched_set_params:
libxl_sched_credit_domain_set failed -6
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hda spec.backend=phy
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
vdev=hdb spec.backend=phy
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 0: Bad file descriptor
libxl: error: libxl_device.c:107:libxl__device_generic_add: xs
transaction failed: Bad file descriptor
libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
dompath for 7: Bad file descriptor
libxl: debug: libxl_dm.c:1008:libxl__spawn_local_dm: Spawning
device-model /usr/local.25371.netbsd/libexec/qemu-dm with arguments:
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
/usr/local.25371.netbsd/libexec/qemu-dm
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -d
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 7
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -domain-name
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: HVM64-NetBSD
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vnc
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 0.0.0.0:0
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vncunused
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -serial
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: pty
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -videoram
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 8
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -boot
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: cd
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -acpi
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vcpus
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 4
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -vcpu_avail
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: 0xf
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -net
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
nic,vlan=0,macaddr=00:16:3e:00:ce:01,model=e1000
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -net
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm:
tap,vlan=0,ifname=vif7.0-emu,bridge=bridge0,script=/usr/local.25371.netbsd/etc/xen/scripts/qemu-ifup,downscript=/usr/local.25371.netbsd/etc/xen/scripts/qemu-ifup
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: -M
libxl: debug: libxl_dm.c:1010:libxl__spawn_local_dm: xenfv
libxl: error: libxl_event.c:468:libxl__ev_xswatch_register: create watch
for path /local/domain/0/device-model/7/state: Bad file descriptor
libxl: error: libxl_dm.c:1072:device_model_spawn_outcome: domain 7
device model: spawn failed (rc=-3)
assertion "ao->in_initiator" failed: file "libxl_event.c", line 1388,
function "libxl__ao_complete_check_progress_reports"
Abort (core dumped)
(gdb) bt
#0 0x00007f7ff65059aa in _lwp_kill () from /usr/lib/libc.so.12
#1 0x00007f7ff6505612 in abort () from /usr/lib/libc.so.12
#2 0x00007f7ff65052dd in __assert13 () from /usr/lib/libc.so.12
#3 0x00007f7ff742d114 in libxl__ao_complete_check_progress_reports (
egc=0x7f7fffffd140, ao=0x7f7ff7b210e0) at libxl_event.c:1388
#4 0x00007f7ff742d2ec in egc_run_callbacks (egc=0x7f7fffffd140)
at libxl_event.c:971
#5 libxl__egc_cleanup (egc=0x7f7fffffd140) at libxl_event.c:991
#6 0x00007f7ff741890f in do_domain_create (ctx=0x7f7ff7b210b8,
d_config=<optimized out>, domid=<optimized out>,
restore_fd=<optimized out>,
ao_how=<optimized out>, aop_console_how=0x7f7fffffffff) at
libxl_create.c:905
#7 0x00007f7ff741893e in libxl_domain_create_new (ctx=<optimized out>,
d_config=<optimized out>, domid=<optimized out>, ao_how=<optimized
out>,
aop_console_how=<optimized out>) at libxl_create.c:926
#8 0x000000000040c4d9 in create_domain (dom_info=0x7f7fffffd630)
at xl_cmdimpl.c:1760
#9 0x0000000000410161 in main_create (argc=3, argv=<optimized out>)
at xl_cmdimpl.c:3730
#10 0x0000000000406d86 in main (argc=3, argv=0x7f7fffffdba0) at xl.c:208
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 15:44 ` Christoph Egger
@ 2012-05-21 15:57 ` Ian Campbell
2012-05-21 16:11 ` Ian Jackson
2012-05-22 12:35 ` Christoph Egger
0 siblings, 2 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-21 15:57 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Mon, 2012-05-21 at 16:44 +0100, Christoph Egger wrote:
> I reverted my local change and retried. See below.
>
> > the only other exit path from that function is:
>
> > dom_path = libxl__xs_get_dompath(gc, domid);
> > if (!dom_path) {
> > return ERROR_FAIL;
> > }
> > which is consistent with the original errors you had (but if ERROR_FAIL,
> > not ERROR_INVAL). This doesn't really help me figure out what is going
> > on though :-/
>
>
>
>
> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> vdev=hda spec.backend=unknown
> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> vdev=hda, using backend phy
> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> Loader: 0000000000100000->000000000019bd04
> TOTAL: 0000000000000000->00000000ff800000
> ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
> 4KB PAGES: 0x0000000000000200
> 2MB PAGES: 0x00000000000003fb
> 1GB PAGES: 0x0000000000000002
> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
> of range, valid values are within range from 1 to 65535
> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
> libxl_sched_credit_domain_set failed -6
> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> vdev=hda spec.backend=phy
> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
> dompath for 7: Bad file descriptor
This is back to the original issue, I think the last couple of mails
have been something of a tangent since you weren't getting as far as
this failure.
I'm not really sure what to suggest here -- something is either closing
the fd or scribbling over the memory which contains it.
I suppose you could sprinkle calls to libxl__xs_get_dompath() around
between libxl__sched_set_params and libxl__device_disk_set_backend and
see where it starts failing -- that's going to be pretty tedious though.
If you've got the gdb-fu you might be able to set a write watch on the
location in the ctx with the fd -- could tell you something perhaps.
Otherwise perhaps bisection is the best bet?
> for path /local/domain/0/device-model/7/state: Bad file descriptor
> libxl: error: libxl_dm.c:1072:device_model_spawn_outcome: domain 7
> device model: spawn failed (rc=-3)
> assertion "ao->in_initiator" failed: file "libxl_event.c", line 1388,
> function "libxl__ao_complete_check_progress_reports"
> Abort (core dumped)
>
> (gdb) bt
Can you tell if the xs fd is still actually open at this point? On Linux
I would look in /proc/<ipd>/fds for the socket.
Also can you print out the xsh from the ctx (perhaps that's easier from
e.g. frame #7 below?)
Also the ao failure smells like bad error handling resulting from the
underlying issue, which might be worth someone investigating separately.
> #0 0x00007f7ff65059aa in _lwp_kill () from /usr/lib/libc.so.12
> #1 0x00007f7ff6505612 in abort () from /usr/lib/libc.so.12
> #2 0x00007f7ff65052dd in __assert13 () from /usr/lib/libc.so.12
> #3 0x00007f7ff742d114 in libxl__ao_complete_check_progress_reports (
> egc=0x7f7fffffd140, ao=0x7f7ff7b210e0) at libxl_event.c:1388
> #4 0x00007f7ff742d2ec in egc_run_callbacks (egc=0x7f7fffffd140)
> at libxl_event.c:971
> #5 libxl__egc_cleanup (egc=0x7f7fffffd140) at libxl_event.c:991
> #6 0x00007f7ff741890f in do_domain_create (ctx=0x7f7ff7b210b8,
> d_config=<optimized out>, domid=<optimized out>,
> restore_fd=<optimized out>,
> ao_how=<optimized out>, aop_console_how=0x7f7fffffffff) at
> libxl_create.c:905
> #7 0x00007f7ff741893e in libxl_domain_create_new (ctx=<optimized out>,
> d_config=<optimized out>, domid=<optimized out>, ao_how=<optimized
> out>,
> aop_console_how=<optimized out>) at libxl_create.c:926
> #8 0x000000000040c4d9 in create_domain (dom_info=0x7f7fffffd630)
> at xl_cmdimpl.c:1760
> #9 0x0000000000410161 in main_create (argc=3, argv=<optimized out>)
> at xl_cmdimpl.c:3730
> #10 0x0000000000406d86 in main (argc=3, argv=0x7f7fffffdba0) at xl.c:208
>
> Christoph
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 15:57 ` Ian Campbell
@ 2012-05-21 16:11 ` Ian Jackson
2012-05-22 12:35 ` Christoph Egger
1 sibling, 0 replies; 30+ messages in thread
From: Ian Jackson @ 2012-05-21 16:11 UTC (permalink / raw)
To: Ian Campbell; +Cc: Christoph Egger, xen-devel@lists.xen.org
Ian Campbell writes ("Re: [Xen-devel] libxl: cannot start guest"):
> On Mon, 2012-05-21 at 16:44 +0100, Christoph Egger wrote:
> > libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
> > dompath for 7: Bad file descriptor
>
> This is back to the original issue, I think the last couple of mails
> have been something of a tangent since you weren't getting as far as
> this failure.
>
> I'm not really sure what to suggest here -- something is either closing
> the fd or scribbling over the memory which contains it.
I would strace (on BSD, ktrace?) the process. That would tell you
whether the fd was being closed and if so when.
If it's not being closed then the fd value is being overwritten and a
gdb hardware watchpoint will find where.
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-21 15:57 ` Ian Campbell
2012-05-21 16:11 ` Ian Jackson
@ 2012-05-22 12:35 ` Christoph Egger
2012-05-22 12:53 ` Ian Campbell
1 sibling, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 12:35 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/21/12 17:57, Ian Campbell wrote:
>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>> vdev=hda spec.backend=unknown
>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>> vdev=hda, using backend phy
>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>> Loader: 0000000000100000->000000000019bd04
>> TOTAL: 0000000000000000->00000000ff800000
>> ENTRY ADDRESS: 0000000000100000
>> xc: info: PHYSICAL MEMORY ALLOCATION:
>> 4KB PAGES: 0x0000000000000200
>> 2MB PAGES: 0x00000000000003fb
>> 1GB PAGES: 0x0000000000000002
>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>> of range, valid values are within range from 1 to 65535
>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>> libxl_sched_credit_domain_set failed -6
>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>> vdev=hda spec.backend=phy
>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>> dompath for 7: Bad file descriptor
>
> This is back to the original issue, I think the last couple of mails
> have been something of a tangent since you weren't getting as far as
> this failure.
>
> I'm not really sure what to suggest here -- something is either closing
> the fd or scribbling over the memory which contains it.
>
> I suppose you could sprinkle calls to libxl__xs_get_dompath() around
> between libxl__sched_set_params and libxl__device_disk_set_backend and
> see where it starts failing -- that's going to be pretty tedious though.
It starts failing in libxl__build_post() right after
xs_introduce_domain().
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 12:35 ` Christoph Egger
@ 2012-05-22 12:53 ` Ian Campbell
2012-05-22 13:17 ` Christoph Egger
2012-05-22 13:18 ` Christoph Egger
0 siblings, 2 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-22 12:53 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote:
> On 05/21/12 17:57, Ian Campbell wrote:
>
> >> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> >> vdev=hda spec.backend=unknown
> >> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> >> vdev=hda, using backend phy
> >> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> >> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> >> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >> Loader: 0000000000100000->000000000019bd04
> >> TOTAL: 0000000000000000->00000000ff800000
> >> ENTRY ADDRESS: 0000000000100000
> >> xc: info: PHYSICAL MEMORY ALLOCATION:
> >> 4KB PAGES: 0x0000000000000200
> >> 2MB PAGES: 0x00000000000003fb
> >> 1GB PAGES: 0x0000000000000002
> >> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> >> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
> >> of range, valid values are within range from 1 to 65535
> >> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
> >> libxl_sched_credit_domain_set failed -6
> >> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> >> vdev=hda spec.backend=phy
> >> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
> >> dompath for 7: Bad file descriptor
> >
> > This is back to the original issue, I think the last couple of mails
> > have been something of a tangent since you weren't getting as far as
> > this failure.
> >
> > I'm not really sure what to suggest here -- something is either closing
> > the fd or scribbling over the memory which contains it.
> >
> > I suppose you could sprinkle calls to libxl__xs_get_dompath() around
> > between libxl__sched_set_params and libxl__device_disk_set_backend and
> > see where it starts failing -- that's going to be pretty tedious though.
>
>
> It starts failing in libxl__build_post() right after
> xs_introduce_domain().
What method did you use to determine that?
So at the xs_transaction_end right before that ctx->xsh is valid, but
right after...
xs_introduce_domain(ctx->xsh, domid, state->store_mfn, state->store_port);
...it is invalid? i.e. before the free(vmpath) it is already corrupt?
(Aside: why isn't vmpath in the gc, instead of done manually,
nevermind...)
Does the xs_introduce_domain itself succeed? Or do you mean that the
next use of xsh after this fails (where is that, somewhere back up the
callchain? store_libxl_entry perhaps?)
xs_introduce_domain doesn't seem to do much which is untoward with the
handle.
The only thing which springs to mind is that it may generate an
@IntroduceDomain watch event. However xl is single threaded so we won't
process that event until we unwind to whichever point we do an event
loop iteration, in which case the corruption would have to happen later
than right after xs_introduce_domain().
Did you manage to determine if "Bad file descriptor" was due to it being
closed vs. the value being corrupted?
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 12:53 ` Ian Campbell
@ 2012-05-22 13:17 ` Christoph Egger
2012-05-22 13:18 ` Christoph Egger
1 sibling, 0 replies; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 13:17 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/22/12 14:53, Ian Campbell wrote:
> On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote:
>> On 05/21/12 17:57, Ian Campbell wrote:
>>
>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>> vdev=hda spec.backend=unknown
>>>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>>>> vdev=hda, using backend phy
>>>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>>>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>>> Loader: 0000000000100000->000000000019bd04
>>>> TOTAL: 0000000000000000->00000000ff800000
>>>> ENTRY ADDRESS: 0000000000100000
>>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>>> 4KB PAGES: 0x0000000000000200
>>>> 2MB PAGES: 0x00000000000003fb
>>>> 1GB PAGES: 0x0000000000000002
>>>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>>>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>>>> of range, valid values are within range from 1 to 65535
>>>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>>>> libxl_sched_credit_domain_set failed -6
>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>> vdev=hda spec.backend=phy
>>>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>>>> dompath for 7: Bad file descriptor
>>>
>>> This is back to the original issue, I think the last couple of mails
>>> have been something of a tangent since you weren't getting as far as
>>> this failure.
>>>
>>> I'm not really sure what to suggest here -- something is either closing
>>> the fd or scribbling over the memory which contains it.
>>>
>>> I suppose you could sprinkle calls to libxl__xs_get_dompath() around
>>> between libxl__sched_set_params and libxl__device_disk_set_backend and
>>> see where it starts failing -- that's going to be pretty tedious though.
>>
>>
>> It starts failing in libxl__build_post() right after
>> xs_introduce_domain().
>
> What method did you use to determine that?
What you said:
"sprinkle calls to libxl__xs_get_dompath() around between
libxl__sched_set_params and libxl__device_disk_set_backend and
see where it starts failing"
> So at the xs_transaction_end right before that ctx->xsh is valid, but
> right after...
> xs_introduce_domain(ctx->xsh, domid, state->store_mfn, state->store_port);
> ...it is invalid? i.e. before the free(vmpath) it is already corrupt?
Yes, you got it.
>
> (Aside: why isn't vmpath in the gc, instead of done manually,
> nevermind...)
>
> Does the xs_introduce_domain itself succeed?
No, it fails.
> Or do you mean that the next use of xsh after this fails
> (where is that, somewhere back up the callchain? store_libxl_entry
> perhaps?)
>
> xs_introduce_domain doesn't seem to do much which is untoward with the
> handle.
I think, in xs_talkv() something must fail.
> The only thing which springs to mind is that it may generate an
> @IntroduceDomain watch event. However xl is single threaded so we won't
> process that event until we unwind to whichever point we do an event
> loop iteration, in which case the corruption would have to happen later
> than right after xs_introduce_domain().
>
> Did you manage to determine if "Bad file descriptor" was due to it being
> closed vs. the value being corrupted?
I'm looking into it. I suspicion is that
if (msg.type != type)
in xs_talkv() is true.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 12:53 ` Ian Campbell
2012-05-22 13:17 ` Christoph Egger
@ 2012-05-22 13:18 ` Christoph Egger
2012-05-22 13:21 ` Ian Campbell
2012-05-22 13:21 ` Christoph Egger
1 sibling, 2 replies; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 13:18 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/22/12 14:53, Ian Campbell wrote:
> On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote:
>> On 05/21/12 17:57, Ian Campbell wrote:
>>
>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>> vdev=hda spec.backend=unknown
>>>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>>>> vdev=hda, using backend phy
>>>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>>>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>>> Loader: 0000000000100000->000000000019bd04
>>>> TOTAL: 0000000000000000->00000000ff800000
>>>> ENTRY ADDRESS: 0000000000100000
>>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>>> 4KB PAGES: 0x0000000000000200
>>>> 2MB PAGES: 0x00000000000003fb
>>>> 1GB PAGES: 0x0000000000000002
>>>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>>>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>>>> of range, valid values are within range from 1 to 65535
>>>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>>>> libxl_sched_credit_domain_set failed -6
>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>> vdev=hda spec.backend=phy
>>>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>>>> dompath for 7: Bad file descriptor
>>>
>>> This is back to the original issue, I think the last couple of mails
>>> have been something of a tangent since you weren't getting as far as
>>> this failure.
>>>
>>> I'm not really sure what to suggest here -- something is either closing
>>> the fd or scribbling over the memory which contains it.
>>>
>>> I suppose you could sprinkle calls to libxl__xs_get_dompath() around
>>> between libxl__sched_set_params and libxl__device_disk_set_backend and
>>> see where it starts failing -- that's going to be pretty tedious though.
>>
>>
>> It starts failing in libxl__build_post() right after
>> xs_introduce_domain().
>
> What method did you use to determine that?
What you said:
"sprinkle calls to libxl__xs_get_dompath() around between
libxl__sched_set_params and libxl__device_disk_set_backend and
see where it starts failing"
> So at the xs_transaction_end right before that ctx->xsh is valid, but
> right after...
> xs_introduce_domain(ctx->xsh, domid, state->store_mfn, state->store_port);
> ...it is invalid? i.e. before the free(vmpath) it is already corrupt?
Yes, you got it.
>
> (Aside: why isn't vmpath in the gc, instead of done manually,
> nevermind...)
>
> Does the xs_introduce_domain itself succeed?
No, it fails.
> Or do you mean that the next use of xsh after this fails
> (where is that, somewhere back up the callchain? store_libxl_entry
> perhaps?)
>
> xs_introduce_domain doesn't seem to do much which is untoward with the
> handle.
I thinkIn xs_talkv() something must fail.
> The only thing which springs to mind is that it may generate an
> @IntroduceDomain watch event. However xl is single threaded so we won't
> process that event until we unwind to whichever point we do an event
> loop iteration, in which case the corruption would have to happen later
> than right after xs_introduce_domain().
>
> Did you manage to determine if "Bad file descriptor" was due to it being
> closed vs. the value being corrupted?
My suspicion is that
if (msg.type != type)
in xs_talkv() is true.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 13:18 ` Christoph Egger
@ 2012-05-22 13:21 ` Ian Campbell
2012-05-22 14:03 ` Christoph Egger
2012-05-22 13:21 ` Christoph Egger
1 sibling, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-22 13:21 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel@lists.xen.org
On Tue, 2012-05-22 at 14:18 +0100, Christoph Egger wrote:
> I thinkIn xs_talkv() something must fail.
>
> > The only thing which springs to mind is that it may generate an
> > @IntroduceDomain watch event. However xl is single threaded so we won't
> > process that event until we unwind to whichever point we do an event
> > loop iteration, in which case the corruption would have to happen later
> > than right after xs_introduce_domain().
> >
> > Did you manage to determine if "Bad file descriptor" was due to it being
> > closed vs. the value being corrupted?
>
> My suspicion is that
>
> if (msg.type != type)
>
> in xs_talkv() is true.
>
Yes, that definitely seems worth investigating.
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 13:18 ` Christoph Egger
2012-05-22 13:21 ` Ian Campbell
@ 2012-05-22 13:21 ` Christoph Egger
1 sibling, 0 replies; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 13:21 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/22/12 15:18, Christoph Egger wrote:
> On 05/22/12 14:53, Ian Campbell wrote:
>
>> On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote:
>>> On 05/21/12 17:57, Ian Campbell wrote:
>>>
>>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>>> vdev=hda spec.backend=unknown
>>>>> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
>>>>> vdev=hda, using backend phy
>>>>> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
>>>>> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
>>>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>>>> Loader: 0000000000100000->000000000019bd04
>>>>> TOTAL: 0000000000000000->00000000ff800000
>>>>> ENTRY ADDRESS: 0000000000100000
>>>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>>>> 4KB PAGES: 0x0000000000000200
>>>>> 2MB PAGES: 0x00000000000003fb
>>>>> 1GB PAGES: 0x0000000000000002
>>>>> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
>>>>> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
>>>>> of range, valid values are within range from 1 to 65535
>>>>> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
>>>>> libxl_sched_credit_domain_set failed -6
>>>>> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
>>>>> vdev=hda spec.backend=phy
>>>>> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
>>>>> dompath for 7: Bad file descriptor
>>>>
>>>> This is back to the original issue, I think the last couple of mails
>>>> have been something of a tangent since you weren't getting as far as
>>>> this failure.
>>>>
>>>> I'm not really sure what to suggest here -- something is either closing
>>>> the fd or scribbling over the memory which contains it.
>>>>
>>>> I suppose you could sprinkle calls to libxl__xs_get_dompath() around
>>>> between libxl__sched_set_params and libxl__device_disk_set_backend and
>>>> see where it starts failing -- that's going to be pretty tedious though.
>>>
>>>
>>> It starts failing in libxl__build_post() right after
>>> xs_introduce_domain().
>>
>> What method did you use to determine that?
>
>
>
>
> What you said:
>
> "sprinkle calls to libxl__xs_get_dompath() around between
> libxl__sched_set_params and libxl__device_disk_set_backend and
> see where it starts failing"
>
> > So at the xs_transaction_end right before that ctx->xsh is valid, but
>
>> right after...
>> xs_introduce_domain(ctx->xsh, domid, state->store_mfn, state->store_port);
>> ...it is invalid? i.e. before the free(vmpath) it is already corrupt?
>
>
>
>
> Yes, you got it.
>
>>
>> (Aside: why isn't vmpath in the gc, instead of done manually,
>> nevermind...)
>>
>> Does the xs_introduce_domain itself succeed?
>
>
>
>
> No, it fails.
>
>> Or do you mean that the next use of xsh after this fails
>
>> (where is that, somewhere back up the callchain? store_libxl_entry
>> perhaps?)
>
>>
>> xs_introduce_domain doesn't seem to do much which is untoward with the
>> handle.
>
>
>
> I thinkIn xs_talkv() something must fail.
>
>> The only thing which springs to mind is that it may generate an
>> @IntroduceDomain watch event. However xl is single threaded so we won't
>> process that event until we unwind to whichever point we do an event
>> loop iteration, in which case the corruption would have to happen later
>> than right after xs_introduce_domain().
>>
>> Did you manage to determine if "Bad file descriptor" was due to it being
>> closed vs. the value being corrupted?
>
> My suspicion is that
>
> if (msg.type != type)
>
> in xs_talkv() is true.
>
> Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 13:21 ` Ian Campbell
@ 2012-05-22 14:03 ` Christoph Egger
2012-05-22 14:20 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 14:03 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xen.org
On 05/22/12 15:21, Ian Campbell wrote:
> On Tue, 2012-05-22 at 14:18 +0100, Christoph Egger wrote:
>> I thinkIn xs_talkv() something must fail.
>>
>>> The only thing which springs to mind is that it may generate an
>>> @IntroduceDomain watch event. However xl is single threaded so we won't
>>> process that event until we unwind to whichever point we do an event
>>> loop iteration, in which case the corruption would have to happen later
>>> than right after xs_introduce_domain().
>>>
>>> Did you manage to determine if "Bad file descriptor" was due to it being
>>> closed vs. the value being corrupted?
>>
>> My suspicion is that
>>
>> if (msg.type != type)
>>
>> in xs_talkv() is true.
>>
>
> Yes, that definitely seems worth investigating.
Ok, I got it.
xenstored crashes due to dereferencing NULL pointer.
In xenstored_domain.c, map_interface() *xcg_handle is NULL
and in xc_gnttab.c, xc_gnttab_map_grant_ref() it is dereferenced.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 14:03 ` Christoph Egger
@ 2012-05-22 14:20 ` Ian Campbell
2012-05-22 15:16 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-22 14:20 UTC (permalink / raw)
To: Christoph Egger; +Cc: Daniel De Graaf, xen-devel@lists.xen.org
On Tue, 2012-05-22 at 15:03 +0100, Christoph Egger wrote:
> On 05/22/12 15:21, Ian Campbell wrote:
>
> > On Tue, 2012-05-22 at 14:18 +0100, Christoph Egger wrote:
> >> I thinkIn xs_talkv() something must fail.
> >>
> >>> The only thing which springs to mind is that it may generate an
> >>> @IntroduceDomain watch event. However xl is single threaded so we won't
> >>> process that event until we unwind to whichever point we do an event
> >>> loop iteration, in which case the corruption would have to happen later
> >>> than right after xs_introduce_domain().
> >>>
> >>> Did you manage to determine if "Bad file descriptor" was due to it being
> >>> closed vs. the value being corrupted?
> >>
> >> My suspicion is that
> >>
> >> if (msg.type != type)
> >>
> >> in xs_talkv() is true.
> >>
> >
> > Yes, that definitely seems worth investigating.
>
>
> Ok, I got it.
>
> xenstored crashes due to dereferencing NULL pointer.
Huh, xenstore has materially changed for quite a while (since February).
> In xenstored_domain.c, map_interface() *xcg_handle is NULL
> and in xc_gnttab.c, xc_gnttab_map_grant_ref() it is dereferenced.
This comes from 24757:aae516b78fce. Diego and Alex aren't around any
more but CCing Daniel in case he remembers anything.
I guess the original xc_gnttab_open which sets *xcg_handle is failing
for you, I suppose that is to be expected on NetBSD? Either way it
should still work after this has failed.
All the >= checks on *xcg_handle seem wrong to me. Really they should be
checking != NULL, since otherwise they don't actually discriminate the
two cases! Does making that change help?
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 14:20 ` Ian Campbell
@ 2012-05-22 15:16 ` Christoph Egger
2012-05-22 15:21 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 15:16 UTC (permalink / raw)
To: Ian Campbell; +Cc: Daniel De Graaf, xen-devel@lists.xen.org
On 05/22/12 16:20, Ian Campbell wrote:
> On Tue, 2012-05-22 at 15:03 +0100, Christoph Egger wrote:
>> On 05/22/12 15:21, Ian Campbell wrote:
>>
>>> On Tue, 2012-05-22 at 14:18 +0100, Christoph Egger wrote:
>>>> I thinkIn xs_talkv() something must fail.
>>>>
>>>>> The only thing which springs to mind is that it may generate an
>>>>> @IntroduceDomain watch event. However xl is single threaded so we won't
>>>>> process that event until we unwind to whichever point we do an event
>>>>> loop iteration, in which case the corruption would have to happen later
>>>>> than right after xs_introduce_domain().
>>>>>
>>>>> Did you manage to determine if "Bad file descriptor" was due to it being
>>>>> closed vs. the value being corrupted?
>>>>
>>>> My suspicion is that
>>>>
>>>> if (msg.type != type)
>>>>
>>>> in xs_talkv() is true.
>>>>
>>>
>>> Yes, that definitely seems worth investigating.
>>
>>
>> Ok, I got it.
>>
>> xenstored crashes due to dereferencing NULL pointer.
>
> Huh, xenstore has materially changed for quite a while (since February).
>
>> In xenstored_domain.c, map_interface() *xcg_handle is NULL
>> and in xc_gnttab.c, xc_gnttab_map_grant_ref() it is dereferenced.
>
> This comes from 24757:aae516b78fce. Diego and Alex aren't around any
> more but CCing Daniel in case he remembers anything.
>
> I guess the original xc_gnttab_open which sets *xcg_handle is failing
> for you, I suppose that is to be expected on NetBSD? Either way it
> should still work after this has failed.
>
> All the >= checks on *xcg_handle seem wrong to me. Really they should be
> checking != NULL, since otherwise they don't actually discriminate the
> two cases! Does making that change help?
Yes, that helps! I can start guests again.
Christoph
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 15:16 ` Christoph Egger
@ 2012-05-22 15:21 ` Ian Campbell
2012-05-22 15:32 ` Christoph Egger
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2012-05-22 15:21 UTC (permalink / raw)
To: Christoph Egger; +Cc: Daniel De Graaf, xen-devel@lists.xen.org
On Tue, 2012-05-22 at 16:16 +0100, Christoph Egger wrote:
> On 05/22/12 16:20, Ian Campbell wrote:
> > All the >= checks on *xcg_handle seem wrong to me. Really they should be
> > checking != NULL, since otherwise they don't actually discriminate the
> > two cases! Does making that change help?
>
> Yes, that helps! I can start guests again.
Excellent, I assume you are going to submit the patch (i.e. I don't need
to..)
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 15:21 ` Ian Campbell
@ 2012-05-22 15:32 ` Christoph Egger
2012-05-23 10:11 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-22 15:32 UTC (permalink / raw)
To: Ian Campbell; +Cc: Daniel De Graaf, xen-devel@lists.xen.org
[-- Attachment #1: Type: text/plain, Size: 954 bytes --]
On 05/22/12 17:21, Ian Campbell wrote:
> On Tue, 2012-05-22 at 16:16 +0100, Christoph Egger wrote:
>> On 05/22/12 16:20, Ian Campbell wrote:
>>> All the >= checks on *xcg_handle seem wrong to me. Really they should be
>>> checking != NULL, since otherwise they don't actually discriminate the
>>> two cases! Does making that change help?
>>
>> Yes, that helps! I can start guests again.
>
> Excellent, I assume you are going to submit the patch (i.e. I don't need
> to..)
Yes, patch attached.
Fix pointer checks introduced in changeset 24757:aae516b78fce.
This fixes xenstored crash on platforms with no gnttap implementation.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
[-- Attachment #2: xen_tools_xenstore.diff --]
[-- Type: text/plain, Size: 780 bytes --]
diff -r 99263132665b tools/xenstore/xenstored_domain.c
--- a/tools/xenstore/xenstored_domain.c Fri May 18 12:38:55 2012 +0200
+++ b/tools/xenstore/xenstored_domain.c Tue May 22 17:25:03 2012 +0200
@@ -167,7 +167,7 @@ static int readchn(struct connection *co
static void *map_interface(domid_t domid, unsigned long mfn)
{
- if (*xcg_handle >= 0) {
+ if (*xcg_handle != NULL) {
/* this is the preferred method */
return xc_gnttab_map_grant_ref(*xcg_handle, domid,
GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
@@ -179,7 +179,7 @@ static void *map_interface(domid_t domid
static void unmap_interface(void *interface)
{
- if (*xcg_handle >= 0)
+ if (*xcg_handle != NULL)
xc_gnttab_munmap(*xcg_handle, interface, 1);
else
munmap(interface, getpagesize());
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-22 15:32 ` Christoph Egger
@ 2012-05-23 10:11 ` Ian Campbell
2012-05-24 9:44 ` Christoph Egger
2012-05-29 10:02 ` Ian Jackson
0 siblings, 2 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-23 10:11 UTC (permalink / raw)
To: Christoph Egger, Ian Jackson; +Cc: Daniel De Graaf, xen-devel@lists.xen.org
On Tue, 2012-05-22 at 16:32 +0100, Christoph Egger wrote:
> On 05/22/12 17:21, Ian Campbell wrote:
>
> > On Tue, 2012-05-22 at 16:16 +0100, Christoph Egger wrote:
> >> On 05/22/12 16:20, Ian Campbell wrote:
> >>> All the >= checks on *xcg_handle seem wrong to me. Really they should be
> >>> checking != NULL, since otherwise they don't actually discriminate the
> >>> two cases! Does making that change help?
> >>
> >> Yes, that helps! I can start guests again.
> >
> > Excellent, I assume you are going to submit the patch (i.e. I don't need
> > to..)
>
> Yes, patch attached.
I fixed up the commit message as follows. I'll apply if IanJ agrees or
acks it.
8<-----------------------------
>From 6b43ca97f5f8c4fa9bf24101253af21bc66ddf96 Mon Sep 17 00:00:00 2001
From: Christoph Egger <Christoph.Egger@amd.com>
Date: Tue, 22 May 2012 17:32:21 +0200
Subject: [PATCH] xenstore: fix crash on platforms with no gntdev driver implementation.
Fix pointer checks introduced in changeset 24757:aae516b78fce.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
tools/xenstore/xenstored_domain.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index f8c822f..bf83d58 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -167,7 +167,7 @@ static int readchn(struct connection *conn, void *data, unsigned int len)
static void *map_interface(domid_t domid, unsigned long mfn)
{
- if (*xcg_handle >= 0) {
+ if (*xcg_handle != NULL) {
/* this is the preferred method */
return xc_gnttab_map_grant_ref(*xcg_handle, domid,
GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
@@ -179,7 +179,7 @@ static void *map_interface(domid_t domid, unsigned long mfn)
static void unmap_interface(void *interface)
{
- if (*xcg_handle >= 0)
+ if (*xcg_handle != NULL)
xc_gnttab_munmap(*xcg_handle, interface, 1);
else
munmap(interface, getpagesize());
--
1.7.2.5
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-23 10:11 ` Ian Campbell
@ 2012-05-24 9:44 ` Christoph Egger
2012-05-25 14:56 ` Roger Pau Monne
2012-05-29 10:02 ` Ian Jackson
1 sibling, 1 reply; 30+ messages in thread
From: Christoph Egger @ 2012-05-24 9:44 UTC (permalink / raw)
To: Ian Campbell; +Cc: Daniel De Graaf, Ian Jackson, xen-devel@lists.xen.org
On 05/23/12 12:11, Ian Campbell wrote:
> On Tue, 2012-05-22 at 16:32 +0100, Christoph Egger wrote:
>> On 05/22/12 17:21, Ian Campbell wrote:
>>
>>> On Tue, 2012-05-22 at 16:16 +0100, Christoph Egger wrote:
>>>> On 05/22/12 16:20, Ian Campbell wrote:
>>>>> All the >= checks on *xcg_handle seem wrong to me. Really they should be
>>>>> checking != NULL, since otherwise they don't actually discriminate the
>>>>> two cases! Does making that change help?
>>>>
>>>> Yes, that helps! I can start guests again.
>>>
>>> Excellent, I assume you are going to submit the patch (i.e. I don't need
>>> to..)
>>
>> Yes, patch attached.
>
> I fixed up the commit message as follows. I'll apply if IanJ agrees or
> acks it.
Thank you. Ian J. what do you say?
Christoph
> 8<-----------------------------
>
> From 6b43ca97f5f8c4fa9bf24101253af21bc66ddf96 Mon Sep 17 00:00:00 2001
> From: Christoph Egger <Christoph.Egger@amd.com>
> Date: Tue, 22 May 2012 17:32:21 +0200
> Subject: [PATCH] xenstore: fix crash on platforms with no gntdev driver implementation.
>
> Fix pointer checks introduced in changeset 24757:aae516b78fce.
>
> Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> ---
> tools/xenstore/xenstored_domain.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
> index f8c822f..bf83d58 100644
> --- a/tools/xenstore/xenstored_domain.c
> +++ b/tools/xenstore/xenstored_domain.c
> @@ -167,7 +167,7 @@ static int readchn(struct connection *conn, void *data, unsigned int len)
>
> static void *map_interface(domid_t domid, unsigned long mfn)
> {
> - if (*xcg_handle >= 0) {
> + if (*xcg_handle != NULL) {
> /* this is the preferred method */
> return xc_gnttab_map_grant_ref(*xcg_handle, domid,
> GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
> @@ -179,7 +179,7 @@ static void *map_interface(domid_t domid, unsigned long mfn)
>
> static void unmap_interface(void *interface)
> {
> - if (*xcg_handle >= 0)
> + if (*xcg_handle != NULL)
> xc_gnttab_munmap(*xcg_handle, interface, 1);
> else
> munmap(interface, getpagesize());
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-24 9:44 ` Christoph Egger
@ 2012-05-25 14:56 ` Roger Pau Monne
2012-05-25 15:42 ` Ian Campbell
0 siblings, 1 reply; 30+ messages in thread
From: Roger Pau Monne @ 2012-05-25 14:56 UTC (permalink / raw)
To: Christoph Egger
Cc: Daniel De Graaf, Ian Jackson, Ian Campbell,
xen-devel@lists.xen.org
Christoph Egger wrote:
> On 05/23/12 12:11, Ian Campbell wrote:
>
>> On Tue, 2012-05-22 at 16:32 +0100, Christoph Egger wrote:
>>> On 05/22/12 17:21, Ian Campbell wrote:
>>>
>>>> On Tue, 2012-05-22 at 16:16 +0100, Christoph Egger wrote:
>>>>> On 05/22/12 16:20, Ian Campbell wrote:
>>>>>> All the>= checks on *xcg_handle seem wrong to me. Really they should be
>>>>>> checking != NULL, since otherwise they don't actually discriminate the
>>>>>> two cases! Does making that change help?
>>>>> Yes, that helps! I can start guests again.
>>>> Excellent, I assume you are going to submit the patch (i.e. I don't need
>>>> to..)
>>> Yes, patch attached.
>> I fixed up the commit message as follows. I'll apply if IanJ agrees or
>> acks it.
>
> Thank you. Ian J. what do you say?
>
> Christoph
>
>
>> 8<-----------------------------
>>
>> From 6b43ca97f5f8c4fa9bf24101253af21bc66ddf96 Mon Sep 17 00:00:00 2001
>> From: Christoph Egger<Christoph.Egger@amd.com>
>> Date: Tue, 22 May 2012 17:32:21 +0200
>> Subject: [PATCH] xenstore: fix crash on platforms with no gntdev driver implementation.
>>
>> Fix pointer checks introduced in changeset 24757:aae516b78fce.
>>
>> Signed-off-by: Christoph Egger<Christoph.Egger@amd.com>
>> Acked-by: Ian Campbell<ian.campbell@citrix.com>
>> ---
>> tools/xenstore/xenstored_domain.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
>> index f8c822f..bf83d58 100644
>> --- a/tools/xenstore/xenstored_domain.c
>> +++ b/tools/xenstore/xenstored_domain.c
>> @@ -167,7 +167,7 @@ static int readchn(struct connection *conn, void *data, unsigned int len)
>>
>> static void *map_interface(domid_t domid, unsigned long mfn)
>> {
>> - if (*xcg_handle>= 0) {
>> + if (*xcg_handle != NULL) {
>> /* this is the preferred method */
>> return xc_gnttab_map_grant_ref(*xcg_handle, domid,
>> GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
>> @@ -179,7 +179,7 @@ static void *map_interface(domid_t domid, unsigned long mfn)
>>
>> static void unmap_interface(void *interface)
>> {
>> - if (*xcg_handle>= 0)
>> + if (*xcg_handle != NULL)
>> xc_gnttab_munmap(*xcg_handle, interface, 1);
>> else
>> munmap(interface, getpagesize());
I also see an error when starting xencommons on NetBSD:
test# /usr/xen42/etc/rc.d/xencommons onestart
Cleaning xenstore database.
Starting xenservices: xenstored, xenconsoled, xenbackendd.xc: error:
OSDEP: interface 2 (gnttab) not supported on this platform: Internal error
Which is quite annoying, but I'm not really sure of the most elegant way
to solve this. The error comes from tools/libxc/xc_private.c:177, so
maybe just removing that message would be ok, or something like this:
--- a/tools/libxc/xc_private.c
+++ b/tools/libxc/xc_private.c
@@ -265,8 +265,12 @@ int xc_evtchn_close(xc_evtchn *xce)
xc_gnttab *xc_gnttab_open(xentoollog_logger *logger,
unsigned open_flags)
{
+#ifndef __NetBSD__
return xc_interface_open_common(logger, NULL, open_flags,
XC_OSDEP_GNTTAB);
+#else
+ return NULL;
+#endif
}
Which is not really pretty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-25 14:56 ` Roger Pau Monne
@ 2012-05-25 15:42 ` Ian Campbell
0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-25 15:42 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Christoph Egger, Ian Jackson, Daniel De Graaf,
xen-devel@lists.xen.org
On Fri, 2012-05-25 at 15:56 +0100, Roger Pau Monne wrote:
> I also see an error when starting xencommons on NetBSD:
>
> test# /usr/xen42/etc/rc.d/xencommons onestart
> Cleaning xenstore database.
> Starting xenservices: xenstored, xenconsoled, xenbackendd.xc: error:
> OSDEP: interface 2 (gnttab) not supported on this platform: Internal error
>
> Which is quite annoying, but I'm not really sure of the most elegant way
> to solve this. The error comes from tools/libxc/xc_private.c:177, so
> maybe just removing that message would be ok,
I think removing the message is fine. This interface is intentionally
"optional" so making a load of noise when the option is exercised seems
silly...
If you make it a DPRINTF is it silent in this context? If not then just
nuke it entirely...
Ian.
> or something like this:
>
> --- a/tools/libxc/xc_private.c
> +++ b/tools/libxc/xc_private.c
> @@ -265,8 +265,12 @@ int xc_evtchn_close(xc_evtchn *xce)
> xc_gnttab *xc_gnttab_open(xentoollog_logger *logger,
> unsigned open_flags)
> {
> +#ifndef __NetBSD__
> return xc_interface_open_common(logger, NULL, open_flags,
> XC_OSDEP_GNTTAB);
> +#else
> + return NULL;
> +#endif
> }
>
> Which is not really pretty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-23 10:11 ` Ian Campbell
2012-05-24 9:44 ` Christoph Egger
@ 2012-05-29 10:02 ` Ian Jackson
2012-05-29 10:13 ` Ian Campbell
1 sibling, 1 reply; 30+ messages in thread
From: Ian Jackson @ 2012-05-29 10:02 UTC (permalink / raw)
To: Ian Campbell; +Cc: Christoph Egger, Daniel De Graaf, xen-devel@lists.xen.org
Ian Campbell writes ("Re: [Xen-devel] libxl: cannot start guest"):
> From: Christoph Egger <Christoph.Egger@amd.com>
> Date: Tue, 22 May 2012 17:32:21 +0200
> Subject: [PATCH] xenstore: fix crash on platforms with no gntdev driver implementation.
>
> Fix pointer checks introduced in changeset 24757:aae516b78fce.
>
> Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: libxl: cannot start guest
2012-05-29 10:02 ` Ian Jackson
@ 2012-05-29 10:13 ` Ian Campbell
0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2012-05-29 10:13 UTC (permalink / raw)
To: Ian Jackson; +Cc: Christoph Egger, Daniel De Graaf, xen-devel@lists.xen.org
On Tue, 2012-05-29 at 11:02 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] libxl: cannot start guest"):
> > From: Christoph Egger <Christoph.Egger@amd.com>
> > Date: Tue, 22 May 2012 17:32:21 +0200
> > Subject: [PATCH] xenstore: fix crash on platforms with no gntdev driver implementation.
> >
> > Fix pointer checks introduced in changeset 24757:aae516b78fce.
> >
> > Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed, thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2012-05-29 10:13 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-18 13:17 libxl: cannot start guest Christoph Egger
2012-05-18 13:30 ` Ian Campbell
2012-05-18 14:23 ` Christoph Egger
2012-05-18 15:51 ` Christoph Egger
2012-05-18 15:58 ` Ian Campbell
2012-05-21 10:26 ` Christoph Egger
2012-05-21 12:15 ` Ian Campbell
2012-05-21 13:10 ` Christoph Egger
2012-05-21 13:49 ` Ian Campbell
2012-05-21 15:43 ` Christoph Egger
2012-05-21 15:44 ` Christoph Egger
2012-05-21 15:57 ` Ian Campbell
2012-05-21 16:11 ` Ian Jackson
2012-05-22 12:35 ` Christoph Egger
2012-05-22 12:53 ` Ian Campbell
2012-05-22 13:17 ` Christoph Egger
2012-05-22 13:18 ` Christoph Egger
2012-05-22 13:21 ` Ian Campbell
2012-05-22 14:03 ` Christoph Egger
2012-05-22 14:20 ` Ian Campbell
2012-05-22 15:16 ` Christoph Egger
2012-05-22 15:21 ` Ian Campbell
2012-05-22 15:32 ` Christoph Egger
2012-05-23 10:11 ` Ian Campbell
2012-05-24 9:44 ` Christoph Egger
2012-05-25 14:56 ` Roger Pau Monne
2012-05-25 15:42 ` Ian Campbell
2012-05-29 10:02 ` Ian Jackson
2012-05-29 10:13 ` Ian Campbell
2012-05-22 13:21 ` Christoph Egger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).