* Unable to start a domain with no disks
@ 2012-10-12 15:45 Ian Campbell
2012-10-12 15:50 ` Ian Campbell
0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 15:45 UTC (permalink / raw)
To: Ian Jackson, Roger Pau Monne; +Cc: xen-devel
Trying to start a very simple domain with no disk from:
name="x"
memory=128
kernel="/scratch/debian/squeeze/amd64/vmlinuz"
ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
extra="console=hvc0"
vcpus=1
Results in:
xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
if I try to autoconnnect the console (ie. xl cr -c).
If I don't try and auto connect to the console then everything is fine
and I can subsequently use "xl cons" to connect.
Backtrace is:
xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
Program received signal SIGABRT, Aborted.
0xff7fe424 in __kernel_vsyscall ()
(gdb) bt
#0 0xff7fe424 in __kernel_vsyscall ()
#1 0xb7e18781 in raise () from /lib/i686/cmov/libc.so.6
#2 0xb7e1bbb2 in abort () from /lib/i686/cmov/libc.so.6
#3 0xb7e118e8 in __assert_fail () from /lib/i686/cmov/libc.so.6
#4 0xb7faf7c1 in libxl__ao_complete_check_progress_reports (egc=0xbffff42c, ao=0x806a870) at libxl_event.c:1522
#5 0xb7fb035b in egc_run_callbacks (egc=0xbffff42c) at libxl_event.c:1095
#6 libxl__egc_cleanup (egc=0xbffff42c) at libxl_event.c:1116
#7 0xb7f95059 in do_domain_create (ctx=<value optimized out>, d_config=<value optimized out>, domid=0xbffff74c, restore_fd=-1, ao_how=0x0, aop_console_how=0xbffff54c) at libxl_create.c:1197
#8 0xb7f9510f in libxl_domain_create_new (ctx=0x806a030, d_config=0xbffff59c, domid=0xbffff74c, ao_how=0x0, aop_console_how=0xbffff54c) at libxl_create.c:1218
#9 0x0805c5a7 in create_domain (dom_info=<value optimized out>) at xl_cmdimpl.c:1932
#10 0x0805ded2 in main_create (argc=3, argv=0xbffffe5a) at xl_cmdimpl.c:3973
#11 0x0804d41e in main (argc=4, argv=0xbffffd24) at xl.c:285
Full log follows below. Any ideas?
Ian.
# xl -vvv cr -c x
Parsing config from x
libxl: debug: libxl_create.c:1184:do_domain_create: ao 0x806a870: create: how=(nil) callback=(nil) poller=0x806a8b0
libxl: debug: libxl_create.c:677:initiate_domain_create: running bootloader
libxl: debug: libxl_bootloader.c:327:libxl__bootloader_run: no bootloader configured, using user supplied kernel
libxl: debug: libxl_event.c:561:libxl__ev_xswatch_deregister: watch w=0x806aa84: deregister unregistered
libxl: debug: libxl_numa.c:435:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=4, nr_vcpus=5, free_memkb=3467
libxl: detail: libxl_dom.c:192:numa_place_domain: NUMA placement candidate with 1 nodes, 4 cpus and 3467 KB free selected
domainbuilder: detail: xc_dom_allocate: cmdline="console=hvc0", features="(null)"
libxl: debug: libxl_dom.c:380:libxl__build_pv: pv kernel mapped 0 path /scratch/debian/squeeze/amd64/vmlinuz
domainbuilder: detail: xc_dom_kernel_file: filename="/scratch/debian/squeeze/amd64/vmlinuz"
domainbuilder: detail: xc_dom_malloc_filemap : 2367 kB
domainbuilder: detail: xc_dom_ramdisk_file: filename="/scratch/debian/squeeze/amd64/initrd.gz"
domainbuilder: detail: xc_dom_malloc_filemap : 19090 kB
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.3, caps xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ...
domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ...
domainbuilder: detail: xc_dom_malloc : 11914 kB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x2485c8 -> 0xba2b00
domainbuilder: detail: loader probe OK
xc: detail: elf_parse_binary: phdr: paddr=0x1000000 memsz=0x42d000
xc: detail: elf_parse_binary: phdr: paddr=0x142d000 memsz=0xd6150
xc: detail: elf_parse_binary: phdr: paddr=0x1504000 memsz=0x888
xc: detail: elf_parse_binary: phdr: paddr=0x1505000 memsz=0x160d8
xc: detail: elf_parse_binary: phdr: paddr=0x151c000 memsz=0x1de000
xc: detail: elf_parse_binary: memory: 0x1000000 -> 0x16fa000
xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff8151c200
xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81009000
xc: detail: elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
xc: detail: elf_xen_parse_note: PAE_MODE = "yes"
xc: detail: elf_xen_parse_note: LOADER = "generic"
xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
xc: detail: elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
xc: detail: elf_xen_addr_calc_check: addresses:
xc: detail: virt_base = 0xffffffff80000000
xc: detail: elf_paddr_offset = 0x0
xc: detail: virt_offset = 0xffffffff80000000
xc: detail: virt_kstart = 0xffffffff81000000
xc: detail: virt_kend = 0xffffffff816fa000
xc: detail: virt_entry = 0xffffffff8151c200
xc: detail: p2m_base = 0xffffffffffffffff
domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff81000000 -> 0xffffffff816fa000
domainbuilder: detail: xc_dom_mem_init: mem 128 MB, pages 0x8000 pages, 4k each
domainbuilder: detail: xc_dom_mem_init: 0x8000 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
domainbuilder: detail: xc_dom_malloc : 128 kB
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_alloc_segment: kernel : 0xffffffff81000000 -> 0xffffffff816fa000 (pfn 0x1000 + 0x6fa pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0x6fa at 0xb4ce6000
xc: detail: elf_load_binary: phdr 0 at 0x0xb4ce6000 -> 0x0xb5113000
xc: detail: elf_load_binary: phdr 1 at 0x0xb5113000 -> 0x0xb51e9150
xc: detail: elf_load_binary: phdr 2 at 0x0xb51ea000 -> 0x0xb51ea888
xc: detail: elf_load_binary: phdr 3 at 0x0xb51eb000 -> 0x0xb52010d8
xc: detail: elf_load_binary: phdr 4 at 0x0xb5202000 -> 0x0xb5288000
domainbuilder: detail: xc_dom_alloc_segment: ramdisk : 0xffffffff816fa000 -> 0xffffffff8481c000 (pfn 0x16fa + 0x3122 pages)
domainbuilder: detail: xc_dom_malloc : 294 kB
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x16fa+0x3122 at 0xb1b7a000
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x12a4954 -> 0x3121e10
domainbuilder: detail: xc_dom_alloc_segment: phys2mach : 0xffffffff8481c000 -> 0xffffffff8485c000 (pfn 0x481c + 0x40 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x481c+0x40 at 0xb1b3a000
domainbuilder: detail: xc_dom_alloc_page : start info : 0xffffffff8485c000 (pfn 0x485c)
domainbuilder: detail: xc_dom_alloc_page : xenstore : 0xffffffff8485d000 (pfn 0x485d)
domainbuilder: detail: xc_dom_alloc_page : console : 0xffffffff8485e000 (pfn 0x485e)
domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff84bfffff, 38 table(s)
domainbuilder: detail: xc_dom_alloc_segment: page tables : 0xffffffff8485f000 -> 0xffffffff84888000 (pfn 0x485f + 0x29 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x485f+0x29 at 0xb1b11000
domainbuilder: detail: xc_dom_alloc_page : boot stack : 0xffffffff84888000 (pfn 0x4888)
domainbuilder: detail: xc_dom_build_image : virt_alloc_end : 0xffffffff84889000
domainbuilder: detail: xc_dom_build_image : virt_pgtab_end : 0xffffffff84c00000
domainbuilder: detail: xc_dom_boot_image: called
domainbuilder: detail: arch_setup_bootearly: doing nothing
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x8000
domainbuilder: detail: clear_page: pfn 0x485e, mfn 0xdc43d
domainbuilder: detail: clear_page: pfn 0x485d, mfn 0xdc43e
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x485c+0x1 at 0xb1b0e000
domainbuilder: detail: start_info_x86_64: called
domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff81009000 pfn=0x1009
domainbuilder: detail: domain builder memory footprint
domainbuilder: detail: allocated
domainbuilder: detail: malloc : 12383 kB
domainbuilder: detail: anon mmap : 0 bytes
domainbuilder: detail: mapped
domainbuilder: detail: file mmap : 21457 kB
domainbuilder: detail: domU mmap : 56 MB
domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0xdccc8
domainbuilder: detail: shared_info_x86_64: called
domainbuilder: detail: vcpu_x86_64: called
domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x485f mfn 0xdc43c
domainbuilder: detail: launch_vm: called, ctxt=0xbfd9c2ec
domainbuilder: detail: xc_dom_release: called
libxl: debug: libxl_event.c:1677:libxl__ao_progress_report: ao 0x806a870: progress report: callback queued aop=0x806c010
libxl: debug: libxl_event.c:1497:libxl__ao_complete: ao 0x806a870: complete, rc=0
libxl: debug: libxl_create.c:1197:do_domain_create: ao 0x806a870: inprogress: poller=0x806a8b0, flags=ic
libxl: debug: libxl_event.c:1469:libxl__ao__destroy: ao 0x806a870: destroy
libxl: debug: libxl_event.c:1090:egc_run_callbacks: ao 0x806a870: progress report: callback aop=0x806c010
xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
Aborted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unable to start a domain with no disks
2012-10-12 15:45 Unable to start a domain with no disks Ian Campbell
@ 2012-10-12 15:50 ` Ian Campbell
2012-10-12 16:06 ` Ian Campbell
0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 15:50 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, Roger Pau Monne
On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> Trying to start a very simple domain with no disk from:
> name="x"
> memory=128
> kernel="/scratch/debian/squeeze/amd64/vmlinuz"
> ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
> extra="console=hvc0"
> vcpus=1
> Results in:
> xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
> if I try to autoconnnect the console (ie. xl cr -c).
>
> If I don't try and auto connect to the console then everything is fine
> and I can subsequently use "xl cons" to connect.
This makes the issue go away but since this function also free's the ao
I think there must be a use after free somewhere.
diff -r 8aadb3204cfa tools/libxl/libxl_event.c
--- a/tools/libxl/libxl_event.c Thu Sep 06 21:41:27 2012 +0200
+++ b/tools/libxl/libxl_event.c Fri Oct 12 16:49:23 2012 +0100
@@ -1468,6 +1468,7 @@ void libxl__ao__destroy(libxl_ctx *ctx,
if (!ao) return;
LOG(DEBUG,"ao %p: destroy",ao);
if (ao->poller) libxl__poller_put(ctx, ao->poller);
+ ao->poller = NULL;
ao->magic = LIBXL__AO_MAGIC_DESTROYED;
libxl__free_all(&ao->gc);
free(ao);
Ian.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unable to start a domain with no disks
2012-10-12 15:50 ` Ian Campbell
@ 2012-10-12 16:06 ` Ian Campbell
2012-10-12 16:29 ` Ian Jackson
0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 16:06 UTC (permalink / raw)
To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel
On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > Trying to start a very simple domain with no disk from:
> > name="x"
> > memory=128
> > kernel="/scratch/debian/squeeze/amd64/vmlinuz"
> > ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
> > extra="console=hvc0"
> > vcpus=1
> > Results in:
> > xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
> > if I try to autoconnnect the console (ie. xl cr -c).
> >
> > If I don't try and auto connect to the console then everything is fine
> > and I can subsequently use "xl cons" to connect.
>
> This makes the issue go away but since this function also free's the ao
> I think there must be a use after free somewhere.
valgrind shows a couple of these sorts of use after free:
==5986== Invalid read of size 4
==5986== at 0x4079323: libxl__egc_cleanup (libxl_event.c:1091)
==5986== by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
==5986== Address 0x42af134 is 28 bytes inside a block of size 1,780 free'd
==5986== at 0x4024F99: free (vg_replace_malloc.c:446)
==5986== by 0x406E5A4: libxl__free_all (libxl_internal.c:74)
==5986== by 0x4077B31: libxl__ao__destroy (libxl_event.c:1472)
==5986== by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986== by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
And a bunch of:
==5986== Invalid read of size 4
==5986== at 0x4079378: libxl__egc_cleanup (libxl_event.c:1094)
==5986== by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
==5986== Address 0x42aef60 is 8 bytes inside a block of size 56 free'd
==5986== at 0x4024F99: free (vg_replace_malloc.c:446)
==5986== by 0x4077B39: libxl__ao__destroy (libxl_event.c:1473)
==5986== by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986== by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
The last of which is:
==5986== Invalid read of size 4
==5986== at 0x40787BC: libxl__ao_complete_check_progress_reports (libxl_event.c:1521)
==5986== by 0x407938A: libxl__egc_cleanup (libxl_event.c:1095)
==5986== by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
==5986== Address 0x42aef80 is 40 bytes inside a block of size 56 free'd
==5986== at 0x4024F99: free (vg_replace_malloc.c:446)
==5986== by 0x4077B39: libxl__ao__destroy (libxl_event.c:1473)
==5986== by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986== by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986== by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986== by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986== by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986== by 0x804D41D: main (xl.c:285)
==5986==
xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unable to start a domain with no disks
2012-10-12 16:06 ` Ian Campbell
@ 2012-10-12 16:29 ` Ian Jackson
2012-10-15 9:18 ` Ian Campbell
2012-10-18 8:35 ` Ian Campbell
0 siblings, 2 replies; 6+ messages in thread
From: Ian Jackson @ 2012-10-12 16:29 UTC (permalink / raw)
To: Ian Campbell; +Cc: Roger Pau Monne, xen-devel
Ian Campbell writes ("Re: [Xen-devel] Unable to start a domain with no disks"):
> On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> > On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > > Trying to start a very simple domain with no disk from:
> > This makes the issue go away but since this function also free's the ao
> > I think there must be a use after free somewhere.
In this case what happens is that the tty available event gets queued
in the outer egc, which survives until after libxl__ao_inprogress has
returned. You only see the bug with no disks because with disks the
events are only generated later in the event loop inside
libxl__ao_inprogress.
Ian.
Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events
There are two egcs in an ao initiator: the one in the AO_CREATE
function, and the one in libxl__ao_inprogress. If synchronous ao
operation generates progress events and completes immediately, the
progress callbacks end up queued in the outer egc. These callbacks
are currently only called after libxl__ao_inprogress has returned, and
keep the ao alive until they happen. This is not good because the
principle is that a synchronous ao is not supposed to survive beyond
libxl__ao_inprogress's return.
The fix is to ensure that the callbacks queued in the outer egc are
called early enough that they don't preserve the ao. This is
straightforward in the AO_INPROGRESS macro because AO_CREATE's egc is
not used inside that macro other than to destroy it. All we have to
do is destroy it a bit sooner.
This involves unlocking and relocking the ctx since EGC_FREE expects
to be called with the lock released but libxl__ao_inprogress needs it
locked. The is hole in our lock tenure is fine - libxl__ao_inprogress
has such holes already.
It is still possible to use the CTX_LOCK macros for this unlock/lock
because the gc we are using is destroyed only afterwards by
libxl__ao_inprogress.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
--- a/tools/libxl/libxl_internal.h Thu Sep 06 21:41:27 2012 +0200
+++ b/tools/libxl/libxl_internal.h Fri Oct 12 17:20:10 2012 +0100
@@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
#define AO_INPROGRESS ({ \
libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
+ CTX_UNLOCK; \
+ EGC_FREE; \
+ CTX_LOCK; \
int ao__rc = libxl__ao_inprogress(ao, \
__FILE__, __LINE__, __func__); \
libxl__ctx_unlock(ao__ctx); /* gc is now invalid */ \
- EGC_FREE; \
(ao__rc); \
})
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unable to start a domain with no disks
2012-10-12 16:29 ` Ian Jackson
@ 2012-10-15 9:18 ` Ian Campbell
2012-10-18 8:35 ` Ian Campbell
1 sibling, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2012-10-15 9:18 UTC (permalink / raw)
To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel
On Fri, 2012-10-12 at 17:29 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] Unable to start a domain with no disks"):
> > On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> > > On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > > > Trying to start a very simple domain with no disk from:
>
> > > This makes the issue go away but since this function also free's the ao
> > > I think there must be a use after free somewhere.
>
> In this case what happens is that the tty available event gets queued
> in the outer egc, which survives until after libxl__ao_inprogress has
> returned. You only see the bug with no disks because with disks the
> events are only generated later in the event loop inside
> libxl__ao_inprogress.
>
> Ian.
>
>
> Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events
>
> There are two egcs in an ao initiator: the one in the AO_CREATE
> function, and the one in libxl__ao_inprogress. If synchronous ao
> operation generates progress events and completes immediately, the
> progress callbacks end up queued in the outer egc. These callbacks
> are currently only called after libxl__ao_inprogress has returned, and
> keep the ao alive until they happen. This is not good because the
> principle is that a synchronous ao is not supposed to survive beyond
> libxl__ao_inprogress's return.
>
> The fix is to ensure that the callbacks queued in the outer egc are
> called early enough that they don't preserve the ao. This is
> straightforward in the AO_INPROGRESS macro because AO_CREATE's egc is
> not used inside that macro other than to destroy it. All we have to
> do is destroy it a bit sooner.
>
> This involves unlocking and relocking the ctx since EGC_FREE expects
> to be called with the lock released but libxl__ao_inprogress needs it
> locked. The is hole in our lock tenure is fine - libxl__ao_inprogress
> has such holes already.
>
> It is still possible to use the CTX_LOCK macros for this unlock/lock
> because the gc we are using is destroyed only afterwards by
> libxl__ao_inprogress.
>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
> --- a/tools/libxl/libxl_internal.h Thu Sep 06 21:41:27 2012 +0200
> +++ b/tools/libxl/libxl_internal.h Fri Oct 12 17:20:10 2012 +0100
> @@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
>
> #define AO_INPROGRESS ({ \
> libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
> + CTX_UNLOCK; \
> + EGC_FREE; \
> + CTX_LOCK; \
> int ao__rc = libxl__ao_inprogress(ao, \
> __FILE__, __LINE__, __func__); \
> libxl__ctx_unlock(ao__ctx); /* gc is now invalid */ \
> - EGC_FREE; \
> (ao__rc); \
> })
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unable to start a domain with no disks
2012-10-12 16:29 ` Ian Jackson
2012-10-15 9:18 ` Ian Campbell
@ 2012-10-18 8:35 ` Ian Campbell
1 sibling, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2012-10-18 8:35 UTC (permalink / raw)
To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel
On Fri, 2012-10-12 at 17:29 +0100, Ian Jackson wrote:
> Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events
[...]
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <Ian.campbell@citrix.com>
and applied, thanks.
>
> diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
> --- a/tools/libxl/libxl_internal.h Thu Sep 06 21:41:27 2012 +0200
> +++ b/tools/libxl/libxl_internal.h Fri Oct 12 17:20:10 2012 +0100
> @@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
>
> #define AO_INPROGRESS ({ \
> libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
> + CTX_UNLOCK; \
> + EGC_FREE; \
> + CTX_LOCK; \
> int ao__rc = libxl__ao_inprogress(ao, \
> __FILE__, __LINE__, __func__); \
> libxl__ctx_unlock(ao__ctx); /* gc is now invalid */ \
> - EGC_FREE; \
> (ao__rc); \
> })
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-10-18 8:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-12 15:45 Unable to start a domain with no disks Ian Campbell
2012-10-12 15:50 ` Ian Campbell
2012-10-12 16:06 ` Ian Campbell
2012-10-12 16:29 ` Ian Jackson
2012-10-15 9:18 ` Ian Campbell
2012-10-18 8:35 ` Ian Campbell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).