xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Unable to start a domain with no disks
@ 2012-10-12 15:45 Ian Campbell
  2012-10-12 15:50 ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 15:45 UTC (permalink / raw)
  To: Ian Jackson, Roger Pau Monne; +Cc: xen-devel

Trying to start a very simple domain with no disk from:
        name="x"
        memory=128
        kernel="/scratch/debian/squeeze/amd64/vmlinuz"
        ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
        extra="console=hvc0"
        vcpus=1
Results in:
        xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
if I try to autoconnnect the console (ie. xl cr -c).

If I don't try and auto connect to the console then everything is fine
and I can subsequently use "xl cons" to connect.

Backtrace is:
        xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
        
        Program received signal SIGABRT, Aborted.
        0xff7fe424 in __kernel_vsyscall ()
        (gdb) bt
        #0  0xff7fe424 in __kernel_vsyscall ()
        #1  0xb7e18781 in raise () from /lib/i686/cmov/libc.so.6
        #2  0xb7e1bbb2 in abort () from /lib/i686/cmov/libc.so.6
        #3  0xb7e118e8 in __assert_fail () from /lib/i686/cmov/libc.so.6
        #4  0xb7faf7c1 in libxl__ao_complete_check_progress_reports (egc=0xbffff42c, ao=0x806a870) at libxl_event.c:1522
        #5  0xb7fb035b in egc_run_callbacks (egc=0xbffff42c) at libxl_event.c:1095
        #6  libxl__egc_cleanup (egc=0xbffff42c) at libxl_event.c:1116
        #7  0xb7f95059 in do_domain_create (ctx=<value optimized out>, d_config=<value optimized out>, domid=0xbffff74c, restore_fd=-1, ao_how=0x0, aop_console_how=0xbffff54c) at libxl_create.c:1197
        #8  0xb7f9510f in libxl_domain_create_new (ctx=0x806a030, d_config=0xbffff59c, domid=0xbffff74c, ao_how=0x0, aop_console_how=0xbffff54c) at libxl_create.c:1218
        #9  0x0805c5a7 in create_domain (dom_info=<value optimized out>) at xl_cmdimpl.c:1932
        #10 0x0805ded2 in main_create (argc=3, argv=0xbffffe5a) at xl_cmdimpl.c:3973
        #11 0x0804d41e in main (argc=4, argv=0xbffffd24) at xl.c:285

Full log follows below. Any ideas?

Ian.



        # xl -vvv cr -c x
        Parsing config from x
        libxl: debug: libxl_create.c:1184:do_domain_create: ao 0x806a870: create: how=(nil) callback=(nil) poller=0x806a8b0
        libxl: debug: libxl_create.c:677:initiate_domain_create: running bootloader
        libxl: debug: libxl_bootloader.c:327:libxl__bootloader_run: no bootloader configured, using user supplied kernel
        libxl: debug: libxl_event.c:561:libxl__ev_xswatch_deregister: watch w=0x806aa84: deregister unregistered
        libxl: debug: libxl_numa.c:435:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=4, nr_vcpus=5, free_memkb=3467
        libxl: detail: libxl_dom.c:192:numa_place_domain: NUMA placement candidate with 1 nodes, 4 cpus and 3467 KB free selected
        domainbuilder: detail: xc_dom_allocate: cmdline="console=hvc0", features="(null)"
        libxl: debug: libxl_dom.c:380:libxl__build_pv: pv kernel mapped 0 path /scratch/debian/squeeze/amd64/vmlinuz
        
        domainbuilder: detail: xc_dom_kernel_file: filename="/scratch/debian/squeeze/amd64/vmlinuz"
        domainbuilder: detail: xc_dom_malloc_filemap    : 2367 kB
        domainbuilder: detail: xc_dom_ramdisk_file: filename="/scratch/debian/squeeze/amd64/initrd.gz"
        domainbuilder: detail: xc_dom_malloc_filemap    : 19090 kB
        domainbuilder: detail: xc_dom_boot_xen_init: ver 4.3, caps xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
        domainbuilder: detail: xc_dom_parse_image: called
        domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ... 
        domainbuilder: detail: loader probe failed
        domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ... 
        domainbuilder: detail: xc_dom_malloc            : 11914 kB
        domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x2485c8 -> 0xba2b00
        domainbuilder: detail: loader probe OK
        xc: detail: elf_parse_binary: phdr: paddr=0x1000000 memsz=0x42d000
        xc: detail: elf_parse_binary: phdr: paddr=0x142d000 memsz=0xd6150
        xc: detail: elf_parse_binary: phdr: paddr=0x1504000 memsz=0x888
        xc: detail: elf_parse_binary: phdr: paddr=0x1505000 memsz=0x160d8
        xc: detail: elf_parse_binary: phdr: paddr=0x151c000 memsz=0x1de000
        xc: detail: elf_parse_binary: memory: 0x1000000 -> 0x16fa000
        xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
        xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
        xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
        xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
        xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff8151c200
        xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81009000
        xc: detail: elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
        xc: detail: elf_xen_parse_note: PAE_MODE = "yes"
        xc: detail: elf_xen_parse_note: LOADER = "generic"
        xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
        xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
        xc: detail: elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
        xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
        xc: detail: elf_xen_addr_calc_check: addresses:
        xc: detail:     virt_base        = 0xffffffff80000000
        xc: detail:     elf_paddr_offset = 0x0
        xc: detail:     virt_offset      = 0xffffffff80000000
        xc: detail:     virt_kstart      = 0xffffffff81000000
        xc: detail:     virt_kend        = 0xffffffff816fa000
        xc: detail:     virt_entry       = 0xffffffff8151c200
        xc: detail:     p2m_base         = 0xffffffffffffffff
        domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff81000000 -> 0xffffffff816fa000
        domainbuilder: detail: xc_dom_mem_init: mem 128 MB, pages 0x8000 pages, 4k each
        domainbuilder: detail: xc_dom_mem_init: 0x8000 pages
        domainbuilder: detail: xc_dom_boot_mem_init: called
        domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
        domainbuilder: detail: xc_dom_malloc            : 128 kB
        domainbuilder: detail: xc_dom_build_image: called
        domainbuilder: detail: xc_dom_alloc_segment:   kernel       : 0xffffffff81000000 -> 0xffffffff816fa000  (pfn 0x1000 + 0x6fa pages)
        domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0x6fa at 0xb4ce6000
        xc: detail: elf_load_binary: phdr 0 at 0x0xb4ce6000 -> 0x0xb5113000
        xc: detail: elf_load_binary: phdr 1 at 0x0xb5113000 -> 0x0xb51e9150
        xc: detail: elf_load_binary: phdr 2 at 0x0xb51ea000 -> 0x0xb51ea888
        xc: detail: elf_load_binary: phdr 3 at 0x0xb51eb000 -> 0x0xb52010d8
        xc: detail: elf_load_binary: phdr 4 at 0x0xb5202000 -> 0x0xb5288000
        domainbuilder: detail: xc_dom_alloc_segment:   ramdisk      : 0xffffffff816fa000 -> 0xffffffff8481c000  (pfn 0x16fa + 0x3122 pages)
        domainbuilder: detail: xc_dom_malloc            : 294 kB
        domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x16fa+0x3122 at 0xb1b7a000
        domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x12a4954 -> 0x3121e10
        domainbuilder: detail: xc_dom_alloc_segment:   phys2mach    : 0xffffffff8481c000 -> 0xffffffff8485c000  (pfn 0x481c + 0x40 pages)
        domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x481c+0x40 at 0xb1b3a000
        domainbuilder: detail: xc_dom_alloc_page   :   start info   : 0xffffffff8485c000 (pfn 0x485c)
        domainbuilder: detail: xc_dom_alloc_page   :   xenstore     : 0xffffffff8485d000 (pfn 0x485d)
        domainbuilder: detail: xc_dom_alloc_page   :   console      : 0xffffffff8485e000 (pfn 0x485e)
        domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
        domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
        domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
        domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff84bfffff, 38 table(s)
        domainbuilder: detail: xc_dom_alloc_segment:   page tables  : 0xffffffff8485f000 -> 0xffffffff84888000  (pfn 0x485f + 0x29 pages)
        domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x485f+0x29 at 0xb1b11000
        domainbuilder: detail: xc_dom_alloc_page   :   boot stack   : 0xffffffff84888000 (pfn 0x4888)
        domainbuilder: detail: xc_dom_build_image  : virt_alloc_end : 0xffffffff84889000
        domainbuilder: detail: xc_dom_build_image  : virt_pgtab_end : 0xffffffff84c00000
        domainbuilder: detail: xc_dom_boot_image: called
        domainbuilder: detail: arch_setup_bootearly: doing nothing
        domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
        domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
        domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
        domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
        domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
        domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x8000
        domainbuilder: detail: clear_page: pfn 0x485e, mfn 0xdc43d
        domainbuilder: detail: clear_page: pfn 0x485d, mfn 0xdc43e
        domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x485c+0x1 at 0xb1b0e000
        domainbuilder: detail: start_info_x86_64: called
        domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff81009000 pfn=0x1009
        domainbuilder: detail: domain builder memory footprint
        domainbuilder: detail:    allocated
        domainbuilder: detail:       malloc             : 12383 kB
        domainbuilder: detail:       anon mmap          : 0 bytes
        domainbuilder: detail:    mapped
        domainbuilder: detail:       file mmap          : 21457 kB
        domainbuilder: detail:       domU mmap          : 56 MB
        domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0xdccc8
        domainbuilder: detail: shared_info_x86_64: called
        domainbuilder: detail: vcpu_x86_64: called
        domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x485f mfn 0xdc43c
        domainbuilder: detail: launch_vm: called, ctxt=0xbfd9c2ec
        domainbuilder: detail: xc_dom_release: called
        libxl: debug: libxl_event.c:1677:libxl__ao_progress_report: ao 0x806a870: progress report: callback queued aop=0x806c010
        libxl: debug: libxl_event.c:1497:libxl__ao_complete: ao 0x806a870: complete, rc=0
        libxl: debug: libxl_create.c:1197:do_domain_create: ao 0x806a870: inprogress: poller=0x806a8b0, flags=ic
        libxl: debug: libxl_event.c:1469:libxl__ao__destroy: ao 0x806a870: destroy
        libxl: debug: libxl_event.c:1090:egc_run_callbacks: ao 0x806a870: progress report: callback aop=0x806c010
        xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
        Aborted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Unable to start a domain with no disks
  2012-10-12 15:45 Unable to start a domain with no disks Ian Campbell
@ 2012-10-12 15:50 ` Ian Campbell
  2012-10-12 16:06   ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 15:50 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Roger Pau Monne

On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> Trying to start a very simple domain with no disk from:
>         name="x"
>         memory=128
>         kernel="/scratch/debian/squeeze/amd64/vmlinuz"
>         ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
>         extra="console=hvc0"
>         vcpus=1
> Results in:
>         xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
> if I try to autoconnnect the console (ie. xl cr -c).
> 
> If I don't try and auto connect to the console then everything is fine
> and I can subsequently use "xl cons" to connect.

This makes the issue go away but since this function also free's the ao
I think there must be a use after free somewhere.

diff -r 8aadb3204cfa tools/libxl/libxl_event.c
--- a/tools/libxl/libxl_event.c	Thu Sep 06 21:41:27 2012 +0200
+++ b/tools/libxl/libxl_event.c	Fri Oct 12 16:49:23 2012 +0100
@@ -1468,6 +1468,7 @@ void libxl__ao__destroy(libxl_ctx *ctx, 
     if (!ao) return;
     LOG(DEBUG,"ao %p: destroy",ao);
     if (ao->poller) libxl__poller_put(ctx, ao->poller);
+    ao->poller = NULL;
     ao->magic = LIBXL__AO_MAGIC_DESTROYED;
     libxl__free_all(&ao->gc);
     free(ao);

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Unable to start a domain with no disks
  2012-10-12 15:50 ` Ian Campbell
@ 2012-10-12 16:06   ` Ian Campbell
  2012-10-12 16:29     ` Ian Jackson
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2012-10-12 16:06 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel

On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > Trying to start a very simple domain with no disk from:
> >         name="x"
> >         memory=128
> >         kernel="/scratch/debian/squeeze/amd64/vmlinuz"
> >         ramdisk="/scratch/debian/squeeze/amd64/initrd.gz"
> >         extra="console=hvc0"
> >         vcpus=1
> > Results in:
> >         xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.
> > if I try to autoconnnect the console (ie. xl cr -c).
> > 
> > If I don't try and auto connect to the console then everything is fine
> > and I can subsequently use "xl cons" to connect.
> 
> This makes the issue go away but since this function also free's the ao
> I think there must be a use after free somewhere.

valgrind shows a couple of these sorts of use after free:
==5986== Invalid read of size 4
==5986==    at 0x4079323: libxl__egc_cleanup (libxl_event.c:1091)
==5986==    by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)
==5986==  Address 0x42af134 is 28 bytes inside a block of size 1,780 free'd
==5986==    at 0x4024F99: free (vg_replace_malloc.c:446)
==5986==    by 0x406E5A4: libxl__free_all (libxl_internal.c:74)
==5986==    by 0x4077B31: libxl__ao__destroy (libxl_event.c:1472)
==5986==    by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986==    by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)

And a bunch of:
==5986== Invalid read of size 4
==5986==    at 0x4079378: libxl__egc_cleanup (libxl_event.c:1094)
==5986==    by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)
==5986==  Address 0x42aef60 is 8 bytes inside a block of size 56 free'd
==5986==    at 0x4024F99: free (vg_replace_malloc.c:446)
==5986==    by 0x4077B39: libxl__ao__destroy (libxl_event.c:1473)
==5986==    by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986==    by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)

The last of which is:
==5986== Invalid read of size 4
==5986==    at 0x40787BC: libxl__ao_complete_check_progress_reports (libxl_event.c:1521)
==5986==    by 0x407938A: libxl__egc_cleanup (libxl_event.c:1095)
==5986==    by 0x405E088: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)
==5986==  Address 0x42aef80 is 40 bytes inside a block of size 56 free'd
==5986==    at 0x4024F99: free (vg_replace_malloc.c:446)
==5986==    by 0x4077B39: libxl__ao__destroy (libxl_event.c:1473)
==5986==    by 0x4079881: libxl__ao_inprogress (libxl_event.c:1638)
==5986==    by 0x405E042: do_domain_create (libxl_create.c:1204)
==5986==    by 0x405E13E: libxl_domain_create_new (libxl_create.c:1225)
==5986==    by 0x805C5A6: create_domain (xl_cmdimpl.c:1932)
==5986==    by 0x805DED1: main_create (xl_cmdimpl.c:3973)
==5986==    by 0x804D41D: main (xl.c:285)
==5986== 
xl: libxl_event.c:1522: libxl__ao_complete_check_progress_reports: Assertion `ao->in_initiator' failed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Unable to start a domain with no disks
  2012-10-12 16:06   ` Ian Campbell
@ 2012-10-12 16:29     ` Ian Jackson
  2012-10-15  9:18       ` Ian Campbell
  2012-10-18  8:35       ` Ian Campbell
  0 siblings, 2 replies; 6+ messages in thread
From: Ian Jackson @ 2012-10-12 16:29 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Roger Pau Monne, xen-devel

Ian Campbell writes ("Re: [Xen-devel] Unable to start a domain with no disks"):
> On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> > On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > > Trying to start a very simple domain with no disk from:

> > This makes the issue go away but since this function also free's the ao
> > I think there must be a use after free somewhere.

In this case what happens is that the tty available event gets queued
in the outer egc, which survives until after libxl__ao_inprogress has
returned.  You only see the bug with no disks because with disks the
events are only generated later in the event loop inside
libxl__ao_inprogress.

Ian.


Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events

There are two egcs in an ao initiator: the one in the AO_CREATE
function, and the one in libxl__ao_inprogress.  If synchronous ao
operation generates progress events and completes immediately, the
progress callbacks end up queued in the outer egc.  These callbacks
are currently only called after libxl__ao_inprogress has returned, and
keep the ao alive until they happen.  This is not good because the
principle is that a synchronous ao is not supposed to survive beyond
libxl__ao_inprogress's return.

The fix is to ensure that the callbacks queued in the outer egc are
called early enough that they don't preserve the ao.  This is
straightforward in the AO_INPROGRESS macro because AO_CREATE's egc is
not used inside that macro other than to destroy it.  All we have to
do is destroy it a bit sooner.

This involves unlocking and relocking the ctx since EGC_FREE expects
to be called with the lock released but libxl__ao_inprogress needs it
locked.  The is hole in our lock tenure is fine - libxl__ao_inprogress
has such holes already.

It is still possible to use the CTX_LOCK macros for this unlock/lock
because the gc we are using is destroyed only afterwards by
libxl__ao_inprogress.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
--- a/tools/libxl/libxl_internal.h	Thu Sep 06 21:41:27 2012 +0200
+++ b/tools/libxl/libxl_internal.h	Fri Oct 12 17:20:10 2012 +0100
@@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
 
 #define AO_INPROGRESS ({                                        \
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
+        CTX_UNLOCK;                                             \
+        EGC_FREE;                                               \
+        CTX_LOCK;                                               \
         int ao__rc = libxl__ao_inprogress(ao,                   \
                                __FILE__, __LINE__, __func__);   \
         libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
-        EGC_FREE;                                               \
         (ao__rc);                                               \
    })

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Unable to start a domain with no disks
  2012-10-12 16:29     ` Ian Jackson
@ 2012-10-15  9:18       ` Ian Campbell
  2012-10-18  8:35       ` Ian Campbell
  1 sibling, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2012-10-15  9:18 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel

On Fri, 2012-10-12 at 17:29 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] Unable to start a domain with no disks"):
> > On Fri, 2012-10-12 at 16:50 +0100, Ian Campbell wrote:
> > > On Fri, 2012-10-12 at 16:45 +0100, Ian Campbell wrote:
> > > > Trying to start a very simple domain with no disk from:
> 
> > > This makes the issue go away but since this function also free's the ao
> > > I think there must be a use after free somewhere.
> 
> In this case what happens is that the tty available event gets queued
> in the outer egc, which survives until after libxl__ao_inprogress has
> returned.  You only see the bug with no disks because with disks the
> events are only generated later in the event loop inside
> libxl__ao_inprogress.
> 
> Ian.
> 
> 
> Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events
> 
> There are two egcs in an ao initiator: the one in the AO_CREATE
> function, and the one in libxl__ao_inprogress.  If synchronous ao
> operation generates progress events and completes immediately, the
> progress callbacks end up queued in the outer egc.  These callbacks
> are currently only called after libxl__ao_inprogress has returned, and
> keep the ao alive until they happen.  This is not good because the
> principle is that a synchronous ao is not supposed to survive beyond
> libxl__ao_inprogress's return.
> 
> The fix is to ensure that the callbacks queued in the outer egc are
> called early enough that they don't preserve the ao.  This is
> straightforward in the AO_INPROGRESS macro because AO_CREATE's egc is
> not used inside that macro other than to destroy it.  All we have to
> do is destroy it a bit sooner.
> 
> This involves unlocking and relocking the ctx since EGC_FREE expects
> to be called with the lock released but libxl__ao_inprogress needs it
> locked.  The is hole in our lock tenure is fine - libxl__ao_inprogress
> has such holes already.
> 
> It is still possible to use the CTX_LOCK macros for this unlock/lock
> because the gc we are using is destroyed only afterwards by
> libxl__ao_inprogress.
> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> 
> diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
> --- a/tools/libxl/libxl_internal.h	Thu Sep 06 21:41:27 2012 +0200
> +++ b/tools/libxl/libxl_internal.h	Fri Oct 12 17:20:10 2012 +0100
> @@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
>  
>  #define AO_INPROGRESS ({                                        \
>          libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
> +        CTX_UNLOCK;                                             \
> +        EGC_FREE;                                               \
> +        CTX_LOCK;                                               \
>          int ao__rc = libxl__ao_inprogress(ao,                   \
>                                 __FILE__, __LINE__, __func__);   \
>          libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
> -        EGC_FREE;                                               \
>          (ao__rc);                                               \
>     })
>  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Unable to start a domain with no disks
  2012-10-12 16:29     ` Ian Jackson
  2012-10-15  9:18       ` Ian Campbell
@ 2012-10-18  8:35       ` Ian Campbell
  1 sibling, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2012-10-18  8:35 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Roger Pau Monne, xen-devel

On Fri, 2012-10-12 at 17:29 +0100, Ian Jackson wrote:

> Subject: [PATCH] libxl: ao: cope with fast ao completion with progess events
[...]
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>

Acked-by: Ian Campbell <Ian.campbell@citrix.com>

and applied, thanks.

> 
> diff -r 8aadb3204cfa tools/libxl/libxl_internal.h
> --- a/tools/libxl/libxl_internal.h	Thu Sep 06 21:41:27 2012 +0200
> +++ b/tools/libxl/libxl_internal.h	Fri Oct 12 17:20:10 2012 +0100
> @@ -1709,10 +1709,12 @@ _hidden void libxl__egc_cleanup(libxl__e
>  
>  #define AO_INPROGRESS ({                                        \
>          libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
> +        CTX_UNLOCK;                                             \
> +        EGC_FREE;                                               \
> +        CTX_LOCK;                                               \
>          int ao__rc = libxl__ao_inprogress(ao,                   \
>                                 __FILE__, __LINE__, __func__);   \
>          libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
> -        EGC_FREE;                                               \
>          (ao__rc);                                               \
>     })
>  

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-10-18  8:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-12 15:45 Unable to start a domain with no disks Ian Campbell
2012-10-12 15:50 ` Ian Campbell
2012-10-12 16:06   ` Ian Campbell
2012-10-12 16:29     ` Ian Jackson
2012-10-15  9:18       ` Ian Campbell
2012-10-18  8:35       ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).