* Re: [libvirt] [vbox-dev] Assert with libvirt + xen hvm
[not found] ` <54C63E53.6040206@oracle.com>
@ 2015-11-11 1:33 ` Cole Robinson
[not found] ` <m2n.s.1ZwKNy-400131@chiark.greenend.org.uk>
1 sibling, 0 replies; 2+ messages in thread
From: Cole Robinson @ 2015-11-11 1:33 UTC (permalink / raw)
To: Klaus Espenlaub, Michal Privoznik, CloudPatch Staff
Cc: libvir-list, vbox-dev, Xen Devel
Reviving old thread, this came up again via a Fedora bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1278847
https://retrace.fedoraproject.org/faf/reports/597209/
I took a cursory look at libxl's sigchld handling... it's intense to say the
least, but there's some driver options that tweak the handling. Maybe there's
a simple fix.
jfehlig, any suggestions here? The summary is that virtualbox loads a stub
SIGCHLD handler of its own, which causes libxl to assert in
sigchld_installhandler_core
Thanks,
Cole
On 01/26/2015 08:17 AM, Klaus Espenlaub wrote:
> Hi all,
>
> On 26.01.2015 10:52, Michal Privoznik wrote:
>> [CC'ing vbox-dev list]
>>
>> On 23.01.2015 21:42, CloudPatch Staff wrote:
>>> After some debugging we found what was causing of the assert. In our
>>> configuration we have
>>> two kernels to boot, one is a pv-linux for Xen dom0 and another just a
>>> normal linux kernel.
>>> We have libvirt built with both Xen and vbox support. When running with
>>> Xen, the libxl
>>> driver is used so it ends calling libxenlight who doesn't want any
>>> SIGCHLD handler set.
>>> Normally this is the case but, since we have vbox support in libvirt the
>>> vbox driver loads
>>> some of the vbox libs and one of them sets a SIGCHLD handler. When
>>> libxenlight checks
>>> if there is any handler for SIGCHLD it finds that one and fails.
>>>
>>> Here is the backtrace from when vbox is setting the handler:
>>>
>>> (gdb) bt
>>> #0 0x00007ffff71d8250 in sigaction () from /lib64/libpthread.so.0
>>> #1 0x00007fffedbf3716 in ?? () from /usr/lib64/virtualbox/VBoxRT.so
>>> #2 0x00007fffedbf3960 in ?? () from /usr/lib64/virtualbox/VBoxRT.so
>>> #3 0x00007fffedeea485 in VBoxGetCAPIFunctions () from
>>> /usr/lib64/virtualbox/VBoxXPCOMC.so
> [...]
>
> That's
> https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Main/cbinding/VBoxCAPI.cpp#L718
> - initializing the runtime our code uses which abstracts platform specifics.
>
> The rationale why the runtime installs a dummy signal handler is that if
> SIGCHLD is set to be ignored (that's the default) then POSIX compliant
> waitpid() won't work. It's mentioned in the wait(2) man page on my linux
> system, see the notes about the ECHILD error code.
>
> It's really excessive that some other code prevents using the full
> functionality. VirtualBox checks if the SIGCHLD handler is already set. If the
> signal isn't ignored it happily continues without touching SIGCHLD. So if
> libvirt wants to establish a sane default it would also work.
>
> To state the obvious: running an API client with crippled waitpid() can lead
> to extremely strange behavior. XPCOM creates worker threads on demand and
> expects to be able to wait on their termination.
>
>>> After building libvirt without vbox support the assert disappeared.
>>
>> So what are you saying is that vbox interferes with libxenlight? While I
>> see why vbox library wants SIGCHLD handler, maybe they became fault
>> tolerant meanwhile. Or?
>
> Hope the explanations help resolving the conflict. It's not that we want to
> sabotage anyone else.
>
> Klaus
>>
>>>
>>>
>>> On Fri, Jan 23, 2015 at 5:14 AM, Michal Privoznik <mprivozn@redhat.com
>>> <mailto:mprivozn@redhat.com>> wrote:
>>>
>>> On 22.01.2015 17:49, CloudPatch Staff wrote:
>>> > We're hitting an assert whenever we try to create an HVM instance under
>>> > Xen via libvirtd.
>>> >
>>> > System is running on Gentoo, package information as follows:
>>> >
>>> > app-emulation/xen-4.5.0 USE="api debug flask hvm pam pygrub python qemu
>>> > screen"
>>> > app-emulation/xen-tools-4.5.0 USE="api debug flask hvm pam pygrub
>>> python
>>> > qemu screen"
>>> > app-emulation/libvirt-1.2.11-r2:0/1.2.11 USE="caps libvirtd lvm macvtap
>>> > nls qemu udev vepa virtualbox xen"
>>> >
>>> > The following commands are run in parallel:
>>> >
>>> > vmmachine ~ # libvirtd --listen
>>> > 2015-01-22 16:33:13.596+0000: 2620: info : libvirt version: 1.2.11
>>> > 2015-01-22 16:33:13.596+0000: 2620: error : udevGetDMIData:1607 :
>>> Failed
>>> > to get udev device for syspath '/sys/devices/virtual/dmi/id' or
>>> > '/sys/class/dmi/id'
>>> > libvirtd: libxl_fork.c:350: sigchld_installhandler_core: Assertion
>>> > `((void)"application must negotiate with libxl about SIGCHLD",
>>> > !(sigchld_saved_action.sa_flags & 4) &&
>>> > (sigchld_saved_action.__sigaction_handler.sa_handler ==
>>> > ((__sighandler_t) 0) ||
>>> > sigchld_saved_action.__sigaction_handler.sa_handler ==
>>> ((__sighandler_t)
>>> > 1)))' failed.
>>> > Aborted
>>>
>>> Interesting. Can you attach a debugger so we can see stacktrace?
>>>
>>> Michal
>>>
>>>
>>
>> Michal
>>
>
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
>
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [libvirt] [Xen-devel] [vbox-dev] Assert with libvirt + xen hvm
[not found] ` <m2n.s.1ZwKNy-400131@chiark.greenend.org.uk>
@ 2015-11-11 10:23 ` Ian Jackson
0 siblings, 0 replies; 2+ messages in thread
From: Ian Jackson @ 2015-11-11 10:23 UTC (permalink / raw)
To: Cole Robinson
Cc: vbox-dev, Xen Devel, libvir-list, Michal Privoznik,
CloudPatch Staff, Klaus Espenlaub
Cole Robinson writes ("Re: [Xen-devel] [libvirt] [vbox-dev] Assert with libvirt + xen hvm"):
> I took a cursory look at libxl's sigchld handling... it's intense to say the
> least, but there's some driver options that tweak the handling. Maybe there's
> a simple fix.
Sadly the way that there's a single SIGCHLD handler in the Unix API is
extremely awkward in multithreaded programs. The complicated code in
libxl is trying to help cope with that.
> On 01/26/2015 08:17 AM, Klaus Espenlaub wrote:
> > The rationale why the runtime installs a dummy signal handler is that if
> > SIGCHLD is set to be ignored (that's the default) then POSIX compliant
> > waitpid() won't work. It's mentioned in the wait(2) man page on my linux
> > system, see the notes about the ECHILD error code.
I think you are wrong about this.
SIGCHLD (like all signals) is set to SIG_DFL by default. But it is
only SIG_IGN that causes automatic reaping of children. I normally
use the online Open Group spec to check this kind of thing - it's more
authoritative than manpages.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03
SIG_IGN
...
If the action for the SIGCHLD signal is set to SIG_IGN, child
processes of the calling processes shall not be transformed into
zombie processes when they terminate. If the calling process
subsequently waits for its children, and the process has no
unwaited-for children that were transformed into zombie processes,
it shall block until all of its children terminate, and wait(),
waitid(), and waitpid() shall fail and set errno to
[ECHILD].
I don't know exactly what virtualbox does with children but if the
only reason it is setting a SIGCHLD handler is to avoid its children
being automatically reaped, leaving it set to SIG_DFL will work just
fine.
If virtualbox always reaps its children synchronously (ie, it does not
use SIGCHLD to get notified of child death) then setting the libxl
signal mode to
libxl_sigchld_owner_libxl_always_selective_reap
in libvirt should work. This is what libvirt (at laast in master)
does.
There would then be no need for virtualbox to do anything to SIGCHLD.
Although, it might be worth checking that it isn't SIG_IGN and
screaming somewhere if it is.
Incidentally it seems to me possible that the reason why virtualbox
explictly sets a SIGCHLD handler may be that at some point in the past
you had encountered a bug with someone in the same process setting
SIGCHLD to SIG_IGN.
Ian.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-11-11 10:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CACoNTyFTWh+2zxO8B_Dx4vzitvLNVAUF3+_AAW4qPpLLTUkmpA@mail.gmail.com>
[not found] ` <54C202D8.7080202@redhat.com>
[not found] ` <CACoNTyHydqOV_vCWYtJ+_1P5QthetXBUPD1Rk=2G8ntCOSWyTg@mail.gmail.com>
[not found] ` <54C60E79.7020304@redhat.com>
[not found] ` <54C63E53.6040206@oracle.com>
2015-11-11 1:33 ` [libvirt] [vbox-dev] Assert with libvirt + xen hvm Cole Robinson
[not found] ` <m2n.s.1ZwKNy-400131@chiark.greenend.org.uk>
2015-11-11 10:23 ` [libvirt] [Xen-devel] " Ian Jackson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.