From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44458) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SSAkW-0004tw-5k for qemu-devel@nongnu.org; Wed, 09 May 2012 13:32:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SSAkR-0000Sp-3d for qemu-devel@nongnu.org; Wed, 09 May 2012 13:32:27 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:33246) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SSAkQ-0000Rx-O6 for qemu-devel@nongnu.org; Wed, 09 May 2012 13:32:23 -0400 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 9 May 2012 11:32:15 -0600 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 20A87C90060 for ; Wed, 9 May 2012 13:31:35 -0400 (EDT) Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q49HVbZ324772780 for ; Wed, 9 May 2012 13:31:37 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q49N2TD7027792 for ; Wed, 9 May 2012 19:02:29 -0400 Message-ID: <4FAAA9F7.7020702@us.ibm.com> Date: Wed, 09 May 2012 12:31:35 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <4FA97596.4000807@siemens.com> <4FAA42EB.2080407@redhat.com> <4FAA5721.9060201@siemens.com> <4FAAA6AA.2040400@codemonkey.ws> <4FAAA893.9050506@siemens.com> In-Reply-To: <4FAAA893.9050506@siemens.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] coroutine-ucontext broken for x86-32 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Kevin Wolf , Peter Maydell , Michael Tokarev , qemu-devel , Anthony Liguori On 05/09/2012 12:25 PM, Jan Kiszka wrote: > On 2012-05-09 14:17, Anthony Liguori wrote: >> On 05/09/2012 06:38 AM, Jan Kiszka wrote: >>> On 2012-05-09 08:15, Peter Maydell wrote: >>>> On 9 May 2012 11:11, Kevin Wolf wrote: >>>>> Am 08.05.2012 21:35, schrieb Jan Kiszka: >>>>>> I hunted down a fairly subtle corruption of the VCPU thread signal mask >>>>>> in KVM mode when using the ucontext version of coroutines: >>>>>> >>>>>> coroutine_new calls getcontext, makecontext, swapcontext. Those >>>>>> functions get/set also the signal mask of the caller. Unfortunately, >>>>>> they only use the sigprocmask syscall on i386, not the rt_sigprocmask >>>>>> version. So they do not properly save/restore the blocked RT signals, >>>>>> namely our SIG_IPI - it becomes unblocke this way. >>>>> >>>>> If other coroutine backends work (sigaltstack?), we could try to detect >>>>> the situation in configure and set the right default. Not sure what the >>>>> condition is, glibc + i386? >>>> >>>> I don't think you can do a compile-time test for this short of >>>> just disabling use of the ucontext code on all i386/Linux platforms. >>>> >>>> I think it's becoming increasingly obvious that the setcontext/getcontext >>>> code path is not very well used and prone to nasty libc bugs. Trying >>>> to implement coroutines in C is just a really bad idea and I think >>>> we should be trying to reduce our use of them if we possibly can, >>>> presumably by switching to actually using threads where we really >>>> need the parallelism. >>> >>> I tend to agree. >>> >>> FWIW, sigaltstack works around the issue here, but I'm still looking s >>> bit skeptical at its implementation. >> >> Is there any downside to using SIGUSR1? > > You mean for SIG_IPI? I don't think so. But the point is that the, well, > limitation of ucontext will continue to break RT signals, Yes, but we currently don't use RT signals, right? So we could switch to SIGUSR1, fix the problem in glibc, and call it a day, no? Regards, Anthony Liguori and this in a > very nasty way as only a specific setup is affected. I can't imagine we > want this. > > Jan >