From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:54301) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rmuhd-0002Tk-U8 for qemu-devel@nongnu.org; Mon, 16 Jan 2012 17:06:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rmuhc-00021g-5T for qemu-devel@nongnu.org; Mon, 16 Jan 2012 17:06:57 -0500 Received: from e32.co.us.ibm.com ([32.97.110.150]:54776) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rmuhb-00021P-RV for qemu-devel@nongnu.org; Mon, 16 Jan 2012 17:06:56 -0500 Received: from /spool/local by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 16 Jan 2012 15:06:54 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 7C8883E40053 for ; Mon, 16 Jan 2012 15:06:51 -0700 (MST) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q0GM6kbg075844 for ; Mon, 16 Jan 2012 15:06:48 -0700 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q0GM6kml021233 for ; Mon, 16 Jan 2012 15:06:46 -0700 Message-ID: <4F149F75.1030405@linux.vnet.ibm.com> Date: Mon, 16 Jan 2012 16:06:45 -0600 From: Michael Roth MIME-Version: 1.0 References: <1326482122-12619-1-git-send-email-lcapitulino@redhat.com> <1326482122-12619-3-git-send-email-lcapitulino@redhat.com> <4F10A694.8030900@redhat.com> <20120116150853.40626823@doriath> <20120116171339.GA2297@redhat.com> <20120116151837.4d90a96c@doriath> <20120116152352.3f97d12e@doriath> <4F148240.6070900@linux.vnet.ibm.com> <20120116203552.GB12789@redhat.com> In-Reply-To: <20120116203552.GB12789@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 2/2] qemu-ga: Add the guest-suspend command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: jcody@redhat.com, Eric Blake , qemu-devel@nongnu.org, Luiz Capitulino On 01/16/2012 02:35 PM, Daniel P. Berrange wrote: > On Mon, Jan 16, 2012 at 02:02:08PM -0600, Michael Roth wrote: >> On 01/16/2012 11:23 AM, Luiz Capitulino wrote: >>> On Mon, 16 Jan 2012 15:18:37 -0200 >>> Luiz Capitulino wrote: >>> >>>> On Mon, 16 Jan 2012 17:13:39 +0000 >>>> "Daniel P. Berrange" wrote: >>>> >>>>> On Mon, Jan 16, 2012 at 03:08:53PM -0200, Luiz Capitulino wrote: >>>>>> On Fri, 13 Jan 2012 14:48:04 -0700 >>>>>> Eric Blake wrote: >>>>>> >>>>>>>> + >>>>>>>> + pid = fork(); >>>>>>>> + if (!pid) { >>>>>>>> + char buf[32]; >>>>>>>> + FILE *sysfile; >>>>>>>> + const char *arg; >>>>>>>> + const char *pmutils_bin = "pm-is-supported"; >>>>>>>> + >>>>>>>> + if (strcmp(mode, "hibernate") == 0) { >>>>>>> >>>>>>> Strangely enough, POSIX doesn't include strcmp() in its list of >>>>>>> async-signal-safe functions (which is what you should be restricting >>>>>>> yourself to, if qemu-ga is multi-threaded), but in practice, I think >>>>>>> that is a bug of omission in POSIX, and not something you have to change >>>>>>> in your code. >>>>>> >>>>>> memset() ins't either... sigaction() either, which begins to get >>>>>> annoying. >>>>>> >>>>>> For those familiar with glib: isn't it possible to confirm it's using >>>>>> threads and/or acquire a global mutex or something? >>> >>> Misread, sigaction() is there. The ones that aren't are strcmp(), strstr() >>> and memset(). Interestingly, they are all "string functions". >>> >> >> There seem to be things beyond that list required to be implemented >> as thread/signal safe: >> >> http://www.unix.org/whitepapers/reentrant.html >> >> fread()/fwrite()/f* for instance, more at `man flockfile`: >> >> The stdio functions are thread-safe. >> This is achieved by assigning to each >> FILE object a lockcount and (if the >> lockcount is nonzero) an owning >> thread. For each library call, these >> functions wait until the FILE object >> is no longer locked by a different >> thread, then lock it, do the requested >> I/O, and unlock the object again. > > You need to be careful with terminology here. > > Threadsafe != async signal safe > > STDIO is one of the major areas of code that is definitely not > async signal safe. Consider Thread A doing something like > fwrite(stderr, "Foo\n"), while another thread forks, and then > its child also does an fwrite(stderr, "Foo\n"). Given that > every stdio function will lock/unlock a mutex, you easily get > this sequence of events: > > 1. Thread A: lock(stderr) > 2. Thread A: write(stderr, "foo\n"); > 3. Thread B: fork() -> Process B1 > 4. Thread A: unlock(stderr) > 5. Process B1: lock(stderr) > > When the child process is started at step 3, the FILE *stderr > object will be locked by thread A. When Thread A does the > unlock in step 4, it has no effect on Process B1. So process > B1 hangs forever in step 5. > Ahh, thanks for the example. I missed that these issues were specifically WRT to code that was fork()'d from a multi-threaded application. Seemed pretty scary otherwise :) >> In practice, are these functions really a problem for multi-threaded >> applications (beyond concurrent access to shared storage)? Maybe it >> would be sufficient to just check the glibc sources? > > In libvirt we have seen the hang scenarios I describe in the real world. > Causes I rememeber were use of malloc (via asprintf()), or use of stdio > FILE * functions, and use of syslog. The libvirt code still isn't 100% > in compliance with avoiding async signal safe functions, but we have > cleaned up many problems in this area. Thanks, looks like I have so fixes to do in qmp_guest_shutdown. syslog is really unfortunate... > > Regards, > Daniel