From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Kxt7f-00029E-Ng for qemu-devel@nongnu.org; Wed, 05 Nov 2008 19:53:19 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Kxt7e-00027V-30 for qemu-devel@nongnu.org; Wed, 05 Nov 2008 19:53:19 -0500 Received: from [199.232.76.173] (port=60316 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Kxt7d-00027H-Qz for qemu-devel@nongnu.org; Wed, 05 Nov 2008 19:53:17 -0500 Received: from mail2.shareable.org ([80.68.89.115]:60647) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Kxt7d-0006h4-DO for qemu-devel@nongnu.org; Wed, 05 Nov 2008 19:53:17 -0500 Date: Thu, 6 Nov 2008 00:53:12 +0000 From: Jamie Lokier Subject: Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s Message-ID: <20081106005312.GA26173@shareable.org> References: <20081104113204.GA32125@shareable.org> <20081104.092231.-1384053398.imp@bsdimp.com> <20081105150042.GJ13630@shareable.org> <20081105.091015.232928302.imp@bsdimp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081105.091015.232928302.imp@bsdimp.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "M. Warner Losh" Cc: jan.kiszka@web.de, qemu-devel@nongnu.org M. Warner Losh wrote: > : > Which ones have a good kernel implementation of it? FreeBSD's is > : > currently approximately: > : > > : > if (!mask) > : > _sigprocmask(mask, &oldmask); > : > /* here */ > : > select(); > : > if (!mask) > : > _sigprocmask(oldmask, NULL); > : > > : > I'm assuming that the problem is due to a signal arriving at /* here */. > : > : If that's _kernel_ code and the kernel behaves like Linux, it's not a > : problem because signals don't affect the control flow until returning > : to userspace, meaning the select() will return EINTR. > > It is currently user level code, and I'm looking at moving it into the > kernel, but I need to understand the race being talked about here. Ugh, I had imagined FreeBSD would have got that right, since it's quite good in other areas. I've added FreeBSD to my blacklist of broken pselect() implementations, thanks for the info. Do you know if FreeBSD's pread() and pwrite() are also thread-unsafe userspace wrappers using lseek+read/write? They are harder to avoid when you're looking at high performance code. > Why is it no good. What is the race here? Is it just the oldmask > thing and multiple callers to select, or is it something else? It's racy with a single caller. The race is: program's signal handler sets a flag like "alarm_happened = 1". The program's main loop checks the flag before calling select(). If the signal is delivered before that check, the program doesn't call select() and handles the reason for the flag. If the signal is delivered during select(), that returns EINTR and the program handles the reason for the flag. But if the signal is delivered _between_ checking the flag and calling select(), the program gets stuck. pselect() avoids that stuck state, by blocking the signal before the program checks the flag, and guaranteeing if the signal is delivered after that point, pselect() returns EINTR. It's sort of analogous to pthread_cond_wait() needing a mutex. > And if it is the oldmask thing, why wouldn't multiple callers of > pselect mess it up depending on what order they have. Signal masks are per-thread anyway, multiple callers isn't an issue. -- Jamie