From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KxHLw-0004I6-7e for qemu-devel@nongnu.org; Tue, 04 Nov 2008 03:33:32 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KxHLu-0004HE-EO for qemu-devel@nongnu.org; Tue, 04 Nov 2008 03:33:31 -0500 Received: from [199.232.76.173] (port=54202 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KxHLu-0004H5-5q for qemu-devel@nongnu.org; Tue, 04 Nov 2008 03:33:30 -0500 Received: from mx20.gnu.org ([199.232.41.8]:53425) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KxHLt-00060I-W7 for qemu-devel@nongnu.org; Tue, 04 Nov 2008 03:33:30 -0500 Received: from rv-out-0708.google.com ([209.85.198.243]) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KxHLs-0004fj-UQ for qemu-devel@nongnu.org; Tue, 04 Nov 2008 03:33:29 -0500 Received: by rv-out-0708.google.com with SMTP id f25so2775934rvb.22 for ; Tue, 04 Nov 2008 00:33:27 -0800 (PST) Message-ID: Date: Tue, 4 Nov 2008 09:33:27 +0100 From: "andrzej zaborowski" Subject: Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s In-Reply-To: <49100654.6020108@web.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <490B53F0.9040200@codemonkey.ws> <490DFAA2.7040900@web.de> <490F5942.6020100@codemonkey.ws> <490F60E4.1040304@web.de> <490F723E.7070308@web.de> <490F7489.6040403@codemonkey.ws> <49100654.6020108@web.de> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: qemu-devel@nongnu.org 2008/11/4 Jan Kiszka : > andrzej zaborowski wrote: >> 2008/11/3 Anthony Liguori : >>> Jan Kiszka wrote: >>>> Jan Kiszka wrote: >>>> >>>> There is a race between the alarm_timer firing SIGALRM and >>>> main_loop_wait reaching the safe harbor of select (with that infamous 5 >>>> second timeout). If the signal comes when already blocked in select, it >>>> will properly resume the latter immediately. But if the timer fired >>>> BEFORE that point, host_alarm_handler will only set a flag that the host >>>> timer has fired, the actual rearming will be done AFTER return from >>>> select. Ooops.... >>>> >>> Ah, so before this was causing the timer to potentially come 10ms later than >>> it should have. I was hoping that this change would shake out this stuff >>> :-) >>> >>>> So, select should actually include the host timer as event. timerfd? >>>> Unfortunately a recent Linux-only feature :-/. I don't think we can >>>> rearm the timer from within the signal handler, at least not without >>>> running all the pending qemu timers. And that is surely not a signal >>>> handler job (qemu timer handler aren't thread-safe in general). >>>> >>>> Anyone any ideas? /me is thinking a bit more about it as well. >> >> The select() man page on Linux mentions this race explicitely and >> explains that pselect() is a solution. >> >>> host_alarm_handler should write to a file descriptor instead of setting a >>> flag. That file descriptor should then be select()'d on (just like we do >>> for SIGUSR2 in block-raw-posix.c). >> >> Or you can do this. > > I think this is safer. Or what's the state of pselect on all supported > platforms (including WIN32)? Supposedly it's in posix, but no idea about win32. Maybe the pipe is safer. > My man page even warns that the Linux > kernel is not implementing it yet, though I don't think this still > applies to recent 2.6.2x kernels. According to the man page it moved to kernel at 2.6.16 but the glibc wrapper should be ok too. Cheers