* [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s @ 2008-10-31 18:40 Anthony Liguori 2008-10-31 18:52 ` Anthony Liguori 2008-10-31 19:41 ` [Qemu-devel] " Jamie Lokier 0 siblings, 2 replies; 28+ messages in thread From: Anthony Liguori @ 2008-10-31 18:40 UTC (permalink / raw) To: qemu-devel Revision: 5578 http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5578 Author: aliguori Date: 2008-10-31 18:40:25 +0000 (Fri, 31 Oct 2008) Log Message: ----------- Increase default IO timeout from 10ms to 5s With the recent changes to the main loop, we no longer have unconditional polling. This means we can now sleep in select() for much longer than we previously did. This patch increases our select() sleep time from 10ms to 5s which is effectively unlimited since we're going to wake up sooner than that in almost all circumstances. With this patch, I see the number of wake-ups with an idle dynamic ticks guest drop from 80 per second to about 15 times per second. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Modified Paths: -------------- trunk/vl.c Modified: trunk/vl.c =================================================================== --- trunk/vl.c 2008-10-31 18:07:17 UTC (rev 5577) +++ trunk/vl.c 2008-10-31 18:40:25 UTC (rev 5578) @@ -8182,7 +8182,7 @@ timeout = 0; } } else { - timeout = 10; + timeout = 5000; } } else { timeout = 0; @@ -8192,7 +8192,7 @@ ret = EXCP_INTERRUPT; break; } - timeout = 10; + timeout = 5000; } #ifdef CONFIG_PROFILER ti = profile_getclock(); ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s 2008-10-31 18:40 [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s Anthony Liguori @ 2008-10-31 18:52 ` Anthony Liguori 2008-11-02 19:08 ` [Qemu-devel] " Jan Kiszka 2008-10-31 19:41 ` [Qemu-devel] " Jamie Lokier 1 sibling, 1 reply; 28+ messages in thread From: Anthony Liguori @ 2008-10-31 18:52 UTC (permalink / raw) To: qemu-devel If anyone notices a slow down, I'd check this revision. There may be broken bits of code out there that depends on wake up rate. I've done a fair bit of testing and haven't found any but you never know. Regards, Anthony Liguori Anthony Liguori wrote: > Revision: 5578 > http://svn.sv.gnu.org/viewvc/?view=rev&root=qemu&revision=5578 > Author: aliguori > Date: 2008-10-31 18:40:25 +0000 (Fri, 31 Oct 2008) > > Log Message: > ----------- > Increase default IO timeout from 10ms to 5s > > With the recent changes to the main loop, we no longer have unconditional > polling. This means we can now sleep in select() for much longer than we > previously did. This patch increases our select() sleep time from 10ms to 5s > which is effectively unlimited since we're going to wake up sooner than that > in almost all circumstances. > > With this patch, I see the number of wake-ups with an idle dynamic ticks guest > drop from 80 per second to about 15 times per second. > > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> > > Modified Paths: > -------------- > trunk/vl.c > > Modified: trunk/vl.c > =================================================================== > --- trunk/vl.c 2008-10-31 18:07:17 UTC (rev 5577) > +++ trunk/vl.c 2008-10-31 18:40:25 UTC (rev 5578) > @@ -8182,7 +8182,7 @@ > timeout = 0; > } > } else { > - timeout = 10; > + timeout = 5000; > } > } else { > timeout = 0; > @@ -8192,7 +8192,7 @@ > ret = EXCP_INTERRUPT; > break; > } > - timeout = 10; > + timeout = 5000; > } > #ifdef CONFIG_PROFILER > ti = profile_getclock(); > > > > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-10-31 18:52 ` Anthony Liguori @ 2008-11-02 19:08 ` Jan Kiszka 2008-11-03 20:04 ` Anthony Liguori 0 siblings, 1 reply; 28+ messages in thread From: Jan Kiszka @ 2008-11-02 19:08 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 562 bytes --] Anthony Liguori wrote: > If anyone notices a slow down, I'd check this revision. There may be > broken bits of code out there that depends on wake up rate. I've done a > fair bit of testing and haven't found any but you never know. OK, here is one brokenness for you: Musicpal (qemu-arm-system) becomes unusable, sounds is stopping for seconds, even keyboard input is inpossible, including switching to the monitor console. What should I check for? Note that this board uses a ptimer to a PIT to the guest. May this make a difference here? Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-02 19:08 ` [Qemu-devel] " Jan Kiszka @ 2008-11-03 20:04 ` Anthony Liguori 2008-11-03 20:36 ` Jan Kiszka 0 siblings, 1 reply; 28+ messages in thread From: Anthony Liguori @ 2008-11-03 20:04 UTC (permalink / raw) To: Jan Kiszka; +Cc: qemu-devel Jan Kiszka wrote: > Anthony Liguori wrote: > >> If anyone notices a slow down, I'd check this revision. There may be >> broken bits of code out there that depends on wake up rate. I've done a >> fair bit of testing and haven't found any but you never know. >> > > OK, here is one brokenness for you: Musicpal (qemu-arm-system) becomes > unusable, sounds is stopping for seconds, even keyboard input is > inpossible, including switching to the monitor console. What should I > check for? Note that this board uses a ptimer to a PIT to the guest. May > this make a difference here? > Keyboard input should be governed by the VGA refresh timer which should be firing 30 times a second. How do you have qemu configured? Are you using SDL? What sound driver are you using on the host? Regards, Anthony Liguori > Jan > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 20:04 ` Anthony Liguori @ 2008-11-03 20:36 ` Jan Kiszka 2008-11-03 21:50 ` Jan Kiszka 0 siblings, 1 reply; 28+ messages in thread From: Jan Kiszka @ 2008-11-03 20:36 UTC (permalink / raw) To: Anthony Liguori; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1162 bytes --] Anthony Liguori wrote: > Jan Kiszka wrote: >> Anthony Liguori wrote: >> >>> If anyone notices a slow down, I'd check this revision. There may be >>> broken bits of code out there that depends on wake up rate. I've done a >>> fair bit of testing and haven't found any but you never know. >>> >> >> OK, here is one brokenness for you: Musicpal (qemu-arm-system) becomes >> unusable, sounds is stopping for seconds, even keyboard input is >> inpossible, including switching to the monitor console. What should I >> check for? Note that this board uses a ptimer to a PIT to the guest. May >> this make a difference here? >> > > Keyboard input should be governed by the VGA refresh timer which should > be firing 30 times a second. If you mean *the* VGA emulation - there is none, the target is an embedded ARM board. > > How do you have qemu configured? qemu-system-arm -M musicpal -pflash musicpal.image -snapshot \ -kernel u-boot.image -serial stdio (I can provide you both privately if you want to play with it.) > Are you using SDL? Yes. > What sound driver > are you using on the host? ALSA. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 20:36 ` Jan Kiszka @ 2008-11-03 21:50 ` Jan Kiszka 2008-11-03 22:00 ` Anthony Liguori 2008-11-04 8:29 ` Avi Kivity 0 siblings, 2 replies; 28+ messages in thread From: Jan Kiszka @ 2008-11-03 21:50 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 2736 bytes --] Jan Kiszka wrote: > Anthony Liguori wrote: >> Jan Kiszka wrote: >>> Anthony Liguori wrote: >>> >>>> If anyone notices a slow down, I'd check this revision. There may be >>>> broken bits of code out there that depends on wake up rate. I've done a >>>> fair bit of testing and haven't found any but you never know. >>>> >>> OK, here is one brokenness for you: Musicpal (qemu-arm-system) becomes >>> unusable, sounds is stopping for seconds, even keyboard input is >>> inpossible, including switching to the monitor console. What should I >>> check for? Note that this board uses a ptimer to a PIT to the guest. May >>> this make a difference here? >>> I think I understood the problem: [...] timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0 --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xca8d90) = 0 select(24, [0 5 6 8 23], [], [], {0, 0}) = 0 (Timeout) ioctl(20, 0x4122, 0xbb) = 0 ioctl(20, 0x4122, 0x54) = 0 ioctl(20, 0x4122, 0x3) = 0 ioctl(20, 0x4122, 0) = 0 ioctl(20, 0x4122, 0xac) = 0 ioctl(20, 0x4122, 0) = 0 ioctl(20, 0x4122, 0x400) = 0 ioctl(21, USBDEVFS_IOCTL, 0xd64080) = 0 ioctl(20, 0x4122, 0xc000000000020400) = 0 semop(34242562, 0x7fff65de5960, 2) = 0 semop(34242562, 0x7fff65de5970, 1) = 0 read(19, "\1\0\0\0\0\201\377\377\321Z\2\0\0\0\0\0\'PO&\0\0\0\0\1\0\0\0\377\377\377\377", 128) = 32 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0 --- SIGALRM (Alarm clock) @ 0 (0) --- rt_sigreturn(0xca8d90) = 1 select(24, [0 5 6 8 23], [], [], {5, 0}^C <unfinished ...> There is a race between the alarm_timer firing SIGALRM and main_loop_wait reaching the safe harbor of select (with that infamous 5 second timeout). If the signal comes when already blocked in select, it will properly resume the latter immediately. But if the timer fired BEFORE that point, host_alarm_handler will only set a flag that the host timer has fired, the actual rearming will be done AFTER return from select. Ooops.... So, select should actually include the host timer as event. timerfd? Unfortunately a recent Linux-only feature :-/. I don't think we can rearm the timer from within the signal handler, at least not without running all the pending qemu timers. And that is surely not a signal handler job (qemu timer handler aren't thread-safe in general). Anyone any ideas? /me is thinking a bit more about it as well. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 21:50 ` Jan Kiszka @ 2008-11-03 22:00 ` Anthony Liguori 2008-11-03 22:03 ` Jan Kiszka 2008-11-04 8:07 ` andrzej zaborowski 2008-11-04 8:29 ` Avi Kivity 1 sibling, 2 replies; 28+ messages in thread From: Anthony Liguori @ 2008-11-03 22:00 UTC (permalink / raw) To: Jan Kiszka; +Cc: qemu-devel Jan Kiszka wrote: > Jan Kiszka wrote: > > > There is a race between the alarm_timer firing SIGALRM and > main_loop_wait reaching the safe harbor of select (with that infamous 5 > second timeout). If the signal comes when already blocked in select, it > will properly resume the latter immediately. But if the timer fired > BEFORE that point, host_alarm_handler will only set a flag that the host > timer has fired, the actual rearming will be done AFTER return from > select. Ooops.... > Ah, so before this was causing the timer to potentially come 10ms later than it should have. I was hoping that this change would shake out this stuff :-) > So, select should actually include the host timer as event. timerfd? > Unfortunately a recent Linux-only feature :-/. I don't think we can > rearm the timer from within the signal handler, at least not without > running all the pending qemu timers. And that is surely not a signal > handler job (qemu timer handler aren't thread-safe in general). > > Anyone any ideas? /me is thinking a bit more about it as well. > host_alarm_handler should write to a file descriptor instead of setting a flag. That file descriptor should then be select()'d on (just like we do for SIGUSR2 in block-raw-posix.c). Regards, Anthony Liguori > Jan > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 22:00 ` Anthony Liguori @ 2008-11-03 22:03 ` Jan Kiszka 2008-11-04 8:07 ` andrzej zaborowski 1 sibling, 0 replies; 28+ messages in thread From: Jan Kiszka @ 2008-11-03 22:03 UTC (permalink / raw) To: Anthony Liguori; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1423 bytes --] Anthony Liguori wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >> >> There is a race between the alarm_timer firing SIGALRM and >> main_loop_wait reaching the safe harbor of select (with that infamous 5 >> second timeout). If the signal comes when already blocked in select, it >> will properly resume the latter immediately. But if the timer fired >> BEFORE that point, host_alarm_handler will only set a flag that the host >> timer has fired, the actual rearming will be done AFTER return from >> select. Ooops.... >> > > Ah, so before this was causing the timer to potentially come 10ms later > than it should have. I was hoping that this change would shake out this > stuff :-) > >> So, select should actually include the host timer as event. timerfd? >> Unfortunately a recent Linux-only feature :-/. I don't think we can >> rearm the timer from within the signal handler, at least not without >> running all the pending qemu timers. And that is surely not a signal >> handler job (qemu timer handler aren't thread-safe in general). >> >> Anyone any ideas? /me is thinking a bit more about it as well. >> > > host_alarm_handler should write to a file descriptor instead of setting > a flag. That file descriptor should then be select()'d on (just like we > do for SIGUSR2 in block-raw-posix.c). A pipe, that slowly came to my mind as well. OK, will play with it. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 22:00 ` Anthony Liguori 2008-11-03 22:03 ` Jan Kiszka @ 2008-11-04 8:07 ` andrzej zaborowski 2008-11-04 8:22 ` Jan Kiszka 1 sibling, 1 reply; 28+ messages in thread From: andrzej zaborowski @ 2008-11-04 8:07 UTC (permalink / raw) To: qemu-devel; +Cc: Jan Kiszka 2008/11/3 Anthony Liguori <anthony@codemonkey.ws>: > Jan Kiszka wrote: >> >> Jan Kiszka wrote: >> >> There is a race between the alarm_timer firing SIGALRM and >> main_loop_wait reaching the safe harbor of select (with that infamous 5 >> second timeout). If the signal comes when already blocked in select, it >> will properly resume the latter immediately. But if the timer fired >> BEFORE that point, host_alarm_handler will only set a flag that the host >> timer has fired, the actual rearming will be done AFTER return from >> select. Ooops.... >> > > Ah, so before this was causing the timer to potentially come 10ms later than > it should have. I was hoping that this change would shake out this stuff > :-) > >> So, select should actually include the host timer as event. timerfd? >> Unfortunately a recent Linux-only feature :-/. I don't think we can >> rearm the timer from within the signal handler, at least not without >> running all the pending qemu timers. And that is surely not a signal >> handler job (qemu timer handler aren't thread-safe in general). >> >> Anyone any ideas? /me is thinking a bit more about it as well. The select() man page on Linux mentions this race explicitely and explains that pselect() is a solution. >> > > host_alarm_handler should write to a file descriptor instead of setting a > flag. That file descriptor should then be select()'d on (just like we do > for SIGUSR2 in block-raw-posix.c). Or you can do this. Cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 8:07 ` andrzej zaborowski @ 2008-11-04 8:22 ` Jan Kiszka 2008-11-04 8:33 ` andrzej zaborowski 0 siblings, 1 reply; 28+ messages in thread From: Jan Kiszka @ 2008-11-04 8:22 UTC (permalink / raw) To: andrzej zaborowski; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1804 bytes --] andrzej zaborowski wrote: > 2008/11/3 Anthony Liguori <anthony@codemonkey.ws>: >> Jan Kiszka wrote: >>> Jan Kiszka wrote: >>> >>> There is a race between the alarm_timer firing SIGALRM and >>> main_loop_wait reaching the safe harbor of select (with that infamous 5 >>> second timeout). If the signal comes when already blocked in select, it >>> will properly resume the latter immediately. But if the timer fired >>> BEFORE that point, host_alarm_handler will only set a flag that the host >>> timer has fired, the actual rearming will be done AFTER return from >>> select. Ooops.... >>> >> Ah, so before this was causing the timer to potentially come 10ms later than >> it should have. I was hoping that this change would shake out this stuff >> :-) >> >>> So, select should actually include the host timer as event. timerfd? >>> Unfortunately a recent Linux-only feature :-/. I don't think we can >>> rearm the timer from within the signal handler, at least not without >>> running all the pending qemu timers. And that is surely not a signal >>> handler job (qemu timer handler aren't thread-safe in general). >>> >>> Anyone any ideas? /me is thinking a bit more about it as well. > > The select() man page on Linux mentions this race explicitely and > explains that pselect() is a solution. > >> host_alarm_handler should write to a file descriptor instead of setting a >> flag. That file descriptor should then be select()'d on (just like we do >> for SIGUSR2 in block-raw-posix.c). > > Or you can do this. I think this is safer. Or what's the state of pselect on all supported platforms (including WIN32)? My man page even warns that the Linux kernel is not implementing it yet, though I don't think this still applies to recent 2.6.2x kernels. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 8:22 ` Jan Kiszka @ 2008-11-04 8:33 ` andrzej zaborowski 2008-11-04 11:32 ` Jamie Lokier 0 siblings, 1 reply; 28+ messages in thread From: andrzej zaborowski @ 2008-11-04 8:33 UTC (permalink / raw) To: Jan Kiszka; +Cc: qemu-devel 2008/11/4 Jan Kiszka <jan.kiszka@web.de>: > andrzej zaborowski wrote: >> 2008/11/3 Anthony Liguori <anthony@codemonkey.ws>: >>> Jan Kiszka wrote: >>>> Jan Kiszka wrote: >>>> >>>> There is a race between the alarm_timer firing SIGALRM and >>>> main_loop_wait reaching the safe harbor of select (with that infamous 5 >>>> second timeout). If the signal comes when already blocked in select, it >>>> will properly resume the latter immediately. But if the timer fired >>>> BEFORE that point, host_alarm_handler will only set a flag that the host >>>> timer has fired, the actual rearming will be done AFTER return from >>>> select. Ooops.... >>>> >>> Ah, so before this was causing the timer to potentially come 10ms later than >>> it should have. I was hoping that this change would shake out this stuff >>> :-) >>> >>>> So, select should actually include the host timer as event. timerfd? >>>> Unfortunately a recent Linux-only feature :-/. I don't think we can >>>> rearm the timer from within the signal handler, at least not without >>>> running all the pending qemu timers. And that is surely not a signal >>>> handler job (qemu timer handler aren't thread-safe in general). >>>> >>>> Anyone any ideas? /me is thinking a bit more about it as well. >> >> The select() man page on Linux mentions this race explicitely and >> explains that pselect() is a solution. >> >>> host_alarm_handler should write to a file descriptor instead of setting a >>> flag. That file descriptor should then be select()'d on (just like we do >>> for SIGUSR2 in block-raw-posix.c). >> >> Or you can do this. > > I think this is safer. Or what's the state of pselect on all supported > platforms (including WIN32)? Supposedly it's in posix, but no idea about win32. Maybe the pipe is safer. > My man page even warns that the Linux > kernel is not implementing it yet, though I don't think this still > applies to recent 2.6.2x kernels. According to the man page it moved to kernel at 2.6.16 but the glibc wrapper should be ok too. Cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 8:33 ` andrzej zaborowski @ 2008-11-04 11:32 ` Jamie Lokier 2008-11-04 16:22 ` M. Warner Losh 0 siblings, 1 reply; 28+ messages in thread From: Jamie Lokier @ 2008-11-04 11:32 UTC (permalink / raw) To: qemu-devel; +Cc: Jan Kiszka andrzej zaborowski wrote: > > My man page even warns that the Linux > > kernel is not implementing it yet, though I don't think this still > > applies to recent 2.6.2x kernels. > > According to the man page it moved to kernel at 2.6.16 but the glibc > wrapper should be ok too. If there's a glibc wrapper, it cannot be reliable... *Looks at glibc source* That's right. The glibc pselect() wrapper has the same race condition which prompted this QEMU bug. If the signal arrives after unmasking and before select() in the wrapper, then blocks. In other words, don't use pselect() if you might run on a kernel older than 2.6.16, or on a host architecture which adds pselect() in a later kernel version. Also, I wouldn't be surprised if older versions of some BSDs have similar dodgy wrappers. -- Jamie ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 11:32 ` Jamie Lokier @ 2008-11-04 16:22 ` M. Warner Losh 2008-11-04 17:10 ` Jan Kiszka 2008-11-05 15:00 ` Jamie Lokier 0 siblings, 2 replies; 28+ messages in thread From: M. Warner Losh @ 2008-11-04 16:22 UTC (permalink / raw) To: qemu-devel, jamie; +Cc: jan.kiszka In message: <20081104113204.GA32125@shareable.org> Jamie Lokier <jamie@shareable.org> writes: : andrzej zaborowski wrote: : > > My man page even warns that the Linux : > > kernel is not implementing it yet, though I don't think this still : > > applies to recent 2.6.2x kernels. : > : > According to the man page it moved to kernel at 2.6.16 but the glibc : > wrapper should be ok too. : : If there's a glibc wrapper, it cannot be reliable... : : *Looks at glibc source* : : That's right. The glibc pselect() wrapper has the same race condition : which prompted this QEMU bug. If the signal arrives after unmasking : and before select() in the wrapper, then blocks. : : In other words, don't use pselect() if you might run on a kernel older : than 2.6.16, or on a host architecture which adds pselect() in a later : kernel version. Also, I wouldn't be surprised if older versions of : some BSDs have similar dodgy wrappers. Which ones have a good kernel implementation of it? FreeBSD's is currently approximately: if (!mask) _sigprocmask(mask, &oldmask); /* here */ select(); if (!mask) _sigprocmask(oldmask, NULL); I'm assuming that the problem is due to a signal arriving at /* here */. Warner ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 16:22 ` M. Warner Losh @ 2008-11-04 17:10 ` Jan Kiszka 2008-11-04 17:55 ` M. Warner Losh 2008-11-05 15:00 ` Jamie Lokier 1 sibling, 1 reply; 28+ messages in thread From: Jan Kiszka @ 2008-11-04 17:10 UTC (permalink / raw) To: M. Warner Losh; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1546 bytes --] M. Warner Losh wrote: > In message: <20081104113204.GA32125@shareable.org> > Jamie Lokier <jamie@shareable.org> writes: > : andrzej zaborowski wrote: > : > > My man page even warns that the Linux > : > > kernel is not implementing it yet, though I don't think this still > : > > applies to recent 2.6.2x kernels. > : > > : > According to the man page it moved to kernel at 2.6.16 but the glibc > : > wrapper should be ok too. > : > : If there's a glibc wrapper, it cannot be reliable... > : > : *Looks at glibc source* > : > : That's right. The glibc pselect() wrapper has the same race condition > : which prompted this QEMU bug. If the signal arrives after unmasking > : and before select() in the wrapper, then blocks. > : > : In other words, don't use pselect() if you might run on a kernel older > : than 2.6.16, or on a host architecture which adds pselect() in a later > : kernel version. Also, I wouldn't be surprised if older versions of > : some BSDs have similar dodgy wrappers. > > Which ones have a good kernel implementation of it? FreeBSD's is > currently approximately: > > if (!mask) > _sigprocmask(mask, &oldmask); > /* here */ > select(); > if (!mask) > _sigprocmask(oldmask, NULL); > > I'm assuming that the problem is due to a signal arriving at /* here */. I guess those things happen under some kind of preemption lock, otherwise it would be a really poor implementation. However, think we buried the idea of using pselect for this anyway. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 17:10 ` Jan Kiszka @ 2008-11-04 17:55 ` M. Warner Losh 2008-11-04 19:08 ` Jan Kiszka 0 siblings, 1 reply; 28+ messages in thread From: M. Warner Losh @ 2008-11-04 17:55 UTC (permalink / raw) To: jan.kiszka; +Cc: qemu-devel In message: <491081EB.5070905@web.de> Jan Kiszka <jan.kiszka@web.de> writes: : M. Warner Losh wrote: : > In message: <20081104113204.GA32125@shareable.org> : > Jamie Lokier <jamie@shareable.org> writes: : > : andrzej zaborowski wrote: : > : > > My man page even warns that the Linux : > : > > kernel is not implementing it yet, though I don't think this still : > : > > applies to recent 2.6.2x kernels. : > : > : > : > According to the man page it moved to kernel at 2.6.16 but the glibc : > : > wrapper should be ok too. : > : : > : If there's a glibc wrapper, it cannot be reliable... : > : : > : *Looks at glibc source* : > : : > : That's right. The glibc pselect() wrapper has the same race condition : > : which prompted this QEMU bug. If the signal arrives after unmasking : > : and before select() in the wrapper, then blocks. : > : : > : In other words, don't use pselect() if you might run on a kernel older : > : than 2.6.16, or on a host architecture which adds pselect() in a later : > : kernel version. Also, I wouldn't be surprised if older versions of : > : some BSDs have similar dodgy wrappers. : > : > Which ones have a good kernel implementation of it? FreeBSD's is : > currently approximately: : > : > if (!mask) : > _sigprocmask(mask, &oldmask); : > /* here */ : > select(); : > if (!mask) : > _sigprocmask(oldmask, NULL); : > : > I'm assuming that the problem is due to a signal arriving at /* here */. : : I guess those things happen under some kind of preemption lock, : otherwise it would be a really poor implementation. Why? There's races there anyway. IF you have a mutex to prevent multiple calls, doesn't that serialize calls to select? Warner ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 17:55 ` M. Warner Losh @ 2008-11-04 19:08 ` Jan Kiszka 0 siblings, 0 replies; 28+ messages in thread From: Jan Kiszka @ 2008-11-04 19:08 UTC (permalink / raw) To: M. Warner Losh; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 2217 bytes --] M. Warner Losh wrote: > In message: <491081EB.5070905@web.de> > Jan Kiszka <jan.kiszka@web.de> writes: > : M. Warner Losh wrote: > : > In message: <20081104113204.GA32125@shareable.org> > : > Jamie Lokier <jamie@shareable.org> writes: > : > : andrzej zaborowski wrote: > : > : > > My man page even warns that the Linux > : > : > > kernel is not implementing it yet, though I don't think this still > : > : > > applies to recent 2.6.2x kernels. > : > : > > : > : > According to the man page it moved to kernel at 2.6.16 but the glibc > : > : > wrapper should be ok too. > : > : > : > : If there's a glibc wrapper, it cannot be reliable... > : > : > : > : *Looks at glibc source* > : > : > : > : That's right. The glibc pselect() wrapper has the same race condition > : > : which prompted this QEMU bug. If the signal arrives after unmasking > : > : and before select() in the wrapper, then blocks. > : > : > : > : In other words, don't use pselect() if you might run on a kernel older > : > : than 2.6.16, or on a host architecture which adds pselect() in a later > : > : kernel version. Also, I wouldn't be surprised if older versions of > : > : some BSDs have similar dodgy wrappers. > : > > : > Which ones have a good kernel implementation of it? FreeBSD's is > : > currently approximately: > : > > : > if (!mask) > : > _sigprocmask(mask, &oldmask); > : > /* here */ > : > select(); > : > if (!mask) > : > _sigprocmask(oldmask, NULL); > : > > : > I'm assuming that the problem is due to a signal arriving at /* here */. > : > : I guess those things happen under some kind of preemption lock, > : otherwise it would be a really poor implementation. > > Why? There's races there anyway. IF you have a mutex to prevent > multiple calls, doesn't that serialize calls to select? [ sorting my mind ] What I meant was that some care will likely be taken to prevent signal delivery _to userspace_ between the unmasking and the select() call. At least Linux does delivery on syscall return, so should FreeBSD do. I guess that signals arriving around "/* here */" will simply prevent select() to block (or even run). Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 258 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-04 16:22 ` M. Warner Losh 2008-11-04 17:10 ` Jan Kiszka @ 2008-11-05 15:00 ` Jamie Lokier 2008-11-05 16:10 ` M. Warner Losh 1 sibling, 1 reply; 28+ messages in thread From: Jamie Lokier @ 2008-11-05 15:00 UTC (permalink / raw) To: M. Warner Losh; +Cc: jan.kiszka, qemu-devel M. Warner Losh wrote: > : In other words, don't use pselect() if you might run on a kernel older > : than 2.6.16, or on a host architecture which adds pselect() in a later > : kernel version. Also, I wouldn't be surprised if older versions of > : some BSDs have similar dodgy wrappers. > > Which ones have a good kernel implementation of it? FreeBSD's is > currently approximately: > > if (!mask) > _sigprocmask(mask, &oldmask); > /* here */ > select(); > if (!mask) > _sigprocmask(oldmask, NULL); > > I'm assuming that the problem is due to a signal arriving at /* here */. If that's _kernel_ code and the kernel behaves like Linux, it's not a problem because signals don't affect the control flow until returning to userspace, meaning the select() will return EINTR. If that's userspace (libc) code, then it is no good. Nobody should ever have written crappy pselect() wrappers in userspace (i.e. Glibc), it just causes portable software to have to keep a whitelist of reliable pselect() platforms (i.e. not Linux) instead of just using it. Same for crappy broken pread() and pwrite() wrappers. -- Jamie ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 15:00 ` Jamie Lokier @ 2008-11-05 16:10 ` M. Warner Losh 2008-11-05 18:21 ` Jan Kiszka 2008-11-06 0:53 ` Jamie Lokier 0 siblings, 2 replies; 28+ messages in thread From: M. Warner Losh @ 2008-11-05 16:10 UTC (permalink / raw) To: jamie; +Cc: jan.kiszka, qemu-devel In message: <20081105150042.GJ13630@shareable.org> Jamie Lokier <jamie@shareable.org> writes: : M. Warner Losh wrote: : > : In other words, don't use pselect() if you might run on a kernel older : > : than 2.6.16, or on a host architecture which adds pselect() in a later : > : kernel version. Also, I wouldn't be surprised if older versions of : > : some BSDs have similar dodgy wrappers. : > : > Which ones have a good kernel implementation of it? FreeBSD's is : > currently approximately: : > : > if (!mask) : > _sigprocmask(mask, &oldmask); : > /* here */ : > select(); : > if (!mask) : > _sigprocmask(oldmask, NULL); : > : > I'm assuming that the problem is due to a signal arriving at /* here */. : : If that's _kernel_ code and the kernel behaves like Linux, it's not a : problem because signals don't affect the control flow until returning : to userspace, meaning the select() will return EINTR. It is currently user level code, and I'm looking at moving it into the kernel, but I need to understand the race being talked about here. : If that's userspace (libc) code, then it is no good. Nobody should : ever have written crappy pselect() wrappers in userspace (i.e. Glibc), : it just causes portable software to have to keep a whitelist of : reliable pselect() platforms (i.e. not Linux) instead of just using : it. Same for crappy broken pread() and pwrite() wrappers. Why is it no good. What is the race here? Is it just the oldmask thing and multiple callers to select, or is it something else? And if it is the oldmask thing, why wouldn't multiple callers of pselect mess it up depending on what order they have. I must have missed the original description of the race and why it matters.. Warner ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 16:10 ` M. Warner Losh @ 2008-11-05 18:21 ` Jan Kiszka 2008-11-05 18:41 ` Daniel P. Berrange 2008-11-05 20:16 ` andrzej zaborowski 2008-11-06 0:53 ` Jamie Lokier 1 sibling, 2 replies; 28+ messages in thread From: Jan Kiszka @ 2008-11-05 18:21 UTC (permalink / raw) To: M. Warner Losh; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1915 bytes --] M. Warner Losh wrote: > In message: <20081105150042.GJ13630@shareable.org> > Jamie Lokier <jamie@shareable.org> writes: > : M. Warner Losh wrote: > : > : In other words, don't use pselect() if you might run on a kernel older > : > : than 2.6.16, or on a host architecture which adds pselect() in a later > : > : kernel version. Also, I wouldn't be surprised if older versions of > : > : some BSDs have similar dodgy wrappers. > : > > : > Which ones have a good kernel implementation of it? FreeBSD's is > : > currently approximately: > : > > : > if (!mask) > : > _sigprocmask(mask, &oldmask); > : > /* here */ > : > select(); > : > if (!mask) > : > _sigprocmask(oldmask, NULL); > : > > : > I'm assuming that the problem is due to a signal arriving at /* here */. > : > : If that's _kernel_ code and the kernel behaves like Linux, it's not a > : problem because signals don't affect the control flow until returning > : to userspace, meaning the select() will return EINTR. > > It is currently user level code, and I'm looking at moving it into the > kernel, but I need to understand the race being talked about here. From the Linux man page on [p]select: "The reason that pselect() is needed is that if one wants to wait for either a signal or for a file descriptor to become ready, then an atomic test is needed to prevent race conditions. (Suppose the signal handler sets a global flag and returns. Then a test of this global flag followed by a call of select() could hang indefinitely if the signal arrived just after the test but just before the call. By contrast, pselect() allows one to first block signals, handle the signals that have come in, then call pselect() with the desired sigmask, avoiding the race.)" So the unmasking and possible blocking on select must be done atomically. And that is only feasible in kernel land. Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 18:21 ` Jan Kiszka @ 2008-11-05 18:41 ` Daniel P. Berrange 2008-11-05 20:16 ` andrzej zaborowski 1 sibling, 0 replies; 28+ messages in thread From: Daniel P. Berrange @ 2008-11-05 18:41 UTC (permalink / raw) To: qemu-devel On Wed, Nov 05, 2008 at 07:21:40PM +0100, Jan Kiszka wrote: > M. Warner Losh wrote: > > In message: <20081105150042.GJ13630@shareable.org> > > Jamie Lokier <jamie@shareable.org> writes: > > : M. Warner Losh wrote: > > : > : In other words, don't use pselect() if you might run on a kernel older > > : > : than 2.6.16, or on a host architecture which adds pselect() in a later > > : > : kernel version. Also, I wouldn't be surprised if older versions of > > : > : some BSDs have similar dodgy wrappers. > > : > > > : > Which ones have a good kernel implementation of it? FreeBSD's is > > : > currently approximately: > > : > > > : > if (!mask) > > : > _sigprocmask(mask, &oldmask); > > : > /* here */ > > : > select(); > > : > if (!mask) > > : > _sigprocmask(oldmask, NULL); > > : > > > : > I'm assuming that the problem is due to a signal arriving at /* here */. > > : > > : If that's _kernel_ code and the kernel behaves like Linux, it's not a > > : problem because signals don't affect the control flow until returning > > : to userspace, meaning the select() will return EINTR. > > > > It is currently user level code, and I'm looking at moving it into the > > kernel, but I need to understand the race being talked about here. > > From the Linux man page on [p]select: > > "The reason that pselect() is needed is that if one wants to wait for > either a signal or for a file descriptor to become ready, then an atomic > test is needed to prevent race conditions. (Suppose the signal handler > sets a global flag and returns. Then a test of this global flag followed > by a call of select() could hang indefinitely if the signal arrived just > after the test but just before the call. By contrast, pselect() allows > one to first block signals, handle the signals that have come in, then > call pselect() with the desired sigmask, avoiding the race.)" There's another good expanded description on LWN too: http://lwn.net/Articles/176911/ Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 18:21 ` Jan Kiszka 2008-11-05 18:41 ` Daniel P. Berrange @ 2008-11-05 20:16 ` andrzej zaborowski 2008-11-05 20:28 ` Daniel P. Berrange 1 sibling, 1 reply; 28+ messages in thread From: andrzej zaborowski @ 2008-11-05 20:16 UTC (permalink / raw) To: qemu-devel 2008/11/5 Jan Kiszka <jan.kiszka@web.de>: > M. Warner Losh wrote: >> In message: <20081105150042.GJ13630@shareable.org> >> Jamie Lokier <jamie@shareable.org> writes: >> : M. Warner Losh wrote: >> : > : In other words, don't use pselect() if you might run on a kernel older >> : > : than 2.6.16, or on a host architecture which adds pselect() in a later >> : > : kernel version. Also, I wouldn't be surprised if older versions of >> : > : some BSDs have similar dodgy wrappers. >> : > >> : > Which ones have a good kernel implementation of it? FreeBSD's is >> : > currently approximately: >> : > >> : > if (!mask) >> : > _sigprocmask(mask, &oldmask); >> : > /* here */ >> : > select(); >> : > if (!mask) >> : > _sigprocmask(oldmask, NULL); >> : > >> : > I'm assuming that the problem is due to a signal arriving at /* here */. >> : >> : If that's _kernel_ code and the kernel behaves like Linux, it's not a >> : problem because signals don't affect the control flow until returning >> : to userspace, meaning the select() will return EINTR. >> >> It is currently user level code, and I'm looking at moving it into the >> kernel, but I need to understand the race being talked about here. > > From the Linux man page on [p]select: > > "The reason that pselect() is needed is that if one wants to wait for > either a signal or for a file descriptor to become ready, then an atomic > test is needed to prevent race conditions. (Suppose the signal handler > sets a global flag and returns. Then a test of this global flag followed > by a call of select() could hang indefinitely if the signal arrived just > after the test but just before the call. By contrast, pselect() allows > one to first block signals, handle the signals that have come in, then > call pselect() with the desired sigmask, avoiding the race.)" > > So the unmasking and possible blocking on select must be done > atomically. And that is only feasible in kernel land. To be exact, it *was* possible for glibc to implement a pselect free of races: that is by using the same trick as your patch, i.e. making a pipe and adding it to select()ed fd's and mangling the sigmask. Cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 20:16 ` andrzej zaborowski @ 2008-11-05 20:28 ` Daniel P. Berrange 2008-11-05 23:38 ` Jamie Lokier 0 siblings, 1 reply; 28+ messages in thread From: Daniel P. Berrange @ 2008-11-05 20:28 UTC (permalink / raw) To: qemu-devel On Wed, Nov 05, 2008 at 09:16:59PM +0100, andrzej zaborowski wrote: > 2008/11/5 Jan Kiszka <jan.kiszka@web.de>: > > M. Warner Losh wrote: > >> In message: <20081105150042.GJ13630@shareable.org> > >> Jamie Lokier <jamie@shareable.org> writes: > >> : M. Warner Losh wrote: > >> : > : In other words, don't use pselect() if you might run on a kernel older > >> : > : than 2.6.16, or on a host architecture which adds pselect() in a later > >> : > : kernel version. Also, I wouldn't be surprised if older versions of > >> : > : some BSDs have similar dodgy wrappers. > >> : > > >> : > Which ones have a good kernel implementation of it? FreeBSD's is > >> : > currently approximately: > >> : > > >> : > if (!mask) > >> : > _sigprocmask(mask, &oldmask); > >> : > /* here */ > >> : > select(); > >> : > if (!mask) > >> : > _sigprocmask(oldmask, NULL); > >> : > > >> : > I'm assuming that the problem is due to a signal arriving at /* here */. > >> : > >> : If that's _kernel_ code and the kernel behaves like Linux, it's not a > >> : problem because signals don't affect the control flow until returning > >> : to userspace, meaning the select() will return EINTR. > >> > >> It is currently user level code, and I'm looking at moving it into the > >> kernel, but I need to understand the race being talked about here. > > > > From the Linux man page on [p]select: > > > > "The reason that pselect() is needed is that if one wants to wait for > > either a signal or for a file descriptor to become ready, then an atomic > > test is needed to prevent race conditions. (Suppose the signal handler > > sets a global flag and returns. Then a test of this global flag followed > > by a call of select() could hang indefinitely if the signal arrived just > > after the test but just before the call. By contrast, pselect() allows > > one to first block signals, handle the signals that have come in, then > > call pselect() with the desired sigmask, avoiding the race.)" > > > > So the unmasking and possible blocking on select must be done > > atomically. And that is only feasible in kernel land. > > To be exact, it *was* possible for glibc to implement a pselect free of races: > that is by using the same trick as your patch, i.e. making a pipe and > adding it to select()ed fd's and mangling the sigmask. Yes & no. The trouble with glibc using pipes behind your back is that then it creates a totally different race in threaded apps, where a FD could be leaked to a child process between glibc opening its secret pipe and setting the O_CLOSEXEC flag. Indeed it already suffers from this problem with name resolving Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 20:28 ` Daniel P. Berrange @ 2008-11-05 23:38 ` Jamie Lokier 0 siblings, 0 replies; 28+ messages in thread From: Jamie Lokier @ 2008-11-05 23:38 UTC (permalink / raw) To: Daniel P. Berrange, qemu-devel Daniel P. Berrange wrote: > > To be exact, it *was* possible for glibc to implement a pselect > > free of races: that is by using the same trick as your patch, > > i.e. making a pipe and adding it to select()ed fd's and mangling > > the sigmask. > > Yes & no. The trouble with glibc using pipes behind your back is that > then it creates a totally different race in threaded apps, where a FD > could be leaked to a child process between glibc opening its secret > pipe and setting the O_CLOSEXEC flag. Indeed it already suffers from > this problem with name resolving That involves wrapping every signal handler too, and because of threads, handlers would need to be wrapped all the time by a sigaction() replacement - quite a lot of cruft in libc would be required! If you're writing libc, and willing to go to such crazy lengths to implement pselect() "properly", you can fix the close-on-exec race by wrapping execve() too. Or you can avoid close-on-exec by using siglongjmp() from the wrapped signal handlers to jump out of select(), not requiring an fd at all. Unsurprisingly no libc that I know goes to these lengths. -- Jamie ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-05 16:10 ` M. Warner Losh 2008-11-05 18:21 ` Jan Kiszka @ 2008-11-06 0:53 ` Jamie Lokier 2008-11-06 5:19 ` M. Warner Losh 1 sibling, 1 reply; 28+ messages in thread From: Jamie Lokier @ 2008-11-06 0:53 UTC (permalink / raw) To: M. Warner Losh; +Cc: jan.kiszka, qemu-devel M. Warner Losh wrote: > : > Which ones have a good kernel implementation of it? FreeBSD's is > : > currently approximately: > : > > : > if (!mask) > : > _sigprocmask(mask, &oldmask); > : > /* here */ > : > select(); > : > if (!mask) > : > _sigprocmask(oldmask, NULL); > : > > : > I'm assuming that the problem is due to a signal arriving at /* here */. > : > : If that's _kernel_ code and the kernel behaves like Linux, it's not a > : problem because signals don't affect the control flow until returning > : to userspace, meaning the select() will return EINTR. > > It is currently user level code, and I'm looking at moving it into the > kernel, but I need to understand the race being talked about here. Ugh, I had imagined FreeBSD would have got that right, since it's quite good in other areas. I've added FreeBSD to my blacklist of broken pselect() implementations, thanks for the info. Do you know if FreeBSD's pread() and pwrite() are also thread-unsafe userspace wrappers using lseek+read/write? They are harder to avoid when you're looking at high performance code. > Why is it no good. What is the race here? Is it just the oldmask > thing and multiple callers to select, or is it something else? It's racy with a single caller. The race is: program's signal handler sets a flag like "alarm_happened = 1". The program's main loop checks the flag before calling select(). If the signal is delivered before that check, the program doesn't call select() and handles the reason for the flag. If the signal is delivered during select(), that returns EINTR and the program handles the reason for the flag. But if the signal is delivered _between_ checking the flag and calling select(), the program gets stuck. pselect() avoids that stuck state, by blocking the signal before the program checks the flag, and guaranteeing if the signal is delivered after that point, pselect() returns EINTR. It's sort of analogous to pthread_cond_wait() needing a mutex. > And if it is the oldmask thing, why wouldn't multiple callers of > pselect mess it up depending on what order they have. Signal masks are per-thread anyway, multiple callers isn't an issue. -- Jamie ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-06 0:53 ` Jamie Lokier @ 2008-11-06 5:19 ` M. Warner Losh 0 siblings, 0 replies; 28+ messages in thread From: M. Warner Losh @ 2008-11-06 5:19 UTC (permalink / raw) To: jamie; +Cc: jan.kiszka, qemu-devel In message: <20081106005312.GA26173@shareable.org> Jamie Lokier <jamie@shareable.org> writes: : M. Warner Losh wrote: : > : > Which ones have a good kernel implementation of it? FreeBSD's is : > : > currently approximately: : > : > : > : > if (!mask) : > : > _sigprocmask(mask, &oldmask); : > : > /* here */ : > : > select(); : > : > if (!mask) : > : > _sigprocmask(oldmask, NULL); : > : > : > : > I'm assuming that the problem is due to a signal arriving at /* here */. : > : : > : If that's _kernel_ code and the kernel behaves like Linux, it's not a : > : problem because signals don't affect the control flow until returning : > : to userspace, meaning the select() will return EINTR. : > : > It is currently user level code, and I'm looking at moving it into the : > kernel, but I need to understand the race being talked about here. : : Ugh, I had imagined FreeBSD would have got that right, since it's : quite good in other areas. I've added FreeBSD to my blacklist of : broken pselect() implementations, thanks for the info. : : Do you know if FreeBSD's pread() and pwrite() are also thread-unsafe : userspace wrappers using lseek+read/write? They are harder to avoid : when you're looking at high performance code. I haven't looked... : > Why is it no good. What is the race here? Is it just the oldmask : > thing and multiple callers to select, or is it something else? : : It's racy with a single caller. The race is: program's signal handler : sets a flag like "alarm_happened = 1". The program's main loop checks : the flag before calling select(). If the signal is delivered before : that check, the program doesn't call select() and handles the reason : for the flag. If the signal is delivered during select(), that : returns EINTR and the program handles the reason for the flag. But if : the signal is delivered _between_ checking the flag and calling : select(), the program gets stuck. : : pselect() avoids that stuck state, by blocking the signal before the : program checks the flag, and guaranteeing if the signal is delivered : after that point, pselect() returns EINTR. It's sort of analogous to : pthread_cond_wait() needing a mutex. OK. that makes sense. : > And if it is the oldmask thing, why wouldn't multiple callers of : > pselect mess it up depending on what order they have. : : Signal masks are per-thread anyway, multiple callers isn't an issue. OK. I have never had good things happen with signals and threads... Warner ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] Re: [5578] Increase default IO timeout from 10ms to 5s 2008-11-03 21:50 ` Jan Kiszka 2008-11-03 22:00 ` Anthony Liguori @ 2008-11-04 8:29 ` Avi Kivity 1 sibling, 0 replies; 28+ messages in thread From: Avi Kivity @ 2008-11-04 8:29 UTC (permalink / raw) To: qemu-devel Jan Kiszka wrote: > There is a race between the alarm_timer firing SIGALRM and > main_loop_wait reaching the safe harbor of select (with that infamous 5 > second timeout). If the signal comes when already blocked in select, it > will properly resume the latter immediately. But if the timer fired > BEFORE that point, host_alarm_handler will only set a flag that the host > timer has fired, the actual rearming will be done AFTER return from > select. Ooops.... > > So, select should actually include the host timer as event. timerfd? > Unfortunately a recent Linux-only feature :-/. I don't think we can > rearm the timer from within the signal handler, at least not without > running all the pending qemu timers. And that is surely not a signal > handler job (qemu timer handler aren't thread-safe in general). > > Anyone any ideas? /me is thinking a bit more about it as well. > pselect(): 1. run tcg code 2. block signals 3. pselect (enabling signals) 4. process pselect results (signals disabled again) 5. unblock signals 6. goto 1 Prior to Linux 2.6.16, pselect() was racy since it was implemented in userspace rather than the kernel. I don't think it's a problem now. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s 2008-10-31 18:40 [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s Anthony Liguori 2008-10-31 18:52 ` Anthony Liguori @ 2008-10-31 19:41 ` Jamie Lokier 2008-10-31 20:13 ` Anthony Liguori 1 sibling, 1 reply; 28+ messages in thread From: Jamie Lokier @ 2008-10-31 19:41 UTC (permalink / raw) To: qemu-devel Anthony Liguori wrote: > With this patch, I see the number of wake-ups with an idle dynamic > ticks guest drop from 80 per second to about 15 times per second. That's great, it really is, but shouldn't it be less than that? Did you look at the wakeup rate shown by powertop in the guest, to see how many extra wakeups are caused by QEMU internals? -- Jamie ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s 2008-10-31 19:41 ` [Qemu-devel] " Jamie Lokier @ 2008-10-31 20:13 ` Anthony Liguori 0 siblings, 0 replies; 28+ messages in thread From: Anthony Liguori @ 2008-10-31 20:13 UTC (permalink / raw) To: qemu-devel Jamie Lokier wrote: > Anthony Liguori wrote: > >> With this patch, I see the number of wake-ups with an idle dynamic >> ticks guest drop from 80 per second to about 15 times per second. >> > > That's great, it really is, but shouldn't it be less than that? > > Did you look at the wakeup rate shown by powertop in the guest, to see > how many extra wakeups are caused by QEMU internals? > I don't have a guest handy that I could run powertop on, but I know of at least a few timers that can be gotten rid of (like the rtc timers). Regards, Anthony Liguori > -- Jamie > > > ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2008-11-06 5:20 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-31 18:40 [Qemu-devel] [5578] Increase default IO timeout from 10ms to 5s Anthony Liguori 2008-10-31 18:52 ` Anthony Liguori 2008-11-02 19:08 ` [Qemu-devel] " Jan Kiszka 2008-11-03 20:04 ` Anthony Liguori 2008-11-03 20:36 ` Jan Kiszka 2008-11-03 21:50 ` Jan Kiszka 2008-11-03 22:00 ` Anthony Liguori 2008-11-03 22:03 ` Jan Kiszka 2008-11-04 8:07 ` andrzej zaborowski 2008-11-04 8:22 ` Jan Kiszka 2008-11-04 8:33 ` andrzej zaborowski 2008-11-04 11:32 ` Jamie Lokier 2008-11-04 16:22 ` M. Warner Losh 2008-11-04 17:10 ` Jan Kiszka 2008-11-04 17:55 ` M. Warner Losh 2008-11-04 19:08 ` Jan Kiszka 2008-11-05 15:00 ` Jamie Lokier 2008-11-05 16:10 ` M. Warner Losh 2008-11-05 18:21 ` Jan Kiszka 2008-11-05 18:41 ` Daniel P. Berrange 2008-11-05 20:16 ` andrzej zaborowski 2008-11-05 20:28 ` Daniel P. Berrange 2008-11-05 23:38 ` Jamie Lokier 2008-11-06 0:53 ` Jamie Lokier 2008-11-06 5:19 ` M. Warner Losh 2008-11-04 8:29 ` Avi Kivity 2008-10-31 19:41 ` [Qemu-devel] " Jamie Lokier 2008-10-31 20:13 ` Anthony Liguori
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).