* [patch] sys_epoll_wait() timeout saga ... @ 2005-09-23 18:13 Davide Libenzi 2005-09-23 18:24 ` Nish Aravamudan 2005-09-24 4:05 ` Willy Tarreau 0 siblings, 2 replies; 33+ messages in thread From: Davide Libenzi @ 2005-09-23 18:13 UTC (permalink / raw) To: Linux Kernel Mailing List; +Cc: Andrew Morton [-- Attachment #1: Type: TEXT/PLAIN, Size: 249 bytes --] The sys_epoll_wait() function was not handling correctly negative timeouts (besides -1), and like sys_poll(), was comparing millisec to secs in testing the upper timeout limit. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> - Davide [-- Attachment #2: Type: TEXT/PLAIN, Size: 475 bytes --] --- a/fs/eventpoll.c 2005-09-23 10:56:57.000000000 -0700 +++ b/fs/eventpoll.c 2005-09-23 11:00:06.000000000 -0700 @@ -1507,7 +1507,7 @@ * and the overflow condition. The passed timeout is in milliseconds, * that why (t * HZ) / 1000. */ - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; retry: ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-23 18:13 [patch] sys_epoll_wait() timeout saga Davide Libenzi @ 2005-09-23 18:24 ` Nish Aravamudan 2005-09-24 4:05 ` Willy Tarreau 1 sibling, 0 replies; 33+ messages in thread From: Nish Aravamudan @ 2005-09-23 18:24 UTC (permalink / raw) To: Davide Libenzi; +Cc: Linux Kernel Mailing List, Andrew Morton On 9/23/05, Davide Libenzi <davidel@xmailserver.org> wrote: > > The sys_epoll_wait() function was not handling correctly negative timeouts > (besides -1), and like sys_poll(), was comparing millisec to secs in > testing the upper timeout limit. > > > Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Looks a lot more correct :) Probably want to eventually convert the code path to be similar to sys_poll(), though? Maybe with a helper to do the converting? I think the epoll code can probable use msecs_to_jiffies() + 1 as well, no? I will wait for your patch to go in and send a patch for these ideas later. Thanks, Nish ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-23 18:13 [patch] sys_epoll_wait() timeout saga Davide Libenzi 2005-09-23 18:24 ` Nish Aravamudan @ 2005-09-24 4:05 ` Willy Tarreau 2005-09-24 4:44 ` Nish Aravamudan 1 sibling, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 4:05 UTC (permalink / raw) To: Davide Libenzi; +Cc: Linux Kernel Mailing List, Andrew Morton Hi Davide, On Fri, Sep 23, 2005 at 11:13:30AM -0700, Davide Libenzi wrote: > > The sys_epoll_wait() function was not handling correctly negative > timeouts (besides -1), and like sys_poll(), was comparing millisec to > secs in testing the upper timeout limit. > > > Signed-off-by: Davide Libenzi <davidel@xmailserver.org> > > > - Davide > --- a/fs/eventpoll.c 2005-09-23 10:56:57.000000000 -0700 > +++ b/fs/eventpoll.c 2005-09-23 11:00:06.000000000 -0700 > @@ -1507,7 +1507,7 @@ > * and the overflow condition. The passed timeout is in milliseconds, > * that why (t * HZ) / 1000. > */ > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; Here, I'm not certain that gcc will optimize the divide. It would be better anyway to write this which is equivalent, and a pure integer comparison : + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; gcc will also spit a warning if the constant is too big for an int, depending on MAX_SCHEDULE_TIMEOUT and HZ, while in the previous case, it would remain silent, and possibly, timeout/1000 would never reach the limit. Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 4:05 ` Willy Tarreau @ 2005-09-24 4:44 ` Nish Aravamudan 2005-09-24 6:15 ` Willy Tarreau 0 siblings, 1 reply; 33+ messages in thread From: Nish Aravamudan @ 2005-09-24 4:44 UTC (permalink / raw) To: Willy Tarreau; +Cc: Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On 9/23/05, Willy Tarreau <willy@w.ods.org> wrote: > Hi Davide, > > On Fri, Sep 23, 2005 at 11:13:30AM -0700, Davide Libenzi wrote: > > > > The sys_epoll_wait() function was not handling correctly negative > > timeouts (besides -1), and like sys_poll(), was comparing millisec to > > secs in testing the upper timeout limit. > > > > > > Signed-off-by: Davide Libenzi <davidel@xmailserver.org> > > > > > > - Davide > > > --- a/fs/eventpoll.c 2005-09-23 10:56:57.000000000 -0700 > > +++ b/fs/eventpoll.c 2005-09-23 11:00:06.000000000 -0700 > > @@ -1507,7 +1507,7 @@ > > * and the overflow condition. The passed timeout is in milliseconds, > > * that why (t * HZ) / 1000. > > */ > > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > Here, I'm not certain that gcc will optimize the divide. It would be better > anyway to write this which is equivalent, and a pure integer comparison : > > + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; Just a question here, maybe it's dumb. * and / have the same priority in the order of operations, yes? If so, won't the the 1000 * MAX_SCHEDULE_TIMEOUT overflow (MAX_SCHEDULE_TIMEOUT is LONG_MAX)? I really think this code just move to the same thing that sys_poll() does to avoid overlflow (I fixed the bug Alexey was experiencing, so I think the changes are safe now). In any case, this code is approaching unreadable with lots of jiffies <--> human-time units manipulations done in non-standard ways, which the updated sys_poll() also tries to avoid. Thanks, Nish ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 4:44 ` Nish Aravamudan @ 2005-09-24 6:15 ` Willy Tarreau 2005-09-24 7:33 ` Vadim Lobanov ` (2 more replies) 0 siblings, 3 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 6:15 UTC (permalink / raw) To: Nish Aravamudan Cc: Willy Tarreau, Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On Fri, Sep 23, 2005 at 09:44:10PM -0700, Nish Aravamudan wrote: > > > * that why (t * HZ) / 1000. > > > */ > > > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > > > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > Here, I'm not certain that gcc will optimize the divide. It would be better > > anyway to write this which is equivalent, and a pure integer comparison : > > > > + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > Just a question here, maybe it's dumb. Your question is not dumb, this code is not trivial at all ! > * and / have the same priority in the order of operations, yes? If so, > won't the the 1000 * MAX_SCHEDULE_TIMEOUT overflow > (MAX_SCHEDULE_TIMEOUT is LONG_MAX)? Yes it can, and that's why I said that gcc should send a warning when comparing an int with something too large for an int. But I should have forced the constant to be evaluated as long long. At the moment, the constant cannot overflow, but it can reach a value so high that timeout/1000 will never reach it. Example : MAX_SCHEDULE_TIMEOUT=LONG_MAX HZ=250 timeout=LONG_MAX-1 => timeout/1000 < MAX_SCHEDULE_TIMEOUT/HZ but (timeout * HZ + 999) / 1000 will still overflow ! So I finally think that the safest test would be to avoid the timeout range which can overflow in the computation, using something like this (but which will limit the timeout to 49 days on HZ=1000 machines) : + jtimeout = timeout < 0 || \ + timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || \ + timeout >= (LONG_MAX / HZ - 1000) ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; as both are constants, they can be optimized. Otherwise, we can resort to using a MAX() macro to reduce this to only one test which will catch all corner cases. > I really think this code just move > to the same thing that sys_poll() does to avoid overlflow (I fixed the > bug Alexey was experiencing, so I think the changes are safe now). I'm not totally certain that all overflows are avoided, see above. If you play with timeout values close to LONG_MAX / HZ, you're still not caught by the test and can overflow in the multiply. > In any case, this code is approaching unreadable with lots of jiffies > <--> human-time units manipulations done in non-standard ways, which > the updated sys_poll() also tries to avoid. I've not checked sys_poll(), but I agree with you that it's rather difficult to imagine all corner cases this way. Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 6:15 ` Willy Tarreau @ 2005-09-24 7:33 ` Vadim Lobanov 2005-09-24 7:51 ` Willy Tarreau 2005-09-24 15:10 ` Davide Libenzi 2005-09-24 17:19 ` Nishanth Aravamudan 2 siblings, 1 reply; 33+ messages in thread From: Vadim Lobanov @ 2005-09-24 7:33 UTC (permalink / raw) To: Willy Tarreau Cc: Nish Aravamudan, Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On Sat, 24 Sep 2005, Willy Tarreau wrote: > On Fri, Sep 23, 2005 at 09:44:10PM -0700, Nish Aravamudan wrote: > > > > * that why (t * HZ) / 1000. > > > > */ > > > > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > > > > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > > > Here, I'm not certain that gcc will optimize the divide. It would be better > > > anyway to write this which is equivalent, and a pure integer comparison : > > > > > > + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > Just a question here, maybe it's dumb. > > Your question is not dumb, this code is not trivial at all ! > > > * and / have the same priority in the order of operations, yes? If so, > > won't the the 1000 * MAX_SCHEDULE_TIMEOUT overflow > > (MAX_SCHEDULE_TIMEOUT is LONG_MAX)? > > Yes it can, and that's why I said that gcc should send a warning when > comparing an int with something too large for an int. But I should have > forced the constant to be evaluated as long long. At the moment, the > constant cannot overflow, but it can reach a value so high that > timeout/1000 will never reach it. Example : > MAX_SCHEDULE_TIMEOUT=LONG_MAX > HZ=250 > timeout=LONG_MAX-1 > => timeout/1000 < MAX_SCHEDULE_TIMEOUT/HZ > but (timeout * HZ + 999) / 1000 will still overflow ! > > So I finally think that the safest test would be to avoid the timeout > range which can overflow in the computation, using something like this > (but which will limit the timeout to 49 days on HZ=1000 machines) : > > + jtimeout = timeout < 0 || \ > + timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || \ > + timeout >= (LONG_MAX / HZ - 1000) ? > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; It seems that we can make the second overflow test be less strict by doing the following instead: timeout >= (LONG_MAX - 1000) / HZ Unless I'm confused. :-) > as both are constants, they can be optimized. Otherwise, we can resort to > using a MAX() macro to reduce this to only one test which will catch all > corner cases. > > > I really think this code just move > > to the same thing that sys_poll() does to avoid overlflow (I fixed the > > bug Alexey was experiencing, so I think the changes are safe now). > > I'm not totally certain that all overflows are avoided, see above. If > you play with timeout values close to LONG_MAX / HZ, you're still not > caught by the test and can overflow in the multiply. > > > In any case, this code is approaching unreadable with lots of jiffies > > <--> human-time units manipulations done in non-standard ways, which > > the updated sys_poll() also tries to avoid. > > I've not checked sys_poll(), but I agree with you that it's rather > difficult to imagine all corner cases this way. > > Regards, > Willy > -Vadim Lobanov ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 7:33 ` Vadim Lobanov @ 2005-09-24 7:51 ` Willy Tarreau 0 siblings, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 7:51 UTC (permalink / raw) To: Vadim Lobanov Cc: Nish Aravamudan, Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On Sat, Sep 24, 2005 at 12:33:05AM -0700, Vadim Lobanov wrote: > On Sat, 24 Sep 2005, Willy Tarreau wrote: > > > On Fri, Sep 23, 2005 at 09:44:10PM -0700, Nish Aravamudan wrote: > > > > > * that why (t * HZ) / 1000. > > > > > */ > > > > > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > > > > > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > > > > > Here, I'm not certain that gcc will optimize the divide. It would be better > > > > anyway to write this which is equivalent, and a pure integer comparison : > > > > > > > > + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > > > Just a question here, maybe it's dumb. > > > > Your question is not dumb, this code is not trivial at all ! > > > > > * and / have the same priority in the order of operations, yes? If so, > > > won't the the 1000 * MAX_SCHEDULE_TIMEOUT overflow > > > (MAX_SCHEDULE_TIMEOUT is LONG_MAX)? > > > > Yes it can, and that's why I said that gcc should send a warning when > > comparing an int with something too large for an int. But I should have > > forced the constant to be evaluated as long long. At the moment, the > > constant cannot overflow, but it can reach a value so high that > > timeout/1000 will never reach it. Example : > > MAX_SCHEDULE_TIMEOUT=LONG_MAX > > HZ=250 > > timeout=LONG_MAX-1 > > => timeout/1000 < MAX_SCHEDULE_TIMEOUT/HZ > > but (timeout * HZ + 999) / 1000 will still overflow ! > > > > So I finally think that the safest test would be to avoid the timeout > > range which can overflow in the computation, using something like this > > (but which will limit the timeout to 49 days on HZ=1000 machines) : > > > > + jtimeout = timeout < 0 || \ > > + timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || \ > > + timeout >= (LONG_MAX / HZ - 1000) ? > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > It seems that we can make the second overflow test be less strict by > doing the following instead: > timeout >= (LONG_MAX - 1000) / HZ > Unless I'm confused. :-) oops, you're right. Then it produces the following patch : diff -purN linux-2.6.13/fs/eventpoll.c linux-2.6.13-epoll/fs/eventpoll.c --- linux-2.6.13/fs/eventpoll.c Sun Sep 11 08:25:26 2005 +++ linux-2.6.13-epoll/fs/eventpoll.c Sat Sep 24 09:49:43 2005 @@ -1504,9 +1504,12 @@ static int ep_poll(struct eventpoll *ep, /* * Calculate the timeout by checking for the "infinite" value ( -1 ) * and the overflow condition. The passed timeout is in milliseconds, - * that why (t * HZ) / 1000. + * that why (t * HZ) / 1000. Note that we also want to avoid an + * overflow in the multiply. */ - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? + jtimeout = timeout < 0 || + timeout > (MAX_SCHEDULE_TIMEOUT * 1000ULL / HZ) || + timeout > (LONG_MAX - 1000) / HZ ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; retry: Interestingly, as long as MAC_SCHEDULE_TIMEOUT == LONG_MAX, the check is identical to the initial one (and does not add any divide) ! Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 6:15 ` Willy Tarreau 2005-09-24 7:33 ` Vadim Lobanov @ 2005-09-24 15:10 ` Davide Libenzi 2005-09-24 17:20 ` Willy Tarreau 2005-09-24 17:19 ` Nishanth Aravamudan 2 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2005-09-24 15:10 UTC (permalink / raw) To: Willy Tarreau; +Cc: Nish Aravamudan, Linux Kernel Mailing List, Andrew Morton On Sat, 24 Sep 2005, Willy Tarreau wrote: > Yes it can, and that's why I said that gcc should send a warning when > comparing an int with something too large for an int. But I should have > forced the constant to be evaluated as long long. At the moment, the > constant cannot overflow, but it can reach a value so high that > timeout/1000 will never reach it. Example : > MAX_SCHEDULE_TIMEOUT=LONG_MAX > HZ=250 > timeout=LONG_MAX-1 > => timeout/1000 < MAX_SCHEDULE_TIMEOUT/HZ > but (timeout * HZ + 999) / 1000 will still overflow ! > > So I finally think that the safest test would be to avoid the timeout > range which can overflow in the computation, using something like this > (but which will limit the timeout to 49 days on HZ=1000 machines) : > > + jtimeout = timeout < 0 || \ > + timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || \ > + timeout >= (LONG_MAX / HZ - 1000) ? > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > as both are constants, they can be optimized. Otherwise, we can resort to > using a MAX() macro to reduce this to only one test which will catch all > corner cases. Using the MIN() macro would be better so we have a single check, and the compiler optimize that automatically. Or we can force 'timeout * HZ' to use ULL math. I don't think it makes a lot of difference for something that is in a likely sleep path ;) - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 15:10 ` Davide Libenzi @ 2005-09-24 17:20 ` Willy Tarreau 2005-09-24 18:19 ` Davide Libenzi 0 siblings, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 17:20 UTC (permalink / raw) To: Davide Libenzi Cc: Willy Tarreau, Nish Aravamudan, Linux Kernel Mailing List, Andrew Morton On Sat, Sep 24, 2005 at 08:10:32AM -0700, Davide Libenzi wrote: > >+ jtimeout = timeout < 0 || \ > >+ timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || > >\ > >+ timeout >= (LONG_MAX / HZ - 1000) ? > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > >as both are constants, they can be optimized. Otherwise, we can resort to > >using a MAX() macro to reduce this to only one test which will catch all > >corner cases. > > Using the MIN() macro would be better so we have a single check, and the > compiler optimize that automatically. you're right, it's MIN() not MAX() ;-) Anyway, I've checked the code and the compiler does a single test with -O2. > Or we can force 'timeout * HZ' to use ULL math. I don't think it makes a lot of difference for something that is in a likely sleep path ;) "likely", yes, but not necessarily. Under a high load, you can have enough events queued so that epoll() will not wait at all. I've already encountered such cases during benchmarks, and I noticed that epoll() took more time than select() for small numbers of FDs (something like 20% below 100 FDs), but of course, it is considerably faster above. So turning the multiply to an ULL may increase this overhead on some architectures, while the double check will leave the code identical. Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 17:20 ` Willy Tarreau @ 2005-09-24 18:19 ` Davide Libenzi 2005-09-25 6:05 ` Andrew Morton 0 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2005-09-24 18:19 UTC (permalink / raw) To: Willy Tarreau; +Cc: Nish Aravamudan, Linux Kernel Mailing List, Andrew Morton [-- Attachment #1: Type: TEXT/PLAIN, Size: 1851 bytes --] On Sat, 24 Sep 2005, Willy Tarreau wrote: > On Sat, Sep 24, 2005 at 08:10:32AM -0700, Davide Libenzi wrote: >>> + jtimeout = timeout < 0 || \ >>> + timeout >= (1000ULL * MAX_SCHEDULE_TIMEOUT / HZ) || >>> \ >>> + timeout >= (LONG_MAX / HZ - 1000) ? >>> MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; >>> >>> as both are constants, they can be optimized. Otherwise, we can resort to >>> using a MAX() macro to reduce this to only one test which will catch all >>> corner cases. >> >> Using the MIN() macro would be better so we have a single check, and the >> compiler optimize that automatically. > > you're right, it's MIN() not MAX() ;-) > Anyway, I've checked the code and the compiler does a single test with -O2. > >> Or we can force 'timeout * HZ' to use ULL math. I don't think it makes a lot of difference for something that is in a likely sleep path ;) > > "likely", yes, but not necessarily. Under a high load, you can have enough > events queued so that epoll() will not wait at all. I've already encountered > such cases during benchmarks, and I noticed that epoll() took more time than > select() for small numbers of FDs (something like 20% below 100 FDs), but of > course, it is considerably faster above. So turning the multiply to an ULL > may increase this overhead on some architectures, while the double check > will leave the code identical. The attached patch uses the kernel min() macro, that is optimized has single compare by gcc-O2. Andrew, this goes over (hopefully ;) the bits you already have in -mm. PS: It might be possible to move the currently epoll-local EP_MAX_MSTIMEO macro, to a MAX_LONG_MSTIMEO (or whatever name you like) somewhere in the tree, to be used where needed. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> - Davide [-- Attachment #2: Type: TEXT/plain, Size: 799 bytes --] --- a/fs/eventpoll.c 2005-09-24 11:07:04.000000000 -0700 +++ b/fs/eventpoll.c 2005-09-24 11:11:06.000000000 -0700 @@ -101,6 +101,10 @@ /* Maximum number of poll wake up nests we are allowing */ #define EP_MAX_POLLWAKE_NESTS 4 +/* Maximum msec timeout value storeable in a long int */ +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) + + struct epoll_filefd { struct file *file; int fd; @@ -1507,8 +1511,7 @@ * and the overflow condition. The passed timeout is in milliseconds, * that why (t * HZ) / 1000. */ - jtimeout = (timeout < 0 || - (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ)) ? + jtimeout = (timeout < 0 || timeout >= EP_MAX_MSTIMEO) ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; retry: ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 18:19 ` Davide Libenzi @ 2005-09-25 6:05 ` Andrew Morton 2005-09-25 6:20 ` Willy Tarreau 2005-09-25 7:08 ` Vadim Lobanov 0 siblings, 2 replies; 33+ messages in thread From: Andrew Morton @ 2005-09-25 6:05 UTC (permalink / raw) To: Davide Libenzi; +Cc: willy, nish.aravamudan, linux-kernel Davide Libenzi <davidel@xmailserver.org> wrote: > > The attached patch uses the kernel min() macro, that is optimized has > single compare by gcc-O2. Andrew, this goes over (hopefully ;) the bits > you already have in -mm. OK, well I've rather lost the plot with all the patches flying around. I now have one single patch against epoll.c, below. Please confirm that this is what was intended. If not, I'll drop it and let's start again. From: Davide Libenzi <davidel@xmailserver.org> The sys_epoll_wait() function was not handling correctly negative timeouts (besides -1), and like sys_poll(), was comparing millisec to secs in testing the upper timeout limit. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@osdl.org> --- fs/eventpoll.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff -puN fs/eventpoll.c~sys_epoll_wait-fix-handling-of-negative-timeouts fs/eventpoll.c --- devel/fs/eventpoll.c~sys_epoll_wait-fix-handling-of-negative-timeouts 2005-09-24 23:01:00.000000000 -0700 +++ devel-akpm/fs/eventpoll.c 2005-09-24 23:02:50.000000000 -0700 @@ -101,6 +101,10 @@ /* Maximum number of poll wake up nests we are allowing */ #define EP_MAX_POLLWAKE_NESTS 4 +/* Maximum msec timeout value storeable in a long int */ +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) + + struct epoll_filefd { struct file *file; int fd; @@ -1506,8 +1510,8 @@ static int ep_poll(struct eventpoll *ep, * and the overflow condition. The passed timeout is in milliseconds, * that why (t * HZ) / 1000. */ - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? - MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; + jtimeout = (timeout < 0 || timeout >= EP_MAX_MSTIMEO) ? + MAX_SCHEDULE_TIMEOUT : (timeout * HZ + 999) / 1000; retry: write_lock_irqsave(&ep->lock, flags); _ ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-25 6:05 ` Andrew Morton @ 2005-09-25 6:20 ` Willy Tarreau 2005-09-25 6:32 ` Andrew Morton 2005-09-25 7:08 ` Vadim Lobanov 1 sibling, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-25 6:20 UTC (permalink / raw) To: Andrew Morton; +Cc: Davide Libenzi, willy, nish.aravamudan, linux-kernel Hi Andrew, On Sat, Sep 24, 2005 at 11:05:45PM -0700, Andrew Morton wrote: > Davide Libenzi <davidel@xmailserver.org> wrote: > > > > The attached patch uses the kernel min() macro, that is optimized has > > single compare by gcc-O2. Andrew, this goes over (hopefully ;) the bits > > you already have in -mm. > > OK, well I've rather lost the plot with all the patches flying around. > > I now have one single patch against epoll.c, below. Please confirm that > this is what was intended. If not, I'll drop it and let's start again. Well, this one is fine. However, since there are possible overflows in msec_to_jiffies() that I fixed in another patch, it would be cleaner to only rely on this function's ability to detect overflows. Moreover, using this function removes the divide for most common values of HZ, which is interesting on some archs. As Davide mentionned it, the EP_MAX_MSTIMEO constant would be useful for other parts of the kernel (starting with sys_poll()). That's also what my fix to jiffies.h does. If you don't feel comfortable with so much changes at once, it's better to first apply Davide's fix, then later the jiffies one, and then after, I could resend the ep_poll() and sys_poll() patches to use the functions in jiffies.h instead of doing their own magics. Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-25 6:20 ` Willy Tarreau @ 2005-09-25 6:32 ` Andrew Morton 0 siblings, 0 replies; 33+ messages in thread From: Andrew Morton @ 2005-09-25 6:32 UTC (permalink / raw) To: Willy Tarreau; +Cc: davidel, willy, nish.aravamudan, linux-kernel Willy Tarreau <willy@w.ods.org> wrote: > > If you don't feel comfortable with so much changes at once, it's better > to first apply Davide's fix, then later the jiffies one, and then after, > I could resend the ep_poll() and sys_poll() patches to use the functions > in jiffies.h instead of doing their own magics. That works for me. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-25 6:05 ` Andrew Morton 2005-09-25 6:20 ` Willy Tarreau @ 2005-09-25 7:08 ` Vadim Lobanov 2005-09-25 8:03 ` Willy Tarreau 1 sibling, 1 reply; 33+ messages in thread From: Vadim Lobanov @ 2005-09-25 7:08 UTC (permalink / raw) To: Andrew Morton, Davide Libenzi; +Cc: willy, nish.aravamudan, linux-kernel On Sat, 24 Sep 2005, Andrew Morton wrote: > Davide Libenzi <davidel@xmailserver.org> wrote: > > > > The attached patch uses the kernel min() macro, that is optimized has > > single compare by gcc-O2. Andrew, this goes over (hopefully ;) the bits > > you already have in -mm. > > OK, well I've rather lost the plot with all the patches flying around. > > I now have one single patch against epoll.c, below. Please confirm that > this is what was intended. If not, I'll drop it and let's start again. > I hate to be the squeaky wheel here, but the attached patch is not 100% right - > > From: Davide Libenzi <davidel@xmailserver.org> > > The sys_epoll_wait() function was not handling correctly negative timeouts > (besides -1), and like sys_poll(), was comparing millisec to secs in > testing the upper timeout limit. > > Signed-off-by: Davide Libenzi <davidel@xmailserver.org> > Signed-off-by: Andrew Morton <akpm@osdl.org> > --- > > fs/eventpoll.c | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff -puN fs/eventpoll.c~sys_epoll_wait-fix-handling-of-negative-timeouts fs/eventpoll.c > --- devel/fs/eventpoll.c~sys_epoll_wait-fix-handling-of-negative-timeouts 2005-09-24 23:01:00.000000000 -0700 > +++ devel-akpm/fs/eventpoll.c 2005-09-24 23:02:50.000000000 -0700 > @@ -101,6 +101,10 @@ > /* Maximum number of poll wake up nests we are allowing */ > #define EP_MAX_POLLWAKE_NESTS 4 > > +/* Maximum msec timeout value storeable in a long int */ > +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) This should instead be: #define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, (LONG_MAX - 999ULL) / HZ) Here's why: We want to avoid overflow of (timeout * HZ + 999), or, in other words, the case where (timeout * HZ + 999) >= LONG_MAX Unwrapping the equation, we get timeout >= (LONG_MAX - 999) / HZ The original code isn't _wrong_, but more restrictive than it should be. In any case, better to fix up the base patch now, before all the other patches go in. I could do this, or Davide can... it's all good. :-) > + > + > struct epoll_filefd { > struct file *file; > int fd; > @@ -1506,8 +1510,8 @@ static int ep_poll(struct eventpoll *ep, > * and the overflow condition. The passed timeout is in milliseconds, > * that why (t * HZ) / 1000. > */ > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > - MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > + jtimeout = (timeout < 0 || timeout >= EP_MAX_MSTIMEO) ? > + MAX_SCHEDULE_TIMEOUT : (timeout * HZ + 999) / 1000; > > retry: > write_lock_irqsave(&ep->lock, flags); > _ > > - -Vadim Lobanov ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-25 7:08 ` Vadim Lobanov @ 2005-09-25 8:03 ` Willy Tarreau 0 siblings, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-25 8:03 UTC (permalink / raw) To: Vadim Lobanov Cc: Andrew Morton, Davide Libenzi, nish.aravamudan, linux-kernel Hi, On Sun, Sep 25, 2005 at 12:08:03AM -0700, Vadim Lobanov wrote: (...) > > +/* Maximum msec timeout value storeable in a long int */ > > +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) > > This should instead be: > #define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, (LONG_MAX - 999ULL) / HZ) > Here's why: > We want to avoid overflow of (timeout * HZ + 999), or, in other words, > the case where (timeout * HZ + 999) >= LONG_MAX > Unwrapping the equation, we get timeout >= (LONG_MAX - 999) / HZ > > The original code isn't _wrong_, but more restrictive than it should be. > In any case, better to fix up the base patch now, before all the other > patches go in. I could do this, or Davide can... it's all good. :-) I think it's because with the numerous changes we brought, the '>' test became '>=' but the old timeout was still used with '>'. With '>=', I agree with you that it must be -999. Andrew, Vadim is right. Anyway, this proves why we must really move all those complicated tests to jiffies.h ASAP ! BTW, Andrew, could you merge the jiffies fix before -mm3, so that we can remove those annoying tests quickly ? Thanks in advance, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 6:15 ` Willy Tarreau 2005-09-24 7:33 ` Vadim Lobanov 2005-09-24 15:10 ` Davide Libenzi @ 2005-09-24 17:19 ` Nishanth Aravamudan 2005-09-24 18:25 ` Davide Libenzi 2005-09-24 18:30 ` Willy Tarreau 2 siblings, 2 replies; 33+ messages in thread From: Nishanth Aravamudan @ 2005-09-24 17:19 UTC (permalink / raw) To: Willy Tarreau Cc: Nish Aravamudan, Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On 24.09.2005 [08:15:00 +0200], Willy Tarreau wrote: > On Fri, Sep 23, 2005 at 09:44:10PM -0700, Nish Aravamudan wrote: > > > > * that why (t * HZ) / 1000. > > > > */ > > > > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > > > > + jtimeout = timeout < 0 || (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ) ? > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > > > Here, I'm not certain that gcc will optimize the divide. It would be better > > > anyway to write this which is equivalent, and a pure integer comparison : > > > > > > + jtimeout = timeout < 0 || timeout >= 1000 * MAX_SCHEDULE_TIMEOUT / HZ ? > > > > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > > > Just a question here, maybe it's dumb. > > Your question is not dumb, this code is not trivial at all ! > > > * and / have the same priority in the order of operations, yes? If so, > > won't the the 1000 * MAX_SCHEDULE_TIMEOUT overflow > > (MAX_SCHEDULE_TIMEOUT is LONG_MAX)? > > Yes it can, and that's why I said that gcc should send a warning when > comparing an int with something too large for an int. But I should have > forced the constant to be evaluated as long long. At the moment, the > constant cannot overflow, but it can reach a value so high that > timeout/1000 will never reach it. Example : Ok, makes sense! <snip> > > In any case, this code is approaching unreadable with lots of jiffies > > <--> human-time units manipulations done in non-standard ways, which > > the updated sys_poll() also tries to avoid. > > I've not checked sys_poll(), but I agree with you that it's rather > difficult to imagine all corner cases this way. Here is the patch I was referring to, adapted to the epoll use of schedule_timeout(). A similar fix is in -mm for almost identical code in sys_poll(). Description: The @timeout parameter to ep_poll() is in milliseconds but we compare it to (MAX_SCHEDULE_TIMEOUT - 1000 / HZ), which is (jiffies/jiffies-per-sec) or seconds. That seems blatantly broken. This led to improper overflow checking for @timeout. As Andrew Morton pointed out in a similar fix for sys_poll(), the best solution is to to check for potential overflow first, then either select an indefinite value or convert @timeout. To achieve this and clean-up the code, change the prototype of the ep_poll() to make it clear that the parameter is in milliseconds and rename jtimeout to timeout_jiffies to hold the corresonding jiffies value. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- fs/eventpoll.c | 46 ++++++++++++++++++++++++++++++++++++++-------- 1 files changed, 38 insertions(+), 8 deletions(-) diff -urpN 2.6.14-rc2-mm1/fs/eventpoll.c 2.6.14-rc2-mm1-dev/fs/eventpoll.c --- 2.6.14-rc2-mm1/fs/eventpoll.c 2005-09-21 23:50:28.000000000 -0700 +++ 2.6.14-rc2-mm1-dev/fs/eventpoll.c 2005-09-24 10:07:51.000000000 -0700 @@ -260,7 +260,7 @@ static int ep_events_transfer(struct eve struct epoll_event __user *events, int maxevents); static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, - int maxevents, long timeout); + int maxevents, long timeout_msecs); static int eventpollfs_delete_dentry(struct dentry *dentry); static struct inode *ep_eventpoll_inode(void); static struct super_block *eventpollfs_get_sb(struct file_system_type *fs_type, @@ -1494,11 +1494,12 @@ static int ep_events_transfer(struct eve static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, - int maxevents, long timeout) + int maxevents, long timeout_msecs) { int res, eavail; + int overflow; unsigned long flags; - long jtimeout; + long timeout_jiffies; wait_queue_t wait; /* @@ -1506,8 +1507,36 @@ static int ep_poll(struct eventpoll *ep, * and the overflow condition. The passed timeout is in milliseconds, * that why (t * HZ) / 1000. */ - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? - MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; + + if (timeout_msecs) { + /* + * We compare HZ with 1000 to work out which side of the + * expression needs conversion. Because we want to + * avoid converting any value to a numerically higher + * value, which could overflow. + */ +#if HZ > 1000 + overflow = timeout_msecs >= + jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT); +#else + overflow = msecs_to_jiffies(timeout_msecs) >= + MAX_SCHEDULE_TIMEOUT; +#endif + + /* + * If we would overflow in the conversion or a negative + * timeout is requested, sleep indefinitely. + */ + if (overflow || timeout_msecs < 0) + timeout_jiffies = MAX_SCHEDULE_TIMEOUT; + else + timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; + } else { + /* + * 0 millisecond requests become 0 jiffy requests + */ + timeout_jiffies = 0; + } retry: write_lock_irqsave(&ep->lock, flags); @@ -1529,7 +1558,7 @@ retry: * to TASK_INTERRUPTIBLE before doing the checks. */ set_current_state(TASK_INTERRUPTIBLE); - if (!list_empty(&ep->rdllist) || !jtimeout) + if (!list_empty(&ep->rdllist) || !timeout_jiffies) break; if (signal_pending(current)) { res = -EINTR; @@ -1537,7 +1566,7 @@ retry: } write_unlock_irqrestore(&ep->lock, flags); - jtimeout = schedule_timeout(jtimeout); + timeout_jiffies = schedule_timeout(timeout_jiffies); write_lock_irqsave(&ep->lock, flags); } remove_wait_queue(&ep->wq, &wait); @@ -1556,7 +1585,8 @@ retry: * more luck. */ if (!res && eavail && - !(res = ep_events_transfer(ep, events, maxevents)) && jtimeout) + !(res = ep_events_transfer(ep, events, maxevents)) && + timeout_jiffies) goto retry; return res; ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 17:19 ` Nishanth Aravamudan @ 2005-09-24 18:25 ` Davide Libenzi 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau 2005-09-24 21:25 ` [patch] sys_epoll_wait() timeout saga Vadim Lobanov 2005-09-24 18:30 ` Willy Tarreau 1 sibling, 2 replies; 33+ messages in thread From: Davide Libenzi @ 2005-09-24 18:25 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Willy Tarreau, Nish Aravamudan, Linux Kernel Mailing List, Andrew Morton On Sat, 24 Sep 2005, Nishanth Aravamudan wrote: > Description: The @timeout parameter to ep_poll() is in milliseconds but > we compare it to (MAX_SCHEDULE_TIMEOUT - 1000 / HZ), which is > (jiffies/jiffies-per-sec) or seconds. That seems blatantly broken. This > led to improper overflow checking for @timeout. As Andrew Morton pointed > out in a similar fix for sys_poll(), the best solution is to to check > for potential overflow first, then either select an indefinite value or > convert @timeout. > > To achieve this and clean-up the code, change the prototype of the > ep_poll() to make it clear that the parameter is in milliseconds and > rename jtimeout to timeout_jiffies to hold the corresonding jiffies > value. > > fs/eventpoll.c | 46 ++++++++++++++++++++++++++++++++++++++-------- > 1 files changed, 38 insertions(+), 8 deletions(-) Eh? "To achieve this and clean-up the code"? :) - Davide eventpoll.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) --- a/fs/eventpoll.c 2005-09-24 11:07:04.000000000 -0700 +++ b/fs/eventpoll.c 2005-09-24 11:11:06.000000000 -0700 @@ -101,6 +101,10 @@ /* Maximum number of poll wake up nests we are allowing */ #define EP_MAX_POLLWAKE_NESTS 4 +/* Maximum msec timeout value storeable in a long int */ +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) + + struct epoll_filefd { struct file *file; int fd; @@ -1507,8 +1511,7 @@ * and the overflow condition. The passed timeout is in milliseconds, * that why (t * HZ) / 1000. */ - jtimeout = (timeout < 0 || - (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ)) ? + jtimeout = (timeout < 0 || timeout >= EP_MAX_MSTIMEO) ? MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; retry: ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() 2005-09-24 18:25 ` Davide Libenzi @ 2005-09-24 19:38 ` Willy Tarreau 2005-09-24 19:44 ` [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() Willy Tarreau ` (4 more replies) 2005-09-24 21:25 ` [patch] sys_epoll_wait() timeout saga Vadim Lobanov 1 sibling, 5 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 19:38 UTC (permalink / raw) To: Davide Libenzi, Andrew Morton Cc: Nishanth Aravamudan, Willy Tarreau, Nish Aravamudan, Linux Kernel Mailing List Hello, After the discussion around epoll() timeout, I noticed that the functions used to detect the timeout could themselves overflow for some values of HZ. So I decided to fix them by defining a macro which represents the maximal acceptable argument which is guaranteed not to overflow. As an added bonus, those functions can now be used in poll() and ep_poll() and remove the divide if HZ == 1000, or replace it with a shift if (1000 % HZ) or (HZ % 1000) is a power of two. Patches against 2.6.14-rc2-mm1 sent as replies to this mail. Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau @ 2005-09-24 19:44 ` Willy Tarreau 2005-09-29 9:43 ` Andrew Morton 2005-09-24 19:47 ` [PATCH 2/3] 2.6.14-rc2-mm1: fixes for overflow in epoll() Willy Tarreau ` (3 subsequent siblings) 4 siblings, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 19:44 UTC (permalink / raw) To: Davide Libenzi, Andrew Morton Cc: Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List On Sat, Sep 24, 2005 at 09:38:39PM +0200, Willy Tarreau wrote: msecs_to_jiffies() can compute a wrong timeout limit for some values of HZ : #define MAX_JIFFY_OFFSET ((~0UL >> 1)-1) static inline unsigned int jiffies_to_msecs(const unsigned long j) { #if HZ <= 1000 && !(1000 % HZ) return (1000 / HZ) * j; #elif HZ > 1000 && !(HZ % 1000) return (j + (HZ / 1000) - 1)/(HZ / 1000); #else return (j * 1000) / HZ; #endif } static inline unsigned long msecs_to_jiffies(const unsigned int m) { if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) return MAX_JIFFY_OFFSET; #if HZ <= 1000 && !(1000 % HZ) return (m + (1000 / HZ) - 1) / (1000 / HZ); #elif HZ > 1000 && !(HZ % 1000) return m * (HZ / 1000); #else return (m * HZ + 999) / 1000; #endif } Calling jiffies_to_msecs(MAX_JIFFY_OFFSET) overflows if 1000/HZ >=2 or if (1000 % HZ != 0 || HZ % 1000 != 0). Eg: for HZ=300, the max is set to 0xda7406 while it is set to 0xfffffff8 for HZ=250 ! The following patch implements two macros MAX_MSEC_OFFSET and MAX_USEC_OFFSET which are used to fix msecs_to_jiffies() and usecs_to_jiffies() respectively. They are defined depending on the operation which will be performed by the function, so that they try to limit to the absolute maximum acceptable value. If someone has better naming, feel free to update the patch. This patch applies to 2.6.14-rc2 and 2.6.14-rc2-mm1. Please review and apply. Signed-off-by: Willy Tarreau <willy@w.ods.org> - Willy diff -purN linux-2.6.14-rc2-mm1/include/linux/jiffies.h linux-2.6.14-rc2-mm1-jiffies/include/linux/jiffies.h --- linux-2.6.14-rc2-mm1/include/linux/jiffies.h Sat Sep 24 21:11:51 2005 +++ linux-2.6.14-rc2-mm1-jiffies/include/linux/jiffies.h Sat Sep 24 21:17:28 2005 @@ -246,6 +246,37 @@ static inline u64 get_jiffies_64(void) #endif + +/* + * We define MAX_MSEC_OFFSET and MAX_USEC_OFFSET as maximal values that can be + * accepted by msecs_to_jiffies() and usec_to_jiffies() respectively, without + * risking a multiply overflow. Those functions return MAX_JIFFY_OFFSET for + * arguments above those values. + */ + +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) +# define MAX_MSEC_OFFSET \ + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) +#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) +# define MAX_MSEC_OFFSET \ + ULONG_MAX / (HZ / MSEC_PER_SEC) +#else +# define MAX_MSEC_OFFSET \ + (ULONG_MAX - (MSEC_PER_SEC - 1)) / HZ +#endif + +#if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ) +# define MAX_USEC_OFFSET \ + (ULONG_MAX - (USEC_PER_SEC / HZ) + 1) +#elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC) +# define MAX_USEC_OFFSET \ + ULONG_MAX / (HZ / USEC_PER_SEC) +#else +# define MAX_USEC_OFFSET \ + (ULONG_MAX - (USEC_PER_SEC - 1)) / HZ +#endif + + /* * Convert jiffies to milliseconds and back. * @@ -276,7 +307,7 @@ static inline unsigned int jiffies_to_us static inline unsigned long msecs_to_jiffies(const unsigned int m) { - if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) + if (m > MAX_MSEC_OFFSET) return MAX_JIFFY_OFFSET; #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); @@ -289,7 +320,7 @@ static inline unsigned long msecs_to_jif static inline unsigned long usecs_to_jiffies(const unsigned int u) { - if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET)) + if (u > MAX_USEC_OFFSET) return MAX_JIFFY_OFFSET; #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ) return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ); ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-24 19:44 ` [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() Willy Tarreau @ 2005-09-29 9:43 ` Andrew Morton 2005-09-29 19:41 ` Willy Tarreau 0 siblings, 1 reply; 33+ messages in thread From: Andrew Morton @ 2005-09-29 9:43 UTC (permalink / raw) To: Willy Tarreau; +Cc: davidel, nacc, nish.aravamudan, linux-kernel Willy Tarreau <willy@w.ods.org> wrote: > > +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) > +# define MAX_MSEC_OFFSET \ > + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) That generates numbers which don't fit into unsigned ints, yielding vast numbers of include/linux/jiffies.h: In function `msecs_to_jiffies': include/linux/jiffies.h:310: warning: comparison is always false due to limited range of data type include/linux/jiffies.h: In function `usecs_to_jiffies': include/linux/jiffies.h:323: warning: comparison is always false due to limited range of data type ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-29 9:43 ` Andrew Morton @ 2005-09-29 19:41 ` Willy Tarreau 2005-09-29 19:52 ` Andrew Morton 0 siblings, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-29 19:41 UTC (permalink / raw) To: Andrew Morton; +Cc: davidel, nacc, nish.aravamudan, linux-kernel Thanks Andrew, I'm very sorry because I have verified the code with gcc-2.95.3, gcc-3.3.6 and gcc-3.4.4 on x86 and alpha to ensure that everything went smooth on archs where sizeof(long) > sizeof(int). But I've tested all the combinations in user-space for obvious ease of validation. I believe I forgot to use -Wall. What architecture gave you this, and with which compiler please ? I'm willing to fix this as soon as I can understand the root of the problem. Regards, Willy On Thu, Sep 29, 2005 at 02:43:12AM -0700, Andrew Morton wrote: > Willy Tarreau <willy@w.ods.org> wrote: > > > > +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) > > +# define MAX_MSEC_OFFSET \ > > + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) > > That generates numbers which don't fit into unsigned ints, yielding vast > numbers of > > include/linux/jiffies.h: In function `msecs_to_jiffies': > include/linux/jiffies.h:310: warning: comparison is always false due to limited range of data type > include/linux/jiffies.h: In function `usecs_to_jiffies': > include/linux/jiffies.h:323: warning: comparison is always false due to limited range of data type > ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-29 19:41 ` Willy Tarreau @ 2005-09-29 19:52 ` Andrew Morton 2005-09-29 20:55 ` Willy Tarreau 2005-10-01 17:39 ` Willy Tarreau 0 siblings, 2 replies; 33+ messages in thread From: Andrew Morton @ 2005-09-29 19:52 UTC (permalink / raw) To: Willy Tarreau; +Cc: davidel, nacc, nish.aravamudan, linux-kernel Willy Tarreau <willy@w.ods.org> wrote: > > Thanks Andrew, > > I'm very sorry because I have verified the code with gcc-2.95.3, > gcc-3.3.6 and gcc-3.4.4 on x86 and alpha to ensure that everything > went smooth on archs where sizeof(long) > sizeof(int). But I've > tested all the combinations in user-space for obvious ease of > validation. I believe I forgot to use -Wall. What architecture > gave you this, and with which compiler please ? I'm willing to > fix this as soon as I can understand the root of the problem. > http://www.zip.com.au/~akpm/linux/patches/stuff/top-posting.txt > > > On Thu, Sep 29, 2005 at 02:43:12AM -0700, Andrew Morton wrote: > > Willy Tarreau <willy@w.ods.org> wrote: > > > > > > +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) > > > +# define MAX_MSEC_OFFSET \ > > > + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) > > > > That generates numbers which don't fit into unsigned ints, yielding vast > > numbers of > > > > include/linux/jiffies.h: In function `msecs_to_jiffies': > > include/linux/jiffies.h:310: warning: comparison is always false due to limited range of data type > > include/linux/jiffies.h: In function `usecs_to_jiffies': > > include/linux/jiffies.h:323: warning: comparison is always false due to limited range of data type > > This was a ppc64 build, gcc-3.3.3, CONFIG_HZ=250 Look a the value which MAX_MSEC_OFFSET will take (it's 2^63 minus a bit). Comparing that to an unsigned int will generate the always-true or always-false warning. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-29 19:52 ` Andrew Morton @ 2005-09-29 20:55 ` Willy Tarreau 2005-10-01 17:39 ` Willy Tarreau 1 sibling, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-29 20:55 UTC (permalink / raw) To: Andrew Morton; +Cc: davidel, nacc, nish.aravamudan, linux-kernel On Thu, Sep 29, 2005 at 12:52:07PM -0700, Andrew Morton wrote: > Willy Tarreau <willy@w.ods.org> wrote: > > > > Thanks Andrew, > > > > I'm very sorry because I have verified the code with gcc-2.95.3, > > gcc-3.3.6 and gcc-3.4.4 on x86 and alpha to ensure that everything > > went smooth on archs where sizeof(long) > sizeof(int). But I've > > tested all the combinations in user-space for obvious ease of > > validation. I believe I forgot to use -Wall. What architecture > > gave you this, and with which compiler please ? I'm willing to > > fix this as soon as I can understand the root of the problem. > > > > http://www.zip.com.au/~akpm/linux/patches/stuff/top-posting.txt Guess what ? I hate it too when others do it, but I often think that my mail will be a one-liner which will be easier to read this way, and of course I'm wrong. Nice FAQ BTW. > > On Thu, Sep 29, 2005 at 02:43:12AM -0700, Andrew Morton wrote: > > > Willy Tarreau <willy@w.ods.org> wrote: > > > > > > > > +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) > > > > +# define MAX_MSEC_OFFSET \ > > > > + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) > > > > > > That generates numbers which don't fit into unsigned ints, yielding vast > > > numbers of > > > > > > include/linux/jiffies.h: In function `msecs_to_jiffies': > > > include/linux/jiffies.h:310: warning: comparison is always false due to limited range of data type > > > include/linux/jiffies.h: In function `usecs_to_jiffies': > > > include/linux/jiffies.h:323: warning: comparison is always false due to limited range of data type > > > > > This was a ppc64 build, gcc-3.3.3, CONFIG_HZ=250 OK, I have a free account on a ppc64 machine in case I cannot reproduce on anything else. > Look a the value which MAX_MSEC_OFFSET will take (it's 2^63 minus a bit). > Comparing that to an unsigned int will generate the always-true or > always-false warning. I see it. I've done the ulong magic only on the constant computation, while in theory, I should have casted m to ulong before the comparison because m is implicitly casted to ulong in the return (common type). In practise, it should be better to cast MAX_MSEC_OFFSET to unsigned int in the comparison, but there's still a risk of warning if MAX_MSEC_OFFSET becomes equal to ~0. I'll makes a few tests and check that gcc is smart enough to remove the cast code when unneeded if I cast m to ulong. Thanks for the details, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() 2005-09-29 19:52 ` Andrew Morton 2005-09-29 20:55 ` Willy Tarreau @ 2005-10-01 17:39 ` Willy Tarreau 1 sibling, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-10-01 17:39 UTC (permalink / raw) To: Andrew Morton; +Cc: davidel, nacc, nish.aravamudan, linux-kernel Hi Andrew, On Thu, Sep 29, 2005 at 12:52:07PM -0700, Andrew Morton wrote: > > On Thu, Sep 29, 2005 at 02:43:12AM -0700, Andrew Morton wrote: > > > Willy Tarreau <willy@w.ods.org> wrote: > > > > > > > > +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) > > > > +# define MAX_MSEC_OFFSET \ > > > > + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) > > > > > > That generates numbers which don't fit into unsigned ints, yielding vast > > > numbers of > > > > > > include/linux/jiffies.h: In function `msecs_to_jiffies': > > > include/linux/jiffies.h:310: warning: comparison is always false due to limited range of data type > > > include/linux/jiffies.h: In function `usecs_to_jiffies': > > > include/linux/jiffies.h:323: warning: comparison is always false due to limited range of data type > > > > > This was a ppc64 build, gcc-3.3.3, CONFIG_HZ=250 > > Look a the value which MAX_MSEC_OFFSET will take (it's 2^63 minus a bit). > Comparing that to an unsigned int will generate the always-true or > always-false warning. OK, this was not trivial because gcc is smart enough to detect that even after a cast, the comparison is still always false. So I had to cut the comparison in two parts : one which tests whether we need to compare, and one which does the test on the unsigned int part if necessary. It has shut the warnings on my alpha (HZ=1024) and on my ultrasparc (HZ=250). I've noticed that the code in previous patch could still overflow in the cases where a multiply was used first, because the common type was still int. My test case in user-space used MSEC_PER_SEC = 1000UL so it did not happen. I've fixed this too. I've also checked the code produced on x86 (because alpha code is unreadable to me), and it resumes to this : - HZ=1000 : no code generated - HZ=250 : comparison, add, right shift - HZ=100 : comparison, add, divide - HZ=1024 : comparison, left shift, add, mul, right shift I tried to compile on ppc64, but unfortunately, the kernel does not build there because sizeof(long) == 4 ! I guess this is because gcc's target is powerpc-linux-gnu. I'm trying to recompile it with powerpc64-linux-gnu. Could you send me the output of gcc -v on your ppc64, please ? I've also added missing parenthesis in the #defines which might have caused trouble to external users of MAX_?SEC_OFFSET (none at the moment). Here's the new patch, I've build everything on 2.6.14-rc2-mm1 but verified that this code was not touched in -mm2 and the patch still applies. Could you please retest it on your ppc64 and apply it if you're OK with it ? Thanks, Willy Signed-off-by: Willy Tarreau <willy@w.ods.org> --- linux-2.6.14-rc2-mm1/include/linux/jiffies.h Thu Sep 29 23:04:49 2005 +++ linux-2.6.14-rc2-mm1-jiffies2/include/linux/jiffies.h Sat Oct 1 19:12:13 2005 @@ -246,6 +246,37 @@ #endif + +/* + * We define MAX_MSEC_OFFSET and MAX_USEC_OFFSET as maximal values that can be + * accepted by msecs_to_jiffies() and usec_to_jiffies() respectively, without + * risking a multiply overflow. Those functions return MAX_JIFFY_OFFSET for + * arguments above those values. + */ + +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) +# define MAX_MSEC_OFFSET \ + (ULONG_MAX - (MSEC_PER_SEC / HZ) + 1) +#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) +# define MAX_MSEC_OFFSET \ + (ULONG_MAX / (HZ / MSEC_PER_SEC)) +#else +# define MAX_MSEC_OFFSET \ + ((ULONG_MAX - (MSEC_PER_SEC - 1)) / HZ) +#endif + +#if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ) +# define MAX_USEC_OFFSET \ + (ULONG_MAX - (USEC_PER_SEC / HZ) + 1) +#elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC) +# define MAX_USEC_OFFSET \ + (ULONG_MAX / (HZ / USEC_PER_SEC)) +#else +# define MAX_USEC_OFFSET \ + ((ULONG_MAX - (USEC_PER_SEC - 1)) / HZ) +#endif + + /* * Convert jiffies to milliseconds and back. * @@ -276,27 +307,29 @@ static inline unsigned long msecs_to_jiffies(const unsigned int m) { - if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) + if (MAX_MSEC_OFFSET < UINT_MAX && m > (unsigned int)MAX_MSEC_OFFSET) return MAX_JIFFY_OFFSET; #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) - return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); + return ((unsigned long)m + (MSEC_PER_SEC / HZ) - 1) / + (MSEC_PER_SEC / HZ); #elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) - return m * (HZ / MSEC_PER_SEC); + return (unsigned long)m * (HZ / MSEC_PER_SEC); #else - return (m * HZ + MSEC_PER_SEC - 1) / MSEC_PER_SEC; + return ((unsigned long)m * HZ + MSEC_PER_SEC - 1) / MSEC_PER_SEC; #endif } static inline unsigned long usecs_to_jiffies(const unsigned int u) { - if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET)) + if (MAX_USEC_OFFSET < UINT_MAX && u > (unsigned int)MAX_USEC_OFFSET) return MAX_JIFFY_OFFSET; #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ) - return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ); + return ((unsigned long)u + (USEC_PER_SEC / HZ) - 1) / + (USEC_PER_SEC / HZ); #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC) - return u * (HZ / USEC_PER_SEC); + return (unsigned long)u * (HZ / USEC_PER_SEC); #else - return (u * HZ + USEC_PER_SEC - 1) / USEC_PER_SEC; + return ((unsigned long)u * HZ + USEC_PER_SEC - 1) / USEC_PER_SEC; #endif } ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 2/3] 2.6.14-rc2-mm1: fixes for overflow in epoll() 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau 2005-09-24 19:44 ` [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() Willy Tarreau @ 2005-09-24 19:47 ` Willy Tarreau 2005-09-24 19:52 ` [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() Willy Tarreau ` (2 subsequent siblings) 4 siblings, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 19:47 UTC (permalink / raw) To: Davide Libenzi, Andrew Morton Cc: Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List This patch fixes the potential timeout overflow in ep_poll(), tries to get the best timeout and also removes the /1000 divide whenever possible. It mostly ensures that timeout computation is uniform with other places in the kernel. This patch applies both to 2.6.14-rc2-mm1 and 2.6.14-rc2. Please review and apply. Signed-off-by: Willy Tarreau <willy@w.ods.org> - Willy diff -purN linux-2.6.14-rc2-mm1/fs/eventpoll.c linux-2.6.14-rc2-mm1-epoll/fs/eventpoll.c --- linux-2.6.14-rc2-mm1/fs/eventpoll.c Sat Sep 24 21:11:49 2005 +++ linux-2.6.14-rc2-mm1-epoll/fs/eventpoll.c Sat Sep 24 21:20:10 2005 @@ -1502,12 +1502,11 @@ static int ep_poll(struct eventpoll *ep, wait_queue_t wait; /* - * Calculate the timeout by checking for the "infinite" value ( -1 ) - * and the overflow condition. The passed timeout is in milliseconds, - * that why (t * HZ) / 1000. + * Calculate the timeout by checking for the "infinite" value ( < 0 ) + * and the overflow condition. The passed timeout is in milliseconds. */ - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? - MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; + jtimeout = (timeout < 0) ? + MAX_SCHEDULE_TIMEOUT : msecs_to_jiffies(timeout); retry: write_lock_irqsave(&ep->lock, flags); ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau 2005-09-24 19:44 ` [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() Willy Tarreau 2005-09-24 19:47 ` [PATCH 2/3] 2.6.14-rc2-mm1: fixes for overflow in epoll() Willy Tarreau @ 2005-09-24 19:52 ` Willy Tarreau 2005-10-01 20:39 ` Willy Tarreau 2005-09-24 20:08 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Davide Libenzi 2005-09-25 20:55 ` Nishanth Aravamudan 4 siblings, 1 reply; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 19:52 UTC (permalink / raw) To: Davide Libenzi, Andrew Morton Cc: Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List This patch simplifies the overflow fix which was applied to sys_poll() in 2.6.14-rc2-mm1 by relying on the overflow detection code added to jiffies.h by patch 1/3. This also removes a useless divide for HZ <= 1000. It might be worth applying too. It's only for -mm1, and does *not* apply to 2.6.14-rc2 because this code has already received patches in -mm1. Signed-off-by: Willy Tarreau <willy@w.ods.org> - Willy diff -purN linux-2.6.14-rc2-mm1/fs/select.c linux-2.6.14-rc2-mm1-poll/fs/select.c --- linux-2.6.14-rc2-mm1/fs/select.c Sat Sep 24 21:12:36 2005 +++ linux-2.6.14-rc2-mm1-poll/fs/select.c Sat Sep 24 21:23:21 2005 @@ -469,7 +469,6 @@ asmlinkage long sys_poll(struct pollfd _ { struct poll_wqueues table; int fdcount, err; - int overflow; unsigned int i; struct poll_list *head; struct poll_list *walk; @@ -486,22 +485,11 @@ asmlinkage long sys_poll(struct pollfd _ return -EINVAL; /* - * We compare HZ with 1000 to work out which side of the - * expression needs conversion. Because we want to avoid - * converting any value to a numerically higher value, which - * could overflow. - */ -#if HZ > 1000 - overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT); -#else - overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT; -#endif - - /* * If we would overflow in the conversion or a negative timeout - * is requested, sleep indefinitely. + * is requested, sleep indefinitely. Note: msecs_to_jiffies checks + * for the overflow. */ - if (overflow || timeout_msecs < 0) + if (timeout_msecs < 0) timeout_jiffies = MAX_SCHEDULE_TIMEOUT; else timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() 2005-09-24 19:52 ` [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() Willy Tarreau @ 2005-10-01 20:39 ` Willy Tarreau 0 siblings, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-10-01 20:39 UTC (permalink / raw) To: Davide Libenzi, Andrew Morton Cc: Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List Hi, On Sat, Sep 24, 2005 at 09:52:05PM +0200, Willy Tarreau wrote: > This patch simplifies the overflow fix which was applied to sys_poll() in > 2.6.14-rc2-mm1 by relying on the overflow detection code added to jiffies.h > by patch 1/3. This also removes a useless divide for HZ <= 1000. > > It might be worth applying too. It's only for -mm1, and does *not* apply > to 2.6.14-rc2 because this code has already received patches in -mm1. I noticed that the overflow check in the original code is either useless or buggy, so this patch is even more worth applying. Look below at original code : if HZ <= 1000, the overflow is set only when msecs_to_jiffies() >= MAX_SCHEDULE_TIMEOUT. But the maximal value that msecs_to_jiffies() can return is MAX_JIFFY_OFFSET. Guess what ? include/linux/jiffies.h:#define MAX_JIFFY_OFFSET ((~0UL >> 1)-1) include/linux/sched.h:#define MAX_SCHEDULE_TIMEOUT LONG_MAX include/linux/kernel.h:#define LONG_MAX ((long)(~0UL>>1)) => MAX_JIFFY_OFFSET == (MAX_SCHEDULE_TIMEOUT - 1) => So msecs_to_jiffies() cannot be >= MAX_SCHEDULE_TIMEOUT, and 'overflow' can never be set for HZ <= 1000. Given how msecs_to_jiffies() is used everywhere, I wonder if we should not provide a separate overflow detection function which would be used where needed, and possibly remove the test from msecs_to_jiffies() which is mostly called with constants or small delays which will never overflow. Has anyone an opinion on this ? Regards, Willy > diff -purN linux-2.6.14-rc2-mm1/fs/select.c linux-2.6.14-rc2-mm1-poll/fs/select.c > --- linux-2.6.14-rc2-mm1/fs/select.c Sat Sep 24 21:12:36 2005 > +++ linux-2.6.14-rc2-mm1-poll/fs/select.c Sat Sep 24 21:23:21 2005 > @@ -469,7 +469,6 @@ asmlinkage long sys_poll(struct pollfd _ > { > struct poll_wqueues table; > int fdcount, err; > - int overflow; > unsigned int i; > struct poll_list *head; > struct poll_list *walk; > @@ -486,22 +485,11 @@ asmlinkage long sys_poll(struct pollfd _ > return -EINVAL; > > /* > - * We compare HZ with 1000 to work out which side of the > - * expression needs conversion. Because we want to avoid > - * converting any value to a numerically higher value, which > - * could overflow. > - */ > -#if HZ > 1000 > - overflow = timeout_msecs >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT); > -#else > - overflow = msecs_to_jiffies(timeout_msecs) >= MAX_SCHEDULE_TIMEOUT; > -#endif > - > - /* > * If we would overflow in the conversion or a negative timeout > - * is requested, sleep indefinitely. > + * is requested, sleep indefinitely. Note: msecs_to_jiffies checks > + * for the overflow. > */ > - if (overflow || timeout_msecs < 0) > + if (timeout_msecs < 0) > timeout_jiffies = MAX_SCHEDULE_TIMEOUT; > else > timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; > > ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau ` (2 preceding siblings ...) 2005-09-24 19:52 ` [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() Willy Tarreau @ 2005-09-24 20:08 ` Davide Libenzi 2005-09-24 20:21 ` Willy TARREAU 2005-09-25 20:55 ` Nishanth Aravamudan 4 siblings, 1 reply; 33+ messages in thread From: Davide Libenzi @ 2005-09-24 20:08 UTC (permalink / raw) To: Willy Tarreau Cc: Andrew Morton, Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List On Sat, 24 Sep 2005, Willy Tarreau wrote: > Hello, > > After the discussion around epoll() timeout, I noticed that the functions used > to detect the timeout could themselves overflow for some values of HZ. > > So I decided to fix them by defining a macro which represents the maximal > acceptable argument which is guaranteed not to overflow. As an added bonus, > those functions can now be used in poll() and ep_poll() and remove the divide > if HZ == 1000, or replace it with a shift if (1000 % HZ) or (HZ % 1000) is a > power of two. Why all that code, when you can have it with: #define MAX_LONG_MSTIMEO (long) min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) that gcc-O2 collapses into a single constant? - Davide ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() 2005-09-24 20:08 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Davide Libenzi @ 2005-09-24 20:21 ` Willy TARREAU 0 siblings, 0 replies; 33+ messages in thread From: Willy TARREAU @ 2005-09-24 20:21 UTC (permalink / raw) To: Davide Libenzi Cc: Andrew Morton, Nishanth Aravamudan, Nish Aravamudan, Linux Kernel Mailing List On Sat, Sep 24, 2005 at 01:08:22PM -0700, Davide Libenzi wrote: > On Sat, 24 Sep 2005, Willy Tarreau wrote: > > >Hello, > > > >After the discussion around epoll() timeout, I noticed that the functions > >used > >to detect the timeout could themselves overflow for some values of HZ. > > > >So I decided to fix them by defining a macro which represents the maximal > >acceptable argument which is guaranteed not to overflow. As an added bonus, > >those functions can now be used in poll() and ep_poll() and remove the > >divide > >if HZ == 1000, or replace it with a shift if (1000 % HZ) or (HZ % 1000) is > >a > >power of two. > > Why all that code, when you can have it with: > > #define MAX_LONG_MSTIMEO (long) min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, > LONG_MAX / HZ - 1000ULL) > > that gcc-O2 collapses into a single constant? It is because I wanted to ensure that it matched exactly the limits of the functions that we call. And since msec_to_jiffies() is defined in 3 possible ways depending on HZ, 1000 % HZ and HZ % 1000, there are 3 different limits. Then, once the msec_to_jiffies() function is fixed, it's valuable to use it in ep_poll(), because its worst case does exactly what you already have (timeout * HZ + 999) / 1000, and other optimal cases can do better (for HZ=100, 250, 1000, there will be no divide at all). Don't worry, in my case, gcc also produces a single constant. That's just that it depends on how it will be used (check include/linux/jiffies.h, you'll understand what I mean). Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau ` (3 preceding siblings ...) 2005-09-24 20:08 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Davide Libenzi @ 2005-09-25 20:55 ` Nishanth Aravamudan 2005-09-25 22:06 ` Willy Tarreau 4 siblings, 1 reply; 33+ messages in thread From: Nishanth Aravamudan @ 2005-09-25 20:55 UTC (permalink / raw) To: Willy Tarreau Cc: Davide Libenzi, Andrew Morton, Nish Aravamudan, Linux Kernel Mailing List On 24.09.2005 [21:38:39 +0200], Willy Tarreau wrote: > Hello, > > After the discussion around epoll() timeout, I noticed that the functions used > to detect the timeout could themselves overflow for some values of HZ. > > So I decided to fix them by defining a macro which represents the maximal > acceptable argument which is guaranteed not to overflow. As an added bonus, > those functions can now be used in poll() and ep_poll() and remove the divide > if HZ == 1000, or replace it with a shift if (1000 % HZ) or (HZ % 1000) is a > power of two. > > Patches against 2.6.14-rc2-mm1 sent as replies to this mail. These look really good, Willy. Thanks for the fixes! -Nish ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() 2005-09-25 20:55 ` Nishanth Aravamudan @ 2005-09-25 22:06 ` Willy Tarreau 0 siblings, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-25 22:06 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Davide Libenzi, Andrew Morton, Nish Aravamudan, Linux Kernel Mailing List On Sun, Sep 25, 2005 at 01:55:18PM -0700, Nishanth Aravamudan wrote: > On 24.09.2005 [21:38:39 +0200], Willy Tarreau wrote: > > Hello, > > > > After the discussion around epoll() timeout, I noticed that the functions used > > to detect the timeout could themselves overflow for some values of HZ. > > > > So I decided to fix them by defining a macro which represents the maximal > > acceptable argument which is guaranteed not to overflow. As an added bonus, > > those functions can now be used in poll() and ep_poll() and remove the divide > > if HZ == 1000, or replace it with a shift if (1000 % HZ) or (HZ % 1000) is a > > power of two. > > > > Patches against 2.6.14-rc2-mm1 sent as replies to this mail. > > These look really good, Willy. Thanks for the fixes! Thanks. I noticed that there still existed a high risk of overflow with HZ values for which (HZ % 1000) != 0 && (1000 % HZ) != 0, because we hit the only situation where we start with a multiply by HZ, which is even worse as HZ grows. The only cases I've found are : - alpha : 1024 or 1200 - ia64 : 1024 Fortunately, they're both 64 bits, and since the result is an unsigned long, the multiply returns 64 bits result so it won't overflow in a while. However, we should avoid those combinations on 32-bits archs, because the limit is then rather low. For example, using HZ=1200 on an x86 will set MAX_MSEC_OFFSET to 3579138 ms, which is just below one hour. For reference, with HZ=1000, MAX_MSEC_OFFSET would be 2^32-1 ms, or 49.7 days, which makes quite a difference. It is possible that some self-made setups hit the overflow before the patch. For this reason, I wonder how we could discourage people from using such bastard HZ values on 32 bits platforms :-/ Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 18:25 ` Davide Libenzi 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau @ 2005-09-24 21:25 ` Vadim Lobanov 1 sibling, 0 replies; 33+ messages in thread From: Vadim Lobanov @ 2005-09-24 21:25 UTC (permalink / raw) To: Davide Libenzi Cc: Nishanth Aravamudan, Willy Tarreau, Nish Aravamudan, Linux Kernel Mailing List, Andrew Morton On Sat, 24 Sep 2005, Davide Libenzi wrote: > On Sat, 24 Sep 2005, Nishanth Aravamudan wrote: > > > Description: The @timeout parameter to ep_poll() is in milliseconds but > > we compare it to (MAX_SCHEDULE_TIMEOUT - 1000 / HZ), which is > > (jiffies/jiffies-per-sec) or seconds. That seems blatantly broken. This > > led to improper overflow checking for @timeout. As Andrew Morton pointed > > out in a similar fix for sys_poll(), the best solution is to to check > > for potential overflow first, then either select an indefinite value or > > convert @timeout. > > > > To achieve this and clean-up the code, change the prototype of the > > ep_poll() to make it clear that the parameter is in milliseconds and > > rename jtimeout to timeout_jiffies to hold the corresonding jiffies > > value. > > > > fs/eventpoll.c | 46 ++++++++++++++++++++++++++++++++++++++-------- > > 1 files changed, 38 insertions(+), 8 deletions(-) > > Eh? "To achieve this and clean-up the code"? :) > > > - Davide > > > eventpoll.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > --- a/fs/eventpoll.c 2005-09-24 11:07:04.000000000 -0700 > +++ b/fs/eventpoll.c 2005-09-24 11:11:06.000000000 -0700 > @@ -101,6 +101,10 @@ > /* Maximum number of poll wake up nests we are allowing */ > #define EP_MAX_POLLWAKE_NESTS 4 > > +/* Maximum msec timeout value storeable in a long int */ > +#define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULE_TIMEOUT / HZ, LONG_MAX / HZ - 1000ULL) Not quite. It really should be: #define EP_MAX_MSTIMEO min(1000ULL * MAX_SCHEDULETIMEOUT / HZ, (LONG_MAX - 999ULL) / HZ); > + > + > struct epoll_filefd { > struct file *file; > int fd; > @@ -1507,8 +1511,7 @@ > * and the overflow condition. The passed timeout is in milliseconds, > * that why (t * HZ) / 1000. > */ > - jtimeout = (timeout < 0 || > - (timeout / 1000) >= (MAX_SCHEDULE_TIMEOUT / HZ)) ? > + jtimeout = (timeout < 0 || timeout >= EP_MAX_MSTIMEO) ? > MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > > retry: > - -Vadim Lobanov ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch] sys_epoll_wait() timeout saga ... 2005-09-24 17:19 ` Nishanth Aravamudan 2005-09-24 18:25 ` Davide Libenzi @ 2005-09-24 18:30 ` Willy Tarreau 1 sibling, 0 replies; 33+ messages in thread From: Willy Tarreau @ 2005-09-24 18:30 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Nish Aravamudan, Davide Libenzi, Linux Kernel Mailing List, Andrew Morton On Sat, Sep 24, 2005 at 10:19:28AM -0700, Nishanth Aravamudan wrote: > Here is the patch I was referring to, adapted to the epoll use of > schedule_timeout(). A similar fix is in -mm for almost identical code in > sys_poll(). > > Description: The @timeout parameter to ep_poll() is in milliseconds but > we compare it to (MAX_SCHEDULE_TIMEOUT - 1000 / HZ), which is > (jiffies/jiffies-per-sec) or seconds. That seems blatantly broken. as you've seen in my last mail, it happens to be valid because it checks for the multiply overflow before doing it. But it's only after a math demonstration that it appears valid anyway ;-) > This > led to improper overflow checking for @timeout. As Andrew Morton pointed > out in a similar fix for sys_poll(), the best solution is to to check > for potential overflow first, then either select an indefinite value or > convert @timeout. > > To achieve this and clean-up the code, change the prototype of the > ep_poll() to make it clear that the parameter is in milliseconds and > rename jtimeout to timeout_jiffies to hold the corresonding jiffies > value. > > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> > > --- > > fs/eventpoll.c | 46 ++++++++++++++++++++++++++++++++++++++-------- > 1 files changed, 38 insertions(+), 8 deletions(-) > > diff -urpN 2.6.14-rc2-mm1/fs/eventpoll.c 2.6.14-rc2-mm1-dev/fs/eventpoll.c > --- 2.6.14-rc2-mm1/fs/eventpoll.c 2005-09-21 23:50:28.000000000 -0700 > +++ 2.6.14-rc2-mm1-dev/fs/eventpoll.c 2005-09-24 10:07:51.000000000 -0700 > @@ -260,7 +260,7 @@ static int ep_events_transfer(struct eve > struct epoll_event __user *events, > int maxevents); > static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, > - int maxevents, long timeout); > + int maxevents, long timeout_msecs); > static int eventpollfs_delete_dentry(struct dentry *dentry); > static struct inode *ep_eventpoll_inode(void); > static struct super_block *eventpollfs_get_sb(struct file_system_type *fs_type, > @@ -1494,11 +1494,12 @@ static int ep_events_transfer(struct eve > > > static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, > - int maxevents, long timeout) > + int maxevents, long timeout_msecs) > { > int res, eavail; > + int overflow; > unsigned long flags; > - long jtimeout; > + long timeout_jiffies; > wait_queue_t wait; > > /* > @@ -1506,8 +1507,36 @@ static int ep_poll(struct eventpoll *ep, > * and the overflow condition. The passed timeout is in milliseconds, > * that why (t * HZ) / 1000. > */ > - jtimeout = timeout == -1 || timeout > (MAX_SCHEDULE_TIMEOUT - 1000) / HZ ? > - MAX_SCHEDULE_TIMEOUT: (timeout * HZ + 999) / 1000; > + > + if (timeout_msecs) { Why not check for < 0 here ? it's the right moment, instead of doing all the math first ? gcc will emit only one check with 3 results resumed in just two consecutive cond jumps for 0, <0, >0. > + /* > + * We compare HZ with 1000 to work out which side of the > + * expression needs conversion. Because we want to > + * avoid converting any value to a numerically higher > + * value, which could overflow. > + */ > +#if HZ > 1000 > + overflow = timeout_msecs >= > + jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT); > +#else > + overflow = msecs_to_jiffies(timeout_msecs) >= > + MAX_SCHEDULE_TIMEOUT; > +#endif Although this seems clean, what worries me here is that for HZ <= 1000, we're doing maths on a variable to compare it to a constant. The msec_to_jiffies() function exists to return exact conversions, while we're just checking for bounds. The magics here clearly depends on the jiffies_to_msecs() implementation. So the real problem is not to check for which values for timeout_msecs we can get an overflow, but to check if timeout_msecs is higher than an HZ-dependant constant above which some conversion functions will overflow. Oh, BTW, while reading msecs_to_jiffies(), I noticed that it can overflow too for small values of HZ (HZ < 1000) : #define MAX_JIFFY_OFFSET ((~0UL >> 1)-1) static inline unsigned int jiffies_to_msecs(const unsigned long j) { #if HZ <= 1000 && !(1000 % HZ) return (1000 / HZ) * j; #elif HZ > 1000 && !(HZ % 1000) return (j + (HZ / 1000) - 1)/(HZ / 1000); #else return (j * 1000) / HZ; #endif } static inline unsigned long msecs_to_jiffies(const unsigned int m) { if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) return MAX_JIFFY_OFFSET; #if HZ <= 1000 && !(1000 % HZ) return (m + (1000 / HZ) - 1) / (1000 / HZ); #elif HZ > 1000 && !(HZ % 1000) return m * (HZ / 1000); #else return (m * HZ + 999) / 1000; #endif } Calling jiffies_to_msecs(MAX_JIFFY_OFFSET) is not valid if 1000/HZ >=2 or if (1000 % HZ != 0 || HZ % 1000 != 0). Eg: for HZ=300, the max is set to 0xda7406 while it is set to 0xfffffff8 for HZ=250 ! This clearly shows that we need some common macros to define some absolute bounds for each unit and avoid any computation on the timers before having checked them against the bounds. I'll see if I can get something working. > + > + /* > + * If we would overflow in the conversion or a negative > + * timeout is requested, sleep indefinitely. > + */ > + if (overflow || timeout_msecs < 0) > + timeout_jiffies = MAX_SCHEDULE_TIMEOUT; > + else > + timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; > + } else { > + /* > + * 0 millisecond requests become 0 jiffy requests > + */ > + timeout_jiffies = 0; > + } > > retry: > write_lock_irqsave(&ep->lock, flags); > @@ -1529,7 +1558,7 @@ retry: > * to TASK_INTERRUPTIBLE before doing the checks. > */ > set_current_state(TASK_INTERRUPTIBLE); > - if (!list_empty(&ep->rdllist) || !jtimeout) > + if (!list_empty(&ep->rdllist) || !timeout_jiffies) > break; > if (signal_pending(current)) { > res = -EINTR; > @@ -1537,7 +1566,7 @@ retry: > } > > write_unlock_irqrestore(&ep->lock, flags); > - jtimeout = schedule_timeout(jtimeout); > + timeout_jiffies = schedule_timeout(timeout_jiffies); > write_lock_irqsave(&ep->lock, flags); > } > remove_wait_queue(&ep->wq, &wait); > @@ -1556,7 +1585,8 @@ retry: > * more luck. > */ > if (!res && eavail && > - !(res = ep_events_transfer(ep, events, maxevents)) && jtimeout) > + !(res = ep_events_transfer(ep, events, maxevents)) && > + timeout_jiffies) > goto retry; > > return res; Regards, Willy ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2005-10-01 20:45 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-09-23 18:13 [patch] sys_epoll_wait() timeout saga Davide Libenzi 2005-09-23 18:24 ` Nish Aravamudan 2005-09-24 4:05 ` Willy Tarreau 2005-09-24 4:44 ` Nish Aravamudan 2005-09-24 6:15 ` Willy Tarreau 2005-09-24 7:33 ` Vadim Lobanov 2005-09-24 7:51 ` Willy Tarreau 2005-09-24 15:10 ` Davide Libenzi 2005-09-24 17:20 ` Willy Tarreau 2005-09-24 18:19 ` Davide Libenzi 2005-09-25 6:05 ` Andrew Morton 2005-09-25 6:20 ` Willy Tarreau 2005-09-25 6:32 ` Andrew Morton 2005-09-25 7:08 ` Vadim Lobanov 2005-09-25 8:03 ` Willy Tarreau 2005-09-24 17:19 ` Nishanth Aravamudan 2005-09-24 18:25 ` Davide Libenzi 2005-09-24 19:38 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Willy Tarreau 2005-09-24 19:44 ` [PATCH 1/3] 2.6.14-rc2-mm1: fixes for overflow msec_to_jiffies() Willy Tarreau 2005-09-29 9:43 ` Andrew Morton 2005-09-29 19:41 ` Willy Tarreau 2005-09-29 19:52 ` Andrew Morton 2005-09-29 20:55 ` Willy Tarreau 2005-10-01 17:39 ` Willy Tarreau 2005-09-24 19:47 ` [PATCH 2/3] 2.6.14-rc2-mm1: fixes for overflow in epoll() Willy Tarreau 2005-09-24 19:52 ` [PATCH 3/3] 2.6.14-rc2-mm1 : fixes for overflow in sys_poll() Willy Tarreau 2005-10-01 20:39 ` Willy Tarreau 2005-09-24 20:08 ` [PATCH 0/3] fixes for overflow in poll(), epoll(), and msec_to_jiffies() Davide Libenzi 2005-09-24 20:21 ` Willy TARREAU 2005-09-25 20:55 ` Nishanth Aravamudan 2005-09-25 22:06 ` Willy Tarreau 2005-09-24 21:25 ` [patch] sys_epoll_wait() timeout saga Vadim Lobanov 2005-09-24 18:30 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox