* compat_sys_times() bogus until jiffies >= 0. @ 2007-11-07 22:47 David Brown 2007-11-07 23:28 ` Andrew Morton 0 siblings, 1 reply; 37+ messages in thread From: David Brown @ 2007-11-07 22:47 UTC (permalink / raw) To: LKML compat_sys_times() has bogus return until jiffies is >= 0. I discovered this running LTP within 5 minutes of booting. The return result return compat_jiffies_to_clock_t(jiffies); will return '-1' to user space and set the negated clock_t value to errno. I'm not sure what the correct fix for this is. I can come up with a patch if anyone has ideas on how to fix it. At minimum, perhaps it should return a sane errno value. Thanks, David Brown ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-07 22:47 compat_sys_times() bogus until jiffies >= 0 David Brown @ 2007-11-07 23:28 ` Andrew Morton 2007-11-08 0:18 ` Andrew Morton ` (2 more replies) 0 siblings, 3 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-07 23:28 UTC (permalink / raw) To: David Brown; +Cc: linux-kernel, Ulrich Drepper, Michael Kerrisk > On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <lkml@davidb.org> wrote: > compat_sys_times() has bogus return until jiffies is >= 0. I discovered > this running LTP within 5 minutes of booting. > > The return result > > return compat_jiffies_to_clock_t(jiffies); > > will return '-1' to user space and set the negated clock_t value to errno. > > I'm not sure what the correct fix for this is. I can come up with a patch > if anyone has ideas on how to fix it. > > At minimum, perhaps it should return a sane errno value. RETURN VALUE times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. For Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot time. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appro- priately. Perhaps this is a bug in glibc: it is interpreting the times() return value in the same way as other syscalls. It would have been sensible for us to add INITIAL_JIFFIES to the value instead of exposing this kernel-only detail to the world, although the problem will of course reoccur once jiffies hits 0x80000000. Unfortunately we've even gone and enshrined this bogon in the manpage. Proposed fix: - return compat_jiffies_to_clock_t(jiffies); + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & + 0x7fffffff); ? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-07 23:28 ` Andrew Morton @ 2007-11-08 0:18 ` Andrew Morton 2007-11-08 0:54 ` Andreas Schwab 2007-11-08 1:53 ` Paul Mackerras 2007-11-08 0:50 ` David Miller 2007-11-08 6:00 ` David Brown 2 siblings, 2 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 0:18 UTC (permalink / raw) To: lkml, linux-kernel, drepper, mtk-manpages > On Wed, 7 Nov 2007 15:28:33 -0800 Andrew Morton <akpm@linux-foundation.org> wrote: > > On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <lkml@davidb.org> wrote: > > compat_sys_times() has bogus return until jiffies is >= 0. I discovered > > this running LTP within 5 minutes of booting. > > > > The return result > > > > return compat_jiffies_to_clock_t(jiffies); > > > > will return '-1' to user space and set the negated clock_t value to errno. > > > > I'm not sure what the correct fix for this is. I can come up with a patch > > if anyone has ideas on how to fix it. > > > > At minimum, perhaps it should return a sane errno value. > > RETURN VALUE > times() returns the number of clock ticks that have elapsed since an > arbitrary point in the past. For Linux 2.4 and earlier this point is > the moment the system was booted. Since Linux 2.6, this point is > (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot > time. The return value may overflow the possible range of type > clock_t. On error, (clock_t) -1 is returned, and errno is set appro- > priately. > > > Perhaps this is a bug in glibc: it is interpreting the times() return value > in the same way as other syscalls. > > It would have been sensible for us to add INITIAL_JIFFIES to the value > instead of exposing this kernel-only detail to the world, although the > problem will of course reoccur once jiffies hits 0x80000000. Unfortunately > we've even gone and enshrined this bogon in the manpage. > > Proposed fix: > > - return compat_jiffies_to_clock_t(jiffies); > + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > + 0x7fffffff); > > ? Like this? It gets messy. From: Andrew Morton <akpm@linux-foundation.org> David Brown points out that compat_sys_times() (and sys_times()) can return arbitrary 32-bit (or 64-bit values). If these happen to be negative (jiffy wrap, or before INITIAL_JIFFIES) then libc will interpret this as an error and will return -1 to the libc user and will set errno. The manpage for times(2) says: times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. For Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot time. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appro- priately. We can fix this by masking the return value down to a 31-bit (63-bit) value. Also, let's correct for INTIAL_JIFFIES - this isn't a detail which should be exposed to userspace. Unfortunately this change can break userspace. If a program was (correctly) doing: unsigned long start = times(...); ... unsigned long end = times(...); unsigned long delta = end - start; then `delta' can be grossly wrong if we wrapped in the interval. Instead userspace will need to mask `delta' by 0x7fffffff (0x7fffffffffffffff) to get the correct number. But userspace was already busted in the presence of wraparound, due to glibc's convert-to-negative-one behaviour. Given all this stuff, the return value from sys_times() doesn't seem a particularly useful or reliable kernel interface. Cc: David Brown <lkml@davidb.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- kernel/compat.c | 3 ++- kernel/sys.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff -puN kernel/sys.c~a kernel/sys.c --- a/kernel/sys.c~a +++ a/kernel/sys.c @@ -897,7 +897,8 @@ asmlinkage long sys_times(struct tms __u if (copy_to_user(tbuf, &tmp, sizeof(struct tms))) return -EFAULT; } - return (long) jiffies_64_to_clock_t(get_jiffies_64()); + return jiffies_64_to_clock_t((get_jiffies_64() + INITIAL_JIFFIES) & + LONG_MAX); } /* diff -puN kernel/compat.c~a kernel/compat.c --- a/kernel/compat.c~a +++ a/kernel/compat.c @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct if (copy_to_user(tbuf, &tmp, sizeof(tmp))) return -EFAULT; } - return compat_jiffies_to_clock_t(jiffies); + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & + LONG_MAX); } /* _ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 0:18 ` Andrew Morton @ 2007-11-08 0:54 ` Andreas Schwab 2007-11-08 1:17 ` Andrew Morton 2007-11-08 1:53 ` Paul Mackerras 1 sibling, 1 reply; 37+ messages in thread From: Andreas Schwab @ 2007-11-08 0:54 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-kernel, drepper, mtk-manpages Andrew Morton <akpm@linux-foundation.org> writes: > diff -puN kernel/compat.c~a kernel/compat.c > --- a/kernel/compat.c~a > +++ a/kernel/compat.c > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct > if (copy_to_user(tbuf, &tmp, sizeof(tmp))) > return -EFAULT; > } > - return compat_jiffies_to_clock_t(jiffies); > + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > + LONG_MAX); Are you sure you want LONG_MAX here, not 0x7fffffff? Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 0:54 ` Andreas Schwab @ 2007-11-08 1:17 ` Andrew Morton 0 siblings, 0 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 1:17 UTC (permalink / raw) To: Andreas Schwab; +Cc: lkml, linux-kernel, drepper, mtk-manpages > On Thu, 08 Nov 2007 01:54:40 +0100 Andreas Schwab <schwab@suse.de> wrote: > Andrew Morton <akpm@linux-foundation.org> writes: > > > diff -puN kernel/compat.c~a kernel/compat.c > > --- a/kernel/compat.c~a > > +++ a/kernel/compat.c > > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct > > if (copy_to_user(tbuf, &tmp, sizeof(tmp))) > > return -EFAULT; > > } > > - return compat_jiffies_to_clock_t(jiffies); > > + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > > + LONG_MAX); > > Are you sure you want LONG_MAX here, not 0x7fffffff? > I'm not sure of anything - I'm just trolling ;) That's 0x7fffffffffffffff for architectures which implement this function. I think that lines up correctly with jiffies and the return value from compat_sys_times(). Perhaps formally it should be USERSPACE_CLOCK_T_MAX, but we don't have that. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 0:18 ` Andrew Morton 2007-11-08 0:54 ` Andreas Schwab @ 2007-11-08 1:53 ` Paul Mackerras 2007-11-08 2:09 ` David Miller 2007-11-08 3:07 ` Andrew Morton 1 sibling, 2 replies; 37+ messages in thread From: Paul Mackerras @ 2007-11-08 1:53 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-kernel, drepper, mtk-manpages Andrew Morton writes: > Given all this stuff, the return value from sys_times() doesn't seem a > particularly useful or reliable kernel interface. I think the best thing would be to ignore any error from copy_to_user and always return the number of clock ticks. We should call force_successful_syscall_return, and glibc on x86 should be taught not to interpret negative values as an error. POSIX doesn't require us to return an EFAULT error if the buf argument is bogus. If userspace does supply a bogus buf pointer, then either it will dereference it itself and get a segfault, or it won't dereference it, in which case it obviously didn't care about the values we tried to put there. If we try to return an error under some circumstances, then there is at least one 32-bit value for the number of ticks that will cause confusion. We can either change that value (or values) to some other value, which seems pretty bogus, or we can just decide not to return any errors. The latter seems to me to have no significant downside and to be the simplest solution to the problem. Paul. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 1:53 ` Paul Mackerras @ 2007-11-08 2:09 ` David Miller 2007-11-08 10:20 ` Andreas Schwab 2007-11-08 19:25 ` Denys Vlasenko 2007-11-08 3:07 ` Andrew Morton 1 sibling, 2 replies; 37+ messages in thread From: David Miller @ 2007-11-08 2:09 UTC (permalink / raw) To: paulus; +Cc: akpm, lkml, linux-kernel, drepper, mtk-manpages From: Paul Mackerras <paulus@samba.org> Date: Thu, 8 Nov 2007 12:53:57 +1100 > Andrew Morton writes: > > > Given all this stuff, the return value from sys_times() doesn't seem a > > particularly useful or reliable kernel interface. > > I think the best thing would be to ignore any error from copy_to_user > and always return the number of clock ticks. We should call > force_successful_syscall_return, and glibc on x86 should be taught not > to interpret negative values as an error. > > POSIX doesn't require us to return an EFAULT error if the buf argument > is bogus. If userspace does supply a bogus buf pointer, then either > it will dereference it itself and get a segfault, or it won't > dereference it, in which case it obviously didn't care about the > values we tried to put there. > > If we try to return an error under some circumstances, then there is > at least one 32-bit value for the number of ticks that will cause > confusion. We can either change that value (or values) to some other > value, which seems pretty bogus, or we can just decide not to return > any errors. The latter seems to me to have no significant downside > and to be the simplest solution to the problem. I agree with this analysis. The Linux man page for times() explicitly lists (clock_t) -1 as a return value meaning error. So even if we did make some effort to return errors "properly" (via force_successful_syscall_return() et al.) userspace would still be screwed because (clock_t) -1 would be interpreted as an error. Actually I think this basically proves we cannot return (clock_t) -1 ever because all existing userland (I'm not talking about inside glibc, I'm talking about inside of applications) will see this as an error. User applications have no other way to check for error. This API is definitely very poorly designed, no matter which way we "fix" this some case will remain broken. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 2:09 ` David Miller @ 2007-11-08 10:20 ` Andreas Schwab 2007-11-08 14:42 ` Chris Friesen 2007-11-08 19:25 ` Denys Vlasenko 1 sibling, 1 reply; 37+ messages in thread From: Andreas Schwab @ 2007-11-08 10:20 UTC (permalink / raw) To: David Miller; +Cc: paulus, akpm, lkml, linux-kernel, drepper, mtk-manpages David Miller <davem@davemloft.net> writes: > I agree with this analysis. > > The Linux man page for times() explicitly lists (clock_t) -1 as a > return value meaning error. > > So even if we did make some effort to return errors "properly" (via > force_successful_syscall_return() et al.) userspace would still be > screwed because (clock_t) -1 would be interpreted as an error. > > Actually I think this basically proves we cannot return (clock_t) -1 > ever because all existing userland (I'm not talking about inside > glibc, I'm talking about inside of applications) will see this as an > error. > > User applications have no other way to check for error. > > This API is definitely very poorly designed, no matter which way we > "fix" this some case will remain broken. A possible remedy is to return the ticks since process start time, which delays the wrap around much further. POSIX only demands consistency within the same process. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 10:20 ` Andreas Schwab @ 2007-11-08 14:42 ` Chris Friesen 2007-11-09 18:20 ` Ulrich Drepper 0 siblings, 1 reply; 37+ messages in thread From: Chris Friesen @ 2007-11-08 14:42 UTC (permalink / raw) To: Andreas Schwab Cc: David Miller, paulus, akpm, lkml, linux-kernel, drepper, mtk-manpages Andreas Schwab wrote: > A possible remedy is to return the ticks since process start time, which > delays the wrap around much further. POSIX only demands consistency > within the same process. This would be an interesting solution. The man page for linux states that the return code is time since system boot, so that could realistically be expected to correlate between different processes. Could we get away with changing the man page and breaking any apps relying on this previously-documented behaviour? Chris ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 14:42 ` Chris Friesen @ 2007-11-09 18:20 ` Ulrich Drepper 2007-12-20 11:36 ` Michael Kerrisk 0 siblings, 1 reply; 37+ messages in thread From: Ulrich Drepper @ 2007-11-09 18:20 UTC (permalink / raw) To: Chris Friesen Cc: Andreas Schwab, David Miller, paulus, akpm, lkml, linux-kernel, mtk-manpages -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Friesen wrote: >> A possible remedy is to return the ticks since process start time, which >> delays the wrap around much further. POSIX only demands consistency >> within the same process. > > This would be an interesting solution. > > The man page for linux states that the return code is time since system > boot, so that could realistically be expected to correlate between > different processes. The Linux man page is documenting existing functionality on top of what the standard requires. Programmers should ever only require what the standard guarantees. I am perfectly willing to support a solution where the time is measured from process startup time. The only code using times() I found is cross-platform and most likely does not depend on the value returned is usable in isolation (only in a difference). - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFHNKTZ2ijCOnn/RHQRAv2wAJsHOnWRrbE2N2Z4R35bsU1+BIZEGQCguaxL zY9f4XEhJnAoNF5jFxm76qI= =0nsU -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-09 18:20 ` Ulrich Drepper @ 2007-12-20 11:36 ` Michael Kerrisk 2007-12-20 11:51 ` David Miller 0 siblings, 1 reply; 37+ messages in thread From: Michael Kerrisk @ 2007-12-20 11:36 UTC (permalink / raw) To: akpm, lkml, paulus Cc: Ulrich Drepper, Chris Friesen, Andreas Schwab, David Miller, linux-kernel David, Andrew, Paul, A late coda to this thread, but I'll just note some changes I'm making to the man page (which I'd like you to review -- please see below), and note a few other points. Andrew, you asked about what happens for x86 with the -1 to -4095 return for other syscalls. At least two other syscalls suffer the same problem. >From the fcntl(2) man page: BUGS A limitation of the Linux system call conventions on some architectures (notably x86) means that if a (negative) process group ID to be returned by F_GETOWN falls in the range -1 to -4095, then the return value is wrongly interpreted by glibc as an error in the system call; that is, the return value of fcntl() will be -1, and errno will contain the (positive) process group ID. Some testing just now shows me that lseek() on /dev/mem suffers similar problems when seeking to bytes 0xfffff001 through to 0xffffffff. Ulrich Drepper wrote: > Chris Friesen wrote: >>> A possible remedy is to return the ticks since process start time, which >>> delays the wrap around much further. POSIX only demands consistency >>> within the same process. >> This would be an interesting solution. > >> The man page for linux states that the return code is time since system >> boot, so that could realistically be expected to correlate between >> different processes. > > The Linux man page is documenting existing functionality on top of what > the standard requires. Programmers should ever only require what the > standard guarantees. > > I am perfectly willing to support a solution where the time is measured >>from process startup time. The only code using times() I found is > cross-platform and most likely does not depend on the value returned is > usable in isolation (only in a difference). Did I miss anything? Is anyone actually working on a solution along these lines? In the meantime, for man-pages-2.74, I've reworked the description of the return value: RETURN VALUE times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appropriately. I moved the Linux specific details of the return value to NOTES, and added a warning about relying on those details: NOTES ... On Linux, the "arbitrary point in the past" from which the return value of times() is measured has varied across kernel versions. On Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) sec- onds before system boot time. This variability across kernel versions (and across Unix implementations), com- bined with the fact that the returned value may overflow the range of clock_t, means that a portable application would be wise to avoid using this value. To measure changes in elapsed time, use gettimeofday(2) instead. Under BUGS I added: BUGS A limitation of the Linux system call conventions on some architectures (notably x86) means that on Linux 2.6 there is a small time window (41 seconds) soon after boot when times(2) can return -1, falsely indicating that an error occurred. The same problem can occur when the return value wraps passed the maximum value that can be stored in clockid_t. Look okay to you folks? Cheers, Michael -- Michael Kerrisk Maintainer of the Linux man-pages project http://www.kernel.org/doc/man-pages/ Want to report a man-pages bug? Look here: http://www.kernel.org/doc/man-pages/reporting_bugs.html ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-20 11:36 ` Michael Kerrisk @ 2007-12-20 11:51 ` David Miller 2007-12-22 0:42 ` Andi Kleen 0 siblings, 1 reply; 37+ messages in thread From: David Miller @ 2007-12-20 11:51 UTC (permalink / raw) To: mtk.manpages; +Cc: akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel From: Michael Kerrisk <mtk.manpages@googlemail.com> Date: Thu, 20 Dec 2007 12:36:52 +0100 > Some testing just now shows me that lseek() on /dev/mem suffers similar > problems when seeking to bytes 0xfffff001 through to 0xffffffff. Only on x86 platforms. Sparc, IA64, MIPS, powerpc, etc. all get this case right. Yes it's another unfortunate side effect of how error status is indicated for x86 system calls. I would suggest, that we put something in place to fix this in the long term: 1) Start setting the condition codes properly to indicate error in the system call return path on x86 like other platforms do now. Make sure that it tips off on force_successful_syscall_return(), as needed. 2) Come up with a transition plan for glibc et al. to take advantage of this. It actually sounds like the kind of problem that could be solved well using the VDSO page. :-) ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-20 11:51 ` David Miller @ 2007-12-22 0:42 ` Andi Kleen 2007-12-22 1:41 ` David Miller 0 siblings, 1 reply; 37+ messages in thread From: Andi Kleen @ 2007-12-22 0:42 UTC (permalink / raw) To: David Miller Cc: mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel David Miller <davem@davemloft.net> writes: > Only on x86 platforms. Sparc, IA64, MIPS, powerpc, etc. all get this > case right. Do they for 32bit kernels or for 64bit userland? > Yes it's another unfortunate side effect of how error status is > indicated for x86 system calls. Maybe I'm dense, but doesn't all the kernel code pass it the same way as the x86 syscall code? For your proposal you would need a separate error bit coming out of the sys_* to handle this case. Basically rewrite all code that ever returns errors in the kernel. Or do I miss something? Or are you talking about solving it only for 32bit compat emulation on 64bit kernels? There it would be possible, but only doing it there and not on native 32bit systems wouldn't seem clean to me. At least on x86-64 the compat code's (near) only goal in live is to be as compatible to 32it as possible not better. -Andi ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 0:42 ` Andi Kleen @ 2007-12-22 1:41 ` David Miller 2007-12-22 1:45 ` David Miller 2007-12-22 1:49 ` Andi Kleen 0 siblings, 2 replies; 37+ messages in thread From: David Miller @ 2007-12-22 1:41 UTC (permalink / raw) To: andi Cc: mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel From: Andi Kleen <andi@firstfloor.org> Date: Sat, 22 Dec 2007 01:42:02 +0100 > David Miller <davem@davemloft.net> writes: > > > Only on x86 platforms. Sparc, IA64, MIPS, powerpc, etc. all get this > > case right. > > Do they for 32bit kernels or for 64bit userland? Both. This is not a compat issue. > > Yes it's another unfortunate side effect of how error status is > > indicated for x86 system calls. > > Maybe I'm dense, but doesn't all the kernel code pass it the > same way as the x86 syscall code? For your proposal you > would need a separate error bit coming out of the sys_* to > handle this case. Basically rewrite all code that ever returns > errors in the kernel. Or do I miss something? I'm suggesting that you set the condition codes based upon whether there is an error or not. That is the critical thing x86 doesn't do that all the other platforms do. x86 relies on interpretation of purely the integer returned from the system call to userspace, and that means a certain chunk of the return value space can never represent valid values. If you use the condition codes to signal "the return value is an error" you don't have these problems. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 1:41 ` David Miller @ 2007-12-22 1:45 ` David Miller 2007-12-22 1:53 ` Andi Kleen 2007-12-22 1:49 ` Andi Kleen 1 sibling, 1 reply; 37+ messages in thread From: David Miller @ 2007-12-22 1:45 UTC (permalink / raw) To: andi Cc: mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel From: David Miller <davem@davemloft.net> Date: Fri, 21 Dec 2007 17:41:24 -0800 (PST) > I'm suggesting that you set the condition codes based upon whether > there is an error or not. That is the critical thing x86 doesn't do > that all the other platforms do. And if you still don't get it, I'm saying that x86, in the syscall trap return path, should set the conditon codes based upon whether the system call is really signalling an error or not. And to handle potentially ambiguous cases we, for a long time, have the force_successful_syscall_return() arch hook. System call implementations use this when the return values they give could be mis-construed as error values. And if you'll notice x86 makes no attempt to implement that hook, because it currently can't. That's what needs to be fixed. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 1:45 ` David Miller @ 2007-12-22 1:53 ` Andi Kleen 2007-12-22 4:36 ` David Miller 0 siblings, 1 reply; 37+ messages in thread From: Andi Kleen @ 2007-12-22 1:53 UTC (permalink / raw) To: David Miller Cc: andi, mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel > And to handle potentially ambiguous cases we, for a long time, have > the force_successful_syscall_return() arch hook. Ah I see what you mean now. Thanks for the clarification. Ok that could be in theory made to work yes. The migration would probably be ugly though (how would glibc figure out if the kernel does that or not?) -Andi ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 1:53 ` Andi Kleen @ 2007-12-22 4:36 ` David Miller 2007-12-22 12:47 ` Andi Kleen 0 siblings, 1 reply; 37+ messages in thread From: David Miller @ 2007-12-22 4:36 UTC (permalink / raw) To: andi Cc: mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel From: Andi Kleen <andi@firstfloor.org> Date: Sat, 22 Dec 2007 02:53:11 +0100 > > And to handle potentially ambiguous cases we, for a long time, have > > the force_successful_syscall_return() arch hook. > > Ah I see what you mean now. > > Thanks for the clarification. Thanks for continuing to insist it's "impossible" :-) ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 4:36 ` David Miller @ 2007-12-22 12:47 ` Andi Kleen 0 siblings, 0 replies; 37+ messages in thread From: Andi Kleen @ 2007-12-22 12:47 UTC (permalink / raw) To: David Miller Cc: andi, mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel On Fri, Dec 21, 2007 at 08:36:40PM -0800, David Miller wrote: > From: Andi Kleen <andi@firstfloor.org> > Date: Sat, 22 Dec 2007 02:53:11 +0100 > > > > And to handle potentially ambiguous cases we, for a long time, have > > > the force_successful_syscall_return() arch hook. > > > > Ah I see what you mean now. > > > > Thanks for the clarification. > > Thanks for continuing to insist it's "impossible" :-) It's still hard -- e.g. i'm not sure your condition flag setting would be even possible for i386 SYSEXIT which does not restore EFLAGS from memory and has some other constraints too. And there is no free other register to use for this either on i386 nor x86-64. Ok you could always disable SYSEXIT on force_successfull_return(), but then e.g. all lseek()s would use the slow path which might not be a good idea. -Andi ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-12-22 1:41 ` David Miller 2007-12-22 1:45 ` David Miller @ 2007-12-22 1:49 ` Andi Kleen 1 sibling, 0 replies; 37+ messages in thread From: Andi Kleen @ 2007-12-22 1:49 UTC (permalink / raw) To: David Miller Cc: andi, mtk.manpages, akpm, lkml, paulus, drepper, cfriesen, schwab, linux-kernel > I'm suggesting that you set the condition codes based upon whether > there is an error or not. And the only way the syscall code could find out if there is an error is by checking err < 0 && err >= -4096 like glibc (except for the compat syscall on 64bit kernel case) Or rewrite all code that returns errors to system calls to pass a separate flag too. > That is the critical thing x86 doesn't do > that all the other platforms do. It doesn't do it because it's useless without a kernel rewrite. I frankly doubt it really works on Sparc :-) Maybe it could work there on a hypothetical rewritten kernel, but not today. -Andi ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 2:09 ` David Miller 2007-11-08 10:20 ` Andreas Schwab @ 2007-11-08 19:25 ` Denys Vlasenko 1 sibling, 0 replies; 37+ messages in thread From: Denys Vlasenko @ 2007-11-08 19:25 UTC (permalink / raw) To: David Miller; +Cc: paulus, akpm, lkml, linux-kernel, drepper, mtk-manpages On Thursday 08 November 2007 02:09, David Miller wrote: > > I think the best thing would be to ignore any error from copy_to_user > > and always return the number of clock ticks. We should call > > force_successful_syscall_return, and glibc on x86 should be taught not > > to interpret negative values as an error. > > > > POSIX doesn't require us to return an EFAULT error if the buf argument > > is bogus. If userspace does supply a bogus buf pointer, then either > > it will dereference it itself and get a segfault, or it won't > > dereference it, in which case it obviously didn't care about the > > values we tried to put there. > > > > If we try to return an error under some circumstances, then there is > > at least one 32-bit value for the number of ticks that will cause > > confusion. We can either change that value (or values) to some other > > value, which seems pretty bogus, or we can just decide not to return > > any errors. The latter seems to me to have no significant downside > > and to be the simplest solution to the problem. > > I agree with this analysis. > > The Linux man page for times() explicitly lists (clock_t) -1 as a > return value meaning error. > > So even if we did make some effort to return errors "properly" (via > force_successful_syscall_return() et al.) userspace would still be > screwed because (clock_t) -1 would be interpreted as an error. > > Actually I think this basically proves we cannot return (clock_t) -1 > ever because all existing userland (I'm not talking about inside > glibc, I'm talking about inside of applications) will see this as an > error. What error? I'd argue it's perfectly sane for application to assume that times() never fails. struct tms t; clock_t start = times(&t); ... clock_t end = times(&t); clock_t delta = end - start; The only error form kernel POV is that passed pointer can be invalid. But from application POV in the above example it cannot be true and if (start == -1) error("error in times!"); would be and exercise in wasting CPU cycles, producing dead code and feeding one's paranoia. > User applications have no other way to check for error. And in all realistic scenarios it doesn't need to. In this particular case, it makes sense to ignore standards and never return an error. If user indeed passed invalid pointer, just don't store anything there, but still return valid value. -- vda ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 1:53 ` Paul Mackerras 2007-11-08 2:09 ` David Miller @ 2007-11-08 3:07 ` Andrew Morton 2007-11-08 3:13 ` David Miller ` (2 more replies) 1 sibling, 3 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 3:07 UTC (permalink / raw) To: Paul Mackerras; +Cc: lkml, linux-kernel, drepper, mtk-manpages > On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <paulus@samba.org> wrote: > Andrew Morton writes: > > > Given all this stuff, the return value from sys_times() doesn't seem a > > particularly useful or reliable kernel interface. > > I think the best thing would be to ignore any error from copy_to_user > and always return the number of clock ticks. We should call > force_successful_syscall_return, and glibc on x86 should be taught not > to interpret negative values as an error. Changing glibc might be hard ;) > POSIX doesn't require us to return an EFAULT error if the buf argument > is bogus. If userspace does supply a bogus buf pointer, then either > it will dereference it itself and get a segfault, or it won't > dereference it, in which case it obviously didn't care about the > values we tried to put there. > > If we try to return an error under some circumstances, then there is > at least one 32-bit value for the number of ticks that will cause > confusion. We can either change that value (or values) to some other > value, which seems pretty bogus, or we can just decide not to return > any errors. The latter seems to me to have no significant downside > and to be the simplest solution to the problem. "the latter" is what my protopatch does isn't it? It wraps at 0x7fffffff. It appears that glibc treats all of 0x80000000-0xffffffff as an error. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 3:07 ` Andrew Morton @ 2007-11-08 3:13 ` David Miller 2007-11-08 5:15 ` Paul Mackerras 2007-11-08 4:59 ` Paul Mackerras 2007-11-08 19:27 ` Denys Vlasenko 2 siblings, 1 reply; 37+ messages in thread From: David Miller @ 2007-11-08 3:13 UTC (permalink / raw) To: akpm; +Cc: paulus, lkml, linux-kernel, drepper, mtk-manpages From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 7 Nov 2007 19:07:14 -0800 > It appears that glibc treats all of 0x80000000-0xffffffff as an > error. glibc treats it as an error if the system call returns with the carry condition code set. At least that's how I've understood it to work and at a minimum this is how it works on sparc, ppc, ia64, mips, etc. The error indication is being created by the system call return path in the kernel. It tests for values between -512 and 0, and marks those as errors unless force_successful_syscall() has been called. I can't see where x86 is doing this though, so perhaps for x86 glibc does make the negative value check. But I doubt it is checking the range 0x80000000-0xffffffff, otherwise mmap() would be busted. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 3:13 ` David Miller @ 2007-11-08 5:15 ` Paul Mackerras 2007-11-08 6:24 ` David Miller 0 siblings, 1 reply; 37+ messages in thread From: Paul Mackerras @ 2007-11-08 5:15 UTC (permalink / raw) To: David Miller; +Cc: akpm, lkml, linux-kernel, drepper, mtk-manpages David Miller writes: > I can't see where x86 is doing this though, so perhaps for x86 > glibc does make the negative value check. But I doubt it is > checking the range 0x80000000-0xffffffff, otherwise mmap() would > be busted. At least for the INTERNAL_SYSCALL macro in glibc, the error check is: #define INTERNAL_SYSCALL_ERROR_P(val, err) \ ((unsigned int) (val) >= 0xfffff001u) in sysdeps/unix/sysv/linux/i386/sysdep.h. Similarly the PSEUDO macro in that file does a cmpl $-4095,%eax to test for error. (There is also a PSEUDO_NOERRNO which doesn't test for error.) So the convention on (32-bit) x86 is that -4095 .. -1 are error values, and other values are successful return values. Paul. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 5:15 ` Paul Mackerras @ 2007-11-08 6:24 ` David Miller 0 siblings, 0 replies; 37+ messages in thread From: David Miller @ 2007-11-08 6:24 UTC (permalink / raw) To: paulus; +Cc: akpm, lkml, linux-kernel, drepper, mtk-manpages From: Paul Mackerras <paulus@samba.org> Date: Thu, 8 Nov 2007 16:15:51 +1100 > David Miller writes: > > > I can't see where x86 is doing this though, so perhaps for x86 > > glibc does make the negative value check. But I doubt it is > > checking the range 0x80000000-0xffffffff, otherwise mmap() would > > be busted. > > At least for the INTERNAL_SYSCALL macro in glibc, the error check is: > > #define INTERNAL_SYSCALL_ERROR_P(val, err) \ > ((unsigned int) (val) >= 0xfffff001u) > > in sysdeps/unix/sysv/linux/i386/sysdep.h. Similarly the PSEUDO macro > in that file does a cmpl $-4095,%eax to test for error. (There is also > a PSEUDO_NOERRNO which doesn't test for error.) > > So the convention on (32-bit) x86 is that -4095 .. -1 are error > values, and other values are successful return values. Thanks for figuring that out. Really there is no way to fix sys_times() return values universally. Each proposed solution either doesn't fix the problem, or adds a new failure mode. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 3:07 ` Andrew Morton 2007-11-08 3:13 ` David Miller @ 2007-11-08 4:59 ` Paul Mackerras 2007-11-08 5:20 ` Andrew Morton 2007-11-08 6:22 ` David Miller 2007-11-08 19:27 ` Denys Vlasenko 2 siblings, 2 replies; 37+ messages in thread From: Paul Mackerras @ 2007-11-08 4:59 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-kernel, drepper, mtk-manpages Andrew Morton writes: > "the latter" is what my protopatch does isn't it? It wraps at 0x7fffffff. > It appears that glibc treats all of 0x80000000-0xffffffff as an error. Not on powerpc. On powerpc the error indication is carried separately in a condition register bit. So a force_successful_syscall_return() call will make glibc automatically do the right thing without any glibc changes on powerpc. Wrapping at 0x7fffffff will cause programs to see large negative deltas between successive calls when the wrap occurs. I can see that giving userspace fits. :) Paul. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 4:59 ` Paul Mackerras @ 2007-11-08 5:20 ` Andrew Morton 2007-11-08 5:36 ` Paul Mackerras 2007-11-08 6:25 ` David Miller 2007-11-08 6:22 ` David Miller 1 sibling, 2 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 5:20 UTC (permalink / raw) To: Paul Mackerras; +Cc: lkml, linux-kernel, drepper, mtk-manpages > On Thu, 8 Nov 2007 15:59:12 +1100 Paul Mackerras <paulus@samba.org> wrote: > Andrew Morton writes: > > > "the latter" is what my protopatch does isn't it? It wraps at 0x7fffffff. > > It appears that glibc treats all of 0x80000000-0xffffffff as an error. > > Not on powerpc. On powerpc the error indication is carried separately > in a condition register bit. So a force_successful_syscall_return() > call will make glibc automatically do the right thing without any > glibc changes on powerpc. OK > Wrapping at 0x7fffffff will cause programs to see large negative > deltas between successive calls when the wrap occurs. I can see that > giving userspace fits. :) > Yup. But userspace will already have a fit if either the start or end time advanced into the glibc-thought-that-was-an-error range. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 5:20 ` Andrew Morton @ 2007-11-08 5:36 ` Paul Mackerras 2007-11-08 6:12 ` Andrew Morton 2007-11-08 6:25 ` David Miller 1 sibling, 1 reply; 37+ messages in thread From: Paul Mackerras @ 2007-11-08 5:36 UTC (permalink / raw) To: Andrew Morton; +Cc: lkml, linux-kernel, drepper, mtk-manpages Andrew Morton writes: > Yup. But userspace will already have a fit if either the start or end time > advanced into the glibc-thought-that-was-an-error range. Not nearly as much of a fit. The effect on x86 is that values between -4095 and -1 are reported as -1, so the end-start difference will be out by less than 41 seconds. That's not nearly as dramatic as a difference of 21 million seconds (over 16 years). :) I really think that wrapping at 0x7fffffff makes the situation worse, not better. Paul. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 5:36 ` Paul Mackerras @ 2007-11-08 6:12 ` Andrew Morton 0 siblings, 0 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 6:12 UTC (permalink / raw) To: Paul Mackerras; +Cc: lkml, linux-kernel, drepper, mtk-manpages > On Thu, 8 Nov 2007 16:36:08 +1100 Paul Mackerras <paulus@samba.org> wrote: > Andrew Morton writes: > > > Yup. But userspace will already have a fit if either the start or end time > > advanced into the glibc-thought-that-was-an-error range. > > Not nearly as much of a fit. The effect on x86 is that values between > -4095 and -1 are reported as -1, so the end-start difference will be > out by less than 41 seconds. That's not nearly as dramatic as a > difference of 21 million seconds (over 16 years). :) > > I really think that wrapping at 0x7fffffff makes the situation worse, > not better. > Sure. So we need to do what you say: never return an error from sys_times() and change glibc to not perform error-interpretation on sys_times() return values and recommend that people bypass libc and go direct to the syscall so they'll work correctly on older glibc. Lovely. I wonder what happens with things like F_GETOWN, shmat() and lseek(/dev/mem) on x86 (things which use force_successful_syscall_return()). According to the comment in include/linux/ptrace.h, glibc should be special-casing these. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 5:20 ` Andrew Morton 2007-11-08 5:36 ` Paul Mackerras @ 2007-11-08 6:25 ` David Miller 2007-11-08 7:09 ` Andrew Morton 2007-11-08 8:53 ` Paul Mackerras 1 sibling, 2 replies; 37+ messages in thread From: David Miller @ 2007-11-08 6:25 UTC (permalink / raw) To: akpm; +Cc: paulus, lkml, linux-kernel, drepper, mtk-manpages From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 7 Nov 2007 21:20:05 -0800 > Yup. But userspace will already have a fit if either the start or end time > advanced into the glibc-thought-that-was-an-error range. On x86 only. We could use force_successful_syscall_return() to make sure the condition codes get set correctly on other platforms. But even in that case we'd still be broken when the return value is exactly -1 and that's what the application is going to compare against to test for errors. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 6:25 ` David Miller @ 2007-11-08 7:09 ` Andrew Morton 2007-11-08 7:14 ` David Miller 2007-11-08 8:53 ` Paul Mackerras 1 sibling, 1 reply; 37+ messages in thread From: Andrew Morton @ 2007-11-08 7:09 UTC (permalink / raw) To: David Miller; +Cc: paulus, lkml, linux-kernel, drepper, mtk-manpages > On Wed, 07 Nov 2007 22:25:30 -0800 (PST) David Miller <davem@davemloft.net> wrote: > From: Andrew Morton <akpm@linux-foundation.org> > Date: Wed, 7 Nov 2007 21:20:05 -0800 > > > Yup. But userspace will already have a fit if either the start or end time > > advanced into the glibc-thought-that-was-an-error range. > > On x86 only. We could use force_successful_syscall_return() > to make sure the condition codes get set correctly on > other platforms. > > But even in that case we'd still be broken when the return > value is exactly -1 and that's what the application is going > to compare against to test for errors. I don't think that's a big problem? This syscall can (oddly) return any 32-bit (64-bit) number and a smart application developer (after saying wtf) would realise that he just can't check for errors and have correctly working code. Then again, if he was smart he just wouldn't use times(2)'s return value for anything. But what is the alternative? I don't think there is one, apart from much saner things like gettimeofday(). ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 7:09 ` Andrew Morton @ 2007-11-08 7:14 ` David Miller 0 siblings, 0 replies; 37+ messages in thread From: David Miller @ 2007-11-08 7:14 UTC (permalink / raw) To: akpm; +Cc: paulus, lkml, linux-kernel, drepper, mtk-manpages From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 7 Nov 2007 23:09:16 -0800 > I don't think that's a big problem? This syscall can (oddly) return any > 32-bit (64-bit) number and a smart application developer (after saying wtf) > would realise that he just can't check for errors and have correctly > working code. > > Then again, if he was smart he just wouldn't use times(2)'s return value > for anything. But what is the alternative? I don't think there is one, > apart from much saner things like gettimeofday(). You and I would say "wtf", but the manual states what it does: On error, (clock_t) -1 is returned, and errno is set appro- priately. And I think this (obviously bogus) convention is something we are really stuck with. Another awful aspect of this is that glibc is going to overwrite 'errno' for this return value range. That will likely cause more application misbehavior than some of the other side effects we've been discussing. In short we have two problems: 1) glibc thinks -4096 < x < 0 is an error, and will write this value into errno and return -1 to the application 2) the manual states that -1 means error ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 6:25 ` David Miller 2007-11-08 7:09 ` Andrew Morton @ 2007-11-08 8:53 ` Paul Mackerras 1 sibling, 0 replies; 37+ messages in thread From: Paul Mackerras @ 2007-11-08 8:53 UTC (permalink / raw) To: David Miller; +Cc: akpm, lkml, linux-kernel, drepper, mtk-manpages David Miller writes: > On x86 only. We could use force_successful_syscall_return() > to make sure the condition codes get set correctly on > other platforms. > > But even in that case we'd still be broken when the return > value is exactly -1 and that's what the application is going > to compare against to test for errors. We could special-case that and turn it into 0. That would introduce a 0.01 second blip, which would be better than a 41 second window for bad behaviour like we have at the moment. It's also possible that many applications already don't check for errors. For example, glibc deliberately doesn't check for errors when it calls __times in the clock() implementation. There is a comment in sysdeps/unix/sysv/linux/clock.c that says this: /* We don't check for errors here. The only error the kernel returns is EFAULT if the value cannot be written to the struct we pass a pointer to. Otherwise the kernel returns an `unsigned long' value which is the number of jiffies since system start. But this number can be negative (when read as `long') when the system is up for some time. Ignoring errors should therefore have no negative impacts but solve the problem. */ __times (&buf); Paul. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 4:59 ` Paul Mackerras 2007-11-08 5:20 ` Andrew Morton @ 2007-11-08 6:22 ` David Miller 1 sibling, 0 replies; 37+ messages in thread From: David Miller @ 2007-11-08 6:22 UTC (permalink / raw) To: paulus; +Cc: akpm, lkml, linux-kernel, drepper, mtk-manpages From: Paul Mackerras <paulus@samba.org> Date: Thu, 8 Nov 2007 15:59:12 +1100 > Not on powerpc. On powerpc the error indication is carried separately > in a condition register bit. So a force_successful_syscall_return() > call will make glibc automatically do the right thing without any > glibc changes on powerpc. It still won't fix the problem. When the return value is (clock_t) -1, all the force_successful_syscall_return() calls and glibc condition codes checks in the world are not going to fix the application code which checks for error using -1. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 3:07 ` Andrew Morton 2007-11-08 3:13 ` David Miller 2007-11-08 4:59 ` Paul Mackerras @ 2007-11-08 19:27 ` Denys Vlasenko 2 siblings, 0 replies; 37+ messages in thread From: Denys Vlasenko @ 2007-11-08 19:27 UTC (permalink / raw) To: Andrew Morton; +Cc: Paul Mackerras, lkml, linux-kernel, drepper, mtk-manpages On Thursday 08 November 2007 03:07, Andrew Morton wrote: > > On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <paulus@samba.org> wrote: > > Andrew Morton writes: > > > > > Given all this stuff, the return value from sys_times() doesn't seem a > > > particularly useful or reliable kernel interface. > > > > I think the best thing would be to ignore any error from copy_to_user > > and always return the number of clock ticks. We should call > > force_successful_syscall_return, and glibc on x86 should be taught not > > to interpret negative values as an error. > > Changing glibc might be hard ;) > > > POSIX doesn't require us to return an EFAULT error if the buf argument > > is bogus. If userspace does supply a bogus buf pointer, then either > > it will dereference it itself and get a segfault, or it won't > > dereference it, in which case it obviously didn't care about the > > values we tried to put there. > > > > If we try to return an error under some circumstances, then there is > > at least one 32-bit value for the number of ticks that will cause > > confusion. We can either change that value (or values) to some other > > value, which seems pretty bogus, or we can just decide not to return > > any errors. The latter seems to me to have no significant downside > > and to be the simplest solution to the problem. > > "the latter" is what my protopatch does isn't it? It wraps at 0x7fffffff. > It appears that glibc treats all of 0x80000000-0xffffffff as an error. The best solution is to change the kernel to never return an error and to change glibc to never treat return as an error. -- vda ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-07 23:28 ` Andrew Morton 2007-11-08 0:18 ` Andrew Morton @ 2007-11-08 0:50 ` David Miller 2007-11-08 1:13 ` Andrew Morton 2007-11-08 6:00 ` David Brown 2 siblings, 1 reply; 37+ messages in thread From: David Miller @ 2007-11-08 0:50 UTC (permalink / raw) To: akpm; +Cc: lkml, linux-kernel, drepper, mtk-manpages From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 7 Nov 2007 15:28:33 -0800 > Perhaps this is a bug in glibc: it is interpreting the times() return value > in the same way as other syscalls. The problem is more likely that we are failing to invoke force_successful_syscall_return() here. Otherwise the syscall return path interprets negative values as errors, and sets the cpu condition codes. And that is what userspace is actually checking for to determine if there is an error or not. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-08 0:50 ` David Miller @ 2007-11-08 1:13 ` Andrew Morton 0 siblings, 0 replies; 37+ messages in thread From: Andrew Morton @ 2007-11-08 1:13 UTC (permalink / raw) To: David Miller; +Cc: lkml, linux-kernel, drepper, mtk-manpages > On Wed, 07 Nov 2007 16:50:22 -0800 (PST) David Miller <davem@davemloft.net> wrote: > From: Andrew Morton <akpm@linux-foundation.org> > Date: Wed, 7 Nov 2007 15:28:33 -0800 > > > Perhaps this is a bug in glibc: it is interpreting the times() return value > > in the same way as other syscalls. > > The problem is more likely that we are failing to > invoke force_successful_syscall_return() here. > > Otherwise the syscall return path interprets negative > values as errors, and sets the cpu condition codes. > > And that is what userspace is actually checking for > to determine if there is an error or not. hm, I'd forgotten about that. It seems to be a no-op on lots of architectures? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: compat_sys_times() bogus until jiffies >= 0. 2007-11-07 23:28 ` Andrew Morton 2007-11-08 0:18 ` Andrew Morton 2007-11-08 0:50 ` David Miller @ 2007-11-08 6:00 ` David Brown 2 siblings, 0 replies; 37+ messages in thread From: David Brown @ 2007-11-08 6:00 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Ulrich Drepper, Michael Kerrisk On Wed, Nov 07, 2007 at 03:28:33PM -0800, Andrew Morton wrote: >> On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <lkml@davidb.org> wrote: >> will return '-1' to user space and set the negated clock_t value to errno. >> >> At minimum, perhaps it should return a sane errno value. > >RETURN VALUE > times() returns the number of clock ticks that have elapsed since an > arbitrary point in the past. For Linux 2.4 and earlier this point is > the moment the system was booted. Since Linux 2.6, this point is > (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot > time. The return value may overflow the possible range of type > clock_t. On error, (clock_t) -1 is returned, and errno is set appro- > priately. The strange -1 behavior is enshrined in history. I think a better answer is to tell people to use getrusage() if they want a return result without this problem. Adding INITIAL_JIFFIES will fix the case where an embedded system is booted up to run a test and then shut down, and the mask, although it causes discontinuities periodically at least moves them away from the early boot. INITIAL_JIFFIES was a good idea, but it is probably best to keep it inside of the kernel. David Brown ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2007-12-22 12:46 UTC | newest] Thread overview: 37+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-07 22:47 compat_sys_times() bogus until jiffies >= 0 David Brown 2007-11-07 23:28 ` Andrew Morton 2007-11-08 0:18 ` Andrew Morton 2007-11-08 0:54 ` Andreas Schwab 2007-11-08 1:17 ` Andrew Morton 2007-11-08 1:53 ` Paul Mackerras 2007-11-08 2:09 ` David Miller 2007-11-08 10:20 ` Andreas Schwab 2007-11-08 14:42 ` Chris Friesen 2007-11-09 18:20 ` Ulrich Drepper 2007-12-20 11:36 ` Michael Kerrisk 2007-12-20 11:51 ` David Miller 2007-12-22 0:42 ` Andi Kleen 2007-12-22 1:41 ` David Miller 2007-12-22 1:45 ` David Miller 2007-12-22 1:53 ` Andi Kleen 2007-12-22 4:36 ` David Miller 2007-12-22 12:47 ` Andi Kleen 2007-12-22 1:49 ` Andi Kleen 2007-11-08 19:25 ` Denys Vlasenko 2007-11-08 3:07 ` Andrew Morton 2007-11-08 3:13 ` David Miller 2007-11-08 5:15 ` Paul Mackerras 2007-11-08 6:24 ` David Miller 2007-11-08 4:59 ` Paul Mackerras 2007-11-08 5:20 ` Andrew Morton 2007-11-08 5:36 ` Paul Mackerras 2007-11-08 6:12 ` Andrew Morton 2007-11-08 6:25 ` David Miller 2007-11-08 7:09 ` Andrew Morton 2007-11-08 7:14 ` David Miller 2007-11-08 8:53 ` Paul Mackerras 2007-11-08 6:22 ` David Miller 2007-11-08 19:27 ` Denys Vlasenko 2007-11-08 0:50 ` David Miller 2007-11-08 1:13 ` Andrew Morton 2007-11-08 6:00 ` David Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox