linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: y2038@lists.linaro.org
Cc: pang.xunlei@linaro.org, Peter Zijlstra <peterz@infradead.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Paul Mackerras <paulus@samba.org>,
	cl@linux.com, Ingo Molnar <mingo@kernel.org>,
	heenasirwani@gmail.com, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, rafael.j.wysocki@intel.com,
	ahh@google.com, Frederic Weisbecker <fweisbec@gmail.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	pjt@google.com, riel@redhat.com, richardcochran@gmail.com,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	John Stultz <john.stultz@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	rth@twiddle.net, Baolin Wang <baolin.wang@linaro.org>,
	gregkh@linuxfoundation.org, LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	linux390@de.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [Y2038] [PATCH 04/11] posix timers:Introduce the 64bit methods with timespec64 type for k_clock structure
Date: Wed, 22 Apr 2015 13:07:44 +0200	[thread overview]
Message-ID: <2233518.Z2Q4dpO62C@wuerfel> (raw)
In-Reply-To: <alpine.DEB.2.11.1504220946460.13914@nanos>

On Wednesday 22 April 2015 10:45:23 Thomas Gleixner wrote:
> On Tue, 21 Apr 2015, Thomas Gleixner wrote:

> So we could save one translation step if we implement new syscalls
> which have a scalar nsec interface instead of the timespec/timeval
> cruft and let user space do the translation to whatever it wants.
> 
> So
> 
> sys_clock_nanosleep(const clockid_t which_clock, int flags,
> 	            const struct timespec __user *expires,
> 		    struct timespec __user *reminder)
> 
> would get the new syscall variant:
> 
> sys_clock_nanosleep_ns(const clockid_t which_clock, int flags,
> 		       const s64 expires, s64 __user *reminder)

As you might expect, there are a number of complications with this
approach:

- John Stultz likes to point out that it's easier to do one change
  at a time, so extending the interface to 64-bit has less potential
  of breaking things than a more fundamental change. I think it's
  useful to drop a lot of the syscalls when a more modern version
  is around (e.g. let libc implement usleep and nanosleep through
  clock_nanosleep), but keep the syscalls as close to the known-working
  64-bit versions as we can.
- The inode timestamp related syscalls (stat, utimes and variants
  thereof) require the full range of time64_t and cannot use ktime_t.
- converting between timespec types of different size is cheap,
  converting timespec to ktime_t is still relatively cheap, but
  converting ktime_t to timespec is rather expensive (at least eight
  32-bit multiplies, plus a few shifts and additions if you don't
  have 64-bit arithmetic).
- ioctls that pass a timespec need to keep doing that or would require
  a source-level change in user space instead of recompiling.

> I personally would welcome such an interface as it makes user space
> programming simpler. Just (re)arming a periodic nanosleep based on
> absolute expiry time is horrible stupid today:
> 
> 	 struct timespec expires;
> 	 ....
> 	 while ()
> 	       expires.tv_nsec += period.tv_nsec;
> 	       expires.tv_sec += period.tv_sec;
> 	       normalize_timespec(&expires);
> 	       sys_clock_nanosleep(CLOCK_ID, ABS, &expires, NULL);
> 
> So with a scalar interface this would reduce to:
> 
> 	 s64 expires;
> 	 ....
> 	 while ()
> 	       expires += period;
> 	       sys_clock_nanosleep_ns(CLOCK_ID, ABS, &expires, NULL);
> 
> There is a difference both in text and storage size plus the avoidance
> of the two translation steps (one translation step on 64bit).

We should probably look at it separately for each syscall. It's
quite possible that we find a number of them for which it helps
and others for which it hurts, so we need to see the big pictures.

There are also a few other calls that will never need 64-bit
time_t because the range is limited by the need to only ever
pass relative timeouts (select, poll, io_getevents, recvmmsg,
clock_getres, rt_sigtimedwait, sched_rr_get_interval, getrusage,
waitid, semtimedop, sysinfo), so we could actually leave them
using a 32-bit structure and have the libc do the conversion.

> I know that this is non portable, but OTOH if I look at the non
> portable mechanisms which are used by data bases, java VMs and other
> apps which exist to squeeze the last cycles out of the system, there
> is certainly some value to that.
> 
> The portable/spec conforming apps can still use the user space
> assisted translated timespec/timeval mechanisms.
> 
> There is one caveat though: sys_clock_gettime and sys_gettimeofday
> will still need a syscall_timespec64 variant. We have no double
> translation steps there because we maintain the timespec
> representation in the timekeeping code for performance reasons to
> avoid the division in the syscall interface. But everything else can
> do nicely without the timespec cruft.
> 
> We really should talk to libc folks and high performance users about
> this before blindly adding a gazillion of new timespec64 based
> interfaces.

I've started a list of affected syscalls at
https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T_0YiQwis/edit?usp=sharing

Still adding more calls and description, let me know if you want edit
permissions.

	Arnd

  parent reply	other threads:[~2015-04-22 11:15 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-20  5:57 [PATCH 00/11] Convert the posix_clock_operations and k_clock structure to ready for 2038 Baolin Wang
2015-04-20  5:57 ` [PATCH 01/11] linux/time64.h:Introduce the 'struct itimerspec64' for 64bit Baolin Wang
2015-04-20  9:49   ` Sergei Shtylyov
2015-04-20 10:55     ` Baolin Wang
2015-04-20 19:14   ` Thomas Gleixner
2015-04-20 19:59     ` Thomas Gleixner
2015-04-21  8:19     ` Baolin Wang
2015-04-20  5:57 ` [PATCH 02/11] timekeeping:Introduce the current_kernel_time64() function with timespec64 type Baolin Wang
2015-04-20  5:57 ` [PATCH 03/11] time/hrtimer:Introduce hrtimer_get_res64() with timespec64 type for getting the timer resolution Baolin Wang
2015-04-20 19:15   ` Thomas Gleixner
2015-04-20  5:57 ` [PATCH 04/11] posix timers:Introduce the 64bit methods with timespec64 type for k_clock structure Baolin Wang
2015-04-20 20:40   ` Thomas Gleixner
2015-04-21  8:59     ` [Y2038] " Arnd Bergmann
2015-04-21 14:14       ` Thomas Gleixner
2015-04-21 14:57         ` Arnd Bergmann
2015-04-21 15:13           ` Thomas Gleixner
2015-04-21 15:40             ` Arnd Bergmann
2015-04-21 20:13               ` Thomas Gleixner
2015-04-22  8:45                 ` Thomas Gleixner
2015-04-22 10:11                   ` Richard Cochran
2015-04-22 10:44                   ` David Laight
2015-04-22 11:07                   ` Arnd Bergmann [this message]
2015-04-22 13:37                     ` Thomas Gleixner
2015-04-22 13:50                     ` Arnd Bergmann
2015-04-22 14:54                       ` Richard Cochran
2015-04-22 15:37                         ` Arnd Bergmann
2015-04-22 15:14                       ` Luc Van Oostenryck
2015-04-22 15:38                         ` Arnd Bergmann
2015-04-20  5:57 ` [PATCH 05/11] time/posix-timers:Convert to the 64bit methods for k_clock callback functions Baolin Wang
2015-04-20 20:48   ` Thomas Gleixner
2015-04-21  8:36     ` Baolin Wang
2015-04-21  8:45       ` [Y2038] " Arnd Bergmann
2015-04-21  8:55         ` Baolin Wang
2015-04-20  5:57 ` [PATCH 06/11] char/mmtimer:Convert to the 64bit methods for k_clock callback function Baolin Wang
2015-04-20  5:57 ` [PATCH 07/11] time/alarmtimer:Convert to the new methods for k_clock structure Baolin Wang
2015-04-20  5:57 ` [PATCH 08/11] time/posix-clock:Convert to the 64bit methods for k_clock and posix_clock_operations structure Baolin Wang
2015-04-20  5:57 ` [PATCH 09/11] cputime:Introduce the cputime_to_timespec64/timespec64_to_cputime function Baolin Wang
2015-04-20 21:09   ` Thomas Gleixner
2015-04-20  5:57 ` [PATCH 10/11] time/posix-cpu-timers:Convert to the 64bit methods for k_clock structure Baolin Wang
2015-04-20  5:57 ` [PATCH 11/11] k_clock:Remove the 32bit methods with timespec type Baolin Wang
2015-04-20  8:42   ` Richard Cochran
2015-04-20  9:00     ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2233518.Z2Q4dpO62C@wuerfel \
    --to=arnd@arndb.de \
    --cc=ahh@google.com \
    --cc=baolin.wang@linaro.org \
    --cc=cl@linux.com \
    --cc=fweisbec@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heenasirwani@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pang.xunlei@linaro.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=richardcochran@gmail.com \
    --cc=riel@redhat.com \
    --cc=rth@twiddle.net \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=y2038@lists.linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).