public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Tomas Janousek <tjanouse@redhat.com>
Cc: linux-kernel@vger.kernel.org, tsmetana@redhat.com, riel@redhat.com
Subject: Re: Broken process startup times after suspend (regression)
Date: Thu, 03 May 2007 12:59:20 -0700	[thread overview]
Message-ID: <1178222360.6085.14.camel@localhost.localdomain> (raw)
In-Reply-To: <20070503150320.GA21471@redhat.com>

On Thu, 2007-05-03 at 17:03 +0200, Tomas Janousek wrote:
> Hi,
> 
> the commits
>   411187fb05cd11676b0979d9fbf3291db69dbce2 (GTOD: persistent clock support)
>   c1d370e167d66b10bca3b602d3740405469383de (i386: use GTOD persistent clock
>     support)
> caused a change in the way uptime is measured and introduced a regression such
> that it's no longer possible to obtain a correct process start time or the
> system boot time.
> 
> These two commits make the monotonic time not jump by hudreds of seconds after
> the kernel resumes from suspend, but they achieve it by moving the
> wall_to_monotonic offset, which is used for all boot time / uptime
> calculations. It's not possible to read the real boot time / process start
> time then.

Indeed. The monotonic clock's behavior around suspend and resume is
poorly defined. When we increased it, folks didn't like the fact that
uptime would increase while a system was suspended. 

> I was thinking about a solution and I am posting the least intrusive one I
> found. I add a total_sleep_time variable which is to be added to
> wall_to_monotonic when one wants the proper boot time. I do by no means say
> it's the best one, I expect comments.

While I'd prefer wrapping it in a interface and keeping it in
timekeeping (rather then exporting a collection of values that should be
combined in varying ways), I think this approach is probably the best.
Rather then forcing one behavior or the other we can provide both smooth
monotonic and monotonic w/ sleep.

> This requires a patch to procps that makes it use /proc/stat's btime instead
> of "now - uptime". This was written by Tomas Smetana and I'm attaching it as
> well. (It was posted upstream as it fixes another bug in ps as well.) Or we
> could just make the /proc/uptime contain "now - btime" like it used to before
> those two commits.
> 
> ---
>  fs/proc/proc_misc.c  |    2 +-
>  include/linux/time.h |    1 +
>  kernel/fork.c        |    1 +
>  kernel/timer.c       |    6 ++++++
>  4 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/proc/proc_misc.c b/fs/proc/proc_misc.c
> index e2c4c0a..a29309e 100644
> --- a/fs/proc/proc_misc.c
> +++ b/fs/proc/proc_misc.c
> @@ -456,7 +456,7 @@ static int show_stat(struct seq_file *p, void *v)
> 
>  	user = nice = system = idle = iowait =
>  		irq = softirq = steal = cputime64_zero;
> -	jif = - wall_to_monotonic.tv_sec;
> +	jif = - (wall_to_monotonic.tv_sec + total_sleep_time);
>  	if (wall_to_monotonic.tv_nsec)
>  		--jif;

As I said, could we wrap this in a function? 

Maybe something like get_monotonicwithsleep_time() (feel free to come up
with a better name :).

> diff --git a/include/linux/time.h b/include/linux/time.h
> index 8ea8dea..cb87616 100644
> --- a/include/linux/time.h
> +++ b/include/linux/time.h
> @@ -90,6 +90,7 @@ static inline struct timespec timespec_sub(struct timespec lhs,
> 
>  extern struct timespec xtime;
>  extern struct timespec wall_to_monotonic;
> +extern unsigned long total_sleep_time;
>  extern seqlock_t xtime_lock __attribute__((weak));
> 
>  extern unsigned long read_persistent_clock(void);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 6af959c..c051f17 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1056,6 +1056,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> 
>  	p->lock_depth = -1;		/* -1 = no lock */
>  	do_posix_clock_monotonic_gettime(&p->start_time);
> +	p->start_time.tv_sec += total_sleep_time;
>  	p->security = NULL;
>  	p->io_context = NULL;
>  	p->io_wait = NULL;
> diff --git a/kernel/timer.c b/kernel/timer.c
> index dd6c2c1..13eb201 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -759,9 +759,13 @@ unsigned long next_timer_interrupt(void)
>   * at zero at system boot time, so wall_to_monotonic will be negative,
>   * however, we will ALWAYS keep the tv_nsec part positive so we can use
>   * the usual normalization.
> + *
> + * One needs to add total_sleep_time to wall_to_monotonic to get the real boot
> + * time.
>   */
>  struct timespec xtime __attribute__ ((aligned (16)));
>  struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
> +unsigned long total_sleep_time;
> 
>  EXPORT_SYMBOL(xtime);
> 
> @@ -975,6 +979,7 @@ void __init timekeeping_init(void)
>  	xtime.tv_nsec = 0;
>  	set_normalized_timespec(&wall_to_monotonic,
>  		-xtime.tv_sec, -xtime.tv_nsec);
> +	total_sleep_time = 0;
> 
>  	write_sequnlock_irqrestore(&xtime_lock, flags);
>  }
> @@ -1004,6 +1009,7 @@ static int timekeeping_resume(struct sys_device *dev)
> 
>  		xtime.tv_sec += sleep_length;
>  		wall_to_monotonic.tv_sec -= sleep_length;
> +		total_sleep_time += sleep_length;
>  	}
>  	/* re-base the last cycle value */
>  	clock->cycle_last = clocksource_read(clock);

Otherwise it looks good.

Thanks for bringing this up!
-john



  reply	other threads:[~2007-05-03 19:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-03 15:03 Broken process startup times after suspend (regression) Tomas Janousek
2007-05-03 19:59 ` john stultz [this message]
2007-05-04 16:13   ` Tomas Janousek
2007-05-04 18:18     ` john stultz
2007-05-04 19:09       ` Tomas Janousek
2007-05-10 17:10       ` [PATCH] Introduce boot based time Tomas Janousek
2007-05-10 17:10         ` [PATCH] Use boot based time for process start time and boot time in /proc Tomas Janousek
2007-05-10 18:44           ` john stultz
2007-05-10 23:40           ` Andrew Morton
2007-05-11  8:45             ` Tomas Janousek
2007-05-11 19:51               ` Andrew Morton
2007-05-11 23:37                 ` Tomas Janousek
2007-05-11  8:46             ` Tomas Janousek
2007-05-10 17:10         ` [PATCH] Use boot based time for uptime " Tomas Janousek
2007-05-10 18:47           ` john stultz
2007-05-10 18:42         ` [PATCH] Introduce boot based time john stultz
2007-05-10 19:48         ` Ingo Oeser
2007-05-10 20:00           ` Tomas Janousek
2007-05-10 20:02           ` john stultz
2007-05-10 22:22             ` Ingo Oeser
2007-05-10 23:40         ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2007-05-06  3:16 Broken process startup times after suspend (regression) Albert Cahalan
2007-05-10 11:36 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1178222360.6085.14.camel@localhost.localdomain \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=tjanouse@redhat.com \
    --cc=tsmetana@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox