All of lore.kernel.org
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Tomas Janousek <tjanouse@redhat.com>
Cc: linux-kernel@vger.kernel.org, tsmetana@redhat.com, riel@redhat.com
Subject: Re: Broken process startup times after suspend (regression)
Date: Thu, 03 May 2007 12:59:20 -0700	[thread overview]
Message-ID: <1178222360.6085.14.camel@localhost.localdomain> (raw)
In-Reply-To: <20070503150320.GA21471@redhat.com>

On Thu, 2007-05-03 at 17:03 +0200, Tomas Janousek wrote:
> Hi,
> 
> the commits
>   411187fb05cd11676b0979d9fbf3291db69dbce2 (GTOD: persistent clock support)
>   c1d370e167d66b10bca3b602d3740405469383de (i386: use GTOD persistent clock
>     support)
> caused a change in the way uptime is measured and introduced a regression such
> that it's no longer possible to obtain a correct process start time or the
> system boot time.
> 
> These two commits make the monotonic time not jump by hudreds of seconds after
> the kernel resumes from suspend, but they achieve it by moving the
> wall_to_monotonic offset, which is used for all boot time / uptime
> calculations. It's not possible to read the real boot time / process start
> time then.

Indeed. The monotonic clock's behavior around suspend and resume is
poorly defined. When we increased it, folks didn't like the fact that
uptime would increase while a system was suspended. 

> I was thinking about a solution and I am posting the least intrusive one I
> found. I add a total_sleep_time variable which is to be added to
> wall_to_monotonic when one wants the proper boot time. I do by no means say
> it's the best one, I expect comments.

While I'd prefer wrapping it in a interface and keeping it in
timekeeping (rather then exporting a collection of values that should be
combined in varying ways), I think this approach is probably the best.
Rather then forcing one behavior or the other we can provide both smooth
monotonic and monotonic w/ sleep.

> This requires a patch to procps that makes it use /proc/stat's btime instead
> of "now - uptime". This was written by Tomas Smetana and I'm attaching it as
> well. (It was posted upstream as it fixes another bug in ps as well.) Or we
> could just make the /proc/uptime contain "now - btime" like it used to before
> those two commits.
> 
> ---
>  fs/proc/proc_misc.c  |    2 +-
>  include/linux/time.h |    1 +
>  kernel/fork.c        |    1 +
>  kernel/timer.c       |    6 ++++++
>  4 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/proc/proc_misc.c b/fs/proc/proc_misc.c
> index e2c4c0a..a29309e 100644
> --- a/fs/proc/proc_misc.c
> +++ b/fs/proc/proc_misc.c
> @@ -456,7 +456,7 @@ static int show_stat(struct seq_file *p, void *v)
> 
>  	user = nice = system = idle = iowait =
>  		irq = softirq = steal = cputime64_zero;
> -	jif = - wall_to_monotonic.tv_sec;
> +	jif = - (wall_to_monotonic.tv_sec + total_sleep_time);
>  	if (wall_to_monotonic.tv_nsec)
>  		--jif;

As I said, could we wrap this in a function? 

Maybe something like get_monotonicwithsleep_time() (feel free to come up
with a better name :).

> diff --git a/include/linux/time.h b/include/linux/time.h
> index 8ea8dea..cb87616 100644
> --- a/include/linux/time.h
> +++ b/include/linux/time.h
> @@ -90,6 +90,7 @@ static inline struct timespec timespec_sub(struct timespec lhs,
> 
>  extern struct timespec xtime;
>  extern struct timespec wall_to_monotonic;
> +extern unsigned long total_sleep_time;
>  extern seqlock_t xtime_lock __attribute__((weak));
> 
>  extern unsigned long read_persistent_clock(void);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 6af959c..c051f17 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1056,6 +1056,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> 
>  	p->lock_depth = -1;		/* -1 = no lock */
>  	do_posix_clock_monotonic_gettime(&p->start_time);
> +	p->start_time.tv_sec += total_sleep_time;
>  	p->security = NULL;
>  	p->io_context = NULL;
>  	p->io_wait = NULL;
> diff --git a/kernel/timer.c b/kernel/timer.c
> index dd6c2c1..13eb201 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -759,9 +759,13 @@ unsigned long next_timer_interrupt(void)
>   * at zero at system boot time, so wall_to_monotonic will be negative,
>   * however, we will ALWAYS keep the tv_nsec part positive so we can use
>   * the usual normalization.
> + *
> + * One needs to add total_sleep_time to wall_to_monotonic to get the real boot
> + * time.
>   */
>  struct timespec xtime __attribute__ ((aligned (16)));
>  struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
> +unsigned long total_sleep_time;
> 
>  EXPORT_SYMBOL(xtime);
> 
> @@ -975,6 +979,7 @@ void __init timekeeping_init(void)
>  	xtime.tv_nsec = 0;
>  	set_normalized_timespec(&wall_to_monotonic,
>  		-xtime.tv_sec, -xtime.tv_nsec);
> +	total_sleep_time = 0;
> 
>  	write_sequnlock_irqrestore(&xtime_lock, flags);
>  }
> @@ -1004,6 +1009,7 @@ static int timekeeping_resume(struct sys_device *dev)
> 
>  		xtime.tv_sec += sleep_length;
>  		wall_to_monotonic.tv_sec -= sleep_length;
> +		total_sleep_time += sleep_length;
>  	}
>  	/* re-base the last cycle value */
>  	clock->cycle_last = clocksource_read(clock);

Otherwise it looks good.

Thanks for bringing this up!
-john



  reply	other threads:[~2007-05-03 19:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-03 15:03 Broken process startup times after suspend (regression) Tomas Janousek
2007-05-03 19:59 ` john stultz [this message]
2007-05-04 16:13   ` Tomas Janousek
2007-05-04 18:18     ` john stultz
2007-05-04 19:09       ` Tomas Janousek
2007-05-10 17:10       ` [PATCH] Introduce boot based time Tomas Janousek
2007-05-10 17:10         ` [PATCH] Use boot based time for process start time and boot time in /proc Tomas Janousek
2007-05-10 18:44           ` john stultz
2007-05-10 23:40           ` Andrew Morton
2007-05-11  8:45             ` Tomas Janousek
2007-05-11 19:51               ` Andrew Morton
2007-05-11 23:37                 ` Tomas Janousek
2007-05-11  8:46             ` Tomas Janousek
2007-05-10 17:10         ` [PATCH] Use boot based time for uptime " Tomas Janousek
2007-05-10 18:47           ` john stultz
2007-05-10 18:42         ` [PATCH] Introduce boot based time john stultz
2007-05-10 19:48         ` Ingo Oeser
2007-05-10 20:00           ` Tomas Janousek
2007-05-10 20:02           ` john stultz
2007-05-10 22:22             ` Ingo Oeser
2007-05-10 23:40         ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2007-05-06  3:16 Broken process startup times after suspend (regression) Albert Cahalan
2007-05-10 11:36 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1178222360.6085.14.camel@localhost.localdomain \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=tjanouse@redhat.com \
    --cc=tsmetana@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.