public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Stultz <johnstul@us.ibm.com>
To: Prarit Bhargava <prarit@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Salman Qazi <sqazi@google.com>,
	stable@kernel.org
Subject: Re: [PATCH] clocksource, prevent overflow in clocksource_cyc2ns
Date: Thu, 05 Apr 2012 09:45:48 -0700	[thread overview]
Message-ID: <4F7DCC3C.70208@us.ibm.com> (raw)
In-Reply-To: <4F7D8FA1.1010107@redhat.com>

On 04/05/2012 05:27 AM, Prarit Bhargava wrote:
>> So what kernel version are you using?
> I retested using top of the linux.git tree, running
>
> echo 1>  /proc/sys/kernel/sysrq
> for i in `seq 10000`; do sleep 1000&  done
> echo t>  /proc/sysrq-trigger
>
> and I no longer see a problem.  However, if I increase the number of threads to
> 1000/cpu I get
>
> Clocksource %s unstable (delta = -429565427)
> Clocksource switching to hpet

Yea, so this is with the 12minute + stall starving the system from irqs? 
Again, its all a matter of degree.

>
>> to narrow down if you're  problem  is currently present in mainline or only in
>> older kernels, as that will help us find the proper fix.
> If I hack in (sorry for the cut-and-paste)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index c958338..f38b8d0 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -279,11 +279,16 @@ static void clocksource_watchdog(unsigned long data)
>                          continue;
>                  }
>
> -               wd_nsec = clocksource_cyc2ns((wdnow - cs->wd_last)&  watchdog->m
> -                                            watchdog->mult, watchdog->shift);
> +               /*wd_nsec = clocksource_cyc2ns((wdnow - cs->wd_last)&  watchdog-
> +                                            watchdog->mult, watchdog->shift);*/
> +               wd_nsec = mult_frac(((wdnow - cs->wd_last), watchdog->mult,
> +                                   1UL<<  watchdog->shift);
> +
> +               /*cs_nsec = clocksource_cyc2ns((csnow - cs->cs_last)&
> +                                            cs->mask, cs->mult, cs->shift);*/
> +               cs_nsec = mult_frac(((csnow - cs->cs_last), cs->mult,
> +                                   1UL<<  cs->shift);
>
> -               cs_nsec = clocksource_cyc2ns((csnow - cs->cs_last)&
> -                                            cs->mask, cs->mult, cs->shift);
>                  cs->cs_last = csnow;
>                  cs->wd_last = wdnow;
>
>
> then I don't see unstable messages.
>
> I think the problem is still here but it only happens in extreme cases.

I suspect if you were closely watching the clock  you'd probably see 
some odd time inconsistencies as well. Its all due to running outside 
the expected bounds of the system.  If you were to use the acpi_pm 
clocksource and stall the system for 10+ minutes, when you returned the 
system clock would be 10 minutes behind.

I think if 10 minutes printk dumps is going to be expected behavior 
(which given the size of these machines, isn't unreasonable), we may 
need to find a way to allow printk to take a break every so often and 
allow irqs to happen. Disabling irqs for 10+ minutes can cause all sorts 
of system trouble.  I've seen scsi adapters hang if their heart beat 
code didn't get an irq every 10 minutes or so.

I had proposed something like this awhile back, but the issue cropping 
up there was lost time due to long printks holding off irqs while we're 
using the jiffies clocks in early boot, so it was at a much smaller scale.

Could you give the following a try and let me know if it helps?

thanks
-john


Only flush 1k chunks max from console_unlock to allow IRQ starvation 
from happening.

NOT FOR INCLUSION

Signed-off-by: John Stultz <john.stultz@linaro.org>

diff --git a/kernel/printk.c b/kernel/printk.c
index b663c2c..b699cc7 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1272,6 +1272,7 @@ void console_unlock(void)
  	unsigned long flags;
  	unsigned _con_start, _log_end;
  	unsigned wake_klogd = 0, retry = 0;
+	unsigned chunk_size, length;

  	if (console_suspended) {
  		up(&console_sem);
@@ -1280,15 +1281,18 @@ void console_unlock(void)

  	console_may_schedule = 0;

+	chunk_size = min(LOG_BUF_MASK, 1024); /* 1k chunks */
  again:
  	for ( ; ; ) {
  		raw_spin_lock_irqsave(&logbuf_lock, flags);
  		wake_klogd |= log_start - log_end;
  		if (con_start == log_end)
  			break;			/* Nothing to print */
+		length = (log_end - con_start)&  LOG_BUF_MASK;
+		length = min(length , chunk_size);
  		_con_start = con_start;
-		_log_end = log_end;
-		con_start = log_end;		/* Flush */
+		_log_end = (con_start + length)&  LOG_BUF_MASK;
+		con_start = _log_end;		/* Flush */
  		raw_spin_unlock(&logbuf_lock);
  		stop_critical_timings();	/* don't trace print latency */
  		call_console_drivers(_con_start, _log_end);


  reply	other threads:[~2012-04-05 16:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-04 15:11 [PATCH] clocksource, prevent overflow in clocksource_cyc2ns Prarit Bhargava
2012-04-04 18:00 ` John Stultz
2012-04-04 18:33   ` Prarit Bhargava
2012-04-05  1:08     ` John Stultz
2012-04-05 11:00       ` Prarit Bhargava
2012-04-05 16:23         ` John Stultz
2012-04-05 12:27       ` Prarit Bhargava
2012-04-05 16:45         ` John Stultz [this message]
2012-04-06 23:29         ` Thomas Gleixner
2012-04-07 13:47           ` Prarit Bhargava
2012-04-18 23:20         ` John Stultz
2012-04-18 23:59           ` Prarit Bhargava
2012-04-19  0:18             ` John Stultz
2012-04-19 11:56               ` Prarit Bhargava
2012-04-19 12:50               ` Thomas Gleixner
2012-04-19 12:52                 ` Thomas Gleixner
2012-04-19 13:06                   ` Prarit Bhargava
2012-04-19 13:18                     ` Thomas Gleixner
2012-04-19 18:12                   ` John Stultz
2012-04-25 12:29                     ` Prarit Bhargava
2012-04-19 12:37             ` Thomas Gleixner
2012-04-19 12:51               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F7DCC3C.70208@us.ibm.com \
    --to=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=prarit@redhat.com \
    --cc=sqazi@google.com \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox