From: Prarit Bhargava <prarit@redhat.com>
To: John Stultz <johnstul@us.ibm.com>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Salman Qazi <sqazi@google.com>,
stable@kernel.org
Subject: Re: [PATCH] clocksource, prevent overflow in clocksource_cyc2ns
Date: Thu, 05 Apr 2012 07:00:27 -0400 [thread overview]
Message-ID: <4F7D7B4B.7050203@redhat.com> (raw)
In-Reply-To: <4F7CF094.5020201@us.ibm.com>
On 04/04/2012 09:08 PM, John Stultz wrote:
> On 04/04/2012 11:33 AM, Prarit Bhargava wrote:
>>> One idea might be to replace the cyc2ns w/ mult_frac in only the watchdog code.
>>> I need to think on that some more (and maybe have you provide some debug output)
>>> to really understand how that's solving the issue for you, but it would be able
>>> to be done w/o affecting the other assumptions of the timekeeping core.
>>>
>> Hey John,
>>
>> After reading the initial part of your reply I was thinking about calling
>> mult_frac() directly from the watchdog code as well.
>>
>> Here's some debug output I cobbled together to get an idea of how quickly the
>> overflow was happening.
>>
>> [ 5.435323] clocksource_watchdog: {0} cs tsc csfirst 227349443638728 mask
>> 0xFFFFFFFFFFFFFFFF mult 797281036 shift 31
>> [ 5.444930] clocksource_watchdog: {0} wd hpet wdfirst 78332535 mask
>> 0xFFFFFFFF mult 292935555 shift 22
>>
>> These, of course, are just the basic data from the clocksources tsc and hpet.
>
> If I'm doing the math right, these are ~2.7 Ghz cpus?
Yes.
>
> So what kernel version are you using?
I was on an earlier version of Fedora (F16) ... but I'll jump forward and see if
I can still hit it.
>
> In trying to reproduce this locally against Linus' HEAD on a much smaller system
> (single core + HT 1.6Ghz), I got:
> [ 6.611366] clocksource_watchdog: {0} cs tsc csfirst 36177888648 mask
> ffffffffffffffff mult 10485747 shift 24
> [ 6.611596] clocksource_watchdog: {0} wd hpet wdfirst 169168400 mask ffffffff
> mult 2684354560 shift 26
>
> Note the smaller shift values. Not too long ago the shift calculation was
> adjusted to allow for longer periods between interrupts, so I suspect you're on
> an older kernel.
>
> Further, using your debug patch on my system, it was well beyond 10 minutes
> before the debug overflow occurred. And similarly I couldn't trip the watchdog
> trigger using sysrq-t (but again, only two threads here, so not nearly as much
> data to print as you have).
I'm going to try this on a 32-cpu system (running the previously mentioned test)
with linux.git HEAD.
>
> Could you verify that the issue you're seeing is still is present w/ current
> mainline? Please don't take this as me dismissing your problem! As I mentioned
Absolutely :) I didn't take it that way at all. .... when I get in this AM I'll
bang out a test and see if I can cause this to happen with sysrq-t. Keep in
mind that 10000 threads is the *minimum* I was able to cause this with, which is
only ~315 threads/cpu, which isn't a lot :/. At that number of threads the dump
takes about 6 mins. Doubling it, IIRC, exceeded 10 mins.
> earlier there are some known issues w/ the clocksource watchdog code. But I want
> to narrow down if you're problem is currently present in mainline or only in
> older kernels, as that will help us find the proper fix.
Thanks John,
P.
>
> thanks
> -john
>
next prev parent reply other threads:[~2012-04-05 11:00 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-04 15:11 [PATCH] clocksource, prevent overflow in clocksource_cyc2ns Prarit Bhargava
2012-04-04 18:00 ` John Stultz
2012-04-04 18:33 ` Prarit Bhargava
2012-04-05 1:08 ` John Stultz
2012-04-05 11:00 ` Prarit Bhargava [this message]
2012-04-05 16:23 ` John Stultz
2012-04-05 12:27 ` Prarit Bhargava
2012-04-05 16:45 ` John Stultz
2012-04-06 23:29 ` Thomas Gleixner
2012-04-07 13:47 ` Prarit Bhargava
2012-04-18 23:20 ` John Stultz
2012-04-18 23:59 ` Prarit Bhargava
2012-04-19 0:18 ` John Stultz
2012-04-19 11:56 ` Prarit Bhargava
2012-04-19 12:50 ` Thomas Gleixner
2012-04-19 12:52 ` Thomas Gleixner
2012-04-19 13:06 ` Prarit Bhargava
2012-04-19 13:18 ` Thomas Gleixner
2012-04-19 18:12 ` John Stultz
2012-04-25 12:29 ` Prarit Bhargava
2012-04-19 12:37 ` Thomas Gleixner
2012-04-19 12:51 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F7D7B4B.7050203@redhat.com \
--to=prarit@redhat.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sqazi@google.com \
--cc=stable@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox