From: John Stultz <john.stultz@linaro.org>
To: Josh Boyer <jwboyer@redhat.com>
Cc: Dave Jones <davej@redhat.com>,
Fedora Kernel Team <kernel-team@fedoraproject.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: WARNING: Adjusting tsc more then 11%
Date: Mon, 05 Mar 2012 12:24:37 -0800 [thread overview]
Message-ID: <1330979077.2191.96.camel@work-vm> (raw)
In-Reply-To: <20120305195619.GB17489@zod.bos.redhat.com>
On Mon, 2012-03-05 at 14:56 -0500, Josh Boyer wrote:
> On Mon, Mar 05, 2012 at 11:50:10AM -0800, John Stultz wrote:
> > On Mon, 2012-03-05 at 14:23 -0500, Dave Jones wrote:
> > > On Mon, Mar 05, 2012 at 10:32:03AM -0800, John Stultz wrote:
> > > > On Mon, 2012-03-05 at 10:44 -0500, Dave Jones wrote:
> > >
> > > > > any idea what could have changed to start tripping that up ?
> > > > >
> > > > > The reports seem to have started around 3.3-rc4.
> > > >
> > > > Huh. No I don't know what would have started causing such a warning. I
> > > > had expected that there would be some edge hardware that might trip that
> > > > warning, but I'd expect the noise to start there w/ 3.2 after it was
> > > > introduced. There's only been spelling & comment changes to the
> > > > timekeeping core in the 3.3-rc series.
> > >
> > > thinking about this some more, while the reports starts around rc4, this
> > > may have been caused by something prior to that, as anyone moving from
> > > Fedora 16 or earlier to F17 alpha would have jumped quite a kernel version or two.
> >
> > Was F16 3.1 based? The warning was added in 3.2, so if you skipped it,
> > it may not be new behavior then.
> >
> > > > Do you know if this is an occasional thing on any of the affected
> > > > hardware, or if it happens after every reboot?
> > >
> > > Out of all the people running the Fedora 17 alpha, this has only shown
> > > up those four times, so it does seem to be a rare thing.
> > > I suspect we'll get more instances of it as more people start testing.
> > >
> > > Three of the reporters noted that it happened on boot.
> > >
> > > > Are any of the reported boxes systems you have access to in order to
> > > > reproduce?
> > >
> > > unfortunately not.
> >
> > Ok. Well, just to level set: the warning is informative, and points to
> > unexpected, but not necessarily unsafe behavior.
> >
> > In fact, the risk (where mult is adjusted to be large enough to cause an
> > overflow) we're warning about have been present 2.6.36 or even possibly
> > before. The change in 3.2 which added the warning also added a more
> > conservative mult calculation, so we're less likely to get overflow
> > prone large mult values.
>
> Is there a reason you decided to use a WARN_ONCE, which dumps a full stack
> trace, instead of just printk(KERN_ERR ?
Well, the WARN_ONCE behavior is really nice, since just a printk would
end up possibly filling the logs, since you might get one every tick.
> > So it would be great to get further feedback from folks who are seeing
> > this warning, so we can really hammer this out, but I don't want the
> > warning spooking anyone into thinking things are terribly broken.
>
> Right... people see backtraces and start thinking "my kernel is broken."
>
> I'm certainly not meaning to pick on you for this. Lately it seems all
> the rage to throw WARN_ONs for all kinds of error paths and leave the user
> to figure out how screwed they are.
Its a trade-off, since we really do want to know if our code has been
pushed outside of its expected boundaries (either by unexpected hadware
behavior or by expectations being raised, like long nohz idle times), so
we have to get folks attention somewhat. The type of error reporting
Dave's managed to collect here is really great.
But at the same time, I agree there has been a few cases where the code
is limited more narrowly then the reality of existing hardware, and we
end up with a constant stream of error messages that get waved off as
broken hardware.
There we need to either fix the code or drop the warnings, but I think
it gets hard when we really want to know about "unexpected behavior,
except on some wide swath of hardware that always acts poorly", where
conditionalizing the warnings isn't easy.
thanks
-john
next prev parent reply other threads:[~2012-03-05 20:25 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-05 15:44 WARNING: Adjusting tsc more then 11% Dave Jones
2012-03-05 18:32 ` John Stultz
2012-03-05 19:23 ` Dave Jones
2012-03-05 19:50 ` John Stultz
2012-03-05 19:56 ` Josh Boyer
2012-03-05 20:24 ` John Stultz [this message]
2012-03-05 20:28 ` Josh Boyer
2012-03-05 20:41 ` John Stultz
2012-03-05 19:57 ` Dave Jones
2012-03-05 20:16 ` Sasha Levin
2012-03-05 20:27 ` John Stultz
2012-03-05 20:36 ` Sasha Levin
2012-03-07 1:13 ` John Stultz
2012-03-22 19:11 ` Sasha Levin
2012-03-22 19:21 ` John Stultz
2012-03-22 15:28 ` Dave Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1330979077.2191.96.camel@work-vm \
--to=john.stultz@linaro.org \
--cc=davej@redhat.com \
--cc=jwboyer@redhat.com \
--cc=kernel-team@fedoraproject.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox