From: Feng Tang <feng.tang@intel.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
John Stultz <john.stultz@linaro.org>,
Stephen Boyd <sboyd@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
Mark Rutland <Mark.Rutland@arm.com>,
Marc Zyngier <maz@kernel.org>, Andi Kleen <ak@linux.intel.com>,
Chris Mason <clm@fb.com>, LKML <linux-kernel@vger.kernel.org>,
lkp@lists.01.org, lkp@intel.com
Subject: Re: [LKP] Re: [clocksource] 6c52b5f3cf: stress-ng.opcode.ops_per_sec -14.4% regression
Date: Sun, 25 Apr 2021 11:14:37 +0800 [thread overview]
Message-ID: <20210425031437.GA38485@shbuild999.sh.intel.com> (raw)
In-Reply-To: <20210425021438.GA2942@shbuild999.sh.intel.com>
On Sun, Apr 25, 2021 at 10:14:38AM +0800, Feng Tang wrote:
> On Sat, Apr 24, 2021 at 10:53:22AM -0700, Paul E. McKenney wrote:
> > And if your 2/2 goes in, those who still distrust TSC will simply
> > revert it. In their defense, their distrust was built up over a very
> > long period of time for very good reasons.
> >
> > > > This last sentence is not a theoretical statement. In the past, I have
> > > > suggested using the existing "tsc=reliable" kernel boot parameter,
> > > > which disables watchdogs on TSC, similar to your patch 2/2 above.
> > > > The discussion was short and that boot parameter was not set. And the
> > > > discussion motivated to my current clocksource series. ;-)
> > > >
> > > > I therefore suspect that someone will want a "tsc=unreliable" boot
> > > > parameter (or similar) to go with your patch 2/2.
> > >
> > > Possibly :)
> > >
> > > But I wonder if tsc is disabled on that 'large system', what will be
> > > used instead? HPET is known to be much slower for clocksource, as shown
> > > in this regression report :) not mentioning the 'acpi_pm' timer.
> >
> > Indeed, the default switch to HPET often causes the system to be taken
> > out of service due to the resulting performance shortfall. There is
> > of course some automated recovery, and no, I am not familiar with the
> > details, but I suspect that a simple reboot is an early recovery step.
> > However, if the problem were to persist, the system would of course be
> > considered to be permanently broken.
>
> Thanks for the info, if a sever is taken out of service just because
> of a false alarm of tsc, then it's a big waste!
>
> > > Again, I want to know the real tsc unstable case. I have spent lots
> > > of time searching these info from git logs and mail archives before
> > > writing the patches.
> >
> > So do I, which is why I put together this patch series. My employer has
> > a fairly strict upstream-first for things like this which are annoyances
> > that are likely hiding other bugs, but which are not causing significant
> > outages, which was of course the motivation for the fault-injection
> > patches.
> >
> > As I said earlier, it would have been very helpful to you for a patch
> > series like this to have been applied many years ago. If it had been,
> > we would already have the failure-rate data that you requested. And of
> > course if that failure-rate data indicated that TSC was reliable, there
> > would be far fewer people still distrusting TSC.
>
> Yes, if they can share the detailed info (like what's the 'watchdog')
> and debug info, it can enable people to debug and root cause the
> problem to be a false alarm or a real silicon platform. Personally, for
> newer platforms I tend to trust tsc much more than other clocksources.
I understand people may 'distrust' tsc, after seeing that 'tsc unstable'
cases. But for 'newer platforms', if the unstable was judged by hpet,
acpi_pm_timer or the software 'refined-jiffies', then it could possibly
be just a false alarm, and that's not too difficult to be root caused.
And if there is a real evidence of a broken tsc case, then the distrust
is not just in impression from old days :)
Thanks,
Feng
next prev parent reply other threads:[~2021-04-25 3:14 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-20 6:49 [clocksource] 6c52b5f3cf: stress-ng.opcode.ops_per_sec -14.4% regression kernel test robot
2021-04-20 13:43 ` Paul E. McKenney
2021-04-20 14:05 ` Paul E. McKenney
2021-04-21 6:07 ` [LKP] " Xing, Zhengjun
2021-04-21 13:42 ` Paul E. McKenney
2021-04-22 6:58 ` Xing Zhengjun
2021-04-22 7:41 ` Feng Tang
2021-04-22 14:24 ` Paul E. McKenney
2021-04-22 16:57 ` Paul E. McKenney
2021-04-23 6:11 ` Feng Tang
2021-04-23 14:02 ` Paul E. McKenney
2021-04-24 12:29 ` Feng Tang
2021-04-24 17:53 ` Paul E. McKenney
2021-04-25 2:14 ` Feng Tang
2021-04-25 3:14 ` Feng Tang [this message]
2021-04-25 19:15 ` Paul E. McKenney
2021-04-25 19:14 ` Paul E. McKenney
2021-04-26 12:39 ` Thomas Gleixner
2021-04-26 14:05 ` Feng Tang
2021-04-26 14:33 ` Thomas Gleixner
2021-04-26 15:12 ` Feng Tang
2021-04-23 2:15 ` Xing Zhengjun
2021-04-23 4:12 ` Paul E. McKenney
2021-04-23 19:14 ` Thomas Gleixner
2021-04-23 21:14 ` Paul E. McKenney
2021-04-23 23:39 ` Paul E. McKenney
2021-04-23 19:09 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210425031437.GA38485@shbuild999.sh.intel.com \
--to=feng.tang@intel.com \
--cc=Mark.Rutland@arm.com \
--cc=ak@linux.intel.com \
--cc=clm@fb.com \
--cc=corbet@lwn.net \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=lkp@lists.01.org \
--cc=maz@kernel.org \
--cc=paulmck@kernel.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox