All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Gilad Ben Yossef <giladb@mellanox.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: clocksource_watchdog causing scheduling of timers every second (was [v13] support "task_isolation" mode)
Date: Thu, 11 Aug 2016 13:58:48 +0200	[thread overview]
Message-ID: <20160811115845.GA4214@lerouge> (raw)
In-Reply-To: <20160811084002.GV30192@twins.programming.kicks-ass.net>

On Thu, Aug 11, 2016 at 10:40:02AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 11, 2016 at 12:16:58AM +0200, Frederic Weisbecker wrote:
> > I had similar issues, this seems to happen when the tsc is considered not reliable
> > (which doesn't necessarily mean unstable. I think it has to do with some x86 CPU feature
> > flag).
> 
> Right, as per the other email, in general we cannot know/assume the TSC
> to be working as intended :/

Yeah, I remember you explained me that a little while ago.

> 
> > IIRC, this _has_ to execute on all online CPUs because every TSCs of running CPUs
> > are concerned.
> 
> With modern Intel we could run it on one CPU per package I think, but at
> the same time, too much in NOHZ_FULL assumes the TSC is indeed sane so
> it doesn't make sense to me to keep the watchdog running, when it
> triggers it would also have to kill all NOHZ_FULL stuff, which would
> probably bring the entire machine down..
> 
> Arguably we should issue a boot time warning if NOHZ_FULL is configured
> and the TSC watchdog is running.

That's a very good idea! We do that when tsc is unstable but indeed we can't
seriously run NOHZ_FULL on a non-reliable tsc.

I'll take care of that warning.

> 
> > I personally override that with passing the tsc=reliable kernel
> > parameter. Of course use it at your own risk.
> 
> Yes, that is (sadly) our only option. Manually assert our hardware is
> solid under the intended workload and then manually disabling the
> watchdog.

Right, I'll tell about that in the warning.

Thanks for those details!

  reply	other threads:[~2016-08-11 11:58 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 20:48 [PATCH v13 00/12] support "task_isolation" mode Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 01/12] vmstat: add quiet_vmstat_sync function Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 02/12] vmstat: add vmstat_idle function Chris Metcalf
2016-07-14 20:48   ` Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 03/12] lru_add_drain_all: factor out lru_add_drain_needed Chris Metcalf
2016-07-14 20:48   ` Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 04/12] task_isolation: add initial support Chris Metcalf
2016-07-14 20:48   ` Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 05/12] task_isolation: track asynchronous interrupts Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 06/12] arch/x86: enable task isolation functionality Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 07/12] arm64: factor work_pending state machine to C Chris Metcalf
2016-07-14 20:48   ` Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 08/12] arch/arm64: enable task isolation functionality Chris Metcalf
2016-07-14 20:48   ` Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 09/12] arch/tile: " Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 10/12] arm, tile: turn off timer tick for oneshot_stopped state Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 11/12] task_isolation: support CONFIG_TASK_ISOLATION_ALL Chris Metcalf
2016-07-14 20:48 ` [PATCH v13 12/12] task_isolation: add user-settable notification signal Chris Metcalf
     [not found] ` <1468529299-27929-1-git-send-email-cmetcalf-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-07-14 21:03   ` [PATCH v13 00/12] support "task_isolation" mode Andy Lutomirski
2016-07-14 21:03     ` Andy Lutomirski
     [not found]     ` <CALCETrVddfd7ZDGpYs4CdkAMEmQCb6a-_5Um9bb4FO+XwWzOAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-14 21:22       ` Chris Metcalf
2016-07-14 21:22         ` Chris Metcalf
2016-07-18 22:11         ` Andy Lutomirski
2016-07-18 22:50           ` Chris Metcalf
2016-07-18  0:42       ` Christoph Lameter
2016-07-18  0:42         ` Christoph Lameter
2016-07-21  2:04   ` Christoph Lameter
2016-07-21  2:04     ` Christoph Lameter
     [not found]     ` <alpine.DEB.2.20.1607202059180.25838-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2016-07-21 14:06       ` Chris Metcalf
2016-07-21 14:06         ` Chris Metcalf
2016-07-22  2:20         ` Christoph Lameter
2016-07-22 12:50           ` Chris Metcalf
2016-07-22 12:50             ` Chris Metcalf
2016-07-25 16:35             ` Christoph Lameter
     [not found]               ` <alpine.DEB.2.20.1607251133450.25354-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2016-07-27 13:55                 ` clocksource_watchdog causing scheduling of timers every second (was [v13] support "task_isolation" mode) Christoph Lameter
2016-07-27 13:55                   ` Christoph Lameter
2016-07-27 14:12                   ` Chris Metcalf
2016-07-27 14:12                     ` Chris Metcalf
     [not found]                     ` <f8d72e47-7e84-cbd0-869f-69bf452a8bfb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-07-27 15:23                       ` Christoph Lameter
2016-07-27 15:23                         ` Christoph Lameter
     [not found]                         ` <alpine.DEB.2.20.1607271022130.25729-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2016-07-27 15:31                           ` Christoph Lameter
2016-07-27 15:31                             ` Christoph Lameter
2016-07-27 17:06                             ` Chris Metcalf
2016-07-27 17:06                               ` Chris Metcalf
2016-07-27 18:56                               ` Christoph Lameter
2016-07-27 19:49                                 ` Chris Metcalf
2016-07-27 19:49                                   ` Chris Metcalf
2016-07-27 19:53                                   ` Christoph Lameter
2016-07-27 19:58                                     ` Chris Metcalf
2016-07-27 19:58                                       ` Chris Metcalf
     [not found]                                       ` <2fefa17d-37c6-9669-724e-9ee0d841e7b2-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-07-29 18:31                                         ` Francis Giraldeau
2016-07-29 18:31                                           ` Francis Giraldeau
     [not found]                                           ` <CAC6yHM4LON5ASooVa_eUaDYsN1W0HYTMX76yHDxf8Mff0mKqiA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-29 21:04                                             ` Chris Metcalf
2016-07-29 21:04                                               ` Chris Metcalf
2016-08-10 22:16                   ` Frederic Weisbecker
2016-08-10 22:26                     ` Chris Metcalf
2016-08-10 22:26                       ` Chris Metcalf
2016-08-11  8:40                     ` Peter Zijlstra
2016-08-11 11:58                       ` Frederic Weisbecker [this message]
2016-08-15 15:03                         ` Chris Metcalf
2016-08-15 15:03                           ` Chris Metcalf
2016-08-11 16:00                       ` Paul E. McKenney
2016-08-11 23:02                         ` Christoph Lameter
2016-08-11 23:47                           ` Paul E. McKenney
2016-08-12 14:23                             ` Christoph Lameter
     [not found]                               ` <alpine.DEB.2.20.1608120922450.20310-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2016-08-12 14:26                                 ` Frederic Weisbecker
2016-08-12 14:26                                   ` Frederic Weisbecker
2016-08-12 16:19                                   ` Paul E. McKenney
     [not found]                                     ` <20160812161919.GV3482-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2016-08-13 15:39                                       ` Frederic Weisbecker
2016-08-13 15:39                                         ` Frederic Weisbecker
2016-08-11  8:27             ` [PATCH v13 00/12] support "task_isolation" mode Peter Zijlstra
2016-07-27 14:01 ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160811115845.GA4214@lerouge \
    --to=fweisbec@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=cmetcalf@mellanox.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=giladb@mellanox.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.