All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	John Stultz <john.stultz@linaro.org>,
	Eric Dumazet <edumazet@google.com>,
	Anna-Maria Gleixner <anna-maria@linutronix.de>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-pm@vger.kernel.org, Arjan van de Ven <arjan@infradead.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Rik van Riel <riel@redhat.com>
Subject: Re: [patch V2 00/10] timer: Move from a push remote at enqueue to a pull at expiry model
Date: Fri, 21 Apr 2017 12:28:53 -0700	[thread overview]
Message-ID: <20170421192853.GD3956@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170418111102.490432548@linutronix.de>

On Tue, Apr 18, 2017 at 01:11:02PM +0200, Thomas Gleixner wrote:
> Placing timers at enqueue time on a target CPU based on dubious heuristics
> does not make any sense:
> 
>  1) Most timer wheel timers are canceled or rearmed before they expire.
> 
>  2) The heuristics to predict which CPU will be busy when the timer expires
>     are wrong by definition.
> 
> So we waste precious cycles to place timers at enqueue time.
> 
> The proper solution to this problem is to always queue the timers on the
> local CPU and allow the non pinned timers to be pulled onto a busy CPU at
> expiry time.
> 
> To achieve this the timer storage has been split into local pinned and
> global timers. Local pinned timers are always expired on the CPU on which
> they have been queued. Global timers can be expired on any CPU.
> 
> As long as a CPU is busy it expires both local and global timers. When a
> CPU goes idle it arms for the first expiring local timer. If the first
> expiring pinned (local) timer is before the first expiring movable timer,
> then no action is required because the CPU will wake up before the first
> movable timer expires. If the first expiring movable timer is before the
> first expiring pinned (local) timer, then this timer is queued into a idle
> timerqueue and eventually expired by some other active CPU.
> 
> To avoid global locking the timerqueues are implemented as a hierarchy. The
> lowest level of the hierarchy holds the CPUs. The CPUs are associated to
> groups of 8, which are seperated per node. If more than one CPU group
> exist, then a second level in the hierarchy collects the groups. Depending
> on the size of the system more than 2 levels are required. Each group has a
> "migrator" which checks the timerqueue during the tick for remote expirable
> timers.
> 
> If the last CPU in a group goes idle it reports the first expiring event in
> the group up to the next group(s) in the hierarchy. If the last CPU goes
> idle it arms its timer for the first system wide expiring timer to ensure
> that no timer event is missed.

OK, after several attempts this week, I formally state that I officially
suck at reviewing, though I must confess that I am quite partial to use
of hierarchies to avoid scalability bottlenecks.  I therefore applied
these patches and ran rcutorture on them.  Two of 19 scenarios resulted
in failures, namely TREE04 and TREE07.  I have posted their .config and
console.log files here:

http://www2.rdrop.com/users/paulmck/submission/TREE04.2017.04.21a.config
http://www2.rdrop.com/users/paulmck/submission/TREE04.2017.04.21a.console.log
http://www2.rdrop.com/users/paulmck/submission/TREE07.2017.04.21a.config
http://www2.rdrop.com/users/paulmck/submission/TREE07.2017.04.21a.console.log

The first splat from TREE04 is:

[    3.513964] WARNING: CPU: 1 PID: 755 at /home/paulmck/public_git/linux-rcu/kernel/time/timer_migration.c:387 tmigr_set_cpu_active+0xd7/0xf0
[    3.517320] Modules linked in:
[    3.518173] CPU: 1 PID: 755 Comm: rcu_torture_rea Not tainted 4.11.0-rc2+ #1
[    3.520134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[    3.522692] Call Trace:
[    3.523354]  <IRQ>
[    3.523953]  dump_stack+0x4d/0x65
[    3.524846]  __warn+0xc6/0xe0
[    3.525680]  warn_slowpath_null+0x18/0x20
[    3.526787]  tmigr_set_cpu_active+0xd7/0xf0
[    3.527866]  tmigr_cpu_activate+0x36/0x40
[    3.528872]  tick_nohz_stop_sched_tick+0x248/0x330
[    3.530103]  tick_nohz_irq_exit+0x109/0x140
[    3.531214]  irq_exit+0x74/0xf0
[    3.532068]  smp_apic_timer_interrupt+0x38/0x50
[    3.533239]  apic_timer_interrupt+0x90/0xa0
[    3.534363] RIP: 0010:__local_bh_enable_ip+0x3e/0x80
[    3.535722] RSP: 0000:ffffc900014ffe60 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
[    3.537640] RAX: 0000000000000000 RBX: ffffffff8319c0e0 RCX: 000000000000000a
[    3.539461] RDX: ffffffff81e46b40 RSI: 0000000000000200 RDI: ffffffff810b0630
[    3.541307] RBP: ffffc900014ffe60 R08: 0000000000000001 R09: 0000000000000101
[    3.543497] R10: 0000000000000000 R11: 0000000000000000 R12: fffffffffffffee5
[    3.545556] R13: 0000000000000000 R14: 00000000dfa1ef74 R15: fffffffffffffee6
[    3.547186]  </IRQ>
[    3.547689]  ? rcu_bh_torture_deferred_free+0x20/0x20
[    3.548837]  rcu_bh_torture_read_unlock+0x15/0x20
[    3.549861]  rcu_torture_reader+0xe4/0x2b0
[    3.550801]  ? rcu_torture_reader+0x2b0/0x2b0
[    3.551768]  kthread+0xfc/0x130
[    3.552465]  ? rcu_torture_fqs+0xe0/0xe0
[    3.553387]  ? kthread_create_on_node+0x40/0x40
[    3.554460]  ret_from_fork+0x29/0x40

Line 387 is this line in tmigr_set_cpu_active():

	if (WARN_ON(group->active == group->num_childs)) {

The splat from TREE07 looks quite similar, although it has three splats
rather than only two.

							Thanx, Paul

> The series is also available from git:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers
> 
> Changes vs. V1:
>   - Add missing inline stubs
>   - Bail out when running on UP
>   - Don't compile migration code for SMP=n
>   - Reorder trace point storage
> 
> Thanks,
> 
> 	tglx
> ---
>  /timer_migration.h              |  173 ++++++++++
>  b/kernel/time/timer_migration.c |  666 ++++++++++++++++++++++++++++++++++++++++
>  b/kernel/time/timer_migration.h |   83 ++++
>  include/linux/cpuhotplug.h      |    1 
>  kernel/time/Makefile            |    3 
>  kernel/time/tick-internal.h     |    4 
>  kernel/time/tick-sched.c        |  121 ++++++-
>  kernel/time/tick-sched.h        |    3 
>  kernel/time/timer.c             |  239 +++++++++-----
>  lib/timerqueue.c                |    8 
>  10 files changed, 1205 insertions(+), 96 deletions(-)
> 
> 
> 
> 

      parent reply	other threads:[~2017-04-21 19:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-18 11:11 [patch V2 00/10] timer: Move from a push remote at enqueue to a pull at expiry model Thomas Gleixner
2017-04-18 11:11 ` [patch V2 01/10] timer: Invoke timer_start_debug() where it makes sense Thomas Gleixner
2017-04-18 11:11 ` [patch V2 02/10] timerqueue: Document return values of timerqueue_add/del() Thomas Gleixner
2017-04-18 11:11 ` [patch V2 03/10] timers: Rework idle logic Thomas Gleixner
2017-04-19  6:50   ` Peter Zijlstra
2017-04-21 14:43     ` Frederic Weisbecker
2017-04-18 11:11 ` [patch V2 04/10] timer: Keep the pinned timers separate from the others Thomas Gleixner
2017-04-18 11:11 ` [patch V2 05/10] timer: Retrieve next expiry of pinned/non-pinned timers seperately Thomas Gleixner
2017-04-19  7:05   ` Peter Zijlstra
2017-04-19  9:56     ` Thomas Gleixner
2017-04-18 11:11 ` [patch V2 06/10] timer: Restructure internal locking Thomas Gleixner
2017-04-19  7:07   ` Peter Zijlstra
2017-04-18 11:11 ` [patch V2 07/10] tick/sched: Split out jiffies update helper function Thomas Gleixner
2017-04-18 11:11 ` [patch V2 08/10] timer: Implement the hierarchical pull model Thomas Gleixner
2017-04-19  7:20   ` Peter Zijlstra
2017-04-19  7:24   ` Peter Zijlstra
2017-04-19  7:34   ` Peter Zijlstra
2017-04-19  7:38   ` Peter Zijlstra
2017-04-19  8:11   ` Peter Zijlstra
2017-04-19  8:31     ` Thomas Gleixner
2017-04-19  8:36       ` Peter Zijlstra
2017-04-19  9:03         ` Thomas Gleixner
2017-04-19  8:52   ` Peter Zijlstra
2017-04-19  9:09   ` Peter Zijlstra
2017-04-19  9:43     ` Thomas Gleixner
2017-04-19  9:52       ` Peter Zijlstra
2017-04-19  9:44     ` Peter Zijlstra
2017-04-19  9:53       ` Peter Zijlstra
2017-04-19  9:20   ` Peter Zijlstra
2017-04-19 10:22   ` Peter Zijlstra
2017-04-18 11:11 ` [patch V2 09/10] timer_migration: Add tracepoints Thomas Gleixner
2017-04-18 11:11 ` [patch V2 10/10] timer: Always queue timers on the local CPU Thomas Gleixner
2017-04-21 19:28 ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170421192853.GD3956@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=anna-maria@linutronix.de \
    --cc=arjan@infradead.org \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.