From: Juri Lelli <juri.lelli@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: fweisbec@gmail.com, tglx@linutronix.de, mingo@kernel.org,
linux-rt-users@vger.kernel.org, peterz@infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] tick/sched: update full_nohz status after SCHED dep is cleared
Date: Wed, 20 May 2020 18:49:25 +0200 [thread overview]
Message-ID: <20200520164925.GM10078@localhost.localdomain> (raw)
In-Reply-To: <20200520162400.GA8800@lenoir>
On 20/05/20 18:24, Frederic Weisbecker wrote:
> Hi Juri,
>
> On Wed, May 20, 2020 at 04:04:02PM +0200, Juri Lelli wrote:
> > After tasks enter or leave a runqueue (wakeup/block) SCHED full_nohz
> > dependency is checked (via sched_update_tick_dependency()). In case tick
> > can be stopped on a CPU (see sched_can_stop_tick() for details), SCHED
> > dependency for such CPU is cleared. However, this new information is not
> > used right away to actually stop the tick.
> >
> > In CONFIG_PREEMPT systems booted with threadirqs option, sched clock
> > tick is serviced by an actual task (ksoftirqd corresponding to the CPU
> > where tick timer fired).
>
> I must confess I haven't tested threaded IRQs but I was
> pretty sure that the timer tick is always serviced on hardirq.
>
> Now the timer list callbacks are executed on softirqs. So if the
> tick itself, executed on hardirq, sees pending timers, it raise
> the softirq which wakes up ksoftirqd on forced irq thread mode
> while calling irq_exit(). Then tick_nohz_irq_exit() sees ksoftirqd
> and the current task on the runqueue, which together can indeed prevent
> from turning off the tick. But then the root cause is pending timer
> list callbacks.
>
> > So, in case a CPU was running a single task,
> > servicing the timer involves exiting full nozh mode. Problem at this
> > point is that we might lose chances to enter back into full nozh mode,
> > since info about ksoftirqd thread going back to sleep is not used (as
> > mentioned above).
>
> It should enter into nohz_full mode in the next tick, which is usually
> a reasonable delay. If you need guarantee that the tick is stopped before
> resuming userspace, you need a stronger machinery such as the task isolation
> patchset.
>
> Let's have a look at the trace below:
>
> > ksoftirqd/19-125 [019] 170.700754: softirq_entry: vec=1 [action=TIMER]
> > ksoftirqd/19-125 [019] 170.700755: softirq_exit: vec=1 [action=TIMER]
> > ksoftirqd/19-125 [019] 170.700756: softirq_entry: vec=7 [action=SCHED]
> > ksoftirqd/19-125 [019] 170.700757: softirq_exit: vec=7 [action=SCHED]
> > ksoftirqd/19-125 [019] 170.700759: sched_switch: ksoftirqd/19:125 [120] S ==> sysjitter:2459 [120]
> > sysjitter-2459 [019] 170.701740: local_timer_entry: vector=236
> > sysjitter-2459 [019] 170.701742: softirq_raise: vec=1 [action=TIMER]
>
> See here the tick sees pending timer callbacks so it raises the timer softirq.
>
> > sysjitter-2459 [019] 170.701743: softirq_raise: vec=7 [action=SCHED]
>
> Oh and the scheduler tick activates the scheduler softirq as well.
>
> > sysjitter-2459 [019] 170.701744: local_timer_exit: vector=236
> > sysjitter-2459 [019] 170.701747: sched_wakeup: ksoftirqd/19:125 [120] success=1 CPU:019
> > sysjitter-2459 [019] 170.701748: tick_stop: success=0 dependency=SCHED
> > sysjitter-2459 [019] 170.701749: irq_work_entry: vector=246
> > sysjitter-2459 [019] 170.701750: irq_work_exit: vector=246
> > sysjitter-2459 [019] 170.701751: tick_stop: success=0 dependency=SCHED
> > sysjitter-2459 [019] 170.701753: sched_switch: sysjitter:2459 [120] R ==> ksoftirqd/19:125 [120]
> > ksoftirqd/19-125 [019] 170.701754: softirq_entry: vec=1 [action=TIMER]
> > ksoftirqd/19-125 [019] 170.701756: softirq_exit: vec=1 [action=TIMER]
> > ksoftirqd/19-125 [019] 170.701756: softirq_entry: vec=7 [action=SCHED]
> > ksoftirqd/19-125 [019] 170.701758: softirq_exit: vec=7 [action=SCHED]
> > ksoftirqd/19-125 [019] 170.701759: sched_switch: ksoftirqd/19:125 [120] S ==> sysjitter:2459 [120]
> > sysjitter-2459 [019] 170.702740: local_timer_entry: vector=236
> > sysjitter-2459 [019] 170.702742: softirq_raise: vec=1 [action=TIMER]
>
> A new tick but we still have pending timer callback so we'll need to wakeup ksoftirqd
> again.
>
> I think you should trace timers and check which one is concerned here.
Hummm, so I enabled 'timer:*', anything else you think I should be
looking at?
...
ksoftirqd/13-117 [013] 148.265945: softirq_entry: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.265947: softirq_exit: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.265948: softirq_entry: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.265950: softirq_exit: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.265952: sched_switch: ksoftirqd/13:117 [120] S ==> sysjitter:2536 [120]
sysjitter-2536 [013] 148.266912: local_timer_entry: vector=236
sysjitter-2536 [013] 148.266914: hrtimer_cancel: hrtimer=0xffff9807df91bee0
sysjitter-2536 [013] 148.266916: hrtimer_expire_entry: hrtimer=0xffff9807df91bee0 now=148249107837 function=tick_sched_timer/0x0
sysjitter-2536 [013] 148.266918: softirq_raise: vec=1 [action=TIMER]
sysjitter-2536 [013] 148.266920: softirq_raise: vec=7 [action=SCHED]
sysjitter-2536 [013] 148.266922: hrtimer_expire_exit: hrtimer=0xffff9807df91bee0
sysjitter-2536 [013] 148.266924: hrtimer_start: hrtimer=0xffff9807df91bee0 function=tick_sched_timer/0x0 expires=148250105422 softexpires=148250105422
sysjitter-2536 [013] 148.266926: local_timer_exit: vector=236
sysjitter-2536 [013] 148.266930: sched_wakeup: ksoftirqd/13:117 [120] success=1 CPU:013
sysjitter-2536 [013] 148.266932: tick_stop: success=0 dependency=SCHED
sysjitter-2536 [013] 148.266934: irq_work_entry: vector=246
sysjitter-2536 [013] 148.266936: irq_work_exit: vector=246
sysjitter-2536 [013] 148.266937: tick_stop: success=0 dependency=SCHED
sysjitter-2536 [013] 148.266940: sched_switch: sysjitter:2536 [120] R ==> ksoftirqd/13:117 [120]
ksoftirqd/13-117 [013] 148.266943: softirq_entry: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.266946: softirq_exit: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.266946: softirq_entry: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.266948: softirq_exit: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.266950: sched_switch: ksoftirqd/13:117 [120] S ==> sysjitter:2536 [120]
sysjitter-2536 [013] 148.267912: local_timer_entry: vector=236
sysjitter-2536 [013] 148.267914: hrtimer_cancel: hrtimer=0xffff9807df91bee0
sysjitter-2536 [013] 148.267916: hrtimer_expire_entry: hrtimer=0xffff9807df91bee0 now=148250107750 function=tick_sched_timer/0x0
sysjitter-2536 [013] 148.267918: softirq_raise: vec=1 [action=TIMER]
sysjitter-2536 [013] 148.267920: softirq_raise: vec=7 [action=SCHED]
sysjitter-2536 [013] 148.267922: hrtimer_expire_exit: hrtimer=0xffff9807df91bee0
sysjitter-2536 [013] 148.267924: hrtimer_start: hrtimer=0xffff9807df91bee0 function=tick_sched_timer/0x0 expires=148251105422 softexpires=148251105422
sysjitter-2536 [013] 148.267926: local_timer_exit: vector=236
sysjitter-2536 [013] 148.267929: sched_wakeup: ksoftirqd/13:117 [120] success=1 CPU:013
sysjitter-2536 [013] 148.267931: tick_stop: success=0 dependency=SCHED
sysjitter-2536 [013] 148.267934: irq_work_entry: vector=246
sysjitter-2536 [013] 148.267936: irq_work_exit: vector=246
sysjitter-2536 [013] 148.267937: tick_stop: success=0 dependency=SCHED
sysjitter-2536 [013] 148.267940: sched_switch: sysjitter:2536 [120] R ==> ksoftirqd/13:117 [120]
ksoftirqd/13-117 [013] 148.267943: softirq_entry: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.267945: softirq_exit: vec=1 [action=TIMER]
ksoftirqd/13-117 [013] 148.267946: softirq_entry: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.267947: softirq_exit: vec=7 [action=SCHED]
ksoftirqd/13-117 [013] 148.267949: sched_switch: ksoftirqd/13:117 [120] S ==> sysjitter:2536 [120]
sysjitter-2536 [013] 148.268865: call_function_single_entry: vector=251
sysjitter-2536 [013] 148.268867: call_function_single_exit: vector=251
sysjitter-2536 [013] 148.268870: tick_stop: success=1 dependency=NONE
sysjitter-2536 [013] 148.268872: hrtimer_cancel: hrtimer=0xffff9807df91bee0
sysjitter-2536 [013] 148.268876: call_function_single_entry: vector=251
sysjitter-2536 [013] 148.268878: call_function_single_exit: vector=251
...
it looks like tick sched timer is kept armed until tick_stop
succedes (because of the check performed after _single_exit)?
Thanks for your quick reply!
next prev parent reply other threads:[~2020-05-20 16:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-20 14:04 [RFC PATCH] tick/sched: update full_nohz status after SCHED dep is cleared Juri Lelli
2020-05-20 16:24 ` Frederic Weisbecker
2020-05-20 16:49 ` Juri Lelli [this message]
2020-05-20 17:02 ` Frederic Weisbecker
2020-05-20 18:47 ` Juri Lelli
2020-05-21 0:44 ` Frederic Weisbecker
2020-05-21 17:00 ` Juri Lelli
2020-06-15 15:05 ` Juri Lelli
2020-06-15 21:07 ` Frederic Weisbecker
2020-06-16 6:57 ` Juri Lelli
2020-06-16 20:46 ` Frederic Weisbecker
2020-06-29 12:36 ` Juri Lelli
2020-06-29 12:42 ` Frederic Weisbecker
2020-06-29 12:53 ` Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200520164925.GM10078@localhost.localdomain \
--to=juri.lelli@redhat.com \
--cc=frederic@kernel.org \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.