linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <frederic@kernel.org>
Cc: rafael@kernel.org, daniel.lezcano@linaro.org,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	len.brown@intel.com, Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [RFC/RFT PATCH 1/2] sched/core: Check and schedule ksoftirq
Date: Mon, 19 Dec 2022 14:54:55 -0800	[thread overview]
Message-ID: <5ae0d53990c29aa25648cbf32ef3b16e9bec911c.camel@linux.intel.com> (raw)
In-Reply-To: <Y6BMAp6A731F8ZL4@hirez.programming.kicks-ass.net>

On Mon, 2022-12-19 at 12:33 +0100, Peter Zijlstra wrote:
> On Fri, Dec 16, 2022 at 11:07:48PM +0100, Frederic Weisbecker wrote:
> > On Fri, Dec 16, 2022 at 12:19:34PM +0100, Peter Zijlstra wrote:
> > > On Thu, Dec 15, 2022 at 10:42:59AM -0800, Srinivas Pandruvada
> > > wrote:
> > > > +               /* Give ksoftirqd 1 jiffy to get a chance to
> > > > start its job */
> > > > +               if (!READ_ONCE(it.done) &&
> > > > task_is_running(__this_cpu_read(ksoftirqd))) {
> > > > +                       __set_current_state(TASK_UNINTERRUPTIBL
> > > > E);
> > > > +                       schedule_timeout(1);
> > > > +               }
> > > 
> > > That's absolutely disgusting :-/
> > 
> > I know, and I hate checking
> > task_is_running(__this_cpu_read(ksoftirqd))
> > everywhere in idle. And in fact it doesn't work because some
> > cpuidle drivers
> > also do need_resched() checks.
> 
> quite a few indeed.
> 
> > I guess that either we assume that the idle injection is more
> > important
> > than serving softirqs and we shutdown the warnings accordingly, or
> > we arrange
> > for idle injection to have a lower prio than ksoftirqd.
> 
> ksoftirq is typically a CFS task while idle injection is required to
> be
> a FIFO (typically FIFO-1) task -- so that would require lifting
> ksoftirqd to FIFO and that has other problems.
> 
> I'm guessing the problem case is where idle injection comes in while
> ksoftirq is running (and preempted), because in that case you cannot
> run
> the softirq stuff in-place.
> 
> The 'right' thing to do would be to PI boost ksoftirqd in that case I
> suppose. Perhaps something like so, it would boost ksoftirq when it's
> running, and otherwise runs the ksoftirqd thing in-situ.
> 
> I've not really throught hard about this though, so perhaps I
> completely
> wrecked things.
Had to fix some compile issues in the patch but able to test. This
fixes the case when softirq is pending before play_idle(). For example:

<idle>-0[004]   254.440116: softirq_raise:   vec=9 [action=RCU]
<idle>-0[004]   254.440116: softirq_exit:    vec=9 [action=RCU]
<idle>-0[004]   254.440116: sched_waking:    comm=ksoftirqd/4 pid=41
prio=120 target_cpu=004
<idle>-0[004]   254.440117: sched_wakeup:    ksoftirqd/4:41 [120]<CANT
FIND FIELD success> CPU:004
<idle>-0[004]   254.440119: sched_switch:    swapper/4:0 [120] R ==>
kidle_inj/4:1180 [49]
kidle_inj/4-1180[004]   254.440120: sched_kthread_work_execute_start:
work struct 0xffffd1dcbfd25230: function clamp_idle_injection_func
kidle_inj/4-1180[004]   254.440121: play_idle_enter: state=24000000
cpu_id=4
kidle_inj/4-1180[004]   254.440122: bputs: __run_ksoftirqd: running
from play_idle

"calliing __do_softirq() here"

kidle_inj/4-1180[004]   254.440122: softirq_entry:   vec=9 [action=RCU]
kidle_inj/4-1180[004]   254.440125: softirq_raise:   vec=9 [action=RCU]
kidle_inj/4-1180[004]   254.440126: softirq_exit:    vec=9 [action=RCU]

But after ksoftirqd_run_end(), which will renable local irq, this may
further cause more soft irq pending. Here RCU soft irq entry continues


kidle_inj/4-1180  [004]254.440141: softirq_exit:    vec=9 [action=RCU]
kidle_inj/4-1180  [004]254.440141: softirq_entry:   vec=9 [action=RCU]
kidle_inj/4-1180  [004]254.440143: softirq_raise:   vec=9 [action=RCU]
kidle_inj/4-1180  [004]254.440143: softirq_exit:    vec=9 [action=RCU]
kidle_inj/4-1180  [004]254.440144: bputs: can_stop_idle_tick.isra.0:
soft irq pending

Another warning above.


This log is with ping test.

Adding additional __run_ksoftirqd(), inside do_idle()  while
(!need_resched()) doesn't help.

If we add check to can_stop_idle_tick() before reporting helps. But
since __do_softirq() can't guarantee that there are no pending soft
irqs, it still have chance to get another warning.
 
Thanks,
Srinivas

> 
> 
> ---
>  include/linux/interrupt.h   |  3 +++
>  kernel/sched/build_policy.c |  1 +
>  kernel/sched/idle.c         |  4 ++++
>  kernel/softirq.c            | 20 +++++++++++++++++---
>  4 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index a92bce40b04b..a2db811d6947 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -606,6 +606,9 @@ extern void raise_softirq_irqoff(unsigned int
> nr);
>  extern void raise_softirq(unsigned int nr);
>  
>  DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
> +DECLARE_PER_CPU(struct rt_mutex, ksoftirq_lock);
> +
> +extern bool __run_ksoftirqd(unsigned int cpu);
>  
>  static inline struct task_struct *this_cpu_ksoftirqd(void)
>  {
> diff --git a/kernel/sched/build_policy.c
> b/kernel/sched/build_policy.c
> index d9dc9ab3773f..731c595e551c 100644
> --- a/kernel/sched/build_policy.c
> +++ b/kernel/sched/build_policy.c
> @@ -28,6 +28,7 @@
>  #include <linux/suspend.h>
>  #include <linux/tsacct_kern.h>
>  #include <linux/vtime.h>
> +#include <linux/interrupt.h>
>  
>  #include <uapi/linux/sched/types.h>
>  
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index f26ab2675f7d..093bfdfca2f1 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -370,6 +370,10 @@ void play_idle_precise(u64 duration_ns, u64
> latency_ns)
>         WARN_ON_ONCE(!duration_ns);
>         WARN_ON_ONCE(current->mm);
>  
> +       rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
> +       __run_ksoftirqd(smp_processor_id());
> +       rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
> +
>         rcu_sleep_check();
>         preempt_disable();
>         current->flags |= PF_IDLE;
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index c8a6913c067d..a2cb600383a4 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -59,6 +59,7 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
>  static struct softirq_action softirq_vec[NR_SOFTIRQS]
> __cacheline_aligned_in_smp;
>  
>  DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
> +DEFINE_PER_CPU(struct rt_mutex, ksoftirq_lock);
>  
>  const char * const softirq_to_name[NR_SOFTIRQS] = {
>         "HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
> @@ -912,6 +913,7 @@ void __init softirq_init(void)
>                         &per_cpu(tasklet_vec, cpu).head;
>                 per_cpu(tasklet_hi_vec, cpu).tail =
>                         &per_cpu(tasklet_hi_vec, cpu).head;
> +               rt_mutex_init(&per_cpu(ksoftirq_mutex, cpu));
>         }
>  
>         open_softirq(TASKLET_SOFTIRQ, tasklet_action);
> @@ -923,7 +925,7 @@ static int ksoftirqd_should_run(unsigned int cpu)
>         return local_softirq_pending();
>  }
>  
> -static void run_ksoftirqd(unsigned int cpu)
> +bool __run_ksoftirqd(unsigned int cpu)
>  {
>         ksoftirqd_run_begin();
>         if (local_softirq_pending()) {
> @@ -933,10 +935,22 @@ static void run_ksoftirqd(unsigned int cpu)
>                  */
>                 __do_softirq();
>                 ksoftirqd_run_end();
> -               cond_resched();
> -               return;
> +               return true;
>         }
>         ksoftirqd_run_end();
> +       return false;
> +}
> +
> +static void run_ksoftirqd(unsigned int cpu)
> +{
> +       bool run;
> +
> +       rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
> +       ran = __run_ksoftirqd(cpu);
> +       rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
> +
> +       if (ran)
> +               cond_resched();
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU


  parent reply	other threads:[~2022-12-19 22:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-15 18:42 [RFC/RFT PATCH 0/2] Forced idle and Non-RCU local softirq pending Srinivas Pandruvada
2022-12-15 18:42 ` [RFC/RFT PATCH 1/2] sched/core: Check and schedule ksoftirq Srinivas Pandruvada
2022-12-16 11:19   ` Peter Zijlstra
2022-12-16 16:58     ` srinivas pandruvada
2022-12-16 22:07     ` Frederic Weisbecker
2022-12-19 11:33       ` Peter Zijlstra
2022-12-19 11:46         ` Sebastian Andrzej Siewior
2022-12-19 14:56           ` Peter Zijlstra
2022-12-19 22:54         ` srinivas pandruvada [this message]
2022-12-20 20:51           ` Peter Zijlstra
2022-12-20 21:18             ` Peter Zijlstra
2023-01-10  2:33               ` srinivas pandruvada
2022-12-15 18:43 ` [RFC/RFT PATCH 2/2] sched/core: Add max duration to play_precise_idle() Srinivas Pandruvada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ae0d53990c29aa25648cbf32ef3b16e9bec911c.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=frederic@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).