public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	rafael@kernel.org, daniel.lezcano@linaro.org,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	len.brown@intel.com, Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [RFC/RFT PATCH 1/2] sched/core: Check and schedule ksoftirq
Date: Mon, 19 Dec 2022 12:33:22 +0100	[thread overview]
Message-ID: <Y6BMAp6A731F8ZL4@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20221216220748.GA1967978@lothringen>

On Fri, Dec 16, 2022 at 11:07:48PM +0100, Frederic Weisbecker wrote:
> On Fri, Dec 16, 2022 at 12:19:34PM +0100, Peter Zijlstra wrote:
> > On Thu, Dec 15, 2022 at 10:42:59AM -0800, Srinivas Pandruvada wrote:
> > > +		/* Give ksoftirqd 1 jiffy to get a chance to start its job */
> > > +		if (!READ_ONCE(it.done) && task_is_running(__this_cpu_read(ksoftirqd))) {
> > > +			__set_current_state(TASK_UNINTERRUPTIBLE);
> > > +			schedule_timeout(1);
> > > +		}
> > 
> > That's absolutely disgusting :-/
> 
> I know, and I hate checking task_is_running(__this_cpu_read(ksoftirqd))
> everywhere in idle. And in fact it doesn't work because some cpuidle drivers
> also do need_resched() checks.

quite a few indeed.

> I guess that either we assume that the idle injection is more important
> than serving softirqs and we shutdown the warnings accordingly, or we arrange
> for idle injection to have a lower prio than ksoftirqd.

ksoftirq is typically a CFS task while idle injection is required to be
a FIFO (typically FIFO-1) task -- so that would require lifting
ksoftirqd to FIFO and that has other problems.

I'm guessing the problem case is where idle injection comes in while
ksoftirq is running (and preempted), because in that case you cannot run
the softirq stuff in-place.

The 'right' thing to do would be to PI boost ksoftirqd in that case I
suppose. Perhaps something like so, it would boost ksoftirq when it's
running, and otherwise runs the ksoftirqd thing in-situ.

I've not really throught hard about this though, so perhaps I completely
wrecked things.


---
 include/linux/interrupt.h   |  3 +++
 kernel/sched/build_policy.c |  1 +
 kernel/sched/idle.c         |  4 ++++
 kernel/softirq.c            | 20 +++++++++++++++++---
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce40b04b..a2db811d6947 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -606,6 +606,9 @@ extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 
 DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+DECLARE_PER_CPU(struct rt_mutex, ksoftirq_lock);
+
+extern bool __run_ksoftirqd(unsigned int cpu);
 
 static inline struct task_struct *this_cpu_ksoftirqd(void)
 {
diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
index d9dc9ab3773f..731c595e551c 100644
--- a/kernel/sched/build_policy.c
+++ b/kernel/sched/build_policy.c
@@ -28,6 +28,7 @@
 #include <linux/suspend.h>
 #include <linux/tsacct_kern.h>
 #include <linux/vtime.h>
+#include <linux/interrupt.h>
 
 #include <uapi/linux/sched/types.h>
 
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index f26ab2675f7d..093bfdfca2f1 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -370,6 +370,10 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns)
 	WARN_ON_ONCE(!duration_ns);
 	WARN_ON_ONCE(current->mm);
 
+	rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
+	__run_ksoftirqd(smp_processor_id());
+	rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
+
 	rcu_sleep_check();
 	preempt_disable();
 	current->flags |= PF_IDLE;
diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..a2cb600383a4 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -59,6 +59,7 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
 DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct rt_mutex, ksoftirq_lock);
 
 const char * const softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
@@ -912,6 +913,7 @@ void __init softirq_init(void)
 			&per_cpu(tasklet_vec, cpu).head;
 		per_cpu(tasklet_hi_vec, cpu).tail =
 			&per_cpu(tasklet_hi_vec, cpu).head;
+		rt_mutex_init(&per_cpu(ksoftirq_mutex, cpu));
 	}
 
 	open_softirq(TASKLET_SOFTIRQ, tasklet_action);
@@ -923,7 +925,7 @@ static int ksoftirqd_should_run(unsigned int cpu)
 	return local_softirq_pending();
 }
 
-static void run_ksoftirqd(unsigned int cpu)
+bool __run_ksoftirqd(unsigned int cpu)
 {
 	ksoftirqd_run_begin();
 	if (local_softirq_pending()) {
@@ -933,10 +935,22 @@ static void run_ksoftirqd(unsigned int cpu)
 		 */
 		__do_softirq();
 		ksoftirqd_run_end();
-		cond_resched();
-		return;
+		return true;
 	}
 	ksoftirqd_run_end();
+	return false;
+}
+
+static void run_ksoftirqd(unsigned int cpu)
+{
+	bool run;
+
+	rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
+	ran = __run_ksoftirqd(cpu);
+	rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
+
+	if (ran)
+		cond_resched();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

  reply	other threads:[~2022-12-19 11:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-15 18:42 [RFC/RFT PATCH 0/2] Forced idle and Non-RCU local softirq pending Srinivas Pandruvada
2022-12-15 18:42 ` [RFC/RFT PATCH 1/2] sched/core: Check and schedule ksoftirq Srinivas Pandruvada
2022-12-16 11:19   ` Peter Zijlstra
2022-12-16 16:58     ` srinivas pandruvada
2022-12-16 22:07     ` Frederic Weisbecker
2022-12-19 11:33       ` Peter Zijlstra [this message]
2022-12-19 11:46         ` Sebastian Andrzej Siewior
2022-12-19 14:56           ` Peter Zijlstra
2022-12-19 22:54         ` srinivas pandruvada
2022-12-20 20:51           ` Peter Zijlstra
2022-12-20 21:18             ` Peter Zijlstra
2023-01-10  2:33               ` srinivas pandruvada
2022-12-15 18:43 ` [RFC/RFT PATCH 2/2] sched/core: Add max duration to play_precise_idle() Srinivas Pandruvada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6BMAp6A731F8ZL4@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=frederic@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox