public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
@ 2024-10-04 10:17 Sebastian Andrzej Siewior
  2024-10-04 10:17 ` [PATCH 1/1] " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-04 10:17 UTC (permalink / raw)
  To: linux-kernel, rcu
  Cc: Paul E. McKenney, Anna-Maria Behnsen, Davidlohr Bueso,
	Frederic Weisbecker, Ingo Molnar, Josh Triplett, Thomas Gleixner

Hi,

the following was in the PREEMPT_RT queue since last softirq rework. The
result is that timer wake ups (hrtimer, timer_list) happens in hardirq
processing them requires to wake ksoftirqd leading two:
- ksoftirqd will consume all further softirqs. That means all
  soft interrupts that would be processed in the threaded interrupt will
  no be delayed to ksoftirqd. The threaded interrupt runs at higher
  priority than ksoftirqd.

- ksoftirqd runs at SCHED_OTHER so it will compete for resources with
  all other tasks in the system, potentially delayed the processing
  further.

The idea was to let the timers be processed by a dedicated thread
running at low SCHED_FIFO priority.
While looking at it again, it might make sense to have the
pending_softirq flag per-thread to avoid threads with higher priority
picking up softirqs from low-priority threads. This isn't yet a problem
because adding softirqs for processing happens only from threaded
interrupts. So the low-priority thread will wait until the high-priority
thread is done. And the high-priority thread will PI-boost the
low-priority thread until it is done. It would only make sense to make
the flags per-thread once the BH lock is gone.

The patch is limited to PREEMPT_RT. The ksoftirqd bullets from above
apply also to !PREEMPT_RT +threadirqs. Would it make sense to restrict
it to force_irqthreads() instead?

Sebastian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-04 10:17 [PATCH 0/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT Sebastian Andrzej Siewior
@ 2024-10-04 10:17 ` Sebastian Andrzej Siewior
  2024-10-21 23:17   ` Paul E. McKenney
  2024-10-22 13:28   ` Frederic Weisbecker
  0 siblings, 2 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-04 10:17 UTC (permalink / raw)
  To: linux-kernel, rcu
  Cc: Paul E. McKenney, Anna-Maria Behnsen, Davidlohr Bueso,
	Frederic Weisbecker, Ingo Molnar, Josh Triplett, Thomas Gleixner,
	Sebastian Andrzej Siewior

A timer/ hrtimer softirq is raised in-IRQ context. With threaded
interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
for the processing of the softirq.
Once the ksoftirqd is marked as pending (or is running) it will collect
all raised softirqs. This in turn means that a softirq which would have
been processed at the end of the threaded interrupt, which runs at an
elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
priority and competes with every regular task for CPU resources.
This introduces long delays on heavy loaded systems and is not desired
especially if the system is not overloaded by the softirqs.

Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
timers thread and let it run at the lowest SCHED_FIFO priority.
Wake-ups for RT tasks happen from hardirq context so only timer_list timers
and hrtimers for "regular" tasks are processed here. The higher priority
ensures that wakeups are performed before scheduling SCHED_OTHER tasks.

Using a dedicated variable to store the pending softirq bits values
ensure that the timer are not accidentally picked up by ksoftirqd and
other threaded interrupts.
It shouldn't be picked up by ksoftirqd since it runs at lower priority.
However if ksoftirqd is already running while a timer fires, then
ksoftird will be PI-boosted due to the BH-lock to ktimer's priority.
Ideally we try to avoid having ksoftirqd running.

The timer thread can pick up pending softirqs from ksoftirqd but only
if the softirq load is high. It is not be desired that the picked up
softirqs are processed at SCHED_FIFO priority under high softirq load
but this can already happen by a PI-boost by a force-threaded interrupt.

[ frederic@kernel.org: rcutorture.c fixes, storm fix by introduction of
  local_pending_timers() for tick_nohz_next_event() ]

[ junxiao.chang@intel.com: Ensure ktimersd gets woken up even if a
  softirq is currently served. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/interrupt.h | 29 ++++++++++++++
 kernel/rcu/rcutorture.c   |  6 +++
 kernel/softirq.c          | 82 ++++++++++++++++++++++++++++++++++++++-
 kernel/time/hrtimer.c     |  4 +-
 kernel/time/tick-sched.c  |  2 +-
 kernel/time/timer.c       |  2 +-
 6 files changed, 120 insertions(+), 5 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 457151f9f263d..4a4f367cd6864 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -616,6 +616,35 @@ extern void __raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 
+#ifdef CONFIG_PREEMPT_RT
+DECLARE_PER_CPU(struct task_struct *, timersd);
+DECLARE_PER_CPU(unsigned long, pending_timer_softirq);
+
+extern void raise_timer_softirq(void);
+extern void raise_hrtimer_softirq(void);
+
+static inline unsigned int local_pending_timers(void)
+{
+	return __this_cpu_read(pending_timer_softirq);
+}
+
+#else
+static inline void raise_timer_softirq(void)
+{
+	raise_softirq(TIMER_SOFTIRQ);
+}
+
+static inline void raise_hrtimer_softirq(void)
+{
+	raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+}
+
+static inline unsigned int local_pending_timers(void)
+{
+	return local_softirq_pending();
+}
+#endif
+
 DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
 
 static inline struct task_struct *this_cpu_ksoftirqd(void)
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index bb75dbf5c800c..609687fd742d5 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2440,6 +2440,12 @@ static int rcutorture_booster_init(unsigned int cpu)
 		WARN_ON_ONCE(!t);
 		sp.sched_priority = 2;
 		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+#ifdef CONFIG_PREEMPT_RT
+		t = per_cpu(timersd, cpu);
+		WARN_ON_ONCE(!t);
+		sp.sched_priority = 2;
+		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+#endif
 	}
 
 	/* Don't allow time recalculation while creating a new task. */
diff --git a/kernel/softirq.c b/kernel/softirq.c
index d082e7840f880..2d847405e5a7f 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -624,6 +624,24 @@ static inline void tick_irq_exit(void)
 #endif
 }
 
+#ifdef CONFIG_PREEMPT_RT
+DEFINE_PER_CPU(struct task_struct *, timersd);
+DEFINE_PER_CPU(unsigned long, pending_timer_softirq);
+
+static void wake_timersd(void)
+{
+	struct task_struct *tsk = __this_cpu_read(timersd);
+
+	if (tsk)
+		wake_up_process(tsk);
+}
+
+#else
+
+static inline void wake_timersd(void) { }
+
+#endif
+
 static inline void __irq_exit_rcu(void)
 {
 #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
@@ -636,6 +654,10 @@ static inline void __irq_exit_rcu(void)
 	if (!in_interrupt() && local_softirq_pending())
 		invoke_softirq();
 
+	if (IS_ENABLED(CONFIG_PREEMPT_RT) && local_pending_timers() &&
+	    !(in_nmi() | in_hardirq()))
+		wake_timersd();
+
 	tick_irq_exit();
 }
 
@@ -971,12 +993,70 @@ static struct smp_hotplug_thread softirq_threads = {
 	.thread_comm		= "ksoftirqd/%u",
 };
 
+#ifdef CONFIG_PREEMPT_RT
+static void timersd_setup(unsigned int cpu)
+{
+	sched_set_fifo_low(current);
+}
+
+static int timersd_should_run(unsigned int cpu)
+{
+	return local_pending_timers();
+}
+
+static void run_timersd(unsigned int cpu)
+{
+	unsigned int timer_si;
+
+	ksoftirqd_run_begin();
+
+	timer_si = local_pending_timers();
+	__this_cpu_write(pending_timer_softirq, 0);
+	or_softirq_pending(timer_si);
+
+	__do_softirq();
+
+	ksoftirqd_run_end();
+}
+
+static void raise_ktimers_thread(unsigned int nr)
+{
+	trace_softirq_raise(nr);
+	__this_cpu_or(pending_timer_softirq, 1 << nr);
+}
+
+void raise_hrtimer_softirq(void)
+{
+	raise_ktimers_thread(HRTIMER_SOFTIRQ);
+}
+
+void raise_timer_softirq(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	raise_ktimers_thread(TIMER_SOFTIRQ);
+	wake_timersd();
+	local_irq_restore(flags);
+}
+
+static struct smp_hotplug_thread timer_threads = {
+	.store			= &timersd,
+	.setup			= timersd_setup,
+	.thread_should_run	= timersd_should_run,
+	.thread_fn		= run_timersd,
+	.thread_comm		= "ktimers/%u",
+};
+#endif
+
 static __init int spawn_ksoftirqd(void)
 {
 	cpuhp_setup_state_nocalls(CPUHP_SOFTIRQ_DEAD, "softirq:dead", NULL,
 				  takeover_tasklets);
 	BUG_ON(smpboot_register_percpu_thread(&softirq_threads));
-
+#ifdef CONFIG_PREEMPT_RT
+	BUG_ON(smpboot_register_percpu_thread(&timer_threads));
+#endif
 	return 0;
 }
 early_initcall(spawn_ksoftirqd);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index cddcd08ea827f..133d49f703d93 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1811,7 +1811,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
 		cpu_base->softirq_expires_next = KTIME_MAX;
 		cpu_base->softirq_activated = 1;
-		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+		raise_hrtimer_softirq();
 	}
 
 	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
@@ -1906,7 +1906,7 @@ void hrtimer_run_queues(void)
 	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
 		cpu_base->softirq_expires_next = KTIME_MAX;
 		cpu_base->softirq_activated = 1;
-		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+		raise_hrtimer_softirq();
 	}
 
 	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 753a184c70907..efa3181607a2b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -859,7 +859,7 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
 
 static inline bool local_timer_softirq_pending(void)
 {
-	return local_softirq_pending() & BIT(TIMER_SOFTIRQ);
+	return local_pending_timers() & BIT(TIMER_SOFTIRQ);
 }
 
 /*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 0fc9d066a7be4..79f0dc73ac436 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -2499,7 +2499,7 @@ static void run_local_timers(void)
 		 */
 		if (time_after_eq(jiffies, READ_ONCE(base->next_expiry)) ||
 		    (i == BASE_DEF && tmigr_requires_handle_remote())) {
-			raise_softirq(TIMER_SOFTIRQ);
+			raise_timer_softirq();
 			return;
 		}
 	}
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-04 10:17 ` [PATCH 1/1] " Sebastian Andrzej Siewior
@ 2024-10-21 23:17   ` Paul E. McKenney
  2024-10-22 13:28   ` Frederic Weisbecker
  1 sibling, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2024-10-21 23:17 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, rcu, Anna-Maria Behnsen, Davidlohr Bueso,
	Frederic Weisbecker, Ingo Molnar, Josh Triplett, Thomas Gleixner

On Fri, Oct 04, 2024 at 12:17:04PM +0200, Sebastian Andrzej Siewior wrote:
> A timer/ hrtimer softirq is raised in-IRQ context. With threaded
> interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
> for the processing of the softirq.
> Once the ksoftirqd is marked as pending (or is running) it will collect
> all raised softirqs. This in turn means that a softirq which would have
> been processed at the end of the threaded interrupt, which runs at an
> elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> priority and competes with every regular task for CPU resources.
> This introduces long delays on heavy loaded systems and is not desired
> especially if the system is not overloaded by the softirqs.
> 
> Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
> timers thread and let it run at the lowest SCHED_FIFO priority.
> Wake-ups for RT tasks happen from hardirq context so only timer_list timers
> and hrtimers for "regular" tasks are processed here. The higher priority
> ensures that wakeups are performed before scheduling SCHED_OTHER tasks.
> 
> Using a dedicated variable to store the pending softirq bits values
> ensure that the timer are not accidentally picked up by ksoftirqd and
> other threaded interrupts.
> It shouldn't be picked up by ksoftirqd since it runs at lower priority.
> However if ksoftirqd is already running while a timer fires, then
> ksoftird will be PI-boosted due to the BH-lock to ktimer's priority.
> Ideally we try to avoid having ksoftirqd running.
> 
> The timer thread can pick up pending softirqs from ksoftirqd but only
> if the softirq load is high. It is not be desired that the picked up
> softirqs are processed at SCHED_FIFO priority under high softirq load
> but this can already happen by a PI-boost by a force-threaded interrupt.
> 
> [ frederic@kernel.org: rcutorture.c fixes, storm fix by introduction of
>   local_pending_timers() for tick_nohz_next_event() ]
> 
> [ junxiao.chang@intel.com: Ensure ktimersd gets woken up even if a
>   softirq is currently served. ]
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

For the rcutorture pieces:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/interrupt.h | 29 ++++++++++++++
>  kernel/rcu/rcutorture.c   |  6 +++
>  kernel/softirq.c          | 82 ++++++++++++++++++++++++++++++++++++++-
>  kernel/time/hrtimer.c     |  4 +-
>  kernel/time/tick-sched.c  |  2 +-
>  kernel/time/timer.c       |  2 +-
>  6 files changed, 120 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 457151f9f263d..4a4f367cd6864 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -616,6 +616,35 @@ extern void __raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq(unsigned int nr);
>  
> +#ifdef CONFIG_PREEMPT_RT
> +DECLARE_PER_CPU(struct task_struct *, timersd);
> +DECLARE_PER_CPU(unsigned long, pending_timer_softirq);
> +
> +extern void raise_timer_softirq(void);
> +extern void raise_hrtimer_softirq(void);
> +
> +static inline unsigned int local_pending_timers(void)
> +{
> +	return __this_cpu_read(pending_timer_softirq);
> +}
> +
> +#else
> +static inline void raise_timer_softirq(void)
> +{
> +	raise_softirq(TIMER_SOFTIRQ);
> +}
> +
> +static inline void raise_hrtimer_softirq(void)
> +{
> +	raise_softirq_irqoff(HRTIMER_SOFTIRQ);
> +}
> +
> +static inline unsigned int local_pending_timers(void)
> +{
> +	return local_softirq_pending();
> +}
> +#endif
> +
>  DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
>  
>  static inline struct task_struct *this_cpu_ksoftirqd(void)
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index bb75dbf5c800c..609687fd742d5 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2440,6 +2440,12 @@ static int rcutorture_booster_init(unsigned int cpu)
>  		WARN_ON_ONCE(!t);
>  		sp.sched_priority = 2;
>  		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> +#ifdef CONFIG_PREEMPT_RT
> +		t = per_cpu(timersd, cpu);
> +		WARN_ON_ONCE(!t);
> +		sp.sched_priority = 2;
> +		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> +#endif
>  	}
>  
>  	/* Don't allow time recalculation while creating a new task. */
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d082e7840f880..2d847405e5a7f 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -624,6 +624,24 @@ static inline void tick_irq_exit(void)
>  #endif
>  }
>  
> +#ifdef CONFIG_PREEMPT_RT
> +DEFINE_PER_CPU(struct task_struct *, timersd);
> +DEFINE_PER_CPU(unsigned long, pending_timer_softirq);
> +
> +static void wake_timersd(void)
> +{
> +	struct task_struct *tsk = __this_cpu_read(timersd);
> +
> +	if (tsk)
> +		wake_up_process(tsk);
> +}
> +
> +#else
> +
> +static inline void wake_timersd(void) { }
> +
> +#endif
> +
>  static inline void __irq_exit_rcu(void)
>  {
>  #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
> @@ -636,6 +654,10 @@ static inline void __irq_exit_rcu(void)
>  	if (!in_interrupt() && local_softirq_pending())
>  		invoke_softirq();
>  
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && local_pending_timers() &&
> +	    !(in_nmi() | in_hardirq()))
> +		wake_timersd();
> +
>  	tick_irq_exit();
>  }
>  
> @@ -971,12 +993,70 @@ static struct smp_hotplug_thread softirq_threads = {
>  	.thread_comm		= "ksoftirqd/%u",
>  };
>  
> +#ifdef CONFIG_PREEMPT_RT
> +static void timersd_setup(unsigned int cpu)
> +{
> +	sched_set_fifo_low(current);
> +}
> +
> +static int timersd_should_run(unsigned int cpu)
> +{
> +	return local_pending_timers();
> +}
> +
> +static void run_timersd(unsigned int cpu)
> +{
> +	unsigned int timer_si;
> +
> +	ksoftirqd_run_begin();
> +
> +	timer_si = local_pending_timers();
> +	__this_cpu_write(pending_timer_softirq, 0);
> +	or_softirq_pending(timer_si);
> +
> +	__do_softirq();
> +
> +	ksoftirqd_run_end();
> +}
> +
> +static void raise_ktimers_thread(unsigned int nr)
> +{
> +	trace_softirq_raise(nr);
> +	__this_cpu_or(pending_timer_softirq, 1 << nr);
> +}
> +
> +void raise_hrtimer_softirq(void)
> +{
> +	raise_ktimers_thread(HRTIMER_SOFTIRQ);
> +}
> +
> +void raise_timer_softirq(void)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	raise_ktimers_thread(TIMER_SOFTIRQ);
> +	wake_timersd();
> +	local_irq_restore(flags);
> +}
> +
> +static struct smp_hotplug_thread timer_threads = {
> +	.store			= &timersd,
> +	.setup			= timersd_setup,
> +	.thread_should_run	= timersd_should_run,
> +	.thread_fn		= run_timersd,
> +	.thread_comm		= "ktimers/%u",
> +};
> +#endif
> +
>  static __init int spawn_ksoftirqd(void)
>  {
>  	cpuhp_setup_state_nocalls(CPUHP_SOFTIRQ_DEAD, "softirq:dead", NULL,
>  				  takeover_tasklets);
>  	BUG_ON(smpboot_register_percpu_thread(&softirq_threads));
> -
> +#ifdef CONFIG_PREEMPT_RT
> +	BUG_ON(smpboot_register_percpu_thread(&timer_threads));
> +#endif
>  	return 0;
>  }
>  early_initcall(spawn_ksoftirqd);
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index cddcd08ea827f..133d49f703d93 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1811,7 +1811,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>  	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
>  		cpu_base->softirq_expires_next = KTIME_MAX;
>  		cpu_base->softirq_activated = 1;
> -		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
> +		raise_hrtimer_softirq();
>  	}
>  
>  	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
> @@ -1906,7 +1906,7 @@ void hrtimer_run_queues(void)
>  	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
>  		cpu_base->softirq_expires_next = KTIME_MAX;
>  		cpu_base->softirq_activated = 1;
> -		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
> +		raise_hrtimer_softirq();
>  	}
>  
>  	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 753a184c70907..efa3181607a2b 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -859,7 +859,7 @@ static void tick_nohz_restart(struct tick_sched *ts, ktime_t now)
>  
>  static inline bool local_timer_softirq_pending(void)
>  {
> -	return local_softirq_pending() & BIT(TIMER_SOFTIRQ);
> +	return local_pending_timers() & BIT(TIMER_SOFTIRQ);
>  }
>  
>  /*
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 0fc9d066a7be4..79f0dc73ac436 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -2499,7 +2499,7 @@ static void run_local_timers(void)
>  		 */
>  		if (time_after_eq(jiffies, READ_ONCE(base->next_expiry)) ||
>  		    (i == BASE_DEF && tmigr_requires_handle_remote())) {
> -			raise_softirq(TIMER_SOFTIRQ);
> +			raise_timer_softirq();
>  			return;
>  		}
>  	}
> -- 
> 2.45.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-04 10:17 ` [PATCH 1/1] " Sebastian Andrzej Siewior
  2024-10-21 23:17   ` Paul E. McKenney
@ 2024-10-22 13:28   ` Frederic Weisbecker
  2024-10-22 15:34     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 10+ messages in thread
From: Frederic Weisbecker @ 2024-10-22 13:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

Le Fri, Oct 04, 2024 at 12:17:04PM +0200, Sebastian Andrzej Siewior a écrit :
> A timer/ hrtimer softirq is raised in-IRQ context. With threaded
> interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
> for the processing of the softirq.

It took me some time to understand the actual problem (yeah I know...)

Can this be rephrased as: "Timer and hrtimer softirq vectors are special
in that they are always raised in-IRQ context whereas other vectors are
more likely to be raised from threaded interrupts or any regular tasks
when threaded interrupts or PREEMPT_RT are enabled. This leads to
waking ksoftirqd for the processing of the softirqs whenever timer
vectors are involved.

>
> Once the ksoftirqd is marked as pending (or is running) it will collect
> all raised softirqs. This in turn means that a softirq which would have
> been processed at the end of the threaded interrupt, which runs at an
> elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> priority and competes with every regular task for CPU resources.

But for ksoftirqd to collect other non-timers softirqs, those vectors must
have been raised before from a threaded interrupt, right? So why those
vectors didn't get a chance to execute at the end of that threaded interrupt?

OTOH one problem I can imagine is a threaded interrupt preempting ksoftirqd
which must wait for ksoftirqd to complete due to the local_bh_disable()
in the beginning of irq_forced_thread_fn(). But then isn't there some
PI involved on the local lock?

Sorry I'm probably missing something...

>
> This introduces long delays on heavy loaded systems and is not desired
> especially if the system is not overloaded by the softirqs.
> 
> Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
> timers thread and let it run at the lowest SCHED_FIFO priority.
> Wake-ups for RT tasks happen from hardirq context so only timer_list timers
> and hrtimers for "regular" tasks are processed here.

That last sentence confuses me. How are timers for non regular task processed
elsewhere? Ah you mean RT tasks rely on hard hrtimers?

> The higher priority
> ensures that wakeups are performed before scheduling SCHED_OTHER tasks.
> 
> Using a dedicated variable to store the pending softirq bits values
> ensure that the timer are not accidentally picked up by ksoftirqd and
> other threaded interrupts.
> It shouldn't be picked up by ksoftirqd since it runs at lower priority.
> However if ksoftirqd is already running while a timer fires, then
> ksoftird will be PI-boosted due to the BH-lock to ktimer's priority.
> Ideally we try to avoid having ksoftirqd running.
> 
> The timer thread can pick up pending softirqs from ksoftirqd but only
> if the softirq load is high. It is not be desired that the picked up
> softirqs are processed at SCHED_FIFO priority under high softirq load
> but this can already happen by a PI-boost by a force-threaded interrupt.
> 
> [ frederic@kernel.org: rcutorture.c fixes, storm fix by introduction of
>   local_pending_timers() for tick_nohz_next_event() ]
> 
> [ junxiao.chang@intel.com: Ensure ktimersd gets woken up even if a
>   softirq is currently served. ]
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  include/linux/interrupt.h | 29 ++++++++++++++
>  kernel/rcu/rcutorture.c   |  6 +++
>  kernel/softirq.c          | 82 ++++++++++++++++++++++++++++++++++++++-
>  kernel/time/hrtimer.c     |  4 +-
>  kernel/time/tick-sched.c  |  2 +-
>  kernel/time/timer.c       |  2 +-
>  6 files changed, 120 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 457151f9f263d..4a4f367cd6864 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -616,6 +616,35 @@ extern void __raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq(unsigned int nr);
>  
> +#ifdef CONFIG_PREEMPT_RT

This needs a comment section to explain why a dedicated
timers processing is needed.

> +DECLARE_PER_CPU(struct task_struct *, timersd);
> +DECLARE_PER_CPU(unsigned long, pending_timer_softirq);
> +
> +extern void raise_timer_softirq(void);
> +extern void raise_hrtimer_softirq(void);
> +
> +static inline unsigned int local_pending_timers(void)

Let's align with local_softirq_pending() naming and rather
have local_timers_pending() ?

> +{
> +	return __this_cpu_read(pending_timer_softirq);
> +}
> +
> +#ifdef CONFIG_PREEMPT_RT
> +static void timersd_setup(unsigned int cpu)
> +{

That also needs a comment.

> +	sched_set_fifo_low(current);
> +}
> +
> +static int timersd_should_run(unsigned int cpu)
> +{
> +	return local_pending_timers();
> +}
> +
> +static void run_timersd(unsigned int cpu)
> +{
> +	unsigned int timer_si;
> +
> +	ksoftirqd_run_begin();
> +
> +	timer_si = local_pending_timers();
> +	__this_cpu_write(pending_timer_softirq, 0);
> +	or_softirq_pending(timer_si);
> +
> +	__do_softirq();
> +
> +	ksoftirqd_run_end();
> +}
> +
> +static void raise_ktimers_thread(unsigned int nr)
> +{
> +	trace_softirq_raise(nr);
> +	__this_cpu_or(pending_timer_softirq, 1 << nr);
> +}
> +
> +void raise_hrtimer_softirq(void)
> +{
> +	raise_ktimers_thread(HRTIMER_SOFTIRQ);
> +}
> +
> +void raise_timer_softirq(void)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	raise_ktimers_thread(TIMER_SOFTIRQ);
> +	wake_timersd();

This is supposed to be called from hardirq only, right?
Can't irq_exit_rcu() take care of it? Why is it different
from HRTIMER_SOFTIRQ ?

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-22 13:28   ` Frederic Weisbecker
@ 2024-10-22 15:34     ` Sebastian Andrzej Siewior
  2024-10-22 22:27       ` Frederic Weisbecker
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-22 15:34 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

On 2024-10-22 15:28:56 [+0200], Frederic Weisbecker wrote:
> Le Fri, Oct 04, 2024 at 12:17:04PM +0200, Sebastian Andrzej Siewior a écrit :
> > A timer/ hrtimer softirq is raised in-IRQ context. With threaded
> > interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
> > for the processing of the softirq.
> 
> It took me some time to understand the actual problem (yeah I know...)
> 
> Can this be rephrased as: "Timer and hrtimer softirq vectors are special
> in that they are always raised in-IRQ context whereas other vectors are
> more likely to be raised from threaded interrupts or any regular tasks
> when threaded interrupts or PREEMPT_RT are enabled. This leads to
> waking ksoftirqd for the processing of the softirqs whenever timer
> vectors are involved.

Oki.

> > Once the ksoftirqd is marked as pending (or is running) it will collect
> > all raised softirqs. This in turn means that a softirq which would have
> > been processed at the end of the threaded interrupt, which runs at an
> > elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> > priority and competes with every regular task for CPU resources.
> 
> But for ksoftirqd to collect other non-timers softirqs, those vectors must
> have been raised before from a threaded interrupt, right? So why those
> vectors didn't get a chance to execute at the end of that threaded interrupt?

This statement is no longer accurate since
	d15121be74856 ("Revert "softirq: Let ksoftirqd do its job"")

So the "collect all" part is no longer.

> OTOH one problem I can imagine is a threaded interrupt preempting ksoftirqd
> which must wait for ksoftirqd to complete due to the local_bh_disable()
> in the beginning of irq_forced_thread_fn(). But then isn't there some
> PI involved on the local lock?

Yes, there is PI involved on the local lock. So this will happen.

> Sorry I'm probably missing something...

Try again without the "ksoftirqd will collect it all" since this won't
happen since the revert I mentioned.

> > This introduces long delays on heavy loaded systems and is not desired
> > especially if the system is not overloaded by the softirqs.
> > 
> > Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
> > timers thread and let it run at the lowest SCHED_FIFO priority.
> > Wake-ups for RT tasks happen from hardirq context so only timer_list timers
> > and hrtimers for "regular" tasks are processed here.
> 
> That last sentence confuses me. How are timers for non regular task processed
> elsewhere? Ah you mean RT tasks rely on hard hrtimers?

Correct. A clock_nanosleep() for a RT task will result in wake_up() from
hardirq. A clock_nanosleep() for a !RT task will result in wake_up()
from ksoftirqd or now the suggested ktimersd.

Quick question: Do we want this in forced-threaded mode, too? The timer
(timer_list timer) and all HRTIMER_MODE_SOFT are handled in ksoftirqd.

> > The higher priority
> > ensures that wakeups are performed before scheduling SCHED_OTHER tasks.
> > 
> > Using a dedicated variable to store the pending softirq bits values
> > ensure that the timer are not accidentally picked up by ksoftirqd and
> > other threaded interrupts.
> > It shouldn't be picked up by ksoftirqd since it runs at lower priority.
> > However if ksoftirqd is already running while a timer fires, then
> > ksoftird will be PI-boosted due to the BH-lock to ktimer's priority.
> > Ideally we try to avoid having ksoftirqd running.
> > 
> > The timer thread can pick up pending softirqs from ksoftirqd but only
> > if the softirq load is high. It is not be desired that the picked up
> > softirqs are processed at SCHED_FIFO priority under high softirq load
> > but this can already happen by a PI-boost by a force-threaded interrupt.
> > 
> > [ frederic@kernel.org: rcutorture.c fixes, storm fix by introduction of
> >   local_pending_timers() for tick_nohz_next_event() ]
> > 
> > [ junxiao.chang@intel.com: Ensure ktimersd gets woken up even if a
> >   softirq is currently served. ]
> > 
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  include/linux/interrupt.h | 29 ++++++++++++++
> >  kernel/rcu/rcutorture.c   |  6 +++
> >  kernel/softirq.c          | 82 ++++++++++++++++++++++++++++++++++++++-
> >  kernel/time/hrtimer.c     |  4 +-
> >  kernel/time/tick-sched.c  |  2 +-
> >  kernel/time/timer.c       |  2 +-
> >  6 files changed, 120 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> > index 457151f9f263d..4a4f367cd6864 100644
> > --- a/include/linux/interrupt.h
> > +++ b/include/linux/interrupt.h
> > @@ -616,6 +616,35 @@ extern void __raise_softirq_irqoff(unsigned int nr);
> >  extern void raise_softirq_irqoff(unsigned int nr);
> >  extern void raise_softirq(unsigned int nr);
> >  
> > +#ifdef CONFIG_PREEMPT_RT
> 
> This needs a comment section to explain why a dedicated
> timers processing is needed.

Okay.

> > +DECLARE_PER_CPU(struct task_struct *, timersd);
> > +DECLARE_PER_CPU(unsigned long, pending_timer_softirq);
> > +
> > +extern void raise_timer_softirq(void);
> > +extern void raise_hrtimer_softirq(void);
> > +
> > +static inline unsigned int local_pending_timers(void)
> 
> Let's align with local_softirq_pending() naming and rather
> have local_timers_pending() ?

good.

> > +{
> > +	return __this_cpu_read(pending_timer_softirq);
> > +}
> > +
> > +#ifdef CONFIG_PREEMPT_RT
> > +static void timersd_setup(unsigned int cpu)
> > +{
> 
> That also needs a comment.

Why we want the priority I guess.

…
> > +void raise_hrtimer_softirq(void)
> > +{
> > +	raise_ktimers_thread(HRTIMER_SOFTIRQ);
> > +}
> > +
> > +void raise_timer_softirq(void)
> > +{
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > +	wake_timersd();
> 
> This is supposed to be called from hardirq only, right?
> Can't irq_exit_rcu() take care of it? Why is it different
> from HRTIMER_SOFTIRQ ?

Good question. This shouldn't be any different compared to the hrtimer
case. This is only raised in hardirq, so yes, the irq_save can go away
and the wake call, too.

> Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-22 15:34     ` Sebastian Andrzej Siewior
@ 2024-10-22 22:27       ` Frederic Weisbecker
  2024-10-23  6:30         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Frederic Weisbecker @ 2024-10-22 22:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

Le Tue, Oct 22, 2024 at 05:34:21PM +0200, Sebastian Andrzej Siewior a écrit :
> On 2024-10-22 15:28:56 [+0200], Frederic Weisbecker wrote:
> > > Once the ksoftirqd is marked as pending (or is running) it will collect
> > > all raised softirqs. This in turn means that a softirq which would have
> > > been processed at the end of the threaded interrupt, which runs at an
> > > elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> > > priority and competes with every regular task for CPU resources.
> > 
> > But for ksoftirqd to collect other non-timers softirqs, those vectors must
> > have been raised before from a threaded interrupt, right? So why those
> > vectors didn't get a chance to execute at the end of that threaded interrupt?
> 
> This statement is no longer accurate since
> 	d15121be74856 ("Revert "softirq: Let ksoftirqd do its job"")
> 
> So the "collect all" part is no longer.
> 
> > OTOH one problem I can imagine is a threaded interrupt preempting ksoftirqd
> > which must wait for ksoftirqd to complete due to the local_bh_disable()
> > in the beginning of irq_forced_thread_fn(). But then isn't there some
> > PI involved on the local lock?
> 
> Yes, there is PI involved on the local lock. So this will happen.
> 
> > Sorry I'm probably missing something...
> 
> Try again without the "ksoftirqd will collect it all" since this won't
> happen since the revert I mentioned.

I still don't get it, this makes:

"""
Once the ksoftirqd is marked as pending (or is running), a softirq which
would have been processed at the end of the threaded interrupt, which runs
at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
priority and competes with every regular task for CPU resources.
"""

ksoftirqd raised for timers still doesn't prevent a threaded IRQ from running
softirqs, unless it preempts ksoftirqd and waits with PI. So is it what you're
trying to solve?

Or is the problem that timer softirqs are executed with SCHED_NORMAL?

> 
> > > This introduces long delays on heavy loaded systems and is not desired
> > > especially if the system is not overloaded by the softirqs.
> > > 
> > > Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
> > > timers thread and let it run at the lowest SCHED_FIFO priority.
> > > Wake-ups for RT tasks happen from hardirq context so only timer_list timers
> > > and hrtimers for "regular" tasks are processed here.
> > 
> > That last sentence confuses me. How are timers for non regular task processed
> > elsewhere? Ah you mean RT tasks rely on hard hrtimers?
> 
> Correct. A clock_nanosleep() for a RT task will result in wake_up() from
> hardirq. A clock_nanosleep() for a !RT task will result in wake_up()
> from ksoftirqd or now the suggested ktimersd.

Oh I see now, that's all in __hrtimer_init_sleeper().

> 
> Quick question: Do we want this in forced-threaded mode, too? The timer
> (timer_list timer) and all HRTIMER_MODE_SOFT are handled in ksoftirqd.

It's hard to tell for me as I don't know the !RT usecases for forced-threaded mode.
"If in doubt say N" ;-)

> > > +#ifdef CONFIG_PREEMPT_RT
> > > +static void timersd_setup(unsigned int cpu)
> > > +{
> > 
> > That also needs a comment.
> 
> Why we want the priority I guess.

Yes.

> 
> …
> > > +void raise_hrtimer_softirq(void)
> > > +{
> > > +	raise_ktimers_thread(HRTIMER_SOFTIRQ);
> > > +}
> > > +
> > > +void raise_timer_softirq(void)
> > > +{
> > > +	unsigned long flags;
> > > +
> > > +	local_irq_save(flags);
> > > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > > +	wake_timersd();
> > 
> > This is supposed to be called from hardirq only, right?
> > Can't irq_exit_rcu() take care of it? Why is it different
> > from HRTIMER_SOFTIRQ ?
> 
> Good question. This shouldn't be any different compared to the hrtimer
> case. This is only raised in hardirq, so yes, the irq_save can go away
> and the wake call, too.

Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
some well deserved relief :-)

Thanks.

> 
> > Thanks.
> 
> Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-22 22:27       ` Frederic Weisbecker
@ 2024-10-23  6:30         ` Sebastian Andrzej Siewior
  2024-10-23 10:10           ` Frederic Weisbecker
  2024-10-23 10:52           ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-23  6:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

On 2024-10-23 00:27:34 [+0200], Frederic Weisbecker wrote:
> > Try again without the "ksoftirqd will collect it all" since this won't
> > happen since the revert I mentioned.
> 
> I still don't get it, this makes:
> 
> """
> Once the ksoftirqd is marked as pending (or is running), a softirq which
> would have been processed at the end of the threaded interrupt, which runs
> at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> priority and competes with every regular task for CPU resources.
> """
> 
> ksoftirqd raised for timers still doesn't prevent a threaded IRQ from running
> softirqs, unless it preempts ksoftirqd and waits with PI. So is it what you're
> trying to solve?
> 
> Or is the problem that timer softirqs are executed with SCHED_NORMAL?

Exactly. It runs at SCHED_NORMAL and competes with everything else. It
can delay tasks wakes depending on system load.

> > Quick question: Do we want this in forced-threaded mode, too? The timer
> > (timer_list timer) and all HRTIMER_MODE_SOFT are handled in ksoftirqd.
> 
> It's hard to tell for me as I don't know the !RT usecases for forced-threaded mode.
> "If in doubt say N" ;-)

Oki.

> > > > +void raise_timer_softirq(void)
> > > > +{
> > > > +	unsigned long flags;
> > > > +
> > > > +	local_irq_save(flags);
> > > > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > +	wake_timersd();
> > > 
> > > This is supposed to be called from hardirq only, right?
> > > Can't irq_exit_rcu() take care of it? Why is it different
> > > from HRTIMER_SOFTIRQ ?
> > 
> > Good question. This shouldn't be any different compared to the hrtimer
> > case. This is only raised in hardirq, so yes, the irq_save can go away
> > and the wake call, too.
> 
> Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> some well deserved relief :-)

If you want to, sure. I would add them to both raise functions.

> Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-23  6:30         ` Sebastian Andrzej Siewior
@ 2024-10-23 10:10           ` Frederic Weisbecker
  2024-10-23 10:52           ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 10+ messages in thread
From: Frederic Weisbecker @ 2024-10-23 10:10 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

Le Wed, Oct 23, 2024 at 08:30:14AM +0200, Sebastian Andrzej Siewior a écrit :
> On 2024-10-23 00:27:34 [+0200], Frederic Weisbecker wrote:
> > > Try again without the "ksoftirqd will collect it all" since this won't
> > > happen since the revert I mentioned.
> > 
> > I still don't get it, this makes:
> > 
> > """
> > Once the ksoftirqd is marked as pending (or is running), a softirq which
> > would have been processed at the end of the threaded interrupt, which runs
> > at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> > priority and competes with every regular task for CPU resources.
> > """
> > 
> > ksoftirqd raised for timers still doesn't prevent a threaded IRQ from running
> > softirqs, unless it preempts ksoftirqd and waits with PI. So is it what you're
> > trying to solve?
> > 
> > Or is the problem that timer softirqs are executed with SCHED_NORMAL?
> 
> Exactly. It runs at SCHED_NORMAL and competes with everything else. It
> can delay tasks wakes depending on system load.

Ok so that narrows down the problem and it's much clearer, thanks.

> > > > > +void raise_timer_softirq(void)
> > > > > +{
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	local_irq_save(flags);
> > > > > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > > +	wake_timersd();
> > > > 
> > > > This is supposed to be called from hardirq only, right?
> > > > Can't irq_exit_rcu() take care of it? Why is it different
> > > > from HRTIMER_SOFTIRQ ?
> > > 
> > > Good question. This shouldn't be any different compared to the hrtimer
> > > case. This is only raised in hardirq, so yes, the irq_save can go away
> > > and the wake call, too.
> > 
> > Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> > some well deserved relief :-)
> 
> If you want to, sure. I would add them to both raise functions.

Yeah, just in case. Thanks!

> 
> > Thanks.
> 
> Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-23  6:30         ` Sebastian Andrzej Siewior
  2024-10-23 10:10           ` Frederic Weisbecker
@ 2024-10-23 10:52           ` Sebastian Andrzej Siewior
  2024-10-24 14:02             ` Frederic Weisbecker
  1 sibling, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-23 10:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

On 2024-10-23 08:30:18 [+0200], To Frederic Weisbecker wrote:
> > > > > +void raise_timer_softirq(void)
> > > > > +{
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	local_irq_save(flags);
> > > > > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > > +	wake_timersd();
> > > > 
> > > > This is supposed to be called from hardirq only, right?
> > > > Can't irq_exit_rcu() take care of it? Why is it different
> > > > from HRTIMER_SOFTIRQ ?
> > > 
> > > Good question. This shouldn't be any different compared to the hrtimer
> > > case. This is only raised in hardirq, so yes, the irq_save can go away
> > > and the wake call, too.
> > 
> > Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> > some well deserved relief :-)
> 
> If you want to, sure. I would add them to both raise functions.

That function (run_local_timers()) was in past also called from other
places like the APIC IRQ but all this is gone now. The reason why I
added the wake and the local_irq_save() is because it uses
raise_softirq() instead raise_softirq_irqoff(). And raise_softirq() was
used since it was separated away from tasklets.

Now, raise_timer_softirq() is a function within softirq.c because it
needs to access task_struct timersd which was only accessible there. It
has been made public later due to the rcutorture bits so it could be
very much be made inline and reduced to just raise_ktimers_thread().
I tend to make TIMER_SOFTIRQ use also raise_softirq_irqoff() to make it
look the same. That lockdep_assert_in_irq() is probably cheap but it
might look odd why RT needs or just TIMER and not HRTIMER.

> > Thanks.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
  2024-10-23 10:52           ` Sebastian Andrzej Siewior
@ 2024-10-24 14:02             ` Frederic Weisbecker
  0 siblings, 0 replies; 10+ messages in thread
From: Frederic Weisbecker @ 2024-10-24 14:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, rcu, Paul E. McKenney, Anna-Maria Behnsen,
	Davidlohr Bueso, Ingo Molnar, Josh Triplett, Thomas Gleixner

Le Wed, Oct 23, 2024 at 12:52:57PM +0200, Sebastian Andrzej Siewior a écrit :
> On 2024-10-23 08:30:18 [+0200], To Frederic Weisbecker wrote:
> > > > > > +void raise_timer_softirq(void)
> > > > > > +{
> > > > > > +	unsigned long flags;
> > > > > > +
> > > > > > +	local_irq_save(flags);
> > > > > > +	raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > > > +	wake_timersd();
> > > > > 
> > > > > This is supposed to be called from hardirq only, right?
> > > > > Can't irq_exit_rcu() take care of it? Why is it different
> > > > > from HRTIMER_SOFTIRQ ?
> > > > 
> > > > Good question. This shouldn't be any different compared to the hrtimer
> > > > case. This is only raised in hardirq, so yes, the irq_save can go away
> > > > and the wake call, too.
> > > 
> > > Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> > > some well deserved relief :-)
> > 
> > If you want to, sure. I would add them to both raise functions.
> 
> That function (run_local_timers()) was in past also called from other
> places like the APIC IRQ but all this is gone now. The reason why I
> added the wake and the local_irq_save() is because it uses
> raise_softirq() instead raise_softirq_irqoff(). And raise_softirq() was
> used since it was separated away from tasklets.
> 
> Now, raise_timer_softirq() is a function within softirq.c because it
> needs to access task_struct timersd which was only accessible there. It
> has been made public later due to the rcutorture bits so it could be
> very much be made inline and reduced to just raise_ktimers_thread().
> I tend to make TIMER_SOFTIRQ use also raise_softirq_irqoff() to make it
> look the same.

Sounds good!

> That lockdep_assert_in_irq() is probably cheap but it
> might look odd why RT needs or just TIMER and not HRTIMER.

I guess adding the same test on inline !RT functions in bottom_half.h
will be challening... Perhaps forget about that idea...

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-10-24 14:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-04 10:17 [PATCH 0/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT Sebastian Andrzej Siewior
2024-10-04 10:17 ` [PATCH 1/1] " Sebastian Andrzej Siewior
2024-10-21 23:17   ` Paul E. McKenney
2024-10-22 13:28   ` Frederic Weisbecker
2024-10-22 15:34     ` Sebastian Andrzej Siewior
2024-10-22 22:27       ` Frederic Weisbecker
2024-10-23  6:30         ` Sebastian Andrzej Siewior
2024-10-23 10:10           ` Frederic Weisbecker
2024-10-23 10:52           ` Sebastian Andrzej Siewior
2024-10-24 14:02             ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox