From: Schspa Shi <schspa@gmail.com>
To: bigeasy@linutronix.de, rostedt@goodmis.org
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
Schspa Shi <schspa@gmail.com>
Subject: [RFC PATCH] irq_work: wakeup irq_workd when queued first rt_lazy work
Date: Tue, 12 Jul 2022 01:23:15 +0800 [thread overview]
Message-ID: <20220711172314.603717-1-schspa@gmail.com> (raw)
Commit b4c6f86ec2f64
("irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT")
treat all irq_work without IRQ_WORK_HARD_IRQ flags to the lazy_list
in PREEMPT_RT. But this kind of irq_work still has some difference
with IRQ work set IRQ_WORK_LAZY. The difference as fellowing:
- With IRQ_WORK_LAZY: (type1)
This kind of work will be executed after the next time tick by wakeup
irq_workd, there will be more scheduling delays.
Let's mark this as type1
- Without IRQ_WORK_LAZY and IRQ_WORK_HARD_IRQ: (type2)
This kind of irq_work will have a faster response speed by wakeup
irq_workd from IPI interrupt.
Let's mark it as type2
I want to know if this difference is by design.
If this is by design, we have a problem that the irq_work of type2
will not execute as quickly as expected, it may be delayed by the
irq_work of type1.
Please consider the following scenarios:
If the CPU queued a type1 irq_work A, and then a type2 irq_work B.
But we won't make B executed quickly, because we won't issue the IPI
interrupt to wakeup irq_workd (the llist_add call will return false).
This PATCH will issue the IPI_IRQ_WORK to make B execute quickly.
One thing that needs to be optimized is that we now have
lazy_list.node.llist and lazy_work_raised which need to be granted
to be atomicity, disabled the local CPU IRQ to make this atomic.
There should be a better way to make these two variants to be atomically
and I can go in deep if this little problem is not by design, and need
to be fixed.
If these two types of irq_work should be the same with the priority.
maybe we should change.
if (!lazy_work || tick_nohz_tick_stopped()) {
arch_irq_work_raise();
}
to
if (!(lazy_work || rt_lazy_work) || tick_nohz_tick_stopped()) {
arch_irq_work_raise();
}
I'm uploading this patch just to explain the problem, hopefully
don't pay too much attention to the ugly changes below.
Signed-off-by: Schspa Shi <schspa@gmail.com>
---
kernel/irq_work.c | 40 +++++++++++++++++++++++++++++++++++++---
1 file changed, 37 insertions(+), 3 deletions(-)
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 7afa40fe5cc43..d5d0b720fac15 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -25,6 +25,7 @@
static DEFINE_PER_CPU(struct llist_head, raised_list);
static DEFINE_PER_CPU(struct llist_head, lazy_list);
static DEFINE_PER_CPU(struct task_struct *, irq_workd);
+static DEFINE_PER_CPU(bool, lazy_work_raised);
static void wake_irq_workd(void)
{
@@ -81,6 +82,7 @@ static void __irq_work_queue_local(struct irq_work *work)
bool rt_lazy_work = false;
bool lazy_work = false;
int work_flags;
+ unsigned long flags;
work_flags = atomic_read(&work->node.a_flags);
if (work_flags & IRQ_WORK_LAZY)
@@ -94,12 +96,35 @@ static void __irq_work_queue_local(struct irq_work *work)
else
list = this_cpu_ptr(&raised_list);
- if (!llist_add(&work->node.llist, list))
+ local_irq_save(flags);
+ if (!llist_add(&work->node.llist, list)) {
+ bool irq_raised;
+ /*
+ * In PREEMPT_RT, if we add a lazy work added to the list
+ * before, the work maybe not raised. We need a extra check
+ * for PREEMPT_RT.
+ */
+ irq_raised = !xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
+ if (unlikely(!irq_raised))
+ arch_irq_work_raise();
+
return;
+ }
/* If the work is "lazy", handle it from next tick if any */
- if (!lazy_work || tick_nohz_tick_stopped())
+ if (!lazy_work || tick_nohz_tick_stopped()) {
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
arch_irq_work_raise();
+ } else if (lazy_work || rt_lazy_work) {
+ /*
+ * The first added irq work not raise a irq work, we need to
+ * raise one for the next added irq work.
+ */
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), false);
+ local_irq_restore(flags);
+ }
}
/* Enqueue the irq work @work on the current CPU */
@@ -151,9 +176,18 @@ bool irq_work_queue_on(struct irq_work *work, int cpu)
*/
if (IS_ENABLED(CONFIG_PREEMPT_RT) &&
!(atomic_read(&work->node.a_flags) & IRQ_WORK_HARD_IRQ)) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (!llist_add(&work->node.llist, &per_cpu(lazy_list, cpu))) {
+ if (!xchg(this_cpu_ptr(&lazy_work_raised), true))
+ arch_irq_work_raise();
- if (!llist_add(&work->node.llist, &per_cpu(lazy_list, cpu)))
+ local_irq_restore(flags);
goto out;
+ }
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
work = &per_cpu(irq_work_wakeup, cpu);
if (!irq_work_claim(work))
--
2.37.0
next reply other threads:[~2022-07-11 17:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-11 17:23 Schspa Shi [this message]
2022-08-18 16:28 ` [RFC PATCH] irq_work: wakeup irq_workd when queued first rt_lazy work Sebastian Andrzej Siewior
2022-08-18 19:56 ` Schspa Shi
2022-08-18 20:42 ` Schspa Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220711172314.603717-1-schspa@gmail.com \
--to=schspa@gmail.com \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox