From: Schspa Shi <schspa@gmail.com>
To: bigeasy@linutronix.de, rostedt@goodmis.org
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
Schspa Shi <schspa@gmail.com>
Subject: [RFC PATCH] irq_work: wakeup irq_workd when queued first rt_lazy work
Date: Tue, 12 Jul 2022 01:23:15 +0800 [thread overview]
Message-ID: <20220711172314.603717-1-schspa@gmail.com> (raw)
Commit b4c6f86ec2f64
("irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT")
treat all irq_work without IRQ_WORK_HARD_IRQ flags to the lazy_list
in PREEMPT_RT. But this kind of irq_work still has some difference
with IRQ work set IRQ_WORK_LAZY. The difference as fellowing:
- With IRQ_WORK_LAZY: (type1)
This kind of work will be executed after the next time tick by wakeup
irq_workd, there will be more scheduling delays.
Let's mark this as type1
- Without IRQ_WORK_LAZY and IRQ_WORK_HARD_IRQ: (type2)
This kind of irq_work will have a faster response speed by wakeup
irq_workd from IPI interrupt.
Let's mark it as type2
I want to know if this difference is by design.
If this is by design, we have a problem that the irq_work of type2
will not execute as quickly as expected, it may be delayed by the
irq_work of type1.
Please consider the following scenarios:
If the CPU queued a type1 irq_work A, and then a type2 irq_work B.
But we won't make B executed quickly, because we won't issue the IPI
interrupt to wakeup irq_workd (the llist_add call will return false).
This PATCH will issue the IPI_IRQ_WORK to make B execute quickly.
One thing that needs to be optimized is that we now have
lazy_list.node.llist and lazy_work_raised which need to be granted
to be atomicity, disabled the local CPU IRQ to make this atomic.
There should be a better way to make these two variants to be atomically
and I can go in deep if this little problem is not by design, and need
to be fixed.
If these two types of irq_work should be the same with the priority.
maybe we should change.
if (!lazy_work || tick_nohz_tick_stopped()) {
arch_irq_work_raise();
}
to
if (!(lazy_work || rt_lazy_work) || tick_nohz_tick_stopped()) {
arch_irq_work_raise();
}
I'm uploading this patch just to explain the problem, hopefully
don't pay too much attention to the ugly changes below.
Signed-off-by: Schspa Shi <schspa@gmail.com>
---
kernel/irq_work.c | 40 +++++++++++++++++++++++++++++++++++++---
1 file changed, 37 insertions(+), 3 deletions(-)
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 7afa40fe5cc43..d5d0b720fac15 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -25,6 +25,7 @@
static DEFINE_PER_CPU(struct llist_head, raised_list);
static DEFINE_PER_CPU(struct llist_head, lazy_list);
static DEFINE_PER_CPU(struct task_struct *, irq_workd);
+static DEFINE_PER_CPU(bool, lazy_work_raised);
static void wake_irq_workd(void)
{
@@ -81,6 +82,7 @@ static void __irq_work_queue_local(struct irq_work *work)
bool rt_lazy_work = false;
bool lazy_work = false;
int work_flags;
+ unsigned long flags;
work_flags = atomic_read(&work->node.a_flags);
if (work_flags & IRQ_WORK_LAZY)
@@ -94,12 +96,35 @@ static void __irq_work_queue_local(struct irq_work *work)
else
list = this_cpu_ptr(&raised_list);
- if (!llist_add(&work->node.llist, list))
+ local_irq_save(flags);
+ if (!llist_add(&work->node.llist, list)) {
+ bool irq_raised;
+ /*
+ * In PREEMPT_RT, if we add a lazy work added to the list
+ * before, the work maybe not raised. We need a extra check
+ * for PREEMPT_RT.
+ */
+ irq_raised = !xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
+ if (unlikely(!irq_raised))
+ arch_irq_work_raise();
+
return;
+ }
/* If the work is "lazy", handle it from next tick if any */
- if (!lazy_work || tick_nohz_tick_stopped())
+ if (!lazy_work || tick_nohz_tick_stopped()) {
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
arch_irq_work_raise();
+ } else if (lazy_work || rt_lazy_work) {
+ /*
+ * The first added irq work not raise a irq work, we need to
+ * raise one for the next added irq work.
+ */
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), false);
+ local_irq_restore(flags);
+ }
}
/* Enqueue the irq work @work on the current CPU */
@@ -151,9 +176,18 @@ bool irq_work_queue_on(struct irq_work *work, int cpu)
*/
if (IS_ENABLED(CONFIG_PREEMPT_RT) &&
!(atomic_read(&work->node.a_flags) & IRQ_WORK_HARD_IRQ)) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (!llist_add(&work->node.llist, &per_cpu(lazy_list, cpu))) {
+ if (!xchg(this_cpu_ptr(&lazy_work_raised), true))
+ arch_irq_work_raise();
- if (!llist_add(&work->node.llist, &per_cpu(lazy_list, cpu)))
+ local_irq_restore(flags);
goto out;
+ }
+ (void) xchg(this_cpu_ptr(&lazy_work_raised), true);
+ local_irq_restore(flags);
work = &per_cpu(irq_work_wakeup, cpu);
if (!irq_work_claim(work))
--
2.37.0
next reply other threads:[~2022-07-11 17:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-11 17:23 Schspa Shi [this message]
2022-08-18 16:28 ` [RFC PATCH] irq_work: wakeup irq_workd when queued first rt_lazy work Sebastian Andrzej Siewior
2022-08-18 19:56 ` Schspa Shi
2022-08-18 20:42 ` Schspa Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220711172314.603717-1-schspa@gmail.com \
--to=schspa@gmail.com \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.