From: Jason Xing <kerneljasonxing@gmail.com>
To: paulmck@kernel.org, peterz@infradead.org, tglx@linutronix.de,
bigeasy@linutronix.de, frederic@kernel.org
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
kerneljasonxing@gmail.com, Jason Xing <kernelxing@tencent.com>
Subject: [PATCH] softirq: let the userside tune the SOFTIRQ_NOW_MASK with sysctl
Date: Mon, 10 Apr 2023 10:30:41 +0800 [thread overview]
Message-ID: <20230410023041.49857-1-kerneljasonxing@gmail.com> (raw)
From: Jason Xing <kernelxing@tencent.com>
Currently we have two exceptions which could avoid ksoftirqd when
invoking softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ. They were introduced
in the commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")
which says if we don't mask them, it will cause excessive latencies in
some cases.
It also mentioned that we may take time softirq into consideration:
"We should probably also consider the timer softirqs to be synchronous
and not be delayed to ksoftirqd."
The same reason goes here. In production workload, we found that some
sensitive applications are complaining about the high latency of
tx/rx path in networking, because some packets have to be delayed in
ksoftirqd kthread that can be blocked in the runqueue for some while
(say, 10-70 ms) especially in guestOS. So marking tx/rx softirq
synchronous, for instance, NET_RX_SOFTIRQ, solves such issue.
We tested and observed the high latency above 50ms of the rx path in
the real workload:
without masking: over 100 times hitting the limit per hour
with masking: less than 10 times for a whole day
As we all know the default config is not able to satisify everyone's
requirements. After applied this patch exporting the softirq mask to
the userside, we can serve different cases by tuning with sysctl.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
kernel/softirq.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..aa6e52ca2c55 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -65,6 +65,8 @@ const char * const softirq_to_name[NR_SOFTIRQS] = {
"TASKLET", "SCHED", "HRTIMER", "RCU"
};
+unsigned int sysctl_softirq_mask = 1 << HI_SOFTIRQ | 1 << TASKLET_SOFTIRQ;
+
/*
* we cannot loop indefinitely here to avoid userspace starvation,
* but we also don't want to introduce a worst case 1/HZ latency
@@ -80,17 +82,23 @@ static void wakeup_softirqd(void)
wake_up_process(tsk);
}
+static bool softirq_now_mask(unsigned long pending)
+{
+ if (pending & sysctl_softirq_mask)
+ return false;
+ return true;
+}
+
/*
* If ksoftirqd is scheduled, we do not want to process pending softirqs
* right now. Let ksoftirqd handle this at its own rate, to get fairness,
* unless we're doing some of the synchronous softirqs.
*/
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
static bool ksoftirqd_running(unsigned long pending)
{
struct task_struct *tsk = __this_cpu_read(ksoftirqd);
- if (pending & SOFTIRQ_NOW_MASK)
+ if (softirq_now_mask(pending))
return false;
return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
}
@@ -903,6 +911,22 @@ void tasklet_unlock_wait(struct tasklet_struct *t)
EXPORT_SYMBOL_GPL(tasklet_unlock_wait);
#endif
+static struct ctl_table softirq_sysctls[] = {
+ {
+ .procname = "softirq_mask",
+ .data = &sysctl_softirq_mask,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+ {}
+};
+
+static void __init softirq_mask_sysctl_init(void)
+{
+ register_sysctl_init("kernel", softirq_sysctls);
+}
+
void __init softirq_init(void)
{
int cpu;
@@ -916,6 +940,7 @@ void __init softirq_init(void)
open_softirq(TASKLET_SOFTIRQ, tasklet_action);
open_softirq(HI_SOFTIRQ, tasklet_hi_action);
+ softirq_mask_sysctl_init();
}
static int ksoftirqd_should_run(unsigned int cpu)
--
2.37.3
next reply other threads:[~2023-04-10 2:31 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-10 2:30 Jason Xing [this message]
2023-05-09 13:05 ` [PATCH] softirq: let the userside tune the SOFTIRQ_NOW_MASK with sysctl Thomas Gleixner
2023-05-09 13:25 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230410023041.49857-1-kerneljasonxing@gmail.com \
--to=kerneljasonxing@gmail.com \
--cc=bigeasy@linutronix.de \
--cc=frederic@kernel.org \
--cc=kernelxing@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).