netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped
@ 2014-04-10 15:57 Rik van Riel
  2014-04-10 18:15 ` Eric Dumazet
  2014-04-11 20:33 ` David Miller
  0 siblings, 2 replies; 4+ messages in thread
From: Rik van Riel @ 2014-04-10 15:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, Jiri Benc, Peter Zijlstra, Thomas Gleixner, David Miller,
	Frederic Weisbecker

Jiri noticed that netperf throughput had gotten worse in recent years,
for smaller message sizes. In the past, ksoftirqd would take around 80%
of a CPU, and netserver would take around 100% of another CPU.

On current kernels, sometimes all the softirq processing is done in the
context of the netperf process, which can result in as much as a 50%
performance drop, due to netserver spending all its CPU time "delivering"
packets to a socket it rarely empties, and dropping the packets on the
floor as a result.

This seems silly in an age where even cell phones are multi-core, and
we could simply let the ksoftirqd thread handle the softirq load, so
the scheduler can migrate the userspace task to another CPU.

This patch accomplishes that in a very simplistic way. The code
remembers when __do_softirq last looped, and will punt softirq
handling to ksoftirqd if another softirq happens in the same jiffie.

Netperf results:

			without patch		with patch
UDP_STREAM      1472    957.17 / 954.18         957.15 / 951.73
UDP_STREAM      978     936.85 / 930.06         936.84 / 927.63
UDP_STREAM      466     875.98 / 865.62         875.98 / 868.65
UDP_STREAM      210     760.88 / 748.70         760.88 / 748.61
UDP_STREAM      82      554.06 / 329.96         554.06 / 505.95
                        unstable ^^^^^^
UDP_STREAM      18      158.99 / 108.95         160.73 / 112.68

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: Jiri Benc <jbenc@redhat.com>
Reported-by: Jiri Benc <jbenc@redhat.com>
---
 kernel/softirq.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 787b3a0..020be2f 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -56,6 +56,7 @@ EXPORT_SYMBOL(irq_stat);
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
 DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(unsigned long, softirq_looped);
 
 char *softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
@@ -271,6 +272,9 @@ asmlinkage void __do_softirq(void)
 
 	pending = local_softirq_pending();
 	if (pending) {
+		/* Still busy? Remember this for invoke_softirq() below... */
+		this_cpu_write(softirq_looped, jiffies);
+
 		if (time_before(jiffies, end) && !need_resched() &&
 		    --max_restart)
 			goto restart;
@@ -330,7 +334,11 @@ void irq_enter(void)
 
 static inline void invoke_softirq(void)
 {
-	if (!force_irqthreads) {
+	/*
+	 * If force_irqthreads is set, or if we looped in __do_softirq this
+	 * jiffie, punt to ksoftirqd to prevent userland starvation.
+	 */
+	if (!force_irqthreads && this_cpu_read(softirq_looped) != jiffies) {
 		/*
 		 * We can safely execute softirq on the current stack if
 		 * it is the irq stack, because it should be near empty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped
  2014-04-10 15:57 [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped Rik van Riel
@ 2014-04-10 18:15 ` Eric Dumazet
  2014-04-11 20:33 ` David Miller
  1 sibling, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2014-04-10 18:15 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, netdev, Jiri Benc, Peter Zijlstra, Thomas Gleixner,
	David Miller, Frederic Weisbecker

On Thu, 2014-04-10 at 11:57 -0400, Rik van Riel wrote:
> Jiri noticed that netperf throughput had gotten worse in recent years,
> for smaller message sizes. In the past, ksoftirqd would take around 80%
> of a CPU, and netserver would take around 100% of another CPU.
> 
> On current kernels, sometimes all the softirq processing is done in the
> context of the netperf process, which can result in as much as a 50%
> performance drop, due to netserver spending all its CPU time "delivering"
> packets to a socket it rarely empties, and dropping the packets on the
> floor as a result.
> 
> This seems silly in an age where even cell phones are multi-core, and
> we could simply let the ksoftirqd thread handle the softirq load, so
> the scheduler can migrate the userspace task to another CPU.
> 
> This patch accomplishes that in a very simplistic way. The code
> remembers when __do_softirq last looped, and will punt softirq
> handling to ksoftirqd if another softirq happens in the same jiffie.
> 
> Netperf results:
> 
> 			without patch		with patch
> UDP_STREAM      1472    957.17 / 954.18         957.15 / 951.73
> UDP_STREAM      978     936.85 / 930.06         936.84 / 927.63
> UDP_STREAM      466     875.98 / 865.62         875.98 / 868.65
> UDP_STREAM      210     760.88 / 748.70         760.88 / 748.61
> UDP_STREAM      82      554.06 / 329.96         554.06 / 505.95
>                         unstable ^^^^^^
> UDP_STREAM      18      158.99 / 108.95         160.73 / 112.68
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: David Miller <davem@davemloft.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Tested-by: Jiri Benc <jbenc@redhat.com>
> Reported-by: Jiri Benc <jbenc@redhat.com>
> ---
>  kernel/softirq.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 787b3a0..020be2f 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -56,6 +56,7 @@ EXPORT_SYMBOL(irq_stat);
>  static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
>  
>  DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
> +DEFINE_PER_CPU(unsigned long, softirq_looped);
>  
>  char *softirq_to_name[NR_SOFTIRQS] = {
>  	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
> @@ -271,6 +272,9 @@ asmlinkage void __do_softirq(void)
>  
>  	pending = local_softirq_pending();
>  	if (pending) {
> +		/* Still busy? Remember this for invoke_softirq() below... */
> +		this_cpu_write(softirq_looped, jiffies);
> +
>  		if (time_before(jiffies, end) && !need_resched() &&
>  		    --max_restart)
>  			goto restart;
> @@ -330,7 +334,11 @@ void irq_enter(void)
>  
>  static inline void invoke_softirq(void)
>  {
> -	if (!force_irqthreads) {
> +	/*
> +	 * If force_irqthreads is set, or if we looped in __do_softirq this
> +	 * jiffie, punt to ksoftirqd to prevent userland starvation.
> +	 */
> +	if (!force_irqthreads && this_cpu_read(softirq_looped) != jiffies) {
>  		/*
>  		 * We can safely execute softirq on the current stack if
>  		 * it is the irq stack, because it should be near empty


I guess this is the tradeoff between latencies and throughput.

Have you tried some TCP_RR  / UDP_RR tests with one / multiple
instances, and have you tried drivers that use deferred skb freeing
(hard irq calling TX completion handler, then dev_kfree_skb_any()
scheduling TX softirq) and increase chance of having a not zero
local_softirq_pending() 

Calling skb destructor on a different cpu can have a huge false sharing
effect. A TCP socket is really big.

Your test only UDP_STREAM stresses the RX part, and UDP sockets dont
use the complex callbacks TCP sockets use.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped
  2014-04-10 15:57 [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped Rik van Riel
  2014-04-10 18:15 ` Eric Dumazet
@ 2014-04-11 20:33 ` David Miller
  2014-04-12  0:02   ` Rik van Riel
  1 sibling, 1 reply; 4+ messages in thread
From: David Miller @ 2014-04-11 20:33 UTC (permalink / raw)
  To: riel; +Cc: linux-kernel, netdev, jbenc, peterz, tglx, fweisbec

From: Rik van Riel <riel@redhat.com>
Date: Thu, 10 Apr 2014 11:57:06 -0400

> @@ -330,7 +334,11 @@ void irq_enter(void)
>  
>  static inline void invoke_softirq(void)
>  {
> -	if (!force_irqthreads) {
> +	/*
> +	 * If force_irqthreads is set, or if we looped in __do_softirq this
> +	 * jiffie, punt to ksoftirqd to prevent userland starvation.
> +	 */
> +	if (!force_irqthreads && this_cpu_read(softirq_looped) != jiffies) {

If we do this, which I'm not convinced of yet, I think we should use two
jiffies as the cutoff.

Because if we are at the tail end of one jiffie we'll prematurely go to
ksoftirqd when we really have no reason to do so.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped
  2014-04-11 20:33 ` David Miller
@ 2014-04-12  0:02   ` Rik van Riel
  0 siblings, 0 replies; 4+ messages in thread
From: Rik van Riel @ 2014-04-12  0:02 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, netdev, jbenc, peterz, tglx, fweisbec

On 04/11/2014 04:33 PM, David Miller wrote:
> From: Rik van Riel <riel@redhat.com>
> Date: Thu, 10 Apr 2014 11:57:06 -0400
> 
>> @@ -330,7 +334,11 @@ void irq_enter(void)
>>  
>>  static inline void invoke_softirq(void)
>>  {
>> -	if (!force_irqthreads) {
>> +	/*
>> +	 * If force_irqthreads is set, or if we looped in __do_softirq this
>> +	 * jiffie, punt to ksoftirqd to prevent userland starvation.
>> +	 */
>> +	if (!force_irqthreads && this_cpu_read(softirq_looped) != jiffies) {
> 
> If we do this, which I'm not convinced of yet, I think we should use two
> jiffies as the cutoff.

I am not fully convinced, either.  This patch mostly just illustrates
the problem, and gives something that solves Jiri's immediate problem.

It is quite likely that there is a better way to solve the problem of:
1) softirq handling starving out userspace processing,
2) which could be solved by moving the userspace process elsewhere, and
3) shifting softirq processing to ksoftirqd

A working patch seems to be one of the better ways to start a
discussion, though.

If anybody has a nicer idea on how to solve the problem, I'd even be
willing to implement your idea, and give Jiri another patch to test :)

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-04-12  0:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-10 15:57 [PATCH] softirq: punt to ksoftirqd if __do_softirq recently looped Rik van Riel
2014-04-10 18:15 ` Eric Dumazet
2014-04-11 20:33 ` David Miller
2014-04-12  0:02   ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).