Re: [RFC PATCH 2/2] softirq: Per vector thread deferment

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paolo Abeni <pabeni@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Cc: Levin Alexander <alexander.levin@verizon.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
	Wanpeng Li <wanpeng.li@hotmail.com>,
	Dmitry Safonov <dima@arista.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Eric Dumazet <edumazet@google.com>,
	Radu Rendec <rrendec@arista.com>, Ingo Molnar <mingo@kernel.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC PATCH 2/2] softirq: Per vector thread deferment
Date: Fri, 12 Jan 2018 10:07:25 +0100	[thread overview]
Message-ID: <1515748045.2648.10.camel@redhat.com> (raw)
In-Reply-To: <1515735354-19279-3-git-send-email-frederic@kernel.org>

On Fri, 2018-01-12 at 06:35 +0100, Frederic Weisbecker wrote:
> Some softirq vectors can be more CPU hungry than others. Especially
> networking may sometimes deal with packet storm and need more CPU than
> IRQ tail can offer without inducing scheduler latencies. In this case
> the current code defers to ksoftirqd that behaves nicer. Now this nice
> behaviour can be bad for other IRQ vectors that usually need quick
> processing.
> 
> To solve this we only defer to threading the vectors that outreached the
> time limit on IRQ tail processing and leave the others inline on real
> Soft-IRQs service. This is achieved using workqueues with
> per-CPU/per-vector worklets.
> 
> Note ksoftirqd is not removed as it is still needed for threaded IRQs
> mode.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Dmitry Safonov <dima@arista.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Levin Alexander <alexander.levin@verizon.com>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Radu Rendec <rrendec@arista.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  kernel/softirq.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index fa267f7..0c817ec6 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -74,6 +74,13 @@ struct softirq_stat {
>  
>  static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu);
>  
> +struct vector_work {
> +	int vec;
> +	struct work_struct work;
> +};
> +
> +static DEFINE_PER_CPU(struct vector_work[NR_SOFTIRQS], vector_work_cpu);
> +
>  /*
>   * we cannot loop indefinitely here to avoid userspace starvation,
>   * but we also don't want to introduce a worst case 1/HZ latency
> @@ -251,6 +258,70 @@ static inline bool lockdep_softirq_start(void) { return false; }
>  static inline void lockdep_softirq_end(bool in_hardirq) { }
>  #endif
>  
> +static void vector_work_func(struct work_struct *work)
> +{
> +	struct vector_work *vector_work;
> +	u32 pending;
> +	int vec;
> +
> +	vector_work = container_of(work, struct vector_work, work);
> +	vec = vector_work->vec;
> +
> +	local_irq_disable();
> +	pending = local_softirq_pending();
> +	account_irq_enter_time(current);
> +	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
> +	lockdep_softirq_enter();
> +	set_softirq_pending(pending & ~(1 << vec));
> +	local_irq_enable();
> +
> +	if (pending & (1 << vec)) {
> +		struct softirq_action *sa = &softirq_vec[vec];
> +
> +		kstat_incr_softirqs_this_cpu(vec);
> +		trace_softirq_entry(vec);
> +		sa->action(sa);
> +		trace_softirq_exit(vec);
> +	}
> +
> +	local_irq_disable();
> +
> +	pending = local_softirq_pending();
> +	if (pending & (1 << vec))
> +		schedule_work_on(smp_processor_id(), work);

If we check for the overrun condition here, as done in the
__do_softirq() main loop, we could avoid ksoftirqd completely and
probably have less code duplication.

> +
> +	lockdep_softirq_exit();
> +	account_irq_exit_time(current);
> +	__local_bh_enable(SOFTIRQ_OFFSET);
> +	local_irq_enable();
> +}
> +
> +static int do_softirq_overrun(u32 overrun, u32 pending)
> +{
> +	struct softirq_action *h = softirq_vec;
> +	int softirq_bit;
> +
> +	if (!overrun)
> +		return pending;
> +
> +	overrun &= pending;
> +	pending &= ~overrun;
> +
> +	while ((softirq_bit = ffs(overrun))) {
> +		struct vector_work *work;
> +		unsigned int vec_nr;
> +
> +		h += softirq_bit - 1;
> +		vec_nr = h - softirq_vec;
> +		work = this_cpu_ptr(&vector_work_cpu[vec_nr]);
> +		schedule_work_on(smp_processor_id(), &work->work);
> +		h++;
> +		overrun >>= softirq_bit;
> +	}
> +
> +	return pending;
> +}
> +
>  asmlinkage __visible void __softirq_entry __do_softirq(void)
>  {
>  	struct softirq_stat *sstat = this_cpu_ptr(&softirq_stat_cpu);
> @@ -321,10 +392,13 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>  
>  	pending = local_softirq_pending();
>  	if (pending) {
> -		if (overrun || need_resched())
> +		if (need_resched()) {
>  			wakeup_softirqd();
> -		else
> -			goto restart;
> +		} else {
> +			pending = do_softirq_overrun(overrun, pending);
> +			if (pending)
> +				goto restart;
> +		}
>  	}
>  
>  	lockdep_softirq_end(in_hardirq);

This way the 'overrun' branch is not triggered if we (also) need
resched, should we test for overrun first ?

Cheers,

Paolo

next prev parent reply	other threads:[~2018-01-12  9:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-12  5:35 [RFC PATCH 0/2] softirq: Per vector threading Frederic Weisbecker
2018-01-12  5:35 ` [RFC PATCH 1/2] softirq: Account time and iteration stats per vector Frederic Weisbecker
2018-01-12  6:22   ` Eric Dumazet
2018-01-12 14:34     ` Frederic Weisbecker
2018-01-12 18:12       ` Linus Torvalds
2018-01-12 18:54         ` Frederic Weisbecker
2018-01-12  5:35 ` [RFC PATCH 2/2] softirq: Per vector thread deferment Frederic Weisbecker
2018-01-12  6:27   ` Frederic Weisbecker
2018-01-12  9:07   ` Paolo Abeni [this message]
2018-01-12 14:56     ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515748045.2648.10.camel@redhat.com \
    --to=pabeni@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.levin@verizon.com \
    --cc=davem@davemloft.net \
    --cc=dima@arista.com \
    --cc=edumazet@google.com \
    --cc=frederic@kernel.org \
    --cc=hannes@stressinduktion.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rrendec@arista.com \
    --cc=sgruszka@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.