public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Hurley <peter@hurleysoftware.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Xiaotian Feng <dannyfeng@tencent.com>,
	Xiaotian Feng <xtfeng@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	linux-next@vger.kernel.org
Subject: Re: [BUG -next-20121127] kernel BUG at kernel/softirq.c:471!
Date: Wed, 28 Nov 2012 17:12:35 -0500	[thread overview]
Message-ID: <1354140755.3588.27.camel@thor> (raw)
In-Reply-To: <1354125623.2788.23.camel@thor>

[cc'ing linux-next]

On Wed, 2012-11-28 at 13:00 -0500, Peter Hurley wrote:
> Hi all,
> 
> I couldn't find the v2 patch of this on linux-kernel but this commit
> 
>   4660e32 "tasklet: ignore disabled tasklet in tasklet_action()"
> 
> BUGS in -next-20121127.
> 
> -----------[cut here ]----------
> kernel BUG at /home/peter/src/kernels/next/kernel/softirq.c:471!
> invalid opcode: 0000 [#1] PREEMPT SMP
> ....
> 
> The registers/stack dump isn't useful so I didn't include it here.
> 
> I'm still trying to track down the execution sequence that causes this,
> but the high-level trigger is a firewire bus reset.
> 
> Hopefully I'll have more information soon.

>From the original v1 of this patch [where is v2?] ...

On Fri, 2012-11-02 at 10:48 +0800, Xiaotian Feng wrote:
> We met a ksoftirqd 100% issue, the perf top shows kernel is busy
> with tasklet_action(), but no actual action is shown. From dumped
> kernel, there's only one disabled tasklet on the tasklet_vec.
> 
> tasklet_action might be handled after tasklet is disabled, this will
> make disabled tasklet stayed on tasklet_vec. tasklet_action will not
> handle disabled tasklet, but place it on the tail of tasklet_vec,
> still raise softirq for this tasklet. Things will become worse if
> device driver uses tasklet_disable on its device remove/close code.
> The disabled tasklet will stay on the vec, frequently __raise_softirq_off()
> and make ksoftirqd wakeup even if no tasklets need to be handled.
> 
> This patch introduced a new TASKLET_STATE_HI bit to indicate HI_SOFTIRQ,
> in tasklet_action(), simply ignore the disabled tasklet and don't raise
> the softirq nr. In my previous patch, I remove tasklet_hi_enable() since
> it is the same as tasklet_enable(). So only tasklet_enable() needs to be
> modified, if tasklet state is changed from disable to enable, use
> __tasklet_schedule() to put it on the right vec.
> 
> Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  include/linux/interrupt.h |   12 ++++++++++--
>  kernel/softirq.c          |   10 +++++-----
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 5e4e617..7e5bb00 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -521,7 +521,8 @@ struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
>  enum
>  {
>  	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
> -	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
> +	TASKLET_STATE_RUN,	/* Tasklet is running (SMP only) */
> +	TASKLET_STATE_HI	/* Tasklet is HI_SOFTIRQ */
>  };
>  
>  #ifdef CONFIG_SMP
> @@ -593,7 +594,14 @@ static inline void tasklet_disable(struct tasklet_struct *t)
>  static inline void tasklet_enable(struct tasklet_struct *t)
>  {
>  	smp_mb__before_atomic_dec();
> -	atomic_dec(&t->count);
> +	if (atomic_dec_and_test(&t->count)) {
> +		if (!test_bit(TASKLET_STATE_SCHED, &t->state))
> +			return;
> +		if (test_bit(TASKLET_STATE_HI, &t->state))
> +			__tasklet_hi_schedule(t);
> +		else
> 

Since this isn't protected by locks, all of the conditions that __were__
met to arrive here (t->count == 0 && t->state & TASKLET_STATE_SCHED) may
no longer be true now because another cpu may have run tasklet_action(),
so now this tasklet will be scheduled when it should not be.

Plus __tasklet_schedule() can't just be called from any cpu that happens
to be calling tasklet_enable(). What you're doing here means that the
tasklet could be scheduled on multiple cpus at the same time -- that's
not going to work.

+			__tasklet_schedule(t);
> 



      reply	other threads:[~2012-11-28 22:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02  2:48 [RFC PATCH] tasklet: ignore disabled tasklet in tasklet_action Xiaotian Feng
2012-11-05 22:52 ` Andrew Morton
2012-11-06  1:22   ` Xiaotian Feng
2012-11-06  1:37     ` Andrew Morton
2012-11-28 18:00       ` [BUG -next-20121127] kernel BUG at kernel/softirq.c:471! Peter Hurley
2012-11-28 22:12         ` Peter Hurley [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1354140755.3588.27.camel@thor \
    --to=peter@hurleysoftware.com \
    --cc=akpm@linux-foundation.org \
    --cc=dannyfeng@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=xtfeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox