public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org, fweisbec@gmail.com,
	Li Zefan <lizf@cn.fujitsu.com>,
	Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH RFC] softirq: fix ksoftirq starved
Date: Fri, 12 Jun 2009 17:51:24 +0800	[thread overview]
Message-ID: <4A32251C.8060001@cn.fujitsu.com> (raw)
In-Reply-To: <4A31BF1D.1050400@cn.fujitsu.com>

Lai Jiangshan wrote:
> Ingo Molnar wrote:
>> * Zhaolei <zhaolei@cn.fujitsu.com> wrote:
>>
>>> * From: "Xiao Guangrong" <xiaoguangrong@cn.fujitsu.com>
>>>> Steven Rostedt wrote:
>>>>> On Thu, 14 May 2009, Mathieu Desnoyers wrote:
>>>>>>> From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
>>>>>>>
>>>>>>> This patch is modified from Mathieu Desnoyers' patch. The original patch
>>>>>>> can be found here: 
>>>>>>> http://marc.info/?l=linux-kernel&m=123791201816245&w=2
>>>>>>> This tracepoint can trace the time stamp when softirq action is raised. 
>>>>>>>
>>>>>>> Changelog for v1 -> v2: 
>>>>>>> 1: Use TRACE_EVENT instead of DEFINE_TRACE
>>>>>>> 2: Move the tracepoint from raise_softirq_irqoff() to
>>>>>>>    __raise_softirq_irqoff()
>>>>>>>
>>>>>>> Changelog for v2 -> v3: 
>>>>>>> Move the definition of __raise_softifq_irqoff() to .c file when
>>>>>>> CONFIG_TRACEPOINTS is enabled, to avoid recursive includes
>>>>>>>
>>>>>>> Changelog for v3 -> v4: 
>>>>>>> 1: Come back to v2, and use forward declarations to avoid
>>>>>>>    recursive includes as Mathieu's suggestion
>>>>>>> 2: Modifiy the tracepoint name
>>>>>>> 3: Add comments for this tracepoint
>>>>>>>
>>>>>> This is a step in the right direction, but please see my email to Lai
>>>>>> about the fact that this assumes correct and undocumented include
>>>>>> dependencies in kernel/trace/events.c. Not explicitely stating the
>>>>>> include dependencies is a build error waiting to happen.
>>>>>>
>>>>>> Including interrupt.h under a ifdef would allow keeping track of
>>>>>> TRACE_EVENT specific build dependencies neatly on a per header basis.
>>>>> This is all moot, the events.c file no longer exists and as not an issue.
>>>>>
>>>> As Steve's says, use ftrace in ftrace.h not in events.c now. 
>>>> So, this mistake does not exist.
>>>> Dose this patch has other error? I expect for your views.
>>>>
>>>> Thanks for your review, is great help to me. ;-) 
>>> Hello,
>>>
>>> It seems Mathieu has no other comments on this patch now.
>>> Ingo, what is your opinion on this patch?
>> There's a complication: this area of the softirq code needs fixes 
>> (unrelated to tracing).
>>  
>> This API:
>>
>> inline void raise_softirq_irqoff(unsigned int nr)
>> {
>>         __raise_softirq_irqoff(nr);
>>
>>         /*
>>          * If we're in an interrupt or softirq, we're done
>>          * (this also catches softirq-disabled code). We will
>>          * actually run the softirq once we return from
>>          * the irq or softirq.
>>          *
>>          * Otherwise we wake up ksoftirqd to make sure we
>>          * schedule the softirq soon.
>>          */
>>         if (!in_interrupt())
>>                 wakeup_softirqd();
>> }
>>
>> is broken with RT tasks (as recently reported to lkml), as when a 
>> real-time task wakes up ksoftirqd (which has lower priority) it wont 
>> execute and we starve softirq execution.
>>
>> The proper solution would be to have a new API:
>>
>> 	raise_softirq_check()
>>
>> and to remove the wakeup_softirqd() hack from raise_softirq_irqoff() 
>> - and put raise_softirq_check() to all places that use 
>> raise_softirq*() from process context.
> 
> It's a nice solution. But I think it would be nicer when it is changed a little.
> 
> The new API raise_softirq_check() will become a very hard _burden_ to
> the users of raise_softirq*(). They have to find out a proper place to place
> the "raise_softirq_check();". It's not an easy things, functions are complicated
> and hard to be determined WHERE is the process context and WHEN(a function may be
> called from multi kinds of context).
> 
> Instead, I prefer that raise_softirq_check() is hidden from users. We call
> raise_softirq_check() from schedule(), it will handle the un-handle softirqs in time
> (if un-handle softirqs are too much, it'll wakeup the ksoftirqd).
> 
> 
> Lai
> 
> 
>>  
>> raise_softirq_check() would execute softirq handlers from process 
>> context, if there's any pending ones. It has to be called outside of 
>> bh critical sections - i.e. often a bit after the raise_softirq() 
>> has been done.
>>
>> __raise_softirq_irqoff() would be made private to kernel/softirq.c, 
>> and we'd only have two public APIs to trigger softirqs: 
>> raise_softirq() and raise_softirq_irqoff(). Both just set the 
>> pending flag and dont do any wakeup.
>>
>> As a side-effect of these fixes, the tracepoints will be sorted out 
>> as well - there wont be any need to hack into 
>> __raise_softirq_irqoff().
>>
>> 	Ingo
>>
>>
>>
> 

Subject: [PATCH RFC] softirq: fix ksoftirq starved

The API:
void raise_softirq_irqoff(unsigned int nr)
{
	__raise_softirq_irqoff(nr);

	/*
	 * If we're in an interrupt or softirq, we're done
	 * (this also catches softirq-disabled code). We will
	 * actually run the softirq once we return from
	 * the irq or softirq.
	 *
	 * Otherwise we wake up ksoftirqd to make sure we
	 * schedule the softirq soon.
	 */
	if (!in_interrupt())
		wakeup_softirqd();
}

is broken with RT tasks, as when a real-time task wakes up ksoftirqd
(which has lower priority) it wont execute and we starve softirq execution.

To make pending softirq is called in time, we check and run the pending
softirq in schedule().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index dd574d5..f59e552 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -359,6 +359,7 @@ struct softirq_action
 
 asmlinkage void do_softirq(void);
 asmlinkage void __do_softirq(void);
+extern void schedule_softirq_check(void);
 extern void open_softirq(int nr, void (*action)(struct softirq_action *));
 extern void softirq_init(void);
 #define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
diff --git a/kernel/sched.c b/kernel/sched.c
index cfb0456..21b5f67 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5307,6 +5307,7 @@ need_resched:
 	release_kernel_lock(prev);
 need_resched_nonpreemptible:
 
+	schedule_softirq_check();
 	schedule_debug(prev);
 
 	if (sched_feat(HRTICK))
diff --git a/kernel/softirq.c b/kernel/softirq.c
index b41fb71..59d142b 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -55,6 +55,7 @@ EXPORT_SYMBOL(irq_stat);
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
 static DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+static DEFINE_PER_CPU(int, schedule_softirq_check);
 
 char *softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK",
@@ -76,6 +77,14 @@ void wakeup_softirqd(void)
 		wake_up_process(tsk);
 }
 
+static void set_schedule_softirq_check(void)
+{
+	if (current != __get_cpu_var(ksoftirqd)) {
+		__get_cpu_var(schedule_softirq_check) = 1;
+		set_tsk_need_resched(current);
+	}
+}
+
 /*
  * This one is for softirq.c-internal use,
  * where hardirqs are disabled legitimately:
@@ -243,6 +252,7 @@ restart:
 
 	lockdep_softirq_exit();
 
+	__get_cpu_var(schedule_softirq_check) = 0;
 	account_system_vtime(current);
 	_local_bh_enable();
 }
@@ -269,6 +279,12 @@ asmlinkage void do_softirq(void)
 
 #endif
 
+void schedule_softirq_check(void)
+{
+	if (__get_cpu_var(schedule_softirq_check))
+		do_softirq();
+}
+
 /*
  * Enter an interrupt context.
  */
@@ -323,11 +339,16 @@ inline void raise_softirq_irqoff(unsigned int nr)
 	 * actually run the softirq once we return from
 	 * the irq or softirq.
 	 *
-	 * Otherwise we wake up ksoftirqd to make sure we
-	 * schedule the softirq soon.
+	 * Otherwise we check and run the pending softirq in schedule().
+	 *
+	 * NOTE: It's sometimes helpless to wakeup the ksoftirqd here.
+	 *       Because when current task is a real-time task(which
+	 *       has higher priority) or a real-time task is on standby
+	 *       in this cpu, ksoftirqd won't execute, we starve softirq
+	 *       execution.
 	 */
 	if (!in_interrupt())
-		wakeup_softirqd();
+		set_schedule_softirq_check();
 }
 
 void raise_softirq(unsigned int nr)


  reply	other threads:[~2009-06-12  9:50 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-05  6:41 [PATCH v3] ftrace: add a tracepoint for __raise_softirq_irqoff() Xiao Guangrong
2009-05-05  6:53 ` Li Zefan
2009-05-07  0:57   ` Xiao Guangrong
2009-05-05 16:16 ` Mathieu Desnoyers
2009-05-11  7:28   ` Xiao Guangrong
2009-05-11 13:40     ` Mathieu Desnoyers
2009-05-11 14:09       ` Steven Rostedt
2009-05-11 14:27         ` Mathieu Desnoyers
2009-05-11 14:53           ` Steven Rostedt
2009-05-11 15:13             ` Mathieu Desnoyers
2009-05-12  9:50               ` Xiao Guangrong
2009-05-12 13:14                 ` Mathieu Desnoyers
2009-05-14 10:53                 ` [PATCH v4] " Xiao Guangrong
2009-05-14 12:40                   ` Mathieu Desnoyers
2009-05-14 13:26                     ` Steven Rostedt
2009-05-14 13:51                       ` Mathieu Desnoyers
2009-05-15  1:53                       ` Xiao Guangrong
2009-05-18  3:06                         ` Zhaolei
2009-05-19  8:24                           ` Ingo Molnar
2009-05-21  5:39                             ` Zhaolei
2009-06-12  2:36                             ` Lai Jiangshan
2009-06-12  9:51                               ` Lai Jiangshan [this message]
2009-06-17 14:53                                 ` [PATCH RFC] softirq: fix ksoftirq starved Ingo Molnar
2009-06-18  3:19                                   ` Lai Jiangshan
2009-06-18  8:22                                     ` Peter Zijlstra
2009-06-20 15:48                                     ` Ingo Molnar
2009-07-03  9:35                             ` [PATCH v4] ftrace: add a tracepoint for __raise_softirq_irqoff() Lai Jiangshan
2009-07-03  9:44                               ` Ingo Molnar
2009-07-09 12:58                                 ` Lai Jiangshan
2009-05-14  2:44       ` [PATCH v3] " Lai Jiangshan
2009-05-14  3:50         ` Mathieu Desnoyers
2009-05-14  6:06           ` Lai Jiangshan
2009-05-14  8:05             ` Xiao Guangrong
2009-05-14 12:36             ` Mathieu Desnoyers
2009-05-14 13:25               ` Steven Rostedt
2009-05-06 13:49 ` Jason Baron
2009-05-07  1:16   ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A32251C.8060001@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=xiaoguangrong@cn.fujitsu.com \
    --cc=zhaolei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox