public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <rostedt@goodmis.org>, <fweisbec@gmail.com>, <mingo@redhat.com>,
	<acme@ghostprotocols.net>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/8] sched/core: Skip wakeup when task is already running.
Date: Tue, 22 Apr 2014 20:56:11 +0900	[thread overview]
Message-ID: <535658DB.2090801@cn.fujitsu.com> (raw)
In-Reply-To: <534E59FC.2090001@cn.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 4338 bytes --]

On 04/16/2014 07:22 PM, Dongsheng Yang wrote:
> On 04/15/2014 10:53 PM, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2014 at 09:32:53PM +0900, Dongsheng Yang wrote:
>>
>> How can you get there with ->state == RUNNING? try_to_wake_up*() bail
>> when !(->state & state).
> Yes, try_to_wake_up() did this check. But other callers would miss it.
>
> With the following code ,I can get the actual message of waking up
> a running task
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9f63275..1369cae 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1418,8 +1418,10 @@ static void ttwu_activate(struct rq *rq, struct 
> task_stru
>  static void
>  ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags)
>  {
> -       if (p->state == TASK_RUNNING)
> +       if (p->state == TASK_RUNNING) {
> +               printk("Wakeup a running task.");
>                 return;
> +       }
>
>         check_preempt_curr(rq, p, wake_flags);
>         trace_sched_wakeup(p, true);
>
>
> # grep "Wakeup" /var/log/messages
> Apr 15 20:16:21 localhost kernel: [    5.436505] Wakeup a running task.
> Apr 15 20:16:21 localhost kernel: [    7.776042] Wakeup a running task.
> Apr 15 20:16:21 localhost kernel: [    9.324274] Wakeup a running task.

Hi Peter, after some more investigation, I think I got the problem, 
which is that
some other task set p->state to TASK_RUNNING without holding p->pi_lock.

Scenario as attached graph shown, if some other task set p->state to
TASK_RUNNING after the check  if (! (p->state & state)), then we are
wasting time to wake up a running task in try_to_wake_up().

If the analyse is right, I think there are two methods to solve this 
problem:
     * Skip in ttwu_do_wakeup() when p->state is running, as what my patch
did.
     * Add a locking when we set p->state, lots of work to do and I am 
afraid
it will hurt the performance of kernel.


The following message is the backtrace info I got when it happened:

(gdb) bt
#0  try_to_wake_up (p=0xffff88027e651930, state=1, wake_flags=0)
     at kernel/sched/core.c:1605
#1  0xffffffff81099532 in default_wake_function (curr=<value optimized 
out>,
     mode=<value optimized out>, wake_flags=<value optimized out>,
     key=<value optimized out>) at kernel/sched/core.c:2853
#2  0xffffffff810aa489 in __wake_up_common (q=0xffff88027f03f210, mode=1,
     nr_exclusive=1, wake_flags=0, key=0x4) at kernel/sched/wait.c:75
#3  0xffffffff810aa838 in __wake_up (q=0xffff88027f03f210, mode=1,
     nr_exclusive=1, key=0x4) at kernel/sched/wait.c:97
#4  0xffffffff813cd0a4 in n_tty_check_unthrottle (tty=0xffff88027f03ec00,
     file=0xffff880278ab1b00,
     buf=0x7fff0fcf9720 "\r\nyum install -y 
./a/alsa-plugins-pulseaudio-1.0.27-1.fc19.x86_64.rpml -y 
./a/alsa-lib-1.0.27.1-2.fc19.x86_64.rpm.noarch.rpm\r\nyum install -y 
./a/albatross-xfwm4-theme-1.2-5.fc19.noarch.rpm\r\nyum instal"...,
     nr=16315) at drivers/tty/n_tty.c:280
#5  n_tty_read (tty=0xffff88027f03ec00, file=0xffff880278ab1b00,
     buf=0x7fff0fcf9720 "\r\nyum install -y 
./a/alsa-plugins-pulseaudio-1.0.27-1.fc19.x86_64.rpml -y 
./a/alsa-lib-1.0.27.1-2.fc19.x86_64.rpm.noarch.rpm\r\nyum install -y 
./a/albatross-xfwm4-theme-1.2-5.fc19.noarch.rpm\r\nyum instal"...,
     nr=16315) at drivers/tty/n_tty.c:2259
#6  0xffffffff813c5667 in tty_read (file=0xffff880278ab1b00,
     buf=0x7fff0fcf9720 "\r\nyum install -y 
./a/alsa-plugins-pulseaudio-1.0.27-1.fc19.x86_64.rpml -y 
./a/alsa-lib-1.0.27.1-2.fc19.x86_64.rpm.noarch.rpm\r\nyum in---Type 
<return> to continue, or q <return> to quit---q
Quit
(gdb) p p->state   ------> Currently, p->state is TASK_RUNNING.
$1 = 0
(gdb) l
1600
1601        success = 1; /* we're going to change ->state */
1602        cpu = task_cpu(p);
1603
1604        if (p->state == TASK_RUNNING) {
1605            printk("Wake up a running task.");
1606        }
1607        if (p->on_rq && ttwu_remote(p, wake_flags))
1608            goto stat;
1609
(gdb)
>
> So, I think there are some caller of ttwu_do_wakeup() is attempt to wake
> up a running task.
>> .
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> .
>


[-- Attachment #2: graph --]
[-- Type: text/plain, Size: 279 bytes --]

			CPU1							CPU2
	raw_spin_lock_irqsave(&p->pi_lock, flags);		set_task_state(tsk, TASK_INTERRUPTIBLE);
	if (!(p->state & state)) ---> TASK_INTERRUPTIBLE			 |
                goto out;							 |
	cpu = task_cpu(p);					set_task_state(tsk, TASK_RUNNING);
			... ---> TASK_RUNNING

  reply	other threads:[~2014-04-22 12:55 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 12:32 [PATCH 0/8] perf sched: Add trace event for sched wait Dongsheng Yang
2014-04-15 12:32 ` [PATCH 1/8] sched & trace: Add a trace event for wait Dongsheng Yang
2014-04-15 13:49   ` Peter Zijlstra
2014-04-16 14:23     ` Steven Rostedt
2014-04-15 12:32 ` [PATCH 2/8] sched/wait: Add trace point before add task into wait queue Dongsheng Yang
2014-04-15 12:32 ` [PATCH 3/8] sched/wait: Use __add_wait_queue{_tail}_exclusive() as possible Dongsheng Yang
2014-04-15 13:49   ` Peter Zijlstra
2014-04-16  9:51     ` Dongsheng Yang
2014-04-15 12:32 ` [PATCH 4/8] sched/core: Skip wakeup when task is already running Dongsheng Yang
2014-04-15 13:53   ` Peter Zijlstra
2014-04-16 10:22     ` Dongsheng Yang
2014-04-22 11:56       ` Dongsheng Yang [this message]
2014-04-22 13:23         ` Peter Zijlstra
2014-04-22 17:10         ` bsegall
2014-04-22 17:53           ` Steven Rostedt
2014-04-22 18:18           ` Peter Zijlstra
2014-05-05  6:32             ` Dongsheng Yang
2014-05-05  6:34               ` [PATCH] sched: Move the wakeup tracepoint from ttwu_do_wakeup() to ttwu_activate() Dongsheng Yang
2014-05-05 14:00                 ` Steven Rostedt
2014-05-06  0:19                   ` Dongsheng Yang
2014-05-06  0:26                     ` Dongsheng Yang
2014-05-06  2:06                     ` Steven Rostedt
2014-05-06  1:29                       ` Dongsheng Yang
2014-05-06  1:52                         ` [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not Dongsheng Yang
2014-05-09  0:16                           ` Dongsheng Yang
2014-05-09  1:27                             ` Steven Rostedt
2014-05-10 15:29                           ` Peter Zijlstra
     [not found]                             ` <536F90BE.2080806@gmail.com>
2014-05-11 15:24                               ` Fwd: " Dongsheng Yang
2014-05-11 16:35                                 ` Peter Zijlstra
2014-05-11 18:52                                   ` Steven Rostedt
2014-05-12  6:47                                     ` Peter Zijlstra
2014-05-12  8:58                                       ` Dongsheng Yang
2014-05-12 14:09                                       ` Steven Rostedt
2014-05-12 15:09                                         ` Peter Zijlstra
2014-05-12 15:17                                           ` Steven Rostedt
2014-05-12 15:28                                             ` Peter Zijlstra
2014-04-15 12:32 ` [PATCH 5/8] perf tools: record and process sched:sched_wait event Dongsheng Yang
2014-04-15 12:32 ` [PATCH 6/8] perf tools: add missing event for perf sched record Dongsheng Yang
2014-04-15 12:32 ` [PATCH 7/8] perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space Dongsheng Yang
2014-04-15 12:32 ` [PATCH 8/8] perf tools: Clarify the output of perf sched map Dongsheng Yang
2014-04-15 13:54 ` [PATCH 0/8] perf sched: Add trace event for sched wait Peter Zijlstra
2014-04-16 10:28   ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=535658DB.2090801@cn.fujitsu.com \
    --to=yangds.fnst@cn.fujitsu.com \
    --cc=acme@ghostprotocols.net \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox