* possible migration bug with hotplug cpu
@ 2009-07-08 15:48 Lucas De Marchi
2009-07-08 15:55 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-08 15:48 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
I was doing some analysis with the number of migrations in my application and
I think there's a bug in this accounting or even worse, in the migrations
mechanism when used together with cpu hotplug.
I turned off all CPUs except one using the hotplug mechanism, after what I
launghed my application that has 8 threads. Before they finish they print the
file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
migrations per thread. The function set_task_cpu is responsible for updating
the migrations counter and is called by 9 other functions. With some tests I
discovered that 95% of these migrations come from try_to_wake_up and the other
5% from pull_task and __migrate_task.
Looking at try_to_wake_up:
...
cpu = task_cpu(p);
orig_cpu = cpu;
this_cpu = smp_processor_id();
#ifdef CONFIG_SMP
if (unlikely(task_running(rq, p)))
goto out_activate;
cpu = p->sched_class->select_task_rq(p, sync); //<<<<===
if (cpu != orig_cpu) { //<<<<===
set_task_cpu(p, cpu);
...
p->sched_class->select_task_rq(p, sync) is returning a different cpu of
task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
similar for rt and normal tasks. For RT, the only possible problem could be on
find_lowest_rq, but I'm still rying to find out why. Since you have more
experience with this code, if you could give it a look I'd appreciate.
Is there any obscure reason why this behavior could be right?
Lucas De Marchi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-08 15:48 possible migration bug with hotplug cpu Lucas De Marchi
@ 2009-07-08 15:55 ` Peter Zijlstra
2009-07-08 16:05 ` Lucas De Marchi
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2009-07-08 15:55 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: Ingo Molnar, linux-kernel
On Wed, 2009-07-08 at 17:48 +0200, Lucas De Marchi wrote:
> I was doing some analysis with the number of migrations in my application and
> I think there's a bug in this accounting or even worse, in the migrations
> mechanism when used together with cpu hotplug.
>
> I turned off all CPUs except one using the hotplug mechanism, after what I
> launghed my application that has 8 threads. Before they finish they print the
> file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
> migrations per thread. The function set_task_cpu is responsible for updating
> the migrations counter and is called by 9 other functions. With some tests I
> discovered that 95% of these migrations come from try_to_wake_up and the other
> 5% from pull_task and __migrate_task.
>
> Looking at try_to_wake_up:
>
> ....
> cpu = task_cpu(p);
> orig_cpu = cpu;
> this_cpu = smp_processor_id();
>
> #ifdef CONFIG_SMP
> if (unlikely(task_running(rq, p)))
> goto out_activate;
>
> cpu = p->sched_class->select_task_rq(p, sync); //<<<<===
> if (cpu != orig_cpu) { //<<<<===
> set_task_cpu(p, cpu);
> ....
>
> p->sched_class->select_task_rq(p, sync) is returning a different cpu of
> task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
> similar for rt and normal tasks. For RT, the only possible problem could be on
> find_lowest_rq, but I'm still rying to find out why. Since you have more
> experience with this code, if you could give it a look I'd appreciate.
>
> Is there any obscure reason why this behavior could be right?
If the task last ran on a now unplugged cpu this would be correct, is
this indeed what happens?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-08 15:55 ` Peter Zijlstra
@ 2009-07-08 16:05 ` Lucas De Marchi
2009-07-08 16:39 ` Lucas De Marchi
0 siblings, 1 reply; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-08 16:05 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel
No, because the tasks are executed only after the CPUs become offline.
Lucas De Marchi
On Wed, Jul 8, 2009 at 17:55, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, 2009-07-08 at 17:48 +0200, Lucas De Marchi wrote:
> > I was doing some analysis with the number of migrations in my application and
> > I think there's a bug in this accounting or even worse, in the migrations
> > mechanism when used together with cpu hotplug.
> >
> > I turned off all CPUs except one using the hotplug mechanism, after what I
> > launghed my application that has 8 threads. Before they finish they print the
> > file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
> > migrations per thread. The function set_task_cpu is responsible for updating
> > the migrations counter and is called by 9 other functions. With some tests I
> > discovered that 95% of these migrations come from try_to_wake_up and the other
> > 5% from pull_task and __migrate_task.
> >
> > Looking at try_to_wake_up:
> >
> > ....
> > cpu = task_cpu(p);
> > orig_cpu = cpu;
> > this_cpu = smp_processor_id();
> >
> > #ifdef CONFIG_SMP
> > if (unlikely(task_running(rq, p)))
> > goto out_activate;
> >
> > cpu = p->sched_class->select_task_rq(p, sync); //<<<<===
> > if (cpu != orig_cpu) { //<<<<===
> > set_task_cpu(p, cpu);
> > ....
> >
> > p->sched_class->select_task_rq(p, sync) is returning a different cpu of
> > task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
> > similar for rt and normal tasks. For RT, the only possible problem could be on
> > find_lowest_rq, but I'm still rying to find out why. Since you have more
> > experience with this code, if you could give it a look I'd appreciate.
> >
> > Is there any obscure reason why this behavior could be right?
>
> If the task last ran on a now unplugged cpu this would be correct, is
> this indeed what happens?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-08 16:05 ` Lucas De Marchi
@ 2009-07-08 16:39 ` Lucas De Marchi
2009-07-09 11:57 ` Lucas De Marchi
0 siblings, 1 reply; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-08 16:39 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel
Following a piece of /proc/<pid>/sched, for RT tasks, running with only one
processor online:
se.wait_count : 1466
sched_info.bkl_count : 0
se.nr_migrations : 289 <<=========
se.nr_migrations_cold : 0
se.nr_failed_migrations_affine : 0
se.nr_failed_migrations_running : 7
se.nr_failed_migrations_hot : 3
se.nr_forced_migrations : 1
se.nr_forced2_migrations : 86
se.nr_wakeups : 151347
se.nr_wakeups_sync : 298
se.nr_wakeups_migrate : 265
se.nr_wakeups_local : 150516
se.nr_wakeups_remote : 831
se.nr_wakeups_affine : 253
se.nr_wakeups_affine_attempts : 1092
se.nr_wakeups_passive : 8
se.nr_wakeups_idle : 0
avg_atom : 0.002887
avg_per_cpu : 1.498609
nr_switches : 150001
nr_voluntary_switches : 150001
nr_involuntary_switches : 0
se.load.weight : 177522
policy : 1
prio : 89 <<=========
clock-delta : 84
At http://pastebin.com/pastebin.php?dl=m7c226875 there's the
/proc/sched_debug before and after running the test.
Lucas De Marchi
On Wed, Jul 8, 2009 at 18:05, Lucas De Marchi <lucas.de.marchi@gmail.com> wrote:
>
> No, because the tasks are executed only after the CPUs become offline.
>
> Lucas De Marchi
>
>
> On Wed, Jul 8, 2009 at 17:55, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, 2009-07-08 at 17:48 +0200, Lucas De Marchi wrote:
> > > I was doing some analysis with the number of migrations in my application and
> > > I think there's a bug in this accounting or even worse, in the migrations
> > > mechanism when used together with cpu hotplug.
> > >
> > > I turned off all CPUs except one using the hotplug mechanism, after what I
> > > launghed my application that has 8 threads. Before they finish they print the
> > > file /proc/<tid>/sched. I have only 1 online CPU and there are ~ 200
> > > migrations per thread. The function set_task_cpu is responsible for updating
> > > the migrations counter and is called by 9 other functions. With some tests I
> > > discovered that 95% of these migrations come from try_to_wake_up and the other
> > > 5% from pull_task and __migrate_task.
> > >
> > > Looking at try_to_wake_up:
> > >
> > > ....
> > > cpu = task_cpu(p);
> > > orig_cpu = cpu;
> > > this_cpu = smp_processor_id();
> > >
> > > #ifdef CONFIG_SMP
> > > if (unlikely(task_running(rq, p)))
> > > goto out_activate;
> > >
> > > cpu = p->sched_class->select_task_rq(p, sync); //<<<<===
> > > if (cpu != orig_cpu) { //<<<<===
> > > set_task_cpu(p, cpu);
> > > ....
> > >
> > > p->sched_class->select_task_rq(p, sync) is returning a different cpu of
> > > task_cpu(p) even if I have only 1 online CPU. In my tests this behavior is
> > > similar for rt and normal tasks. For RT, the only possible problem could be on
> > > find_lowest_rq, but I'm still rying to find out why. Since you have more
> > > experience with this code, if you could give it a look I'd appreciate.
> > >
> > > Is there any obscure reason why this behavior could be right?
> >
> > If the task last ran on a now unplugged cpu this would be correct, is
> > this indeed what happens?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-08 16:39 ` Lucas De Marchi
@ 2009-07-09 11:57 ` Lucas De Marchi
2009-07-09 12:24 ` Peter Zijlstra
2009-07-10 10:41 ` [tip:sched/urgent] sched: Reset sched stats on fork() tip-bot for Lucas De Marchi
0 siblings, 2 replies; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-09 11:57 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel
So, I found the problem.
These fields are currently not initialized upon fork. I noted that when I
updated to 2.6.31-rc2; commit 6c594c21fcb02c662f11c97be4d7d2b73060a205 was
merged into kernel by Ingo (not present yet in 2.6.30), but it only initializes
nr_migrations. Why the other fields are not initialized to 0? Even when there
are more processors, these fields may be wrong if not zeroed when a new task
is started. Below the fast way to fix it. This fixed the counters for me.
What do you think of creating a struct sched_statistics embedded into
sched_entity so we coudl memset it to zero all at once? All fields of
SCHED_STATS piece should be initialized, right?
Lucas De Marchi
diff --git a/kernel/sched.c b/kernel/sched.c
index fd3ac58..b8d75cf 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2572,15 +2572,37 @@ static void __sched_fork(struct task_struct *p)
p->se.avg_wakeup = sysctl_sched_wakeup_granularity;
#ifdef CONFIG_SCHEDSTATS
- p->se.wait_start = 0;
- p->se.sum_sleep_runtime = 0;
- p->se.sleep_start = 0;
- p->se.block_start = 0;
- p->se.sleep_max = 0;
- p->se.block_max = 0;
- p->se.exec_max = 0;
- p->se.slice_max = 0;
- p->se.wait_max = 0;
+ p->se.wait_start = 0;
+ p->se.wait_max = 0;
+ p->se.wait_count = 0;
+ p->se.wait_sum = 0;
+
+ p->se.sleep_start = 0;
+ p->se.sleep_max = 0;
+ p->se.sum_sleep_runtime = 0;
+
+ p->se.block_start = 0;
+ p->se.block_max = 0;
+ p->se.exec_max = 0;
+ p->se.slice_max = 0;
+
+ p->se.nr_migrations_cold = 0;
+ p->se.nr_failed_migrations_affine = 0;
+ p->se.nr_failed_migrations_running = 0;
+ p->se.nr_failed_migrations_hot = 0;
+ p->se.nr_forced_migrations = 0;
+ p->se.nr_forced2_migrations = 0;
+
+ p->se.nr_wakeups = 0;
+ p->se.nr_wakeups_sync = 0;
+ p->se.nr_wakeups_migrate = 0;
+ p->se.nr_wakeups_local = 0;
+ p->se.nr_wakeups_remote = 0;
+ p->se.nr_wakeups_affine = 0;
+ p->se.nr_wakeups_affine_attempts = 0;
+ p->se.nr_wakeups_passive = 0;
+ p->se.nr_wakeups_idle = 0;
+
#endif
INIT_LIST_HEAD(&p->rt.run_list);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-09 11:57 ` Lucas De Marchi
@ 2009-07-09 12:24 ` Peter Zijlstra
2009-07-09 12:55 ` Lucas De Marchi
2009-07-10 10:41 ` [tip:sched/urgent] sched: Reset sched stats on fork() tip-bot for Lucas De Marchi
1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2009-07-09 12:24 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: Ingo Molnar, linux-kernel
On Thu, 2009-07-09 at 13:57 +0200, Lucas De Marchi wrote:
> So, I found the problem.
>
> These fields are currently not initialized upon fork. I noted that when I
> updated to 2.6.31-rc2; commit 6c594c21fcb02c662f11c97be4d7d2b73060a205 was
> merged into kernel by Ingo (not present yet in 2.6.30), but it only initializes
> nr_migrations. Why the other fields are not initialized to 0? Even when there
> are more processors, these fields may be wrong if not zeroed when a new task
> is started. Below the fast way to fix it. This fixed the counters for me.
Ah, awesome. I hadn't had time to look in detail yet. Thanks!
> What do you think of creating a struct sched_statistics embedded into
> sched_entity so we coudl memset it to zero all at once? All fields of
> SCHED_STATS piece should be initialized, right?
Sounds like a sane plan.
One of the things I have wanted to do for a long time is to rename
sched_entity to sched_fair_entity, and do something like:
struct sched_entity {
struct sched_common_entity common;
union {
struct sched_fair_entity fair;
struct sched_rt_entity rt;
};
};
I imagine we can add struct sched_statistics to the end of this as well.
The reason I haven't come around to doing this is that it takes a bit of
time to refactor the code and find all the common bits. So its been on
my todo list like forever.
Can I persuade you to look at this? :-)
I'll queue the below, can I add:
Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-09 12:24 ` Peter Zijlstra
@ 2009-07-09 12:55 ` Lucas De Marchi
2009-07-09 13:15 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Lucas De Marchi @ 2009-07-09 12:55 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel
On Thu, Jul 9, 2009 at 14:24, Peter Zijlstra<peterz@infradead.org> wrote:
> One of the things I have wanted to do for a long time is to rename
> sched_entity to sched_fair_entity, and do something like:
>
> struct sched_entity {
> struct sched_common_entity common;
>
> union {
> struct sched_fair_entity fair;
> struct sched_rt_entity rt;
> };
> };
>
> I imagine we can add struct sched_statistics to the end of this as well.
>
> The reason I haven't come around to doing this is that it takes a bit of
> time to refactor the code and find all the common bits. So its been on
> my todo list like forever.
>
> Can I persuade you to look at this? :-)
It looks like a very good idea. As you told, it will take a time to refactor
the code. I can look at this, but only on the end of next week.
> I'll queue the below, can I add:
>
> Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
>
> ?
Sure. I'm not very habituated yet with it... do I re-send the email
or you add it by yourself?
thanks
Lucas De Marchi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: possible migration bug with hotplug cpu
2009-07-09 12:55 ` Lucas De Marchi
@ 2009-07-09 13:15 ` Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2009-07-09 13:15 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: Ingo Molnar, linux-kernel
On Thu, 2009-07-09 at 14:55 +0200, Lucas De Marchi wrote:
> On Thu, Jul 9, 2009 at 14:24, Peter Zijlstra<peterz@infradead.org> wrote:
> > One of the things I have wanted to do for a long time is to rename
> > sched_entity to sched_fair_entity, and do something like:
> >
> > struct sched_entity {
> > struct sched_common_entity common;
> >
> > union {
> > struct sched_fair_entity fair;
> > struct sched_rt_entity rt;
> > };
> > };
> >
> > I imagine we can add struct sched_statistics to the end of this as well.
> >
> > The reason I haven't come around to doing this is that it takes a bit of
> > time to refactor the code and find all the common bits. So its been on
> > my todo list like forever.
> >
> > Can I persuade you to look at this? :-)
>
> It looks like a very good idea. As you told, it will take a time to refactor
> the code. I can look at this, but only on the end of next week.
Sure no hurry, its been without that for like forever, so a few more
weeks won't harm anybody, thanks!
> > I'll queue the below, can I add:
> >
> > Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
> >
> > ?
>
> Sure. I'm not very habituated yet with it... do I re-send the email
> or you add it by yourself?
I've added it, simply remember to add that for future patches :-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:sched/urgent] sched: Reset sched stats on fork()
2009-07-09 11:57 ` Lucas De Marchi
2009-07-09 12:24 ` Peter Zijlstra
@ 2009-07-10 10:41 ` tip-bot for Lucas De Marchi
1 sibling, 0 replies; 9+ messages in thread
From: tip-bot for Lucas De Marchi @ 2009-07-10 10:41 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, a.p.zijlstra, lucas.de.marchi, tglx,
mingo
Commit-ID: 7793527b90d9418211f4fe8464cc1dcb1631ea1b
Gitweb: http://git.kernel.org/tip/7793527b90d9418211f4fe8464cc1dcb1631ea1b
Author: Lucas De Marchi <lucas.de.marchi@gmail.com>
AuthorDate: Thu, 9 Jul 2009 13:57:20 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 10 Jul 2009 10:43:29 +0200
sched: Reset sched stats on fork()
The sched_stat fields are currently not reset upon fork.
Ingo's recent commit 6c594c21fcb02c662f11c97be4d7d2b73060a205
did reset nr_migrations, but it didn't reset any of the
others.
This patch resets all sched_stat fields on fork.
Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <193b0f820907090457s7a3662f4gcdecdc22fcae857b@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 40 +++++++++++++++++++++++++++++++---------
1 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index a17f3d9..c4549bd 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2572,15 +2572,37 @@ static void __sched_fork(struct task_struct *p)
p->se.avg_wakeup = sysctl_sched_wakeup_granularity;
#ifdef CONFIG_SCHEDSTATS
- p->se.wait_start = 0;
- p->se.sum_sleep_runtime = 0;
- p->se.sleep_start = 0;
- p->se.block_start = 0;
- p->se.sleep_max = 0;
- p->se.block_max = 0;
- p->se.exec_max = 0;
- p->se.slice_max = 0;
- p->se.wait_max = 0;
+ p->se.wait_start = 0;
+ p->se.wait_max = 0;
+ p->se.wait_count = 0;
+ p->se.wait_sum = 0;
+
+ p->se.sleep_start = 0;
+ p->se.sleep_max = 0;
+ p->se.sum_sleep_runtime = 0;
+
+ p->se.block_start = 0;
+ p->se.block_max = 0;
+ p->se.exec_max = 0;
+ p->se.slice_max = 0;
+
+ p->se.nr_migrations_cold = 0;
+ p->se.nr_failed_migrations_affine = 0;
+ p->se.nr_failed_migrations_running = 0;
+ p->se.nr_failed_migrations_hot = 0;
+ p->se.nr_forced_migrations = 0;
+ p->se.nr_forced2_migrations = 0;
+
+ p->se.nr_wakeups = 0;
+ p->se.nr_wakeups_sync = 0;
+ p->se.nr_wakeups_migrate = 0;
+ p->se.nr_wakeups_local = 0;
+ p->se.nr_wakeups_remote = 0;
+ p->se.nr_wakeups_affine = 0;
+ p->se.nr_wakeups_affine_attempts = 0;
+ p->se.nr_wakeups_passive = 0;
+ p->se.nr_wakeups_idle = 0;
+
#endif
INIT_LIST_HEAD(&p->rt.run_list);
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-07-10 10:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-08 15:48 possible migration bug with hotplug cpu Lucas De Marchi
2009-07-08 15:55 ` Peter Zijlstra
2009-07-08 16:05 ` Lucas De Marchi
2009-07-08 16:39 ` Lucas De Marchi
2009-07-09 11:57 ` Lucas De Marchi
2009-07-09 12:24 ` Peter Zijlstra
2009-07-09 12:55 ` Lucas De Marchi
2009-07-09 13:15 ` Peter Zijlstra
2009-07-10 10:41 ` [tip:sched/urgent] sched: Reset sched stats on fork() tip-bot for Lucas De Marchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox