* Regression: turbostat stops working after suspend/resume cycle
@ 2015-05-17 15:48 Gabriele Mazzotta
2015-05-18 6:45 ` Ingo Molnar
2015-05-18 6:48 ` Regression: turbostat stops working after suspend/resume cycle\ Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Gabriele Mazzotta @ 2015-05-17 15:48 UTC (permalink / raw)
To: juri.lelli, mingo, peterz; +Cc: linux-kernel, len.brown, andrey.semin
Hi,
I've recently noticed that if I suspend and resume my laptop, I can no
longer execute turbostat. This is what I get when I try to start it:
# turbostat
Could not migrate to CPU 1
turbostat: re-initialized with num_cpus 4
Could not migrate to CPU 1
Since everything works as expected with v4.0, I ran a bisection and
found that commit 3c18d447b3b36a8d ("sched/core: Check for available
DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression.
I don't know if there's something else affected by that change, but
I can consistently reproduce the bug with turbostat.
I can provide more info if needed.
Regards,
Gabriele
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Regression: turbostat stops working after suspend/resume cycle 2015-05-17 15:48 Regression: turbostat stops working after suspend/resume cycle Gabriele Mazzotta @ 2015-05-18 6:45 ` Ingo Molnar 2015-05-18 7:12 ` Gabriele Mazzotta 2015-05-18 6:48 ` Regression: turbostat stops working after suspend/resume cycle\ Peter Zijlstra 1 sibling, 1 reply; 5+ messages in thread From: Ingo Molnar @ 2015-05-18 6:45 UTC (permalink / raw) To: Gabriele Mazzotta Cc: juri.lelli, mingo, peterz, linux-kernel, len.brown, andrey.semin * Gabriele Mazzotta <gabriele.mzt@gmail.com> wrote: > Hi, > > I've recently noticed that if I suspend and resume my laptop, I can no > longer execute turbostat. This is what I get when I try to start it: > # turbostat > Could not migrate to CPU 1 > turbostat: re-initialized with num_cpus 4 > Could not migrate to CPU 1 > > Since everything works as expected with v4.0, I ran a bisection and > found that commit 3c18d447b3b36a8d ("sched/core: Check for available > DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression. > > I don't know if there's something else affected by that change, but > I can consistently reproduce the bug with turbostat. > > I can provide more info if needed. Does this commit: 533445c6e533 sched/core: Fix regression in cpuset_cpu_inactive() for suspend which is already in Linus's tree, and which should be part of -rc4, fix it? Also attached below. Thanks, Ingo ====================> >From 533445c6e53368569e50ab3fb712230c03d523f3 Mon Sep 17 00:00:00 2001 From: Omar Sandoval <osandov@osandov.com> Date: Mon, 4 May 2015 03:09:36 -0700 Subject: [PATCH] sched/core: Fix regression in cpuset_cpu_inactive() for suspend Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that caused a regression in setting a CPU inactive during suspend. I ran into this when a program was failing pthread_setaffinity_np() with EINVAL after a suspend+wake up. A simple reproducer: $ ./a.out sched_setaffinity: Success $ systemctl suspend $ ./a.out sched_setaffinity: Invalid argument ... where ./a.out is: #define _GNU_SOURCE #include <errno.h> #include <sched.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(void) { long num_cores; cpu_set_t cpu_set; int ret; num_cores = sysconf(_SC_NPROCESSORS_ONLN); CPU_ZERO(&cpu_set); CPU_SET(num_cores - 1, &cpu_set); errno = 0; ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set); perror("sched_setaffinity"); return ret ? EXIT_FAILURE : EXIT_SUCCESS; } The mistake is that suspend is handled in the action == CPU_DOWN_PREPARE_FROZEN case of the switch statement in cpuset_cpu_inactive(). However, the commit in question masked out CPU_TASKS_FROZEN from the action, making this case dead. The fix is straightforward. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()") Link: http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osandov@osandov.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 34db9bf892a3..57bd333bc4ab 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6999,27 +6999,23 @@ static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long action, unsigned long flags; long cpu = (long)hcpu; struct dl_bw *dl_b; + bool overflow; + int cpus; - switch (action & ~CPU_TASKS_FROZEN) { + switch (action) { case CPU_DOWN_PREPARE: - /* explicitly allow suspend */ - if (!(action & CPU_TASKS_FROZEN)) { - bool overflow; - int cpus; - - rcu_read_lock_sched(); - dl_b = dl_bw_of(cpu); + rcu_read_lock_sched(); + dl_b = dl_bw_of(cpu); - raw_spin_lock_irqsave(&dl_b->lock, flags); - cpus = dl_bw_cpus(cpu); - overflow = __dl_overflow(dl_b, cpus, 0, 0); - raw_spin_unlock_irqrestore(&dl_b->lock, flags); + raw_spin_lock_irqsave(&dl_b->lock, flags); + cpus = dl_bw_cpus(cpu); + overflow = __dl_overflow(dl_b, cpus, 0, 0); + raw_spin_unlock_irqrestore(&dl_b->lock, flags); - rcu_read_unlock_sched(); + rcu_read_unlock_sched(); - if (overflow) - return notifier_from_errno(-EBUSY); - } + if (overflow) + return notifier_from_errno(-EBUSY); cpuset_update_active_cpus(false); break; case CPU_DOWN_PREPARE_FROZEN: ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Regression: turbostat stops working after suspend/resume cycle 2015-05-18 6:45 ` Ingo Molnar @ 2015-05-18 7:12 ` Gabriele Mazzotta 0 siblings, 0 replies; 5+ messages in thread From: Gabriele Mazzotta @ 2015-05-18 7:12 UTC (permalink / raw) To: Ingo Molnar Cc: juri.lelli, mingo, peterz, linux-kernel, len.brown, andrey.semin On Monday 18 May 2015 08:45:52 Ingo Molnar wrote: > > * Gabriele Mazzotta <gabriele.mzt@gmail.com> wrote: > > > Hi, > > > > I've recently noticed that if I suspend and resume my laptop, I can no > > longer execute turbostat. This is what I get when I try to start it: > > # turbostat > > Could not migrate to CPU 1 > > turbostat: re-initialized with num_cpus 4 > > Could not migrate to CPU 1 > > > > Since everything works as expected with v4.0, I ran a bisection and > > found that commit 3c18d447b3b36a8d ("sched/core: Check for available > > DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression. > > > > I don't know if there's something else affected by that change, but > > I can consistently reproduce the bug with turbostat. > > > > I can provide more info if needed. > > Does this commit: > > 533445c6e533 sched/core: Fix regression in cpuset_cpu_inactive() for suspend > > which is already in Linus's tree, and which should be part of -rc4, > fix it? Also attached below. Yes, this fixes the problem, thanks. Sorry for not noticing it. Gabriele ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regression: turbostat stops working after suspend/resume cycle\ 2015-05-17 15:48 Regression: turbostat stops working after suspend/resume cycle Gabriele Mazzotta 2015-05-18 6:45 ` Ingo Molnar @ 2015-05-18 6:48 ` Peter Zijlstra 2015-05-18 7:19 ` Gabriele Mazzotta 1 sibling, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2015-05-18 6:48 UTC (permalink / raw) To: Gabriele Mazzotta Cc: juri.lelli, mingo, linux-kernel, len.brown, andrey.semin On Sun, May 17, 2015 at 05:48:44PM +0200, Gabriele Mazzotta wrote: > Hi, > > I've recently noticed that if I suspend and resume my laptop, I can no > longer execute turbostat. This is what I get when I try to start it: > # turbostat > Could not migrate to CPU 1 > turbostat: re-initialized with num_cpus 4 > Could not migrate to CPU 1 > > Since everything works as expected with v4.0, I ran a bisection and > found that commit 3c18d447b3b36a8d ("sched/core: Check for available > DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression. > > I don't know if there's something else affected by that change, but > I can consistently reproduce the bug with turbostat. This should be fixed by the below commit which is already in Linus' tree. --- commit 533445c6e53368569e50ab3fb712230c03d523f3 Author: Omar Sandoval <osandov@osandov.com> Date: Mon May 4 03:09:36 2015 -0700 sched/core: Fix regression in cpuset_cpu_inactive() for suspend Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that caused a regression in setting a CPU inactive during suspend. I ran into this when a program was failing pthread_setaffinity_np() with EINVAL after a suspend+wake up. A simple reproducer: $ ./a.out sched_setaffinity: Success $ systemctl suspend $ ./a.out sched_setaffinity: Invalid argument ... where ./a.out is: #define _GNU_SOURCE #include <errno.h> #include <sched.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(void) { long num_cores; cpu_set_t cpu_set; int ret; num_cores = sysconf(_SC_NPROCESSORS_ONLN); CPU_ZERO(&cpu_set); CPU_SET(num_cores - 1, &cpu_set); errno = 0; ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set); perror("sched_setaffinity"); return ret ? EXIT_FAILURE : EXIT_SUCCESS; } The mistake is that suspend is handled in the action == CPU_DOWN_PREPARE_FROZEN case of the switch statement in cpuset_cpu_inactive(). However, the commit in question masked out CPU_TASKS_FROZEN from the action, making this case dead. The fix is straightforward. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()") Link: http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osandov@osandov.com Signed-off-by: Ingo Molnar <mingo@kernel.org> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 34db9bf892a3..57bd333bc4ab 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6999,27 +6999,23 @@ static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long action, unsigned long flags; long cpu = (long)hcpu; struct dl_bw *dl_b; + bool overflow; + int cpus; - switch (action & ~CPU_TASKS_FROZEN) { + switch (action) { case CPU_DOWN_PREPARE: - /* explicitly allow suspend */ - if (!(action & CPU_TASKS_FROZEN)) { - bool overflow; - int cpus; - - rcu_read_lock_sched(); - dl_b = dl_bw_of(cpu); + rcu_read_lock_sched(); + dl_b = dl_bw_of(cpu); - raw_spin_lock_irqsave(&dl_b->lock, flags); - cpus = dl_bw_cpus(cpu); - overflow = __dl_overflow(dl_b, cpus, 0, 0); - raw_spin_unlock_irqrestore(&dl_b->lock, flags); + raw_spin_lock_irqsave(&dl_b->lock, flags); + cpus = dl_bw_cpus(cpu); + overflow = __dl_overflow(dl_b, cpus, 0, 0); + raw_spin_unlock_irqrestore(&dl_b->lock, flags); - rcu_read_unlock_sched(); + rcu_read_unlock_sched(); - if (overflow) - return notifier_from_errno(-EBUSY); - } + if (overflow) + return notifier_from_errno(-EBUSY); cpuset_update_active_cpus(false); break; case CPU_DOWN_PREPARE_FROZEN: ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Regression: turbostat stops working after suspend/resume cycle\ 2015-05-18 6:48 ` Regression: turbostat stops working after suspend/resume cycle\ Peter Zijlstra @ 2015-05-18 7:19 ` Gabriele Mazzotta 0 siblings, 0 replies; 5+ messages in thread From: Gabriele Mazzotta @ 2015-05-18 7:19 UTC (permalink / raw) To: Peter Zijlstra; +Cc: juri.lelli, mingo, linux-kernel, len.brown, andrey.semin On Monday 18 May 2015 08:48:04 Peter Zijlstra wrote: > On Sun, May 17, 2015 at 05:48:44PM +0200, Gabriele Mazzotta wrote: > > Hi, > > > > I've recently noticed that if I suspend and resume my laptop, I can no > > longer execute turbostat. This is what I get when I try to start it: > > # turbostat > > Could not migrate to CPU 1 > > turbostat: re-initialized with num_cpus 4 > > Could not migrate to CPU 1 > > > > Since everything works as expected with v4.0, I ran a bisection and > > found that commit 3c18d447b3b36a8d ("sched/core: Check for available > > DL bandwidth in cpuset_cpu_inactive()") is the cause of the regression. > > > > I don't know if there's something else affected by that change, but > > I can consistently reproduce the bug with turbostat. > > > This should be fixed by the below commit which is already in Linus' > tree. Thank you for the quick reply. As I replied to Ingo's mail, which arrived just a bit earlier than yours, yes, the commit here below fixes the problem. Thanks, Gabriele > --- > commit 533445c6e53368569e50ab3fb712230c03d523f3 > Author: Omar Sandoval <osandov@osandov.com> > Date: Mon May 4 03:09:36 2015 -0700 > > sched/core: Fix regression in cpuset_cpu_inactive() for suspend > > Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in > cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that > caused a regression in setting a CPU inactive during suspend. I ran into > this when a program was failing pthread_setaffinity_np() with EINVAL after > a suspend+wake up. > > A simple reproducer: > > $ ./a.out > sched_setaffinity: Success > $ systemctl suspend > $ ./a.out > sched_setaffinity: Invalid argument > > ... where ./a.out is: > > #define _GNU_SOURCE > #include <errno.h> > #include <sched.h> > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <unistd.h> > > int main(void) > { > long num_cores; > cpu_set_t cpu_set; > int ret; > > num_cores = sysconf(_SC_NPROCESSORS_ONLN); > CPU_ZERO(&cpu_set); > CPU_SET(num_cores - 1, &cpu_set); > errno = 0; > ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set); > perror("sched_setaffinity"); > return ret ? EXIT_FAILURE : EXIT_SUCCESS; > } > > The mistake is that suspend is handled in the action == > CPU_DOWN_PREPARE_FROZEN case of the switch statement in > cpuset_cpu_inactive(). > > However, the commit in question masked out CPU_TASKS_FROZEN > from the action, making this case dead. > > The fix is straightforward. > > Signed-off-by: Omar Sandoval <osandov@osandov.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Borislav Petkov <bp@alien8.de> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Juri Lelli <juri.lelli@arm.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()") > Link: http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osandov@osandov.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 34db9bf892a3..57bd333bc4ab 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6999,27 +6999,23 @@ static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long action, > unsigned long flags; > long cpu = (long)hcpu; > struct dl_bw *dl_b; > + bool overflow; > + int cpus; > > - switch (action & ~CPU_TASKS_FROZEN) { > + switch (action) { > case CPU_DOWN_PREPARE: > - /* explicitly allow suspend */ > - if (!(action & CPU_TASKS_FROZEN)) { > - bool overflow; > - int cpus; > - > - rcu_read_lock_sched(); > - dl_b = dl_bw_of(cpu); > + rcu_read_lock_sched(); > + dl_b = dl_bw_of(cpu); > > - raw_spin_lock_irqsave(&dl_b->lock, flags); > - cpus = dl_bw_cpus(cpu); > - overflow = __dl_overflow(dl_b, cpus, 0, 0); > - raw_spin_unlock_irqrestore(&dl_b->lock, flags); > + raw_spin_lock_irqsave(&dl_b->lock, flags); > + cpus = dl_bw_cpus(cpu); > + overflow = __dl_overflow(dl_b, cpus, 0, 0); > + raw_spin_unlock_irqrestore(&dl_b->lock, flags); > > - rcu_read_unlock_sched(); > + rcu_read_unlock_sched(); > > - if (overflow) > - return notifier_from_errno(-EBUSY); > - } > + if (overflow) > + return notifier_from_errno(-EBUSY); > cpuset_update_active_cpus(false); > break; > case CPU_DOWN_PREPARE_FROZEN: ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-05-18 7:19 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-17 15:48 Regression: turbostat stops working after suspend/resume cycle Gabriele Mazzotta 2015-05-18 6:45 ` Ingo Molnar 2015-05-18 7:12 ` Gabriele Mazzotta 2015-05-18 6:48 ` Regression: turbostat stops working after suspend/resume cycle\ Peter Zijlstra 2015-05-18 7:19 ` Gabriele Mazzotta
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox