public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* Re: 2.6.21-rc7-mm2 suspend bug. [kernel/kthread.c]
       [not found] <a8f16e2b0704291251l69c1b02ahb9b22370bc468fa1@mail.gmail.com>
@ 2007-04-29 20:27 ` Rafael J. Wysocki
       [not found] ` <200704292227.45215.rjw@sisk.pl>
  1 sibling, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2007-04-29 20:27 UTC (permalink / raw)
  To: Dan Kruchinin; +Cc: pm list, linux-kernel, Pavel Machek

Hi,

On Sunday, 29 April 2007 21:51, Dan Kruchinin wrote:
> Hi all.
> 
> There is a problem on my macbook core duo with suspend.
> after suspending when i'm trying to 'wake up' my notebook, it seems
> that it works, but i don't see anything at my monitor. So i have to
> reboot it to continue my work.

What exactly do you do to suspend?

Rafael


> ---
> Apr 29 23:31:16 midgard kernel: [140594.900856] BUG: at
> kernel/kthread.c:166 kthread_bind()
> Apr 29 23:31:16 midgard kernel: [140594.900870]  [<c0142c9b>]
> _cpu_down+0x16b/0x250
> Apr 29 23:31:16 midgard kernel: [140594.900893]  [<c0142f80>]
> disable_nonboot_cpus+0x60/0xf0
> Apr 29 23:31:16 midgard kernel: [140594.900903]  [<c0147efa>]
> enter_state+0x22a/0x240
> Apr 29 23:31:16 midgard kernel: [140594.900913]  [<c0147fcd>]
> state_store+0xbd/0xd0
> Apr 29 23:31:16 midgard kernel: [140594.900920]  [<c0147f10>]
> state_store+0x0/0xd0
> Apr 29 23:31:16 midgard kernel: [140594.900927]  [<c01c1559>]
> subsys_attr_store+0x29/0x40
> Apr 29 23:31:16 midgard kernel: [140594.900937]  [<c01c1774>]
> sysfs_write_file+0xd4/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900948]  [<c0180eb6>]
> vfs_write+0xa6/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900958]  [<c01c16a0>]
> sysfs_write_file+0x0/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900966]  [<c0181601>]
> sys_write+0x41/0x70
> Apr 29 23:31:16 midgard kernel: [140594.900974]  [<c018c70b>]
> sys_dup2+0xeb/0x120
> Apr 29 23:31:16 midgard kernel: [140594.900984]  [<c0104116>]
> sysenter_past_esp+0x5f/0x85
> Apr 29 23:31:16 midgard kernel: [140594.900999]  =======================
> ---
> 
> dmesg output:
> ----
> ....
> Apr 29 23:31:16 midgard kernel: [140594.788697] Suspending device vtcon0
> Apr 29 23:31:16 midgard kernel: [140594.788700] Suspending device platform
> Apr 29 23:31:16 midgard kernel: [140594.788704] Disabling non-boot CPUs ...
> Apr 29 23:31:16 midgard kernel: [140594.900464] CPU 1 is now offline
> Apr 29 23:31:16 midgard kernel: [140594.900469] SMP alternatives:
> switching to UP code
> Apr 29 23:31:16 midgard kernel: [140594.900856] BUG: at
> kernel/kthread.c:166 kthread_bind()
> Apr 29 23:31:16 midgard kernel: [140594.900870]  [<c0142c9b>]
> _cpu_down+0x16b/0x250
> Apr 29 23:31:16 midgard kernel: [140594.900893]  [<c0142f80>]
> disable_nonboot_cpus+0x60/0xf0
> Apr 29 23:31:16 midgard kernel: [140594.900903]  [<c0147efa>]
> enter_state+0x22a/0x240
> Apr 29 23:31:16 midgard kernel: [140594.900913]  [<c0147fcd>]
> state_store+0xbd/0xd0
> Apr 29 23:31:16 midgard kernel: [140594.900920]  [<c0147f10>]
> state_store+0x0/0xd0
> Apr 29 23:31:16 midgard kernel: [140594.900927]  [<c01c1559>]
> subsys_attr_store+0x29/0x40
> Apr 29 23:31:16 midgard kernel: [140594.900937]  [<c01c1774>]
> sysfs_write_file+0xd4/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900948]  [<c0180eb6>]
> vfs_write+0xa6/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900958]  [<c01c16a0>]
> sysfs_write_file+0x0/0x160
> Apr 29 23:31:16 midgard kernel: [140594.900966]  [<c0181601>]
> sys_write+0x41/0x70
> Apr 29 23:31:16 midgard kernel: [140594.900974]  [<c018c70b>]
> sys_dup2+0xeb/0x120
> Apr 29 23:31:16 midgard kernel: [140594.900984]  [<c0104116>]
> sysenter_past_esp+0x5f/0x85
> Apr 29 23:31:16 midgard kernel: [140594.900999]  =======================
> Apr 29 23:31:16 midgard kernel: [140594.902843] CPU1 is down
> Apr 29 23:31:16 midgard kernel: [18014366.415769] Enabling non-boot CPUs ...
> Apr 29 23:31:16 midgard kernel: [18014366.426999] SMP alternatives:
> switching to SMP code
> Apr 29 23:31:16 midgard kernel: [18014366.427165] Booting processor 1/1 eip 3000
> Apr 29 23:31:16 midgard kernel: [18014366.436913] Initializing CPU#1
> Apr 29 23:31:16 midgard kernel: [18014366.509141] Calibrating delay
> using timer specific routine.. 3994.69 BogoMIPS (lpj=7989390)
> Apr 29 23:31:16 midgard kernel: [18014366.509152] monitor/mwait feature present.
> Apr 29 23:31:16 midgard kernel: [18014366.509156] CPU: L1 I cache:
> 32K, L1 D cache: 32K
> Apr 29 23:31:16 midgard kernel: [18014366.509158] CPU: L2 cache: 2048K
> Apr 29 23:31:16 midgard kernel: [18014366.509160] CPU: Physical Processor ID: 0
> Apr 29 23:31:16 midgard kernel: [18014366.509161] CPU: Processor Core ID: 1
> Apr 29 23:31:16 midgard kernel: [18014366.509637] CPU1: Intel Genuine
> Intel(R) CPU            1500  @ 2.00GHz stepping 08
> Apr 29 23:31:16 midgard kernel: [18014366.509659] checking TSC
> synchronization [CPU#0 -> CPU#1]:
> Apr 29 23:31:16 midgard kernel: [18014366.529627] Measured 68812018716
> cycles TSC warp between CPUs, turning off TSC clock.
> Apr 29 23:31:16 midgard kernel: [18014366.529630] Marking TSC unstable
> due to: check_tsc_sync_source failed.
> Apr 29 23:31:16 midgard kernel: [18014366.529641] Time: hpet
> clocksource has been installed.
> Apr 29 23:31:16 midgard kernel: [18014366.530137] speedstep-centrino
> with X86_SPEEDSTEP_CENTRINO_ACPI config is deprecated.
> Apr 29 23:31:16 midgard kernel: [18014366.530139]  Use
> X86_ACPI_CPUFREQ (acpi-cpufreq) instead.
> Apr 29 23:31:16 midgard kernel: [18014366.530209] CPU1 is up
> Apr 29 23:31:16 midgard kernel: [18014366.614003] Clocksource tsc
> unstable (delta = 4125051653052 ns)
> Apr 29 23:31:16 midgard kernel: [18014366.730352] ACPI: PCI Interrupt
> 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
> Apr 29 23:31:16 midgard kernel: [18014366.745126] ACPI: PCI Interrupt
> 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 22
> Apr 29 23:31:16 midgard kernel: [18014366.894946] ACPI: PCI Interrupt
> 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 17
> Apr 29 23:31:16 midgard kernel: [18014366.895025] ACPI: PCI Interrupt
> 0000:00:1c.1[B] -> GSI 16 (level, low) -> IRQ 16
> Apr 29 23:31:16 midgard kernel: [18014366.895039] PCI: Enabling device
> 0000:00:1d.0 (0000 -> 0001)
> Apr 29 23:31:16 midgard kernel: [18014366.895042] ACPI: PCI Interrupt
> 0000:00:1d.0[A] -> GSI 21 (level, low) -> IRQ 21
> Apr 29 23:31:16 midgard kernel: [18014366.895104] usb usb1: root hub
> lost power or was reset
> Apr 29 23:31:16 midgard kernel: [18014366.895141] PCI: Enabling device
> 0000:00:1d.1 (0000 -> 0001)
> Apr 29 23:31:16 midgard kernel: [18014366.895144] ACPI: PCI Interrupt
> 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
> Apr 29 23:31:16 midgard kernel: [18014366.895204] usb usb2: root hub
> lost power or was reset
> Apr 29 23:31:16 midgard kernel: [18014366.895237] PCI: Enabling device
> 0000:00:1d.2 (0000 -> 0001)
> Apr 29 23:31:16 midgard kernel: [18014366.895240] ACPI: PCI Interrupt
> 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
> Apr 29 23:31:16 midgard kernel: [18014366.895299] usb usb3: root hub
> lost power or was reset
> Apr 29 23:31:16 midgard kernel: [18014366.895328] PCI: Enabling device
> 0000:00:1d.3 (0000 -> 0001)
> Apr 29 23:31:16 midgard kernel: [18014366.895331] ACPI: PCI Interrupt
> 0000:00:1d.3[D] -> GSI 16 (level, low) -> IRQ 16
> Apr 29 23:31:16 midgard kernel: [18014366.895391] usb usb4: root hub
> lost power or was reset
> Apr 29 23:31:16 midgard kernel: [18014366.909854] PCI: Enabling device
> 0000:00:1d.7 (0000 -> 0002)
> Apr 29 23:31:16 midgard kernel: [18014366.909857] ACPI: PCI Interrupt
> 0000:00:1d.7[A] -> GSI 21 (level, low) -> IRQ 21
> Apr 29 23:31:16 midgard kernel: [18014366.910032] ACPI: PCI Interrupt
> 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
> Apr 29 23:31:16 midgard kernel: [18014366.924871] ACPI: PCI Interrupt
> 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
> Apr 29 23:31:16 midgard kernel: [18014366.939965] sky2 eth0: enabling interface
> Apr 29 23:31:16 midgard kernel: [18014366.941763] sky2 eth0: ram buffer 48K
> Apr 29 23:31:16 midgard kernel: [18014367.006976] ohci1394: fw-host0:
> OHCI-1394 1.0 (PCI): IRQ=[19]  MMIO=[90000000-900007ff]  Max
> Packet=[2048]  IR/IT contexts=[8/8]
> ...
> ---
> 
> Thanks.
> Dan Kruchinin.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.21-rc7-mm2 suspend bug. [kernel/kthread.c]
       [not found] ` <200704292227.45215.rjw@sisk.pl>
@ 2007-04-30  7:39   ` Andrew Morton
  2007-04-30 10:05     ` Gautham R Shenoy
       [not found]     ` <20070430100535.GA30975@in.ibm.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Morton @ 2007-04-30  7:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: list, Dan Kruchinin, linux-kernel, Pavel Machek, pm

On Sun, 29 Apr 2007 22:27:44 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, 29 April 2007 21:51, Dan Kruchinin wrote:
> > Hi all.
> > 
> > There is a problem on my macbook core duo with suspend.
> > after suspending when i'm trying to 'wake up' my notebook, it seems
> > that it works, but i don't see anything at my monitor. So i have to
> > reboot it to continue my work.
> 
> What exactly do you do to suspend?
> 

This is due to _cpu_down() calling kthread_bind() in state TASK_RUNNING.

So I was sent the below, including worrisome changelog.




From: Gautham R Shenoy <ego@in.ibm.com>

We are anyway kthread_stop()ping other per-cpu kernel threads after
move_task_off_dead_cpu(), so we can do it with the stop_machine_run thread
as well.

I just checked with Vatsa if there was any subtle reason why they
had put in the kthread_bind() in cpu.c. Vatsa cannot seem to recollect
any and I can't see any. So let us just remove the kthread_bind.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/cpu.c |    4 ----
 1 files changed, 4 deletions(-)

diff -puN kernel/cpu.c~remvoe-kthread_bind-call-from-_cpu_down kernel/cpu.c
--- a/kernel/cpu.c~remvoe-kthread_bind-call-from-_cpu_down
+++ a/kernel/cpu.c
@@ -175,10 +175,6 @@ static int _cpu_down(unsigned int cpu)
 	/* This actually kills the CPU. */
 	__cpu_die(cpu);
 
-	/* Move it here so it can run. */
-	kthread_bind(p, get_cpu());
-	put_cpu();
-
 	/* CPU is completely dead: tell everyone.  Too late to complain. */
 	if (raw_notifier_call_chain(&cpu_chain, CPU_DEAD, hcpu) == NOTIFY_BAD)
 		BUG();
_

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: 2.6.21-rc7-mm2 suspend bug. [kernel/kthread.c]
  2007-04-30  7:39   ` Andrew Morton
@ 2007-04-30 10:05     ` Gautham R Shenoy
       [not found]     ` <20070430100535.GA30975@in.ibm.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Gautham R Shenoy @ 2007-04-30 10:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pavel Machek, Dan Kruchinin, linux-kernel, pm, list

On Mon, Apr 30, 2007 at 12:39:46AM -0700, Andrew Morton wrote:
> On Sun, 29 Apr 2007 22:27:44 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Sunday, 29 April 2007 21:51, Dan Kruchinin wrote:
> > > Hi all.
> > > 
> > > There is a problem on my macbook core duo with suspend.
> > > after suspending when i'm trying to 'wake up' my notebook, it seems
> > > that it works, but i don't see anything at my monitor. So i have to
> > > reboot it to continue my work.
> > 
> > What exactly do you do to suspend?
> > 
> 
> This is due to _cpu_down() calling kthread_bind() in state TASK_RUNNING.

The state should be TASK_INTERRUPTIBLE. That's the state of the thread
'p' should be in when we do a kthread_bind(p) in _cpu_down().

Are you sure about the TASK_RUNNING part ?

> 
> So I was sent the below, including worrisome changelog.
> 

Ok, it should not be that worrisome!
By the time we would be doing kthread_stop(p) in _cpu_down(), 'p' would have
been moved over to some other online cpu, due to the migrate_dead_tasks() 
called in CPU_DEAD handling of migration_call (kernel/sched.c).

So we are safe. Anyway, I apologise for causing any worry :-)

Thanks and Regards
gautham.
> 
> 
> 
> From: Gautham R Shenoy <ego@in.ibm.com>
> 
> We are anyway kthread_stop()ping other per-cpu kernel threads after
> move_task_off_dead_cpu(), so we can do it with the stop_machine_run thread
> as well.
> 
> I just checked with Vatsa if there was any subtle reason why they
> had put in the kthread_bind() in cpu.c. Vatsa cannot seem to recollect
> any and I can't see any. So let us just remove the kthread_bind.
> 
> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> Cc: Oleg Nesterov <oleg@tv-sign.ru>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  kernel/cpu.c |    4 ----
>  1 files changed, 4 deletions(-)
> 
> diff -puN kernel/cpu.c~remvoe-kthread_bind-call-from-_cpu_down kernel/cpu.c
> --- a/kernel/cpu.c~remvoe-kthread_bind-call-from-_cpu_down
> +++ a/kernel/cpu.c
> @@ -175,10 +175,6 @@ static int _cpu_down(unsigned int cpu)
>  	/* This actually kills the CPU. */
>  	__cpu_die(cpu);
> 
> -	/* Move it here so it can run. */
> -	kthread_bind(p, get_cpu());
> -	put_cpu();
> -
>  	/* CPU is completely dead: tell everyone.  Too late to complain. */
>  	if (raw_notifier_call_chain(&cpu_chain, CPU_DEAD, hcpu) == NOTIFY_BAD)
>  		BUG();
> _
> 
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: 2.6.21-rc7-mm2 suspend bug. [kernel/kthread.c]
       [not found]     ` <20070430100535.GA30975@in.ibm.com>
@ 2007-04-30 14:58       ` Rafael J. Wysocki
  0 siblings, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2007-04-30 14:58 UTC (permalink / raw)
  To: ego; +Cc: Pavel Machek, Dan Kruchinin, linux-kernel, pm, Andrew Morton,
	list

On Monday, 30 April 2007 12:05, Gautham R Shenoy wrote:
> On Mon, Apr 30, 2007 at 12:39:46AM -0700, Andrew Morton wrote:
> > On Sun, 29 Apr 2007 22:27:44 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > On Sunday, 29 April 2007 21:51, Dan Kruchinin wrote:
> > > > Hi all.
> > > > 
> > > > There is a problem on my macbook core duo with suspend.
> > > > after suspending when i'm trying to 'wake up' my notebook, it seems
> > > > that it works, but i don't see anything at my monitor. So i have to
> > > > reboot it to continue my work.
> > > 
> > > What exactly do you do to suspend?
> > > 
> > 
> > This is due to _cpu_down() calling kthread_bind() in state TASK_RUNNING.
> 
> The state should be TASK_INTERRUPTIBLE. That's the state of the thread
> 'p' should be in when we do a kthread_bind(p) in _cpu_down().
> 
> Are you sure about the TASK_RUNNING part ?

Well, the WARN_ON() in kernel/kthread.c, line166, is triggering here, so it
may be TASK_INTERRUPTIBLE too (should the WARN_ON() trigger in that case)?

Rafael

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-04-30 14:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <a8f16e2b0704291251l69c1b02ahb9b22370bc468fa1@mail.gmail.com>
2007-04-29 20:27 ` 2.6.21-rc7-mm2 suspend bug. [kernel/kthread.c] Rafael J. Wysocki
     [not found] ` <200704292227.45215.rjw@sisk.pl>
2007-04-30  7:39   ` Andrew Morton
2007-04-30 10:05     ` Gautham R Shenoy
     [not found]     ` <20070430100535.GA30975@in.ibm.com>
2007-04-30 14:58       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox