linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Suspend to memory is freezing my machine
       [not found]       ` <481DF570.8050700@gmail.com>
@ 2008-05-04 18:01         ` Robert Hancock
  2008-05-04 18:52           ` Jacek Luczak
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Hancock @ 2008-05-04 18:01 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Rafael J. Wysocki, Zdenek Kabelac, Linux Kernel Mailing List, ide

Jacek Luczak wrote:
> Robert Hancock pisze:
>> Jacek Luczak wrote:
>>> Rafael J. Wysocki pisze:
>>>> On Sunday, 4 of May 2008, Zdenek Kabelac wrote:
>>>>> Hello
>>>> Hi,
>>>>
>>>>> With recent 2.6.25 & 2.6.26-rc1 git (around 1 week) I get occasionally
>>>>> complete freeze of my T61 during suspend. (dual core, 2GB).
>>>> How reproducible is this?
>>>>
>>>>> I'm running kernel with no_console_suspend - but all I can see is
>>>>> blinking cursor on an empty screen - thus even when I run kernel with
>>>>> most debug options turned on, I can't pass more details so far.  I run
>>>>> suspend with with SD card in - so maybe some update in the MMC driver
>>>>> might be responsible for this ?
>>>>>
>>>>> Also - I think that option no_console_suspend doens't work correctly -
>>>>> as many times with suspend I do not see any log message on my console
>>>>> screen. However sometimes the log is shown.
>>>> It would be helpful if you could verify if:
>>>>
>>>> (1) The problem occurs without no_console_suspend.
>>>> (2) The problem occurs without the SD card.
>>>>
>>> Hi Rafael,
>>>
>>> same problem here, although I was able to resume system (it's
>>> basically Intel
>>> machine) , but it was unusable - I was able to switch between
>>> terminals and see
>>> output from kernel. So there was:
>>>     - Disabling irq #19;
>>>     - some kind of lock spinning on disk:
>>> IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
>>> Storage
>>> Controller IDE (rev 02)
>>>     but I can't provide more output of that lock now - no sign in logs.
>>>
>>> I've made some successful suspend/resume all without sound card active
>>> without
>>> problem. Those appear with sound card active, but I must take closer
>>> look - will
>>> send info later.
>> Can you post your dmesg and /proc/interrupts output from normal bootup ?
> 
> Sure I can ;)
> 
> 1) /proc/interrupts
> 
>            CPU0       CPU1
>   0:   11846981          0   IO-APIC-edge      timer
>   1:      30098          0   IO-APIC-edge      i8042
>   8:          3          0   IO-APIC-edge      rtc
>   9:         13          0   IO-APIC-fasteoi   acpi
>  12:    1776540          0   IO-APIC-edge      i8042
>  14:         39          0   IO-APIC-edge      ata_piix
>  15:          0          0   IO-APIC-edge      ata_piix
>  16:      54570      44642   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>  19:      98243          0   IO-APIC-fasteoi   ata_piix, uhci_hcd:usb5
>  21:    1650574          0   IO-APIC-fasteoi   HDA Intel
>  23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
> 220:      14263          0   PCI-MSI-edge      iwl3945
> 221:    1166041    1333296   PCI-MSI-edge      eth0
> NMI:          0          0   Non-maskable interrupts
> LOC:    1104887    7534969   Local timer interrupts
> RES:     633378     701351   Rescheduling interrupts
> CAL:         16      28315   function call interrupts
> TLB:       1721       2620   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> SPU:          0          0   Spurious interrupts
> ERR:          0
> MIS:          0
> 
> 2) dmesg can here -> http://212.109.128.251/~difrost/linux-next/dmesg.log
> 3) Kernel:
> Linux difrost 2.6.25-07422-gb66e1f1-dirty #14 SMP Fri May 2 22:04:17 CEST 2008
> i686 i686 i386 GNU/Linux
> It's marked dirty because due to http://lkml.org/lkml/2008/5/2/405 patch applied.
> 
> -Jacek
> 

Well, if IRQ 19 got disabled, that's your SATA controller, so resume 
likely isn't going to work. Could be a libata problem? CCing linux-ide.

BTW, if your BIOS offers an option to enable AHCI on your SATA 
controller then that would be a more optimal configuration (could get 
NCQ support), but that is an aside.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Suspend to memory is freezing my machine
  2008-05-04 18:01         ` Suspend to memory is freezing my machine Robert Hancock
@ 2008-05-04 18:52           ` Jacek Luczak
  2008-05-04 19:44             ` Jacek Luczak
  0 siblings, 1 reply; 5+ messages in thread
From: Jacek Luczak @ 2008-05-04 18:52 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Rafael J. Wysocki, Zdenek Kabelac, Linux Kernel Mailing List, ide

Robert Hancock pisze:
> Jacek Luczak wrote:
>> Robert Hancock pisze:
>>> Jacek Luczak wrote:
>>>> Rafael J. Wysocki pisze:
>>>>> On Sunday, 4 of May 2008, Zdenek Kabelac wrote:
>>>>>> Hello
>>>>> Hi,
>>>>>
>>>>>> With recent 2.6.25 & 2.6.26-rc1 git (around 1 week) I get
>>>>>> occasionally
>>>>>> complete freeze of my T61 during suspend. (dual core, 2GB).
>>>>> How reproducible is this?
>>>>>
>>>>>> I'm running kernel with no_console_suspend - but all I can see is
>>>>>> blinking cursor on an empty screen - thus even when I run kernel with
>>>>>> most debug options turned on, I can't pass more details so far.  I
>>>>>> run
>>>>>> suspend with with SD card in - so maybe some update in the MMC driver
>>>>>> might be responsible for this ?
>>>>>>
>>>>>> Also - I think that option no_console_suspend doens't work
>>>>>> correctly -
>>>>>> as many times with suspend I do not see any log message on my console
>>>>>> screen. However sometimes the log is shown.
>>>>> It would be helpful if you could verify if:
>>>>>
>>>>> (1) The problem occurs without no_console_suspend.
>>>>> (2) The problem occurs without the SD card.
>>>>>
>>>> Hi Rafael,
>>>>
>>>> same problem here, although I was able to resume system (it's
>>>> basically Intel
>>>> machine) , but it was unusable - I was able to switch between
>>>> terminals and see
>>>> output from kernel. So there was:
>>>>     - Disabling irq #19;
>>>>     - some kind of lock spinning on disk:
>>>> IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
>>>> Storage
>>>> Controller IDE (rev 02)
>>>>     but I can't provide more output of that lock now - no sign in logs.
>>>>
>>>> I've made some successful suspend/resume all without sound card active
>>>> without
>>>> problem. Those appear with sound card active, but I must take closer
>>>> look - will
>>>> send info later.
>>> Can you post your dmesg and /proc/interrupts output from normal bootup ?
>>
>> Sure I can ;)
>>
>> 1) /proc/interrupts
>>
>>            CPU0       CPU1
>>   0:   11846981          0   IO-APIC-edge      timer
>>   1:      30098          0   IO-APIC-edge      i8042
>>   8:          3          0   IO-APIC-edge      rtc
>>   9:         13          0   IO-APIC-fasteoi   acpi
>>  12:    1776540          0   IO-APIC-edge      i8042
>>  14:         39          0   IO-APIC-edge      ata_piix
>>  15:          0          0   IO-APIC-edge      ata_piix
>>  16:      54570      44642   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>>  19:      98243          0   IO-APIC-fasteoi   ata_piix, uhci_hcd:usb5
>>  21:    1650574          0   IO-APIC-fasteoi   HDA Intel
>>  23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1,
>> uhci_hcd:usb2
>> 220:      14263          0   PCI-MSI-edge      iwl3945
>> 221:    1166041    1333296   PCI-MSI-edge      eth0
>> NMI:          0          0   Non-maskable interrupts
>> LOC:    1104887    7534969   Local timer interrupts
>> RES:     633378     701351   Rescheduling interrupts
>> CAL:         16      28315   function call interrupts
>> TLB:       1721       2620   TLB shootdowns
>> TRM:          0          0   Thermal event interrupts
>> SPU:          0          0   Spurious interrupts
>> ERR:          0
>> MIS:          0
>>
>> 2) dmesg can here -> http://212.109.128.251/~difrost/linux-next/dmesg.log
>> 3) Kernel:
>> Linux difrost 2.6.25-07422-gb66e1f1-dirty #14 SMP Fri May 2 22:04:17
>> CEST 2008
>> i686 i686 i386 GNU/Linux
>> It's marked dirty because due to http://lkml.org/lkml/2008/5/2/405
>> patch applied.
>>
>> -Jacek
>>
> 
> Well, if IRQ 19 got disabled, that's your SATA controller, so resume
> likely isn't going to work. Could be a libata problem? CCing linux-ide.

Yep, I know, that's why I pointed that out. Irq was disabled somehow in suspend
or resume process.

> BTW, if your BIOS offers an option to enable AHCI on your SATA
> controller then that would be a more optimal configuration (could get
> NCQ support), but that is an aside.

With AHCI I've got pretty bad timings (and I don't really know why!):

[root|20:49|~]$ cat sda_ahci_t

/dev/sda:
 Timing cached reads:   1560 MB in  2.00 seconds = 780.51 MB/sec
 Timing buffered disk reads:  102 MB in  3.02 seconds =  33.74 MB/sec
[root|20:49|~]$ cat sda_piix_t

/dev/sda:
 Timing cached reads:   1544 MB in  2.00 seconds = 772.35 MB/sec
 Timing buffered disk reads:  134 MB in  3.04 seconds =  44.05 MB/sec

-Jacek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Suspend to memory is freezing my machine
  2008-05-04 18:52           ` Jacek Luczak
@ 2008-05-04 19:44             ` Jacek Luczak
  2008-05-04 21:06               ` Robert Hancock
  0 siblings, 1 reply; 5+ messages in thread
From: Jacek Luczak @ 2008-05-04 19:44 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Rafael J. Wysocki, Zdenek Kabelac, Linux Kernel Mailing List, ide,
	linux-acpi

Jacek Luczak pisze:
> Robert Hancock pisze:
>> Jacek Luczak wrote:
>>> Robert Hancock pisze:
>>>> Jacek Luczak wrote:
>>>>> Rafael J. Wysocki pisze:
>>>>>> On Sunday, 4 of May 2008, Zdenek Kabelac wrote:
>>>>>>> Hello
>>>>>> Hi,
>>>>>>
>>>>>>> With recent 2.6.25 & 2.6.26-rc1 git (around 1 week) I get
>>>>>>> occasionally
>>>>>>> complete freeze of my T61 during suspend. (dual core, 2GB).
>>>>>> How reproducible is this?
>>>>>>
>>>>>>> I'm running kernel with no_console_suspend - but all I can see is
>>>>>>> blinking cursor on an empty screen - thus even when I run kernel with
>>>>>>> most debug options turned on, I can't pass more details so far.  I
>>>>>>> run
>>>>>>> suspend with with SD card in - so maybe some update in the MMC driver
>>>>>>> might be responsible for this ?
>>>>>>>
>>>>>>> Also - I think that option no_console_suspend doens't work
>>>>>>> correctly -
>>>>>>> as many times with suspend I do not see any log message on my console
>>>>>>> screen. However sometimes the log is shown.
>>>>>> It would be helpful if you could verify if:
>>>>>>
>>>>>> (1) The problem occurs without no_console_suspend.
>>>>>> (2) The problem occurs without the SD card.
>>>>>>
>>>>> Hi Rafael,
>>>>>
>>>>> same problem here, although I was able to resume system (it's
>>>>> basically Intel
>>>>> machine) , but it was unusable - I was able to switch between
>>>>> terminals and see
>>>>> output from kernel. So there was:
>>>>>     - Disabling irq #19;
>>>>>     - some kind of lock spinning on disk:
>>>>> IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
>>>>> Storage
>>>>> Controller IDE (rev 02)
>>>>>     but I can't provide more output of that lock now - no sign in logs.
>>>>>
>>>>> I've made some successful suspend/resume all without sound card active
>>>>> without
>>>>> problem. Those appear with sound card active, but I must take closer
>>>>> look - will
>>>>> send info later.
>>>> Can you post your dmesg and /proc/interrupts output from normal bootup ?
>>> Sure I can ;)
>>>
>>> 1) /proc/interrupts
>>>
>>>            CPU0       CPU1
>>>   0:   11846981          0   IO-APIC-edge      timer
>>>   1:      30098          0   IO-APIC-edge      i8042
>>>   8:          3          0   IO-APIC-edge      rtc
>>>   9:         13          0   IO-APIC-fasteoi   acpi
>>>  12:    1776540          0   IO-APIC-edge      i8042
>>>  14:         39          0   IO-APIC-edge      ata_piix
>>>  15:          0          0   IO-APIC-edge      ata_piix
>>>  16:      54570      44642   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>>>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>>>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>>>  19:      98243          0   IO-APIC-fasteoi   ata_piix, uhci_hcd:usb5
>>>  21:    1650574          0   IO-APIC-fasteoi   HDA Intel
>>>  23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1,
>>> uhci_hcd:usb2
>>> 220:      14263          0   PCI-MSI-edge      iwl3945
>>> 221:    1166041    1333296   PCI-MSI-edge      eth0
>>> NMI:          0          0   Non-maskable interrupts
>>> LOC:    1104887    7534969   Local timer interrupts
>>> RES:     633378     701351   Rescheduling interrupts
>>> CAL:         16      28315   function call interrupts
>>> TLB:       1721       2620   TLB shootdowns
>>> TRM:          0          0   Thermal event interrupts
>>> SPU:          0          0   Spurious interrupts
>>> ERR:          0
>>> MIS:          0
>>>
>>> 2) dmesg can here -> http://212.109.128.251/~difrost/linux-next/dmesg.log
>>> 3) Kernel:
>>> Linux difrost 2.6.25-07422-gb66e1f1-dirty #14 SMP Fri May 2 22:04:17
>>> CEST 2008
>>> i686 i686 i386 GNU/Linux
>>> It's marked dirty because due to http://lkml.org/lkml/2008/5/2/405
>>> patch applied.
>>>
>>> -Jacek
>>>
>> Well, if IRQ 19 got disabled, that's your SATA controller, so resume
>> likely isn't going to work. Could be a libata problem? CCing linux-ide.
> 
> Yep, I know, that's why I pointed that out. Irq was disabled somehow in suspend
> or resume process.
> 
>> BTW, if your BIOS offers an option to enable AHCI on your SATA
>> controller then that would be a more optimal configuration (could get
>> NCQ support), but that is an aside.
> 
> With AHCI I've got pretty bad timings (and I don't really know why!):
> 
> [root|20:49|~]$ cat sda_ahci_t
> 
> /dev/sda:
>  Timing cached reads:   1560 MB in  2.00 seconds = 780.51 MB/sec
>  Timing buffered disk reads:  102 MB in  3.02 seconds =  33.74 MB/sec
> [root|20:49|~]$ cat sda_piix_t
> 
> /dev/sda:
>  Timing cached reads:   1544 MB in  2.00 seconds = 772.35 MB/sec
>  Timing buffered disk reads:  134 MB in  3.04 seconds =  44.05 MB/sec

Here's the latest report (all on latest git):
	1) I've switched to AHCI mode and suspend/resume works OK (because SATA
controller irq is not disabled).

	2) now /proc/interrupts look like that:
           CPU0       CPU1
  0:     110708          0   IO-APIC-edge      timer
  1:       4008          0   IO-APIC-edge      i8042
  8:          3          0   IO-APIC-edge      rtc
  9:      15091          0   IO-APIC-fasteoi   acpi
 12:      77467          0   IO-APIC-edge      i8042
 14:         44          0   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   i915@pci:0000:00:02.0
 17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 19:     100001          0   IO-APIC-fasteoi   uhci_hcd:usb5
 21:        282          0   IO-APIC-fasteoi   HDA Intel
 23:          1          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
219:        858          0   PCI-MSI-edge      iwl3945
220:       8616          0   PCI-MSI-edge      eth0
221:       6423          0   PCI-MSI-edge      ahci
NMI:          0          0   Non-maskable interrupts
LOC:      15777      64510   Local timer interrupts
RES:       9045      24560   Rescheduling interrupts
CAL:         30      28255   function call interrupts
TLB:        341        145   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0

	3) The IRQ #19 remains disabled after resume and produce:
irq 19: nobody cared (try booting with the "irqpoll" option)
Pid: 13, comm: kacpi_notify Not tainted 2.6.26-rc1-07561-gafa26be-dirty #16
 [<c013ea27>] __report_bad_irq+0x24/0x69
 [<c013ea2e>] __report_bad_irq+0x2b/0x69
 [<c013ec25>] note_interrupt+0x1b9/0x210
 [<c013e36c>] handle_IRQ_event+0x1a/0x3f
 [<c013f195>] handle_fasteoi_irq+0x84/0xa2
 [<c0104fde>] do_IRQ+0x4f/0x65
 [<c01034ff>] common_interrupt+0x23/0x28
 [<c013007b>] timekeeping_resume+0x9b/0x127
 [<c020b090>] acpi_os_read_port+0x29/0x44
 [<c02177c9>] acpi_hw_register_read+0x61/0x119
 [<c020f76e>] acpi_ev_fixed_event_detect+0x2a/0xa0
 [<c021001a>] acpi_ev_sci_xrupt_handler+0x9/0x17
 [<c020b053>] acpi_irq+0xb/0x1f
 [<c013e36c>] handle_IRQ_event+0x1a/0x3f
 [<c013f181>] handle_fasteoi_irq+0x70/0xa2
 [<c0104fde>] do_IRQ+0x4f/0x65
 [<c020b623>] acpi_os_execute_deferred+0x0/0x25
 [<c01034ff>] common_interrupt+0x23/0x28
 [<c020b623>] acpi_os_execute_deferred+0x0/0x25
 [<c020b0b8>] acpi_os_write_port+0xd/0x2c
 [<c020b640>] acpi_os_execute_deferred+0x1d/0x25
 [<c01290fa>] run_workqueue+0x69/0xda
 [<c0129221>] worker_thread+0xb6/0xc2
 [<c012bca6>] autoremove_wake_function+0x0/0x2d
 [<c012916b>] worker_thread+0x0/0xc2
 [<c012ba42>] kthread+0x38/0x5d
 [<c012ba0a>] kthread+0x0/0x5d
 [<c010370f>] kernel_thread_helper+0x7/0x10
 =======================
handlers:
[<c027d100>] (usb_hcd_irq+0x0/0x53)
Disabling IRQ #19

This might happen due to "ACPI: EC: GPE storm detected, disabling EC GPE", but
here it should revert to polling mode (which is done during boot, but not during
resume). I'm not expert here.

Full dmesg here -> http://212.109.128.251/~difrost/linux-next/dmesg_ahci.log

-Jacek

PS: Site note: Why there's such big difference on hdparm timings with PATA and
AHCI mode?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Suspend to memory is freezing my machine
  2008-05-04 19:44             ` Jacek Luczak
@ 2008-05-04 21:06               ` Robert Hancock
  2008-05-07 20:29                 ` Jacek Luczak
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Hancock @ 2008-05-04 21:06 UTC (permalink / raw)
  To: Jacek Luczak
  Cc: Rafael J. Wysocki, Zdenek Kabelac, Linux Kernel Mailing List, ide,
	linux-acpi

Jacek Luczak wrote:
> Jacek Luczak pisze:
>> Robert Hancock pisze:
>>> Jacek Luczak wrote:
>>>> Robert Hancock pisze:
>>>>> Jacek Luczak wrote:
>>>>>> Rafael J. Wysocki pisze:
>>>>>>> On Sunday, 4 of May 2008, Zdenek Kabelac wrote:
>>>>>>>> Hello
>>>>>>> Hi,
>>>>>>>
>>>>>>>> With recent 2.6.25 & 2.6.26-rc1 git (around 1 week) I get
>>>>>>>> occasionally
>>>>>>>> complete freeze of my T61 during suspend. (dual core, 2GB).
>>>>>>> How reproducible is this?
>>>>>>>
>>>>>>>> I'm running kernel with no_console_suspend - but all I can see is
>>>>>>>> blinking cursor on an empty screen - thus even when I run kernel with
>>>>>>>> most debug options turned on, I can't pass more details so far.  I
>>>>>>>> run
>>>>>>>> suspend with with SD card in - so maybe some update in the MMC driver
>>>>>>>> might be responsible for this ?
>>>>>>>>
>>>>>>>> Also - I think that option no_console_suspend doens't work
>>>>>>>> correctly -
>>>>>>>> as many times with suspend I do not see any log message on my console
>>>>>>>> screen. However sometimes the log is shown.
>>>>>>> It would be helpful if you could verify if:
>>>>>>>
>>>>>>> (1) The problem occurs without no_console_suspend.
>>>>>>> (2) The problem occurs without the SD card.
>>>>>>>
>>>>>> Hi Rafael,
>>>>>>
>>>>>> same problem here, although I was able to resume system (it's
>>>>>> basically Intel
>>>>>> machine) , but it was unusable - I was able to switch between
>>>>>> terminals and see
>>>>>> output from kernel. So there was:
>>>>>>     - Disabling irq #19;
>>>>>>     - some kind of lock spinning on disk:
>>>>>> IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
>>>>>> Storage
>>>>>> Controller IDE (rev 02)
>>>>>>     but I can't provide more output of that lock now - no sign in logs.
>>>>>>
>>>>>> I've made some successful suspend/resume all without sound card active
>>>>>> without
>>>>>> problem. Those appear with sound card active, but I must take closer
>>>>>> look - will
>>>>>> send info later.
>>>>> Can you post your dmesg and /proc/interrupts output from normal bootup ?
>>>> Sure I can ;)
>>>>
>>>> 1) /proc/interrupts
>>>>
>>>>            CPU0       CPU1
>>>>   0:   11846981          0   IO-APIC-edge      timer
>>>>   1:      30098          0   IO-APIC-edge      i8042
>>>>   8:          3          0   IO-APIC-edge      rtc
>>>>   9:         13          0   IO-APIC-fasteoi   acpi
>>>>  12:    1776540          0   IO-APIC-edge      i8042
>>>>  14:         39          0   IO-APIC-edge      ata_piix
>>>>  15:          0          0   IO-APIC-edge      ata_piix
>>>>  16:      54570      44642   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>>>>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>>>>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>>>>  19:      98243          0   IO-APIC-fasteoi   ata_piix, uhci_hcd:usb5
>>>>  21:    1650574          0   IO-APIC-fasteoi   HDA Intel
>>>>  23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1,
>>>> uhci_hcd:usb2
>>>> 220:      14263          0   PCI-MSI-edge      iwl3945
>>>> 221:    1166041    1333296   PCI-MSI-edge      eth0
>>>> NMI:          0          0   Non-maskable interrupts
>>>> LOC:    1104887    7534969   Local timer interrupts
>>>> RES:     633378     701351   Rescheduling interrupts
>>>> CAL:         16      28315   function call interrupts
>>>> TLB:       1721       2620   TLB shootdowns
>>>> TRM:          0          0   Thermal event interrupts
>>>> SPU:          0          0   Spurious interrupts
>>>> ERR:          0
>>>> MIS:          0
>>>>
>>>> 2) dmesg can here -> http://212.109.128.251/~difrost/linux-next/dmesg.log
>>>> 3) Kernel:
>>>> Linux difrost 2.6.25-07422-gb66e1f1-dirty #14 SMP Fri May 2 22:04:17
>>>> CEST 2008
>>>> i686 i686 i386 GNU/Linux
>>>> It's marked dirty because due to http://lkml.org/lkml/2008/5/2/405
>>>> patch applied.
>>>>
>>>> -Jacek
>>>>
>>> Well, if IRQ 19 got disabled, that's your SATA controller, so resume
>>> likely isn't going to work. Could be a libata problem? CCing linux-ide.
>> Yep, I know, that's why I pointed that out. Irq was disabled somehow in suspend
>> or resume process.
>>
>>> BTW, if your BIOS offers an option to enable AHCI on your SATA
>>> controller then that would be a more optimal configuration (could get
>>> NCQ support), but that is an aside.
>> With AHCI I've got pretty bad timings (and I don't really know why!):
>>
>> [root|20:49|~]$ cat sda_ahci_t
>>
>> /dev/sda:
>>  Timing cached reads:   1560 MB in  2.00 seconds = 780.51 MB/sec
>>  Timing buffered disk reads:  102 MB in  3.02 seconds =  33.74 MB/sec
>> [root|20:49|~]$ cat sda_piix_t
>>
>> /dev/sda:
>>  Timing cached reads:   1544 MB in  2.00 seconds = 772.35 MB/sec
>>  Timing buffered disk reads:  134 MB in  3.04 seconds =  44.05 MB/sec
> 
> Here's the latest report (all on latest git):
> 	1) I've switched to AHCI mode and suspend/resume works OK (because SATA
> controller irq is not disabled).
> 
> 	2) now /proc/interrupts look like that:
>            CPU0       CPU1
>   0:     110708          0   IO-APIC-edge      timer
>   1:       4008          0   IO-APIC-edge      i8042
>   8:          3          0   IO-APIC-edge      rtc
>   9:      15091          0   IO-APIC-fasteoi   acpi
>  12:      77467          0   IO-APIC-edge      i8042
>  14:         44          0   IO-APIC-edge      ata_piix
>  15:          0          0   IO-APIC-edge      ata_piix
>  16:          0          0   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>  19:     100001          0   IO-APIC-fasteoi   uhci_hcd:usb5
>  21:        282          0   IO-APIC-fasteoi   HDA Intel
>  23:          1          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
> 219:        858          0   PCI-MSI-edge      iwl3945
> 220:       8616          0   PCI-MSI-edge      eth0
> 221:       6423          0   PCI-MSI-edge      ahci
> NMI:          0          0   Non-maskable interrupts
> LOC:      15777      64510   Local timer interrupts
> RES:       9045      24560   Rescheduling interrupts
> CAL:         30      28255   function call interrupts
> TLB:        341        145   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> SPU:          0          0   Spurious interrupts
> ERR:          0
> MIS:          0
> 
> 	3) The IRQ #19 remains disabled after resume and produce:
> irq 19: nobody cared (try booting with the "irqpoll" option)
> Pid: 13, comm: kacpi_notify Not tainted 2.6.26-rc1-07561-gafa26be-dirty #16
>  [<c013ea27>] __report_bad_irq+0x24/0x69
>  [<c013ea2e>] __report_bad_irq+0x2b/0x69
>  [<c013ec25>] note_interrupt+0x1b9/0x210
>  [<c013e36c>] handle_IRQ_event+0x1a/0x3f
>  [<c013f195>] handle_fasteoi_irq+0x84/0xa2
>  [<c0104fde>] do_IRQ+0x4f/0x65
>  [<c01034ff>] common_interrupt+0x23/0x28
>  [<c013007b>] timekeeping_resume+0x9b/0x127
>  [<c020b090>] acpi_os_read_port+0x29/0x44
>  [<c02177c9>] acpi_hw_register_read+0x61/0x119
>  [<c020f76e>] acpi_ev_fixed_event_detect+0x2a/0xa0
>  [<c021001a>] acpi_ev_sci_xrupt_handler+0x9/0x17
>  [<c020b053>] acpi_irq+0xb/0x1f
>  [<c013e36c>] handle_IRQ_event+0x1a/0x3f
>  [<c013f181>] handle_fasteoi_irq+0x70/0xa2
>  [<c0104fde>] do_IRQ+0x4f/0x65
>  [<c020b623>] acpi_os_execute_deferred+0x0/0x25
>  [<c01034ff>] common_interrupt+0x23/0x28
>  [<c020b623>] acpi_os_execute_deferred+0x0/0x25
>  [<c020b0b8>] acpi_os_write_port+0xd/0x2c
>  [<c020b640>] acpi_os_execute_deferred+0x1d/0x25
>  [<c01290fa>] run_workqueue+0x69/0xda
>  [<c0129221>] worker_thread+0xb6/0xc2
>  [<c012bca6>] autoremove_wake_function+0x0/0x2d
>  [<c012916b>] worker_thread+0x0/0xc2
>  [<c012ba42>] kthread+0x38/0x5d
>  [<c012ba0a>] kthread+0x0/0x5d
>  [<c010370f>] kernel_thread_helper+0x7/0x10
>  =======================
> handlers:
> [<c027d100>] (usb_hcd_irq+0x0/0x53)
> Disabling IRQ #19

Hmm, so either it's the SATA controller still generating that IRQ even 
when it's in AHCI mode, or else it's USB that's the real problem..

> 
> This might happen due to "ACPI: EC: GPE storm detected, disabling EC GPE", but
> here it should revert to polling mode (which is done during boot, but not during
> resume). I'm not expert here.

That does seem unusual, but it doesn't seem directly related (ACPI is on 
IRQ9).

> 
> Full dmesg here -> http://212.109.128.251/~difrost/linux-next/dmesg_ahci.log
> 
> -Jacek
> 
> PS: Site note: Why there's such big difference on hdparm timings with PATA and
> AHCI mode?

You can narrow that down by doing this with AHCI in use:

echo 1 > /sys/block/sda/device/queue_depth

which will disable NCQ but keep AHCI. If that brings the performance 
back up, then quite likely your drive's NCQ implementation isn't really 
optimized for sequential reads..

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Suspend to memory is freezing my machine
  2008-05-04 21:06               ` Robert Hancock
@ 2008-05-07 20:29                 ` Jacek Luczak
  0 siblings, 0 replies; 5+ messages in thread
From: Jacek Luczak @ 2008-05-07 20:29 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Rafael J. Wysocki, Zdenek Kabelac, Linux Kernel Mailing List, ide,
	linux-acpi

Robert Hancock pisze:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> With recent 2.6.25 & 2.6.26-rc1 git (around 1 week) I get
>>>>>>>>> occasionally
>>>>>>>>> complete freeze of my T61 during suspend. (dual core, 2GB).
>>>>>>>> How reproducible is this?
>>>>>>>>
>>>>>>>>> I'm running kernel with no_console_suspend - but all I can see is
>>>>>>>>> blinking cursor on an empty screen - thus even when I run
>>>>>>>>> kernel with
>>>>>>>>> most debug options turned on, I can't pass more details so far.  I
>>>>>>>>> run
>>>>>>>>> suspend with with SD card in - so maybe some update in the MMC
>>>>>>>>> driver
>>>>>>>>> might be responsible for this ?
>>>>>>>>>
>>>>>>>>> Also - I think that option no_console_suspend doens't work
>>>>>>>>> correctly -
>>>>>>>>> as many times with suspend I do not see any log message on my
>>>>>>>>> console
>>>>>>>>> screen. However sometimes the log is shown.
>>>>>>>> It would be helpful if you could verify if:
>>>>>>>>
>>>>>>>> (1) The problem occurs without no_console_suspend.
>>>>>>>> (2) The problem occurs without the SD card.
>>>>>>>>
>>>>>>> Hi Rafael,
>>>>>>>
>>>>>>> same problem here, although I was able to resume system (it's
>>>>>>> basically Intel
>>>>>>> machine) , but it was unusable - I was able to switch between
>>>>>>> terminals and see
>>>>>>> output from kernel. So there was:
>>>>>>>     - Disabling irq #19;
>>>>>>>     - some kind of lock spinning on disk:
>>>>>>> IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family)
>>>>>>> Serial ATA
>>>>>>> Storage
>>>>>>> Controller IDE (rev 02)
>>>>>>>     but I can't provide more output of that lock now - no sign in
>>>>>>> logs.
>>>>>>>
>>>>>>> I've made some successful suspend/resume all without sound card
>>>>>>> active
>>>>>>> without
>>>>>>> problem. Those appear with sound card active, but I must take closer
>>>>>>> look - will
>>>>>>> send info later.
>>>>>> Can you post your dmesg and /proc/interrupts output from normal
>>>>>> bootup ?
>>>>> Sure I can ;)
>>>>>
>>>>> 1) /proc/interrupts
>>>>>
>>>>>            CPU0       CPU1
>>>>>   0:   11846981          0   IO-APIC-edge      timer
>>>>>   1:      30098          0   IO-APIC-edge      i8042
>>>>>   8:          3          0   IO-APIC-edge      rtc
>>>>>   9:         13          0   IO-APIC-fasteoi   acpi
>>>>>  12:    1776540          0   IO-APIC-edge      i8042
>>>>>  14:         39          0   IO-APIC-edge      ata_piix
>>>>>  15:          0          0   IO-APIC-edge      ata_piix
>>>>>  16:      54570      44642   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>>>>>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>>>>>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>>>>>  19:      98243          0   IO-APIC-fasteoi   ata_piix, uhci_hcd:usb5
>>>>>  21:    1650574          0   IO-APIC-fasteoi   HDA Intel
>>>>>  23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1,
>>>>> uhci_hcd:usb2
>>>>> 220:      14263          0   PCI-MSI-edge      iwl3945
>>>>> 221:    1166041    1333296   PCI-MSI-edge      eth0
>>>>> NMI:          0          0   Non-maskable interrupts
>>>>> LOC:    1104887    7534969   Local timer interrupts
>>>>> RES:     633378     701351   Rescheduling interrupts
>>>>> CAL:         16      28315   function call interrupts
>>>>> TLB:       1721       2620   TLB shootdowns
>>>>> TRM:          0          0   Thermal event interrupts
>>>>> SPU:          0          0   Spurious interrupts
>>>>> ERR:          0
>>>>> MIS:          0
>>>>>
>>>>> 2) dmesg can here ->
>>>>> http://212.109.128.251/~difrost/linux-next/dmesg.log
>>>>> 3) Kernel:
>>>>> Linux difrost 2.6.25-07422-gb66e1f1-dirty #14 SMP Fri May 2 22:04:17
>>>>> CEST 2008
>>>>> i686 i686 i386 GNU/Linux
>>>>> It's marked dirty because due to http://lkml.org/lkml/2008/5/2/405
>>>>> patch applied.
>>>>>
>>>>> -Jacek
>>>>>
>>>> Well, if IRQ 19 got disabled, that's your SATA controller, so resume
>>>> likely isn't going to work. Could be a libata problem? CCing linux-ide.
>>> Yep, I know, that's why I pointed that out. Irq was disabled somehow
>>> in suspend
>>> or resume process.
>>>
>>>> BTW, if your BIOS offers an option to enable AHCI on your SATA
>>>> controller then that would be a more optimal configuration (could get
>>>> NCQ support), but that is an aside.
>>> With AHCI I've got pretty bad timings (and I don't really know why!):
>>>
>>> [root|20:49|~]$ cat sda_ahci_t
>>>
>>> /dev/sda:
>>>  Timing cached reads:   1560 MB in  2.00 seconds = 780.51 MB/sec
>>>  Timing buffered disk reads:  102 MB in  3.02 seconds =  33.74 MB/sec
>>> [root|20:49|~]$ cat sda_piix_t
>>>
>>> /dev/sda:
>>>  Timing cached reads:   1544 MB in  2.00 seconds = 772.35 MB/sec
>>>  Timing buffered disk reads:  134 MB in  3.04 seconds =  44.05 MB/sec
>>
>> Here's the latest report (all on latest git):
>>     1) I've switched to AHCI mode and suspend/resume works OK (because
>> SATA
>> controller irq is not disabled).
>>
>>     2) now /proc/interrupts look like that:
>>            CPU0       CPU1
>>   0:     110708          0   IO-APIC-edge      timer
>>   1:       4008          0   IO-APIC-edge      i8042
>>   8:          3          0   IO-APIC-edge      rtc
>>   9:      15091          0   IO-APIC-fasteoi   acpi
>>  12:      77467          0   IO-APIC-edge      i8042
>>  14:         44          0   IO-APIC-edge      ata_piix
>>  15:          0          0   IO-APIC-edge      ata_piix
>>  16:          0          0   IO-APIC-fasteoi   i915@pci:0000:00:02.0
>>  17:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
>>  18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
>>  19:     100001          0   IO-APIC-fasteoi   uhci_hcd:usb5
>>  21:        282          0   IO-APIC-fasteoi   HDA Intel
>>  23:          1          0   IO-APIC-fasteoi   ehci_hcd:usb1,
>> uhci_hcd:usb2
>> 219:        858          0   PCI-MSI-edge      iwl3945
>> 220:       8616          0   PCI-MSI-edge      eth0
>> 221:       6423          0   PCI-MSI-edge      ahci
>> NMI:          0          0   Non-maskable interrupts
>> LOC:      15777      64510   Local timer interrupts
>> RES:       9045      24560   Rescheduling interrupts
>> CAL:         30      28255   function call interrupts
>> TLB:        341        145   TLB shootdowns
>> TRM:          0          0   Thermal event interrupts
>> SPU:          0          0   Spurious interrupts
>> ERR:          0
>> MIS:          0
>>
>>     3) The IRQ #19 remains disabled after resume and produce:
>> irq 19: nobody cared (try booting with the "irqpoll" option)
>> Pid: 13, comm: kacpi_notify Not tainted
>> 2.6.26-rc1-07561-gafa26be-dirty #16
>>  [<c013ea27>] __report_bad_irq+0x24/0x69
>>  [<c013ea2e>] __report_bad_irq+0x2b/0x69
>>  [<c013ec25>] note_interrupt+0x1b9/0x210
>>  [<c013e36c>] handle_IRQ_event+0x1a/0x3f
>>  [<c013f195>] handle_fasteoi_irq+0x84/0xa2
>>  [<c0104fde>] do_IRQ+0x4f/0x65
>>  [<c01034ff>] common_interrupt+0x23/0x28
>>  [<c013007b>] timekeeping_resume+0x9b/0x127
>>  [<c020b090>] acpi_os_read_port+0x29/0x44
>>  [<c02177c9>] acpi_hw_register_read+0x61/0x119
>>  [<c020f76e>] acpi_ev_fixed_event_detect+0x2a/0xa0
>>  [<c021001a>] acpi_ev_sci_xrupt_handler+0x9/0x17
>>  [<c020b053>] acpi_irq+0xb/0x1f
>>  [<c013e36c>] handle_IRQ_event+0x1a/0x3f
>>  [<c013f181>] handle_fasteoi_irq+0x70/0xa2
>>  [<c0104fde>] do_IRQ+0x4f/0x65
>>  [<c020b623>] acpi_os_execute_deferred+0x0/0x25
>>  [<c01034ff>] common_interrupt+0x23/0x28
>>  [<c020b623>] acpi_os_execute_deferred+0x0/0x25
>>  [<c020b0b8>] acpi_os_write_port+0xd/0x2c
>>  [<c020b640>] acpi_os_execute_deferred+0x1d/0x25
>>  [<c01290fa>] run_workqueue+0x69/0xda
>>  [<c0129221>] worker_thread+0xb6/0xc2
>>  [<c012bca6>] autoremove_wake_function+0x0/0x2d
>>  [<c012916b>] worker_thread+0x0/0xc2
>>  [<c012ba42>] kthread+0x38/0x5d
>>  [<c012ba0a>] kthread+0x0/0x5d
>>  [<c010370f>] kernel_thread_helper+0x7/0x10
>>  =======================
>> handlers:
>> [<c027d100>] (usb_hcd_irq+0x0/0x53)
>> Disabling IRQ #19
> 
> Hmm, so either it's the SATA controller still generating that IRQ even
> when it's in AHCI mode, or else it's USB that's the real problem..

I thing your are right here. I've made some basic tests. Removed USB support
form kernel and then suspend/resume with new one. There were no disabled irq
printouts as the one above. But then I've started new kernel compilation (back
with USB support and CONFIG_USB_SUSPEND=y) and suddenly Irq 221 was disabled -
the ,,nobody cared'' error was printed out. Kernel was trying to reset it but it
stacked (I will try to catch those messages).

I will make some tests on second system (I wouldn't like to break primary, as
while kernel is trying to bring back disk to live, the xfs errors are printed -
mainly failed writes).

-Jacek

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-05-07 20:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.4esVkXHbwOL+Pzatd+lYSGawfns@ifi.uio.no>
     [not found] ` <fa.zsdi5UBi6URzBaRSfFXIc6F7kHk@ifi.uio.no>
     [not found]   ` <fa.plXIUcQIe19WMbpH4aKYo6JX1RQ@ifi.uio.no>
     [not found]     ` <481DF438.30307@shaw.ca>
     [not found]       ` <481DF570.8050700@gmail.com>
2008-05-04 18:01         ` Suspend to memory is freezing my machine Robert Hancock
2008-05-04 18:52           ` Jacek Luczak
2008-05-04 19:44             ` Jacek Luczak
2008-05-04 21:06               ` Robert Hancock
2008-05-07 20:29                 ` Jacek Luczak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).