linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: 2.6.16-rc5: known regressions
@ 2006-02-27  9:04 Yu, Luming
  2006-03-03  2:59 ` Sanjoy Mahajan
  2006-03-10  5:26 ` 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT] Sanjoy Mahajan
  0 siblings, 2 replies; 22+ messages in thread
From: Yu, Luming @ 2006-02-27  9:04 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds, Andrew Morton
  Cc: Tom Seeley, Dave Jones, Jiri Slaby, michael, mchehab,
	v4l-dvb-maintainer, video4linux-list, Brian Marete, Ryan Phillips,
	gregkh, linux-usb-devel, Sanjoy Mahajan, Brown, Len, linux-acpi,
	Mark Lord, Randy Dunlap, jgarzik, linux-ide, Duncan,
	Pavlik Vojtech, linux-input, Meelis Roos

>Subject    : S3 sleep hangs the second time - 600X
>References : http://bugzilla.kernel.org/show_bug.cgi?id=5989
>Submitter  : Sanjoy Mahajan <sanjoy@mrao.cam.ac.uk>
>Handled-By : Luming Yu <luming.yu@intel.com>
>Status     : is being debugged,
>             we might want to change the default back for 2.6.16:
>             http://lkml.org/lkml/2006/2/25/101
>

Accordint to bug report, the BIOS DSDT is modified.
I don't know how these changes affect the results
of suspend/resume. But, it is clear this is NOT right approach 
to fix problem. Hence, I need the testing report with 
un-modified DSDT on TP 600X, bios 1.11.

--Luming

^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-10  6:12 Yu, Luming
  2006-03-10  6:27 ` Sanjoy Mahajan
  0 siblings, 1 reply; 22+ messages in thread
From: Yu, Luming @ 2006-03-10  6:12 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos

>From: "Yu, Luming" <luming.yu@intel.com>
>> I suggest you to retest, and post dmesg with UN-modified BIOS.
>
>I'm now running/testing an unmodified DSDT with 2.6.16-rc5.  
>For a while
>I had no S3 hangs, but I just noticed them again.  The error 
>is the same
>as with the modified DSDT (with slightly different offsets):

I assume you have tested ec_intr=0 and ec_intr=1.

>
>exregion-0185 [36] ex_system_memory_space: system_memory 0 (32 
>width) Address=0000000023FDFFC0
>exregion-0185 [36] ex_system_memory_space: system_memory 1 (32 
>width) Address=0000000023FDFFC0
>exregion-0290 [36] ex_system_io_space_han: system_iO 1 (8 
>width) Address=00000000000000B2
>
>repeated endlessly.

I need calltrace for this 

>
>I think the problem resurfaced once I decided to let my sleep.sh script
>leave the thermal driver loaded before going into S3 (suspecting that
>the bug might come back if I did that).

Clealy, it's thermal related. We need to narrow down here.

>
>So I susect that my modified DSDT didn't cause the S3 problems, it
>merely exposed one even in the minimal configuration discussed in the
>#5989 report.

The ground rule is Don't use modified DSDT.
If you do that,  the results won't be trusted.

>
>Which makes me wonder about another bug that disappeared when 
>I switched
>to the vanilla DSDT: While printing (via gs+hpijs to an HP photosmart
>2710 via the wireless card), the system makes double-beeps as 
>if it were
>having the AC adapter plugged and unplugged.  These noises happen when
>printing via the wireless card or via USB (to a different HP inkjet),

Interesting, open bug for this.

>but not when printing via the parallel port to a Lexmark laserprinter
>(using just gs).  Since I didn't do anything to the battery code in the
>DSDT, I now wonder whether changing the DSDT merely exposed the issue
>but didn't create it.
>
>[From an earlier msg:]
>> I think the truth is, for 5989, we need to fix thermal and processor
>> driver issue.
>
>I agree, although I think the processor driver is not the culprit.  My
>earlier testing with the (with the modified DSDT) worked fine with the
>processor module loaded, but hung with processor + thermal loaded.
>

ok, we need to start from thermal.  

BTW, do you still think this is a regression?

Thanks,
Luming


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-10  6:46 Yu, Luming
  2006-03-10 13:27 ` Sanjoy Mahajan
  2006-03-10 13:36 ` Sanjoy Mahajan
  0 siblings, 2 replies; 22+ messages in thread
From: Yu, Luming @ 2006-03-10  6:46 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos

> exregion-0290 [36] ex_system_io_space_han: system_iO 1 (8 
>>> width) Address=00000000000000B2
>>>
>>> repeated endlessly.
>
>> I need calltrace for this 
>
>Looking at /proc/acpi/debug_level, I see several debugging choices
>that might give the calltrace you want.  Let me know which ones are
>essential (I'd turn all of them on; however, I found when trying to
>track this down earlier that the bug would slither away if I had too
>much debugging turned on):

What do you mean of "slither away" ? 
bug go away?

>
>ACPI_LV_DISPATCH	       0x00000100 [ ]
>ACPI_LV_EXEC		       0x00000200 [ ]
>ACPI_LV_NAMES		       0x00000400 [ ]
>ACPI_LV_FUNCTIONS	       0x00200000 [ ]
>
>By the way, a long standing buglet for me is that 'cat
>/proc/acpi/debug_level' truncates the output to 1024 bytes.  So I have
>to do 'cat /proc/acpi/debug_level | cat' so that the first cat doesn't
>find that its stdout is a tty and try to reduce its buffer size from
>4096 (big enough) to 1024.  A patch is available at
><http://bugzilla.kernel.org/show_bug.cgi?id=5076>

let's start from:

echo -n 0x10 > /proc/acpi/debug_layer
echo -n 0x10 > /proc/acpi/debug_level

>
>> BTW, do you still think this is a regression?
>
>I'm 95% sure, because booting with ec_intr=0 avoids the problem, so
>the commit that made ec_intr=1 the default almost certainly also makes
>this bug appear.

why NOT 100% sure? :-)

^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-13  2:00 Yu, Luming
  2006-03-13  4:38 ` Sanjoy Mahajan
  0 siblings, 1 reply; 22+ messages in thread
From: Yu, Luming @ 2006-03-13  2:00 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos

>width) Address=0000000023FDFFC0
>exregion-0290 [36] ex_system_io_space_han: system_iO 1 (8 
>width) Address=00000000000000B2
>exregion-0185 [35] ex_system_memory_space: system_memory 0 (32 
>width) Address=0000000023FDFFC0
>exregion-0185 [36] ex_system_memory_space: system_memory 0 (32 
>width) Address=0000000023FDFFC0
>exregion-0185 [36] ex_system_memory_space: system_memory 1 (32 
>width) Address=0000000023FDFFC0
>exregion-0290 [36] ex_system_io_space_han: system_iO 1 (8 
>width) Address=00000000000000B2
>
>And then these above four lines (exregion-0185, -0185, -0185, -0290)
>repeat until I reboot.
>

If I understand correctly, it was due to  LEqual(S_AH, 0xA6) awlays
true.
SMM bios code didn't  respond , or respond correctly 
to the request by "store 0x81, APMD"  due to thermal module caused
issue?
I need the acpi trace log before _PTS to see what kind of thermal
related methods got called.

    Method (SMPI, 1, NotSerialized)
    {
        Store (S_AX, Local0)
        Store (0x81, APMD)
        While (LEqual (S_AH, 0xA6))
        {
            Sleep (0x64)
            Store (Local0, S_AX)
            Store (0x81, APMD)
        }
    }

^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-13  4:51 Yu, Luming
  2006-03-13  7:28 ` Sanjoy Mahajan
  0 siblings, 1 reply; 22+ messages in thread
From: Yu, Luming @ 2006-03-13  4:51 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos

>> I need the acpi trace log before _PTS to see what kind of thermal
>> related methods got called.
>
>Alas, I've included all the dmesg's.  

I need the full log  for S3 suspend failure not just snippets.
Please attach it on bugzilla.kernel.org

The log for S3 suspend success cannot help me to track down.


>
>Below is the script that I use to enter S3 sleep.  It unloads rid of
>troublesome modules and stop services that don't sleep well.  Then
>(for debugging) it sends the kernel version and boot parameters across
>the serial console (the @@@@ SLEEP line), raises the debug level to
>0x1F, does a sync (in case the sleep hangs, since this is my
>production machine), and then enters mem sleep.
>
>So nothing in it should trigger any thermal methods; except that I
>usually have the THM2 trip point raised to 45C with a polling time of
>100 seconds.  So once in a while a thermal poll will happen sleep is
>being set up.  I am not sure whether it would be reported in the
>dmesgs if it happened; but the S3 failure happens much more often than
>such a thermal polling would happen, so I doubt the S3 failure
>requires a thermal poll.

Could you try to mute thermal poll?

^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-13  8:35 Yu, Luming
  2006-03-13 15:21 ` Sanjoy Mahajan
  0 siblings, 1 reply; 22+ messages in thread
From: Yu, Luming @ 2006-03-13  8:35 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos


Thanks for your debug information.

>
>> Could you try to mute thermal poll?
>
>Done.  The sleep.sh script now has
>
>echo 0 > /proc/acpi/thermal_zone/THM2/polling_frequency
>echo 0 > /proc/acpi/thermal_zone/THM0/polling_frequency
>sleep 1

Hmm,  could you file dmesges with tmermal module loaded and
unloaded?

>
>> I need the full log  for S3 suspend failure not just snippets.
>> Please attach it on bugzilla.kernel.org
>
>Done.

I saw this acpi_debug=0xffffffff.

I used to used to use acpi_debug_layer=0x10 acpi_debug_level=0x10
Could you try that?


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-03-14  1:48 Yu, Luming
  0 siblings, 0 replies; 22+ messages in thread
From: Yu, Luming @ 2006-03-14  1:48 UTC (permalink / raw)
  To: Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos

>> Hmm, could you file dmesgs with thermal module loaded and unloaded?
>
>Filed at bugzilla.

Excellent! .

>Let me know if there's a different permutation of debug options that I
>should try.  I wasn't sure whether you meant that I should leave all
>the debug values at 0x10.  Or whether I should still include
>acpi_debug=0xffffffff on top of the other options.

So far, it's ok,  I saw these,  Could you do bisection to find out
which methods or which thermal zone cause trouble?
To do that, you have to hack thermal.c by commenting out 
some calls of evaluating methods below.
I hope it is easy for you!	 :-)

Thanks,
Luming

Execute Method: [\_TZ_.THM0._TMP] (Node c157bf88)
Execute Method: [\_TZ_.THM0._PSV] (Node c157be48)
Execute Method: [\_TZ_.THM0._TC1] (Node c157bdc8)
Execute Method: [\_TZ_.THM0._TC2] (Node c157bd88)
Execute Method: [\_TZ_.THM0._TSP] (Node c157bd48)
Execute Method: [\_TZ_.THM0._AC0] (Node c157bf48)
Execute Method: [\_TZ_.THM0._SCP] (Node c157bec8)
Execute Method: [\_TZ_.THM0._TMP] (Node c157bf88)
ACPI: Thermal Zone [THM0] (47 C)
Execute Method: [\_TZ_.THM2._TMP] (Node c157bb88)
Execute Method: [\_TZ_.THM2._AC0] (Node c157bb48)
Execute Method: [\_TZ_.THM2._SCP] (Node c157bac8)
Execute Method: [\_TZ_.THM2._TMP] (Node c157bb88)
Execute Method: [\_TZ_.PFN0._ON_] (Node c157a2c8)
Execute Method: [\_TZ_.PFN0._STA] (Node c157a308)
ACPI: Thermal Zone [THM2] (40 C)
Execute Method: [\_TZ_.THM6._TMP] (Node c157b948)
Execute Method: [\_TZ_.THM6._AC0] (Node c157b908)
Execute Method: [\_TZ_.THM6._SCP] (Node c157b888)
Execute Method: [\_TZ_.THM6._TMP] (Node c157b948)
ACPI: Thermal Zone [THM6] (30 C)
Execute Method: [\_TZ_.THM7._TMP] (Node c157b708)
Execute Method: [\_TZ_.THM7._AC0] (Node c157b6c8)
Execute Method: [\_TZ_.THM7._SCP] (Node c157b648)
Execute Method: [\_TZ_.THM7._TMP] (Node c157b708)
ACPI: Thermal Zone [THM7] (33 C)


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT]
@ 2006-05-23 13:29 Yu, Luming
  0 siblings, 0 replies; 22+ messages in thread
From: Yu, Luming @ 2006-05-23 13:29 UTC (permalink / raw)
  To: trenn, Sanjoy Mahajan
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Tom Seeley,
	Dave Jones, Jiri Slaby, michael, mchehab, v4l-dvb-maintainer,
	video4linux-list, Brian Marete, Ryan Phillips, gregkh,
	linux-usb-devel, Brown, Len, linux-acpi, Mark Lord, Randy Dunlap,
	jgarzik, linux-ide, Duncan, Pavlik Vojtech, linux-input,
	Meelis Roos, Carl-Daniel Hailfinger


>> exregion-0185 [36] ex_system_memory_space: system_memory 0 
>(32 width) Address=0000000023FDFFC0
>> exregion-0185 [36] ex_system_memory_space: system_memory 1 
>(32 width) Address=0000000023FDFFC0
>> exregion-0290 [36] ex_system_io_space_han: system_iO 1 (8 
>width) Address=00000000000000B2
>> 
>> repeated endlessly.

Hmm.. interesting.  This looks like same error with TP600X.

>
>This sounds like the problem Daniel had on his Samsung P35 recently.
>He could fix it by getting rid of some asus_unhide_smbus stuff or the
>otherway around, adding asus_unhide_smbus quirks in the S3 resume code.
>
>This thread was recently posted on lkml:
>Re: [patch] smbus unhiding kills thermal management
>
>Here are some more details, for me that sounds related...:
>https://bugzilla.novell.com/show_bug.cgi?id=173420
>

But this Samsung P35 don't have _GLK. So, I think TP 600x has
a different problem with Samsung P35.

Actually, Sanjoy has a workaround to solve TP 600X S3 issue.
What we need to do is to come up with a clean patch. 
It is on to-do list. 

Thanks,
Luming

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2006-05-23 13:29 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-27  9:04 2.6.16-rc5: known regressions Yu, Luming
2006-03-03  2:59 ` Sanjoy Mahajan
2006-03-03 16:51   ` Matthew Garrett
2006-03-03 21:04     ` Sanjoy Mahajan
2006-03-10  5:26 ` 2.6.16-rc5: known regressions [TP 600X S3, vanilla DSDT] Sanjoy Mahajan
2006-05-19 13:44   ` Thomas Renninger
2006-05-21  0:12     ` Sanjoy Mahajan
2006-05-21  0:40       ` Carl-Daniel Hailfinger
2006-05-22  9:55       ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2006-03-10  6:12 Yu, Luming
2006-03-10  6:27 ` Sanjoy Mahajan
2006-03-10  6:46 Yu, Luming
2006-03-10 13:27 ` Sanjoy Mahajan
2006-03-10 13:36 ` Sanjoy Mahajan
2006-03-13  2:00 Yu, Luming
2006-03-13  4:38 ` Sanjoy Mahajan
2006-03-13  4:51 Yu, Luming
2006-03-13  7:28 ` Sanjoy Mahajan
2006-03-13  8:35 Yu, Luming
2006-03-13 15:21 ` Sanjoy Mahajan
2006-03-14  1:48 Yu, Luming
2006-05-23 13:29 Yu, Luming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).