linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ata_piix resume from S3 on T43P failed
@ 2006-05-11  8:05 zhao, forrest
  2006-05-11  8:31 ` Tejun Heo
  0 siblings, 1 reply; 17+ messages in thread
From: zhao, forrest @ 2006-05-11  8:05 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

Hi, Tejun

We just tested your git tree on thinkpad T43P laptop, and found that
after ata_piix resumed from S3, SATA disk can't be read/write anymore.
But according to the test result of kernel 2.6.16-rc6 on T43P, ata_piix
can resume from S3 successfully.

We know that this problem may be not related to your patches, but I
think you know the libata development status very well, maybe you can
give us some clue about what happened to ata_piix between 2.6.16-rc6 and
your git tree.

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  8:05 ata_piix resume from S3 on T43P failed zhao, forrest
@ 2006-05-11  8:31 ` Tejun Heo
  2006-05-11  8:35   ` Tejun Heo
  2006-05-13  4:19   ` Jeff Garzik
  0 siblings, 2 replies; 17+ messages in thread
From: Tejun Heo @ 2006-05-11  8:31 UTC (permalink / raw)
  To: zhao, forrest; +Cc: linux-ide

zhao, forrest wrote:
> Hi, Tejun
> 
> We just tested your git tree on thinkpad T43P laptop, and found that
> after ata_piix resumed from S3, SATA disk can't be read/write anymore.
> But according to the test result of kernel 2.6.16-rc6 on T43P, ata_piix
> can resume from S3 successfully.
> 
> We know that this problem may be not related to your patches, but I
> think you know the libata development status very well, maybe you can
> give us some clue about what happened to ata_piix between 2.6.16-rc6 and
> your git tree.
> 

Hello, Zhao.

I haven't really followed AHCI suspend/resume stuff but AFAICT it never
made to #upstream.  I don't know whether it was included in -rc# or not.
 But it sounds like it did.

One thing to note about suspend/resume is they should be handled from
EH.  IIRC, they weren't synchronized properly with the rest of libata.
Maybe it can be another ATA_EH action or maybe it needs separate
handling but at any rate it should be handled as part of EH to be
synchronized properly.  I'm planning to work on suspend/resume once the
currently pending changes settle down.  I thought about including them
in this round but the changes were HUGE as they were, so decided to
defer it.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  8:31 ` Tejun Heo
@ 2006-05-11  8:35   ` Tejun Heo
  2006-05-11  9:46     ` zhao, forrest
  2006-05-13  4:19   ` Jeff Garzik
  1 sibling, 1 reply; 17+ messages in thread
From: Tejun Heo @ 2006-05-11  8:35 UTC (permalink / raw)
  To: Tejun Heo; +Cc: zhao, forrest, linux-ide

Tejun Heo wrote:
> zhao, forrest wrote:
>> Hi, Tejun
>>
>> We just tested your git tree on thinkpad T43P laptop, and found that
>> after ata_piix resumed from S3, SATA disk can't be read/write anymore.
>> But according to the test result of kernel 2.6.16-rc6 on T43P, ata_piix
>> can resume from S3 successfully.
>>
>> We know that this problem may be not related to your patches, but I
>> think you know the libata development status very well, maybe you can
>> give us some clue about what happened to ata_piix between 2.6.16-rc6 and
>> your git tree.
>>
> 
> Hello, Zhao.
> 
> I haven't really followed AHCI suspend/resume stuff but AFAICT it never
> made to #upstream.  I don't know whether it was included in -rc# or not.
>  But it sounds like it did.
> 
> One thing to note about suspend/resume is they should be handled from
> EH.  IIRC, they weren't synchronized properly with the rest of libata.
> Maybe it can be another ATA_EH action or maybe it needs separate
> handling but at any rate it should be handled as part of EH to be
> synchronized properly.  I'm planning to work on suspend/resume once the
> currently pending changes settle down.  I thought about including them
> in this round but the changes were HUGE as they were, so decided to
> defer it.
> 

Oops, you were talking about ata_piix and I answered about ahci.  Sorry
about that.  :(

Can you please post dmesg w/ ATA_DEBUG turne on?  I might have screwed
up while updating suspend/resume functions.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  8:35   ` Tejun Heo
@ 2006-05-11  9:46     ` zhao, forrest
  2006-05-11 10:39       ` Tejun Heo
  2006-05-11 10:55       ` Jens Axboe
  0 siblings, 2 replies; 17+ messages in thread
From: zhao, forrest @ 2006-05-11  9:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

On Thu, 2006-05-11 at 17:35 +0900, Tejun Heo wrote:
> Tejun Heo wrote:
> Oops, you were talking about ata_piix and I answered about ahci.  Sorry
> about that.  :(
> 
> Can you please post dmesg w/ ATA_DEBUG turne on?  I might have screwed
> up while updating suspend/resume functions.
> 

Tejun,

The case is very tricky. When ata_piix resumed from S3, dmesg can't be
outputted, when I type "dmesg" in console, it prompts:
-bash: /bin/dmesg: Input/output error

Is there any other way to collect the log?

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  9:46     ` zhao, forrest
@ 2006-05-11 10:39       ` Tejun Heo
  2006-05-12  5:02         ` zhao, forrest
  2006-05-11 10:55       ` Jens Axboe
  1 sibling, 1 reply; 17+ messages in thread
From: Tejun Heo @ 2006-05-11 10:39 UTC (permalink / raw)
  To: zhao, forrest; +Cc: linux-ide

zhao, forrest wrote:
> On Thu, 2006-05-11 at 17:35 +0900, Tejun Heo wrote:
>> Tejun Heo wrote:
>> Oops, you were talking about ata_piix and I answered about ahci.  Sorry
>> about that.  :(
>>
>> Can you please post dmesg w/ ATA_DEBUG turne on?  I might have screwed
>> up while updating suspend/resume functions.
>>
> 
> Tejun,
> 
> The case is very tricky. When ata_piix resumed from S3, dmesg can't be
> outputted, when I type "dmesg" in console, it prompts:
> -bash: /bin/dmesg: Input/output error
> 
> Is there any other way to collect the log?
> 

Ahh.... right.  Unfortunately, I don't have much experience with
suspending/resuming and don't really know what to do.  Can you try the
current #upstream and see how that works?  So that we can see whether
the EH changes caused the problem?

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  9:46     ` zhao, forrest
  2006-05-11 10:39       ` Tejun Heo
@ 2006-05-11 10:55       ` Jens Axboe
  2006-05-12  5:51         ` zhao, forrest
  1 sibling, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2006-05-11 10:55 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Tejun Heo, linux-ide

On Thu, May 11 2006, zhao, forrest wrote:
> On Thu, 2006-05-11 at 17:35 +0900, Tejun Heo wrote:
> > Tejun Heo wrote:
> > Oops, you were talking about ata_piix and I answered about ahci.  Sorry
> > about that.  :(
> > 
> > Can you please post dmesg w/ ATA_DEBUG turne on?  I might have screwed
> > up while updating suspend/resume functions.
> > 
> 
> Tejun,
> 
> The case is very tricky. When ata_piix resumed from S3, dmesg can't be
> outputted, when I type "dmesg" in console, it prompts:
> -bash: /bin/dmesg: Input/output error
> 
> Is there any other way to collect the log?

run dmesg prior to suspending, then it'll be cached and can be run after
resume even if the disk doesn't want to talk to you.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11 10:39       ` Tejun Heo
@ 2006-05-12  5:02         ` zhao, forrest
  0 siblings, 0 replies; 17+ messages in thread
From: zhao, forrest @ 2006-05-12  5:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

On Thu, 2006-05-11 at 19:39 +0900, Tejun Heo wrote:
> Can you try the
> current #upstream and see how that works?  So that we can see whether
> the EH changes caused the problem?
I tried the #upstream, the ata_piix of which can't resume from S3 on T43P either.
So the code changes between 2.6.16-rc6 and #upstream caused the problem.
I'll post the dmesg in the following mails.

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11 10:55       ` Jens Axboe
@ 2006-05-12  5:51         ` zhao, forrest
  2006-05-12 10:17           ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: zhao, forrest @ 2006-05-12  5:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, linux-ide

On Thu, 2006-05-11 at 12:55 +0200, Jens Axboe wrote:
> run dmesg prior to suspending, then it'll be cached and can be run after
> resume even if the disk doesn't want to talk to you.
> 

Don't know why the mails with attached files were not sent to mailing
list, so I put them at
http://www.infradead.org/~forrest/suspend-resume-dmesg/

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-12  5:51         ` zhao, forrest
@ 2006-05-12 10:17           ` Jens Axboe
  2006-05-12 10:56             ` Tejun Heo
  2006-05-16  3:56             ` zhao, forrest
  0 siblings, 2 replies; 17+ messages in thread
From: Jens Axboe @ 2006-05-12 10:17 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Tejun Heo, linux-ide

On Fri, May 12 2006, zhao, forrest wrote:
> On Thu, 2006-05-11 at 12:55 +0200, Jens Axboe wrote:
> > run dmesg prior to suspending, then it'll be cached and can be run after
> > resume even if the disk doesn't want to talk to you.
> > 
> 
> Don't know why the mails with attached files were not sent to mailing
> list, so I put them at
> http://www.infradead.org/~forrest/suspend-resume-dmesg/

The key is the 0xef timeout, then the device is offlined and you see a
lot of io errors due to that.

Try this:

diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index bd14720..f120839 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4288,6 +4288,7 @@ int ata_device_resume(struct ata_port *a
 {
 	if (ap->flags & ATA_FLAG_SUSPENDED) {
 		ap->flags &= ~ATA_FLAG_SUSPENDED;
+		mdelay(2000);
 		ata_set_mode(ap);
 	}
 	if (!ata_dev_present(dev))

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-12 10:17           ` Jens Axboe
@ 2006-05-12 10:56             ` Tejun Heo
  2006-05-16  3:56             ` zhao, forrest
  1 sibling, 0 replies; 17+ messages in thread
From: Tejun Heo @ 2006-05-12 10:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: zhao, forrest, linux-ide

Hello, Zhao, Jens.

Jens Axboe wrote:
> On Fri, May 12 2006, zhao, forrest wrote:
>> On Thu, 2006-05-11 at 12:55 +0200, Jens Axboe wrote:
>>> run dmesg prior to suspending, then it'll be cached and can be run after
>>> resume even if the disk doesn't want to talk to you.
>>>
>> Don't know why the mails with attached files were not sent to mailing
>> list, so I put them at
>> http://www.infradead.org/~forrest/suspend-resume-dmesg/
> 
> The key is the 0xef timeout, then the device is offlined and you see a
> lot of io errors due to that.
> 
> Try this:
> 
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index bd14720..f120839 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4288,6 +4288,7 @@ int ata_device_resume(struct ata_port *a
>  {
>  	if (ap->flags & ATA_FLAG_SUSPENDED) {
>  		ap->flags &= ~ATA_FLAG_SUSPENDED;
> +		mdelay(2000);
>  		ata_set_mode(ap);
>  	}
>  	if (!ata_dev_present(dev))
> 

If Jens' suggestion doesn't work, can you post the same log with the 
working kernel?  In #upstream, that part of code hasn't changed much, so 
I'm curious what makes the other kernel work.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-11  8:31 ` Tejun Heo
  2006-05-11  8:35   ` Tejun Heo
@ 2006-05-13  4:19   ` Jeff Garzik
  2006-05-16  1:58     ` zhao, forrest
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Garzik @ 2006-05-13  4:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: zhao, forrest, linux-ide, Hannes Reinecke

Tejun Heo wrote:
> zhao, forrest wrote:
>> Hi, Tejun
>>
>> We just tested your git tree on thinkpad T43P laptop, and found that
>> after ata_piix resumed from S3, SATA disk can't be read/write anymore.
>> But according to the test result of kernel 2.6.16-rc6 on T43P, ata_piix
>> can resume from S3 successfully.
>>
>> We know that this problem may be not related to your patches, but I
>> think you know the libata development status very well, maybe you can
>> give us some clue about what happened to ata_piix between 2.6.16-rc6 and
>> your git tree.
>>
> 
> Hello, Zhao.
> 
> I haven't really followed AHCI suspend/resume stuff but AFAICT it never
> made to #upstream.  I don't know whether it was included in -rc# or not.
>  But it sounds like it did.

Hannes @ SuSE posted an AHCI suspend/resume patch that needs to go in.
Unfortunately AHCI is a high rate of churn, so it's still in my
'Pending' folder :/

	Jeff



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-13  4:19   ` Jeff Garzik
@ 2006-05-16  1:58     ` zhao, forrest
  0 siblings, 0 replies; 17+ messages in thread
From: zhao, forrest @ 2006-05-16  1:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Tejun Heo, linux-ide, Hannes Reinecke

On Sat, 2006-05-13 at 00:19 -0400, Jeff Garzik wrote:
> Hannes @ SuSE posted an AHCI suspend/resume patch that needs to go in.
> Unfortunately AHCI is a high rate of churn, so it's still in my
> 'Pending' folder :/
> 
> 	Jeff

Jeff,

According to our test, Hannes's patch can't work on our NAPA sdv, I have
reported the problem on 27th, April. The mail archive is at:
http://marc.theaimsgroup.com/?l=linux-ide&m=114612987014369&w=2

And I ported a patch from OpenSUSE to #upstream and posted patch on
30th,April, the mail archive is at:
http://marc.theaimsgroup.com/?l=linux-ide&m=114637600401411&w=2

Will this ported patch be in your pending list?

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-12 10:17           ` Jens Axboe
  2006-05-12 10:56             ` Tejun Heo
@ 2006-05-16  3:56             ` zhao, forrest
  2006-05-17 11:03               ` Jens Axboe
  1 sibling, 1 reply; 17+ messages in thread
From: zhao, forrest @ 2006-05-16  3:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, linux-ide

On Fri, 2006-05-12 at 12:17 +0200, Jens Axboe wrote:
> The key is the 0xef timeout, then the device is offlined and you see a
> lot of io errors due to that.
> 
> Try this:
> 
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index bd14720..f120839 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4288,6 +4288,7 @@ int ata_device_resume(struct ata_port *a
>  {
>  	if (ap->flags & ATA_FLAG_SUSPENDED) {
>  		ap->flags &= ~ATA_FLAG_SUSPENDED;
> +		mdelay(2000);
>  		ata_set_mode(ap);
>  	}
>  	if (!ata_dev_present(dev))
> 

Jens,

Yes! The patch works. But I'm wondering why ata_piix driver in kernel
2.6.16-rc6 works without mdelay(2000); in ata_device_resume()?

Thanks,
Forrest

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-16  3:56             ` zhao, forrest
@ 2006-05-17 11:03               ` Jens Axboe
  2006-05-17 12:56                 ` Jeff Garzik
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2006-05-17 11:03 UTC (permalink / raw)
  To: zhao, forrest; +Cc: Tejun Heo, linux-ide

On Tue, May 16 2006, zhao, forrest wrote:
> On Fri, 2006-05-12 at 12:17 +0200, Jens Axboe wrote:
> > The key is the 0xef timeout, then the device is offlined and you see a
> > lot of io errors due to that.
> > 
> > Try this:
> > 
> > diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> > index bd14720..f120839 100644
> > --- a/drivers/scsi/libata-core.c
> > +++ b/drivers/scsi/libata-core.c
> > @@ -4288,6 +4288,7 @@ int ata_device_resume(struct ata_port *a
> >  {
> >  	if (ap->flags & ATA_FLAG_SUSPENDED) {
> >  		ap->flags &= ~ATA_FLAG_SUSPENDED;
> > +		mdelay(2000);
> >  		ata_set_mode(ap);
> >  	}
> >  	if (!ata_dev_present(dev))
> > 
> 
> Jens,
> 
> Yes! The patch works. But I'm wondering why ata_piix driver in kernel
> 2.6.16-rc6 works without mdelay(2000); in ata_device_resume()?

I think Hugh traced it down to a unrelated timer change. The above
really wants to wait for BUSY clear, perhaps the best solution would be
to have piix device its own ata_piix_device_resume() that first waits
for BUSY clear, then calls ata_device_resume().

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-17 11:03               ` Jens Axboe
@ 2006-05-17 12:56                 ` Jeff Garzik
  2006-05-17 13:02                   ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Garzik @ 2006-05-17 12:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide

Jens Axboe wrote:
> I think Hugh traced it down to a unrelated timer change. The above
> really wants to wait for BUSY clear, perhaps the best solution would be
> to have piix device its own ata_piix_device_resume() that first waits
> for BUSY clear, then calls ata_device_resume().

Close...  all devices should wait for libata to signal that the bus is 
ready to be talked to.  For some that's waiting for BSY to clear, for 
others that's checking the SATA bus.

	Jeff



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-17 12:56                 ` Jeff Garzik
@ 2006-05-17 13:02                   ` Jens Axboe
  2006-05-22  7:32                     ` Jeff Garzik
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2006-05-17 13:02 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: zhao, forrest, Tejun Heo, linux-ide

On Wed, May 17 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >I think Hugh traced it down to a unrelated timer change. The above
> >really wants to wait for BUSY clear, perhaps the best solution would be
> >to have piix device its own ata_piix_device_resume() that first waits
> >for BUSY clear, then calls ata_device_resume().
> 
> Close...  all devices should wait for libata to signal that the bus is 
> ready to be talked to.  For some that's waiting for BSY to clear, for 
> others that's checking the SATA bus.

Ok, I meant for the ata_piix case. Same should apply to others, right?
Do whatever you need to do to make sure the hardware is ready, then call
the generic ata_device_resume() helper to handle the libata side of
things.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ata_piix resume from S3 on T43P failed
  2006-05-17 13:02                   ` Jens Axboe
@ 2006-05-22  7:32                     ` Jeff Garzik
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff Garzik @ 2006-05-22  7:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide

Jens Axboe wrote:
> On Wed, May 17 2006, Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> I think Hugh traced it down to a unrelated timer change. The above
>>> really wants to wait for BUSY clear, perhaps the best solution would be
>>> to have piix device its own ata_piix_device_resume() that first waits
>>> for BUSY clear, then calls ata_device_resume().
>> Close...  all devices should wait for libata to signal that the bus is 
>> ready to be talked to.  For some that's waiting for BSY to clear, for 
>> others that's checking the SATA bus.
> 
> Ok, I meant for the ata_piix case. Same should apply to others, right?
> Do whatever you need to do to make sure the hardware is ready, then call
> the generic ata_device_resume() helper to handle the libata side of
> things.

Correct.

	Jeff




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-05-22  7:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-11  8:05 ata_piix resume from S3 on T43P failed zhao, forrest
2006-05-11  8:31 ` Tejun Heo
2006-05-11  8:35   ` Tejun Heo
2006-05-11  9:46     ` zhao, forrest
2006-05-11 10:39       ` Tejun Heo
2006-05-12  5:02         ` zhao, forrest
2006-05-11 10:55       ` Jens Axboe
2006-05-12  5:51         ` zhao, forrest
2006-05-12 10:17           ` Jens Axboe
2006-05-12 10:56             ` Tejun Heo
2006-05-16  3:56             ` zhao, forrest
2006-05-17 11:03               ` Jens Axboe
2006-05-17 12:56                 ` Jeff Garzik
2006-05-17 13:02                   ` Jens Axboe
2006-05-22  7:32                     ` Jeff Garzik
2006-05-13  4:19   ` Jeff Garzik
2006-05-16  1:58     ` zhao, forrest

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).