* [PATCH] Add ata_piix's own resume function
@ 2006-05-26 9:04 zhao, forrest
2006-05-26 23:05 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: zhao, forrest @ 2006-05-26 9:04 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, Tejun Heo, linux-ide
For ata_piix resume operation, it first waits for BUSY bit clear,
then calls ata_device_resume().
The patch is against #upstream 957d2df1801865eb1e63864bc63b970aa9c460ba
Thanks,
Forrest
Signed-off-by: Forrest Zhao <forrest.zhao@intel.com>
---
drivers/scsi/ata_piix.c | 19 ++++++++++++++++++-
1 files changed, 18 insertions(+), 1 deletions(-)
55116b42caac4c48b2b85228e5b961d71266de22
diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index 1db007f..405f039 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -90,6 +90,7 @@
#include <linux/delay.h>
#include <linux/device.h>
#include <scsi/scsi_host.h>
+#include <scsi/scsi_device.h>
#include <linux/libata.h>
#define DRV_NAME "ata_piix"
@@ -151,6 +152,7 @@ static int piix_pata_probe_reset(struct
static int piix_sata_probe_reset(struct ata_port *ap, unsigned int *classes);
static void piix_set_piomode (struct ata_port *ap, struct ata_device *adev);
static void piix_set_dmamode (struct ata_port *ap, struct ata_device *adev);
+static int piix_scsi_device_resume(struct scsi_device *sdev);
static unsigned int in_module_init = 1;
@@ -220,7 +222,7 @@ static struct scsi_host_template piix_sh
.dma_boundary = ATA_DMA_BOUNDARY,
.slave_configure = ata_scsi_slave_config,
.bios_param = ata_std_bios_param,
- .resume = ata_scsi_device_resume,
+ .resume = piix_scsi_device_resume,
.suspend = ata_scsi_device_suspend,
};
@@ -710,6 +712,21 @@ static void piix_set_dmamode (struct ata
}
}
+int piix_scsi_device_resume(struct scsi_device *sdev)
+{
+ struct ata_port *ap = ata_shost_to_port(sdev->host);
+ struct ata_device *dev = &ap->device[sdev->id];
+ u8 status;
+
+ status = ata_busy_wait(ap, ATA_BUSY, 200000);
+ if (status & ATA_BUSY) {
+ ata_port_printk(ap, KERN_ERR, "port failed to resume "
+ "in 2 secs)\n");
+ return 1;
+ }
+ return ata_device_resume(dev);
+}
+
#define AHCI_PCI_BAR 5
#define AHCI_GLOBAL_CTL 0x04
#define AHCI_ENABLE (1 << 31)
--
1.2.6
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-26 9:04 [PATCH] Add ata_piix's own resume function zhao, forrest
@ 2006-05-26 23:05 ` Jens Axboe
2006-05-26 23:28 ` Jeff Garzik
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
0 siblings, 2 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-26 23:05 UTC (permalink / raw)
To: zhao, forrest; +Cc: Jeff Garzik, Tejun Heo, linux-ide
On Fri, May 26 2006, zhao, forrest wrote:
> For ata_piix resume operation, it first waits for BUSY bit clear,
> then calls ata_device_resume().
This has the problem that it introduces scsi specific knowledge into
ata_piix, something that is both a violation and a problem because we
are moving libata away from scsi. I think the best way to currently do
this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
busy clear and gets called in ata_device_resume is probably the way to
go.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-26 23:05 ` Jens Axboe
@ 2006-05-26 23:28 ` Jeff Garzik
2006-05-26 23:38 ` Jeff Garzik
2006-05-27 6:21 ` Jens Axboe
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
1 sibling, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-26 23:28 UTC (permalink / raw)
To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Fri, May 26 2006, zhao, forrest wrote:
>> For ata_piix resume operation, it first waits for BUSY bit clear,
>> then calls ata_device_resume().
>
> This has the problem that it introduces scsi specific knowledge into
> ata_piix, something that is both a violation and a problem because we
> are moving libata away from scsi. I think the best way to currently do
> this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
> busy clear and gets called in ata_device_resume is probably the way to
> go.
ata_device_resume() is fine as-is. Modifying it to resurrect the bus is
a gross layering violation. Resume must be done in this order:
controller -> bus -> device
Thus, the bus must be resurrected and brought to a known HSM state (bus
idle), and then ata_device_resume() will work just fine.
The proper solution is to modify the pci_driver::resume code path, to
resurrect not only the HBA but also the ATA bus. Currently we have
ata_pci_device_{suspend,resume}, whose contents is wholly generic to any
random PCI device.
I would suggest creating pata_pci_device_resume(), which calls
ata_pci_device_resume(), and then waits for BSY to clear.
Similarly, create sata_pci_device_resume(), which calls
ata_pci_device_resume() then calls sata_phy_resume(). Additionally, for
select SATA controllers (sata_mv, others?), it may be wise to wait for
BSY to clear after the SATA phy has been awakened.
You are quite right that ata_piix is an inappropriate place to do all this.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-26 23:28 ` Jeff Garzik
@ 2006-05-26 23:38 ` Jeff Garzik
2006-05-26 23:50 ` Jeff Garzik
2006-05-27 6:21 ` Jens Axboe
1 sibling, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-26 23:38 UTC (permalink / raw)
To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Jens Axboe wrote:
>> On Fri, May 26 2006, zhao, forrest wrote:
>>> For ata_piix resume operation, it first waits for BUSY bit clear,
>>> then calls ata_device_resume().
>>
>> This has the problem that it introduces scsi specific knowledge into
>> ata_piix, something that is both a violation and a problem because we
>> are moving libata away from scsi. I think the best way to currently do
>> this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
>> busy clear and gets called in ata_device_resume is probably the way to
>> go.
>
> ata_device_resume() is fine as-is. Modifying it to resurrect the bus is
> a gross layering violation. Resume must be done in this order:
>
> controller -> bus -> device
>
> Thus, the bus must be resurrected and brought to a known HSM state (bus
> idle), and then ata_device_resume() will work just fine.
Additionally, once Tejun's EH stuff is in mainline, the best thing to do
will be to trigger a revalidate. With the EH stuff, we can easily
handle the rare case of waking up and noticing that a different device
has been plugged in, in addition to the common case.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-26 23:38 ` Jeff Garzik
@ 2006-05-26 23:50 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-26 23:50 UTC (permalink / raw)
To: linux-ide, Mark Lord; +Cc: Jens Axboe, zhao, forrest, Tejun Heo
Another note for people working on libata suspend/resume: There are
other details to consider along the controller->bus->device chain to
look into, besides bringing the ATA bus to bus-idle.
For example, suspend on modern controllers should probably quiesce
interrupts, and free_irq(), or at least zero the interrupt mask.
It is entirely possible we might resume to an interrupt storm if we're
not careful.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-26 23:05 ` Jens Axboe
2006-05-26 23:28 ` Jeff Garzik
@ 2006-05-27 3:22 ` Mark Lord
2006-05-27 3:32 ` Linus Torvalds
` (2 more replies)
1 sibling, 3 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 3:22 UTC (permalink / raw)
To: Jens Axboe
Cc: zhao, forrest, Jeff Garzik, Tejun Heo, linux-ide, Linus Torvalds
Mark Lord wrote:
> Mark Lord wrote:
>> My ata_piix based Notebook (Dell i9300) suspends/resumes perfectly (RAM
>> or disk)
>> with 2.6.16.xx kernels, but fails resume on 2.6.17-rc5-git1 (the first
>> 2.6.17-*
>> I've attempted on this machine).
>>
>> On resume from RAM, after a 30-second-ish timeout, the screen comes on
>> but the hard disk is NOT accessible. "dmesg" in an already-open window
>> shows this (typed in from handwritten notes):
>>
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
> ...
>
> Ahh.. the fix for this was posted earlier today by Forrest Zhao.
..
> Here is a modified version of Forrest's original patch, for 2.6.17-rc5-git1.
> It seems to have fixed the resume issue on my machine here,
> so that things are now working as they were in the unpatched 2.6.16 kernels.
> Jens Axboe wrote:
> This has the problem that it introduces scsi specific knowledge into
> ata_piix, something that is both a violation and a problem because we
> are moving libata away from scsi. I think the best way to currently do
> this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
> busy clear and gets called in ata_device_resume is probably the way to go.
Well, this problem has been with us all for a year now,
and at this point it impacts practically *every* new "centrino"
notebook out there.
We have a very simple workaround (previous post) that addresses it
for 2.6.17, and it's about damn time it got fixed.
If there's a better solution for *2.6.17*, then *please* post it.
Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
@ 2006-05-27 3:32 ` Linus Torvalds
2006-05-27 3:41 ` Jeff Garzik
` (2 more replies)
2006-05-27 3:35 ` Jeff Garzik
2006-05-27 6:20 ` Jens Axboe
2 siblings, 3 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 3:32 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, zhao, forrest, Jeff Garzik, Tejun Heo, linux-ide
On Fri, 26 May 2006, Mark Lord wrote:
>
> Well, this problem has been with us all for a year now,
> and at this point it impacts practically *every* new "centrino"
> notebook out there.
>
> We have a very simple workaround (previous post) that addresses it
> for 2.6.17, and it's about damn time it got fixed.
>
> If there's a better solution for *2.6.17*, then *please* post it.
> Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
I'm definitely in the "at some point, protesting a patch that works
becomes an untenably position to take, no matter _how_ ugly the patch is"
camp.
If the people who complain that it is ugly cannot come up with an
alternate solution that works and isn't ugly, at some point the "ugly"
complaint just becomes totally pointless.
Of course, I'm not on linux-ide, and I didn't see this particular
discussion from the start (or even the alledged simple workaround in the
"previous post"), but can people please fill me in? And if the choice is
not between "ugly" vs "pretty", but between "ugly" vs "nonworking", I
think we know what the answer should be.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
2006-05-27 3:32 ` Linus Torvalds
@ 2006-05-27 3:35 ` Jeff Garzik
2006-05-27 6:20 ` Jens Axboe
2 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 3:35 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, zhao, forrest, Tejun Heo, linux-ide, Linus Torvalds
Mark Lord wrote:
> Mark Lord wrote:
>> Mark Lord wrote:
>>> My ata_piix based Notebook (Dell i9300) suspends/resumes perfectly
>>> (RAM or disk)
>>> with 2.6.16.xx kernels, but fails resume on 2.6.17-rc5-git1 (the
>>> first 2.6.17-*
>>> I've attempted on this machine).
>>>
>>> On resume from RAM, after a 30-second-ish timeout, the screen comes on
>>> but the hard disk is NOT accessible. "dmesg" in an already-open window
>>> shows this (typed in from handwritten notes):
>>>
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> ...
>>
>> Ahh.. the fix for this was posted earlier today by Forrest Zhao.
> ..
>> Here is a modified version of Forrest's original patch, for
>> 2.6.17-rc5-git1.
>> It seems to have fixed the resume issue on my machine here,
>> so that things are now working as they were in the unpatched 2.6.16
>> kernels.
>
>> Jens Axboe wrote:
>> This has the problem that it introduces scsi specific knowledge into
>> ata_piix, something that is both a violation and a problem because we
>> are moving libata away from scsi. I think the best way to currently do
>> this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
>> busy clear and gets called in ata_device_resume is probably the way to
>> go.
>
> Well, this problem has been with us all for a year now,
> and at this point it impacts practically *every* new "centrino"
> notebook out there.
>
> We have a very simple workaround (previous post) that addresses it
> for 2.6.17, and it's about damn time it got fixed.
>
> If there's a better solution for *2.6.17*, then *please* post it.
> Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
Having been around for a year+, there is no reason to rush a really bad
fix into the kernel.
Especially since I told you guys the proper place to put basically the
exact same fix. You probably could have created the proper fix in the
time it took to get annoyed at my email, and type the reply...
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:32 ` Linus Torvalds
@ 2006-05-27 3:41 ` Jeff Garzik
2006-05-27 4:00 ` [PATCH] " Jeff Garzik
2006-05-27 6:29 ` Jens Axboe
2 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 3:41 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Fri, 26 May 2006, Mark Lord wrote:
>> Well, this problem has been with us all for a year now,
>> and at this point it impacts practically *every* new "centrino"
>> notebook out there.
>>
>> We have a very simple workaround (previous post) that addresses it
>> for 2.6.17, and it's about damn time it got fixed.
>>
>> If there's a better solution for *2.6.17*, then *please* post it.
>> Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
>
> I'm definitely in the "at some point, protesting a patch that works
> becomes an untenably position to take, no matter _how_ ugly the patch is"
> camp.
>
> If the people who complain that it is ugly cannot come up with an
> alternate solution that works and isn't ugly, at some point the "ugly"
> complaint just becomes totally pointless.
>
> Of course, I'm not on linux-ide, and I didn't see this particular
> discussion from the start (or even the alledged simple workaround in the
> "previous post"), but can people please fill me in? And if the choice is
> not between "ugly" vs "pretty", but between "ugly" vs "nonworking", I
> think we know what the answer should be.
Mark is just a slacker, like the rest of us ;-)
The solution, described in [1], is basically "move the delay from <here>
to <there>."
The current code does
resume PCI device
kick the ATA device
when it should do
resume PCI device
bring up the ATA bus
kick the ATA device
Regards,
Jeff
[1] http://marc.theaimsgroup.com/?l=linux-ide&m=114868613527204&w=2
^ permalink raw reply [flat|nested] 100+ messages in thread
* [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:32 ` Linus Torvalds
2006-05-27 3:41 ` Jeff Garzik
@ 2006-05-27 4:00 ` Jeff Garzik
2006-05-27 18:23 ` Mark Lord
2006-05-27 6:29 ` Jens Axboe
2 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 4:00 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
[-- Attachment #1: Type: text/plain, Size: 245 bytes --]
Although I didn't even bother to compile-test it, here is the
recommended solution, as described multiple times...
Testers might wanna put msleep(500) in pata_pci_device_resume(), before
the port loop, to let the SATA phy settle...
Jeff
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 1578 bytes --]
diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index 6dc8814..949d496 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -201,7 +201,7 @@ static struct pci_driver piix_pci_driver
.probe = piix_init_one,
.remove = ata_pci_remove_one,
.suspend = ata_pci_device_suspend,
- .resume = ata_pci_device_resume,
+ .resume = pata_pci_device_resume,
};
static struct scsi_host_template piix_sht = {
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..95e0885 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4852,6 +4852,30 @@ int ata_pci_device_resume(struct pci_dev
pci_set_master(pdev);
return 0;
}
+
+int pata_pci_device_resume(struct pci_dev *pdev)
+{
+ struct device *dev = pci_dev_to_dev(pdev);
+ struct ata_host_set *host_set = dev_get_drvdata(dev);
+ struct ata_port *ap;
+ int i, rc = 0;
+ u8 status;
+
+ ata_pci_device_resume(pdev);
+
+ for (i = 0; i < host_set->n_ports; i++) {
+ ap = host_set->ports[i];
+
+ status = ata_busy_wait(ap, ATA_BUSY, 200000);
+ if (status & ATA_BUSY) {
+ printk(KERN_ERR "ata%u: failed to resume\n", ap->id);
+ rc = -EIO;
+ }
+ }
+
+ return rc;
+}
+
#endif /* CONFIG_PCI */
@@ -4970,6 +4994,7 @@ EXPORT_SYMBOL_GPL(ata_pci_init_one);
EXPORT_SYMBOL_GPL(ata_pci_remove_one);
EXPORT_SYMBOL_GPL(ata_pci_device_suspend);
EXPORT_SYMBOL_GPL(ata_pci_device_resume);
+EXPORT_SYMBOL_GPL(pata_pci_device_resume);
EXPORT_SYMBOL_GPL(ata_pci_default_filter);
EXPORT_SYMBOL_GPL(ata_pci_clear_simplex);
#endif /* CONFIG_PCI */
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
2006-05-27 3:32 ` Linus Torvalds
2006-05-27 3:35 ` Jeff Garzik
@ 2006-05-27 6:20 ` Jens Axboe
2 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 6:20 UTC (permalink / raw)
To: Mark Lord
Cc: zhao, forrest, Jeff Garzik, Tejun Heo, linux-ide, Linus Torvalds
On Fri, May 26 2006, Mark Lord wrote:
> Mark Lord wrote:
> >Mark Lord wrote:
> >>My ata_piix based Notebook (Dell i9300) suspends/resumes perfectly (RAM
> >>or disk)
> >>with 2.6.16.xx kernels, but fails resume on 2.6.17-rc5-git1 (the first
> >>2.6.17-*
> >>I've attempted on this machine).
> >>
> >>On resume from RAM, after a 30-second-ish timeout, the screen comes on
> >>but the hard disk is NOT accessible. "dmesg" in an already-open window
> >>shows this (typed in from handwritten notes):
> >>
> >>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>end_request: I/O error, /dev/sda, sector nnnnnnn
> >...
> >
> >Ahh.. the fix for this was posted earlier today by Forrest Zhao.
> ..
> >Here is a modified version of Forrest's original patch, for
> >2.6.17-rc5-git1.
> >It seems to have fixed the resume issue on my machine here,
> >so that things are now working as they were in the unpatched 2.6.16
> >kernels.
>
> >Jens Axboe wrote:
> >This has the problem that it introduces scsi specific knowledge into
> >ata_piix, something that is both a violation and a problem because we
> >are moving libata away from scsi. I think the best way to currently do
> >this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
> >busy clear and gets called in ata_device_resume is probably the way to go.
>
> Well, this problem has been with us all for a year now,
> and at this point it impacts practically *every* new "centrino"
> notebook out there.
>
> We have a very simple workaround (previous post) that addresses it
> for 2.6.17, and it's about damn time it got fixed.
>
> If there's a better solution for *2.6.17*, then *please* post it.
> Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
Don't get me wrong, I could not agree more. I rely on suspend/resume all
the time, and the fact that we didn't get this fully working _in kernel_
years ago is really embarassing. So I fully want the ata_piix busy clear
patch to be in 2.6.17, my objection was merely that shoving scsi
knowledge into ata_piix is not the way to do it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-26 23:28 ` Jeff Garzik
2006-05-26 23:38 ` Jeff Garzik
@ 2006-05-27 6:21 ` Jens Axboe
2006-05-27 6:31 ` Jeff Garzik
1 sibling, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 6:21 UTC (permalink / raw)
To: Jeff Garzik; +Cc: zhao, forrest, Tejun Heo, linux-ide
On Fri, May 26 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Fri, May 26 2006, zhao, forrest wrote:
> >>For ata_piix resume operation, it first waits for BUSY bit clear,
> >>then calls ata_device_resume().
> >
> >This has the problem that it introduces scsi specific knowledge into
> >ata_piix, something that is both a violation and a problem because we
> >are moving libata away from scsi. I think the best way to currently do
> >this is to introduce a ata_port_ops hook (pre_resume()?) that waits for
> >busy clear and gets called in ata_device_resume is probably the way to
> >go.
>
> ata_device_resume() is fine as-is. Modifying it to resurrect the bus is
> a gross layering violation. Resume must be done in this order:
>
> controller -> bus -> device
>
> Thus, the bus must be resurrected and brought to a known HSM state (bus
> idle), and then ata_device_resume() will work just fine.
>
> The proper solution is to modify the pci_driver::resume code path, to
> resurrect not only the HBA but also the ATA bus. Currently we have
> ata_pci_device_{suspend,resume}, whose contents is wholly generic to any
> random PCI device.
>
> I would suggest creating pata_pci_device_resume(), which calls
> ata_pci_device_resume(), and then waits for BSY to clear.
I thought about that, and I don't agree. Waiting for the BSY to clear is
not a pci property, at best I'd consider that even worse than defining a
scsi resume function in ata_piix.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 3:32 ` Linus Torvalds
2006-05-27 3:41 ` Jeff Garzik
2006-05-27 4:00 ` [PATCH] " Jeff Garzik
@ 2006-05-27 6:29 ` Jens Axboe
2006-05-27 6:36 ` Jeff Garzik
2006-05-27 18:46 ` Mark Lord
2 siblings, 2 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 6:29 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mark Lord, zhao, forrest, Jeff Garzik, Tejun Heo, linux-ide
On Fri, May 26 2006, Linus Torvalds wrote:
>
>
> On Fri, 26 May 2006, Mark Lord wrote:
> >
> > Well, this problem has been with us all for a year now,
> > and at this point it impacts practically *every* new "centrino"
> > notebook out there.
> >
> > We have a very simple workaround (previous post) that addresses it
> > for 2.6.17, and it's about damn time it got fixed.
> >
> > If there's a better solution for *2.6.17*, then *please* post it.
> > Otherwise, we have a fix. Maybe Linus or Andrew should just apply it?
>
> I'm definitely in the "at some point, protesting a patch that works
> becomes an untenably position to take, no matter _how_ ugly the patch is"
> camp.
>
> If the people who complain that it is ugly cannot come up with an
> alternate solution that works and isn't ugly, at some point the "ugly"
> complaint just becomes totally pointless.
If that wasn't the case, then we wouldn't even have basic sata suspend
support in the Linus kernels right now... So agreed.
> Of course, I'm not on linux-ide, and I didn't see this particular
> discussion from the start (or even the alledged simple workaround in the
> "previous post"), but can people please fill me in? And if the choice is
> not between "ugly" vs "pretty", but between "ugly" vs "nonworking", I
> think we know what the answer should be.
There was no real discussion on this issue yet. I think we all agree
that the functionality of the patch (waiting for BSY clear on resume) is
the right thing to do. This posted patch moved SCSI stuff into ata_piix,
which isn't really very nice. Jeff wants to do it from the pci resume,
which just seems wrong to me since it's a device (disk) condition not a
"host" condition. FWIW, here's what I had in mind as suggested in the
original reply. Not tested at all for functionality, but it compiles.
diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index 6dc8814..103afc3 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -151,6 +151,7 @@ static int piix_pata_probe_reset(struct
static int piix_sata_probe_reset(struct ata_port *ap, unsigned int *classes);
static void piix_set_piomode (struct ata_port *ap, struct ata_device *adev);
static void piix_set_dmamode (struct ata_port *ap, struct ata_device *adev);
+static int piix_port_resume(struct ata_port *ap);
static unsigned int in_module_init = 1;
@@ -250,6 +251,7 @@ static const struct ata_port_operations
.port_start = ata_port_start,
.port_stop = ata_port_stop,
+ .port_resume = piix_port_resume,
.host_stop = ata_host_stop,
};
@@ -738,6 +740,16 @@ static int piix_disable_ahci(struct pci_
return rc;
}
+static int piix_port_resume(struct ata_port *ap)
+{
+ u8 status = ata_busy_wait(ap, ATA_BUSY, 200000);
+
+ if (status & ATA_BUSY)
+ return 1;
+
+ return 0;
+}
+
/**
* piix_check_450nx_errata - Check for problem 450NX setup
* @ata_dev: the PCI device to check
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..ae7fac1 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4298,6 +4298,8 @@ int ata_device_resume(struct ata_port *a
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
ap->flags &= ~ATA_FLAG_SUSPENDED;
+ if (ap->ops->port_resume)
+ ap->ops->port_resume(ap);
ata_set_mode(ap);
}
if (!ata_dev_present(dev))
diff --git a/include/linux/libata.h b/include/linux/libata.h
index b80d2e7..0be5d02 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -461,6 +461,7 @@ struct ata_port_operations {
int (*port_start) (struct ata_port *ap);
void (*port_stop) (struct ata_port *ap);
+ int (*port_resume) (struct ata_port *ap);
void (*host_stop) (struct ata_host_set *host_set);
--
Jens Axboe
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-27 6:21 ` Jens Axboe
@ 2006-05-27 6:31 ` Jeff Garzik
2006-05-27 6:46 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 6:31 UTC (permalink / raw)
To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> I thought about that, and I don't agree. Waiting for the BSY to clear is
> not a pci property, at best I'd consider that even worse than defining a
> scsi resume function in ata_piix.
It has nothing to do with PCI, and everything to do control flow. This
is _how the hardware works_:
First you resume the controller.
Then you resume the bus.
Then you resume the devices on the bus.
The original patch goes BACKWARDS, by trying to resume the bus from the
device resume method. That's just dumb.
In a PCI driver, the resume is _always_ driven by pci_driver::resume.
Control flow starts there. Once you get down to the individual device
level -- with the original patch -- you are attempting to pick up the
pieces that should have _already_ been done by higher layers.
This is how all drivers work. Control flow starts at driver::resume,
which in turn directs resumption of various layers.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 6:29 ` Jens Axboe
@ 2006-05-27 6:36 ` Jeff Garzik
2006-05-27 7:01 ` Jens Axboe
2006-05-27 18:46 ` Mark Lord
1 sibling, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 6:36 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> There was no real discussion on this issue yet. I think we all agree
> that the functionality of the patch (waiting for BSY clear on resume) is
> the right thing to do. This posted patch moved SCSI stuff into ata_piix,
> which isn't really very nice. Jeff wants to do it from the pci resume,
> which just seems wrong to me since it's a device (disk) condition not a
Wrong. It is a _bus_ condition, not a device condition.
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..ae7fac1 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4298,6 +4298,8 @@ int ata_device_resume(struct ata_port *a
> {
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> + if (ap->ops->port_resume)
> + ap->ops->port_resume(ap);
This is even MORE broken!
A port can have multiple devices hanging off of it. With this
silliness, you will be calling ->port_resume() for both master and slave
devices... or all devices attached to a port multiplier.
Control flow is top-down, not bottom-up.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-27 6:31 ` Jeff Garzik
@ 2006-05-27 6:46 ` Jens Axboe
2006-05-27 6:52 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 6:46 UTC (permalink / raw)
To: Jeff Garzik; +Cc: zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >I thought about that, and I don't agree. Waiting for the BSY to clear is
> >not a pci property, at best I'd consider that even worse than defining a
> >scsi resume function in ata_piix.
>
> It has nothing to do with PCI, and everything to do control flow. This
> is _how the hardware works_:
It does by the very nature of this being invoked by the pci device
resume function...
> First you resume the controller.
> Then you resume the bus.
> Then you resume the devices on the bus.
>
> The original patch goes BACKWARDS, by trying to resume the bus from the
> device resume method. That's just dumb.
Since there's just the one device on the bus in this case, whether its
the device or bus posting BUSY seems pretty irrelevant. If anything, I'd
say that the act of iterating over possible devices hanging of the pci
device and resuming them from the pci handler is definitely the worst
approach. The pci driver resume function should do just one thing --
resume the device itself.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Add ata_piix's own resume function
2006-05-27 6:46 ` Jens Axboe
@ 2006-05-27 6:52 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 6:52 UTC (permalink / raw)
To: Jens Axboe; +Cc: zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> Since there's just the one device on the bus in this case, whether its
Wrong. ata_piix must deal with master/slave, as must libata itself.
> the device or bus posting BUSY seems pretty irrelevant. If anything, I'd
> say that the act of iterating over possible devices hanging of the pci
> device and resuming them from the pci handler is definitely the worst
> approach. The pci driver resume function should do just one thing --
> resume the device itself.
Who resumes the bus? hmmm?
Please, read other drivers. Control flow always starts at
driver::resume, and since the driver knows the hardware best, it _must_
be the one that directs the control flow.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 6:36 ` Jeff Garzik
@ 2006-05-27 7:01 ` Jens Axboe
2006-05-27 7:06 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 7:01 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >There was no real discussion on this issue yet. I think we all agree
> >that the functionality of the patch (waiting for BSY clear on resume) is
> >the right thing to do. This posted patch moved SCSI stuff into ata_piix,
> >which isn't really very nice. Jeff wants to do it from the pci resume,
> >which just seems wrong to me since it's a device (disk) condition not a
>
> Wrong. It is a _bus_ condition, not a device condition.
See my other mail.
> >diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> >index fa476e7..ae7fac1 100644
> >--- a/drivers/scsi/libata-core.c
> >+++ b/drivers/scsi/libata-core.c
> >@@ -4298,6 +4298,8 @@ int ata_device_resume(struct ata_port *a
> > {
> > if (ap->flags & ATA_FLAG_SUSPENDED) {
> > ap->flags &= ~ATA_FLAG_SUSPENDED;
> >+ if (ap->ops->port_resume)
> >+ ap->ops->port_resume(ap);
>
> This is even MORE broken!
>
> A port can have multiple devices hanging off of it. With this
> silliness, you will be calling ->port_resume() for both master and slave
> devices... or all devices attached to a port multiplier.
Worst case is the N-1 invocations basically being noops. Since
2.6.17-rc5 iirc doesn't even support port multipliers, I'd say this is a
pretty weak case.
What I care about is getting libata suspend/resume working, and so do a
lot of people. So far you've done nothing but whine at the posted
patches from the beginning and absolutely zero work on helping us get
there in _your_ driver/sub system. If you think your posted patch is the
best solution, by all means just put in there. Just make sure that we at
least get _something_ in there that does for 2.6.17.
I'll pick this up again tomorrow and actually test the patches so at
least I can say that 2.6.17 will suspend/resume for me.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 7:01 ` Jens Axboe
@ 2006-05-27 7:06 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 7:06 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sat, May 27 2006, Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> There was no real discussion on this issue yet. I think we all agree
>>> that the functionality of the patch (waiting for BSY clear on resume) is
>>> the right thing to do. This posted patch moved SCSI stuff into ata_piix,
>>> which isn't really very nice. Jeff wants to do it from the pci resume,
>>> which just seems wrong to me since it's a device (disk) condition not a
>> Wrong. It is a _bus_ condition, not a device condition.
>
> See my other mail.
>
>>> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
>>> index fa476e7..ae7fac1 100644
>>> --- a/drivers/scsi/libata-core.c
>>> +++ b/drivers/scsi/libata-core.c
>>> @@ -4298,6 +4298,8 @@ int ata_device_resume(struct ata_port *a
>>> {
>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>>> ap->flags &= ~ATA_FLAG_SUSPENDED;
>>> + if (ap->ops->port_resume)
>>> + ap->ops->port_resume(ap);
>> This is even MORE broken!
>>
>> A port can have multiple devices hanging off of it. With this
>> silliness, you will be calling ->port_resume() for both master and slave
>> devices... or all devices attached to a port multiplier.
>
> Worst case is the N-1 invocations basically being noops. Since
> 2.6.17-rc5 iirc doesn't even support port multipliers, I'd say this is a
> pretty weak case.
If N-1 invocations are no-ops, that is a clear sign you got the layering
very wrong. Backwards, in fact.
And if you had to do something other than test for BSY -- say for
example powering the bus on -- then you would be doing N-1 needless
resets and power-ons.
> What I care about is getting libata suspend/resume working, and so do a
> lot of people. So far you've done nothing but whine at the posted
> patches from the beginning and absolutely zero work on helping us get
> there in _your_ driver/sub system. If you think your posted patch is the
> best solution, by all means just put in there. Just make sure that we at
> least get _something_ in there that does for 2.6.17.
After describing what to do innumerable times to people actively working
in the area, I did indeed write the simple and obvious patch. As soon
as positive results appear, its going in. The relevant subject is
"[PATCH] libata resume improvements"
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 4:00 ` [PATCH] " Jeff Garzik
@ 2006-05-27 18:23 ` Mark Lord
2006-05-27 18:47 ` Linus Torvalds
2006-05-27 18:54 ` Jeff Garzik
0 siblings, 2 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 18:23 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
On Saturday 27 May 2006 00:00, Jeff Garzik wrote:
> Although I didn't even bother to compile-test it, here is the
> recommended solution, as described multiple times...
..
That patch has ZERO (positive) effect on my machine.
It still crashes miserably with a zillion "read errors" on resume
(drive not accessible).
The original "bad layering" patch still works perfectly in it's place.
Repeated below for Linus's benefit.
Signed-off-by: Mark Lord <lkml@rtr.ca>
---
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -90,6 +90,7 @@
#include <linux/delay.h>
#include <linux/device.h>
#include <scsi/scsi_host.h>
+#include <scsi/scsi_device.h>
#include <linux/libata.h>
#define DRV_NAME "ata_piix"
@@ -151,6 +152,7 @@ static int piix_pata_probe_reset(struct
static int piix_sata_probe_reset(struct ata_port *ap, unsigned int *classes);
static void piix_set_piomode (struct ata_port *ap, struct ata_device *adev);
static void piix_set_dmamode (struct ata_port *ap, struct ata_device *adev);
+static int piix_scsi_device_resume(struct scsi_device *sdev);
static unsigned int in_module_init = 1;
@@ -220,7 +222,7 @@ static struct scsi_host_template piix_sh
.dma_boundary = ATA_DMA_BOUNDARY,
.slave_configure = ata_scsi_slave_config,
.bios_param = ata_std_bios_param,
- .resume = ata_scsi_device_resume,
+ .resume = piix_scsi_device_resume,
.suspend = ata_scsi_device_suspend,
};
@@ -710,6 +712,21 @@ static void piix_set_dmamode (struct ata
}
}
+int piix_scsi_device_resume(struct scsi_device *sdev)
+{
+ struct ata_port *ap = (struct ata_port *) &sdev->host->hostdata[0];
+ struct ata_device *dev = &ap->device[sdev->id];
+ u8 status;
+
+ status = ata_busy_wait(ap, ATA_BUSY, 200000);
+ if (status & ATA_BUSY) {
+ printk(KERN_ERR "libata port failed to resume "
+ "in 2 secs)\n");
+ return 1;
+ }
+ return ata_device_resume(ap, dev);
+}
+
#define AHCI_PCI_BAR 5
#define AHCI_GLOBAL_CTL 0x04
#define AHCI_ENABLE (1 << 31)
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 6:29 ` Jens Axboe
2006-05-27 6:36 ` Jeff Garzik
@ 2006-05-27 18:46 ` Mark Lord
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 18:46 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, zhao, forrest, Jeff Garzik, Tejun Heo, linux-ide
Jens Axboe wrote:
..
> There was no real discussion on this issue yet. I think we all agree
> that the functionality of the patch (waiting for BSY clear on resume) is
> the right thing to do. This posted patch moved SCSI stuff into ata_piix,
> which isn't really very nice. Jeff wants to do it from the pci resume,
> which just seems wrong to me since it's a device (disk) condition not a
> "host" condition. FWIW, here's what I had in mind as suggested in the
> original reply. Not tested at all for functionality, but it compiles.
>
> diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
> index 6dc8814..103afc3 100644
> --- a/drivers/scsi/ata_piix.c
> +++ b/drivers/scsi/ata_piix.c
> @@ -151,6 +151,7 @@ static int piix_pata_probe_reset(struct
> static int piix_sata_probe_reset(struct ata_port *ap, unsigned int *classes);
...
That patch, like Jeff's, does nothing (good) on my system.
It still pauses for a longish timeout on resume, and then fails
with a zillion "read errors", and leaves the drive inaccessible.
I'm will y'all on the poor layering, but non-functionality is worse.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:23 ` Mark Lord
@ 2006-05-27 18:47 ` Linus Torvalds
2006-05-27 19:01 ` Jeff Garzik
` (2 more replies)
2006-05-27 18:54 ` Jeff Garzik
1 sibling, 3 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 18:47 UTC (permalink / raw)
To: Mark Lord; +Cc: Jeff Garzik, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Mark Lord wrote:
>
> The original "bad layering" patch still works perfectly in it's place.
> Repeated below for Linus's benefit.
Why isn't the right fix the minimal one?
What's the layering violation in just having ATA resume make sure it's not
ATA_BUSY?
Why are you guys fighting over this?
And why the hell is Mark's patch not being accepted if it fixes something,
and the alternate patches do not?
Linus
---
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..0ef4cf4 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
*/
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
+ ata_busy_wait(ap, ATA_BUSY, 200000);
if (ap->flags & ATA_FLAG_SUSPENDED) {
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:23 ` Mark Lord
2006-05-27 18:47 ` Linus Torvalds
@ 2006-05-27 18:54 ` Jeff Garzik
2006-05-27 19:08 ` Mark Lord
1 sibling, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 18:54 UTC (permalink / raw)
To: Mark Lord; +Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]
Mark Lord wrote:
> On Saturday 27 May 2006 00:00, Jeff Garzik wrote:
>> Although I didn't even bother to compile-test it, here is the
>> recommended solution, as described multiple times...
>
> ..
>
> That patch has ZERO (positive) effect on my machine.
> It still crashes miserably with a zillion "read errors" on resume
> (drive not accessible).
Output please? Did you get any failed-to-resume messages?
> The original "bad layering" patch still works perfectly in it's place.
> Repeated below for Linus's benefit.
Actually, unfortunately I don't see that here. On my ICH7 in x86-64
ata_piix combined mode, it hard locks. In x86 mode, it appears to work
(or at least not break anything, since resume works without the patch
too on this box) both in combined mode and native mode.
Since this is claimed to be a regression, what was the last upstream
kernel version that worked for you? 2.6.16.x?
Finally, if you could test the updated patch I posted, and/or the one
that is attached, that would be preferred.
Jeff
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2166 bytes --]
diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index 6dc8814..949d496 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -201,7 +201,7 @@ static struct pci_driver piix_pci_driver
.probe = piix_init_one,
.remove = ata_pci_remove_one,
.suspend = ata_pci_device_suspend,
- .resume = ata_pci_device_resume,
+ .resume = pata_pci_device_resume,
};
static struct scsi_host_template piix_sht = {
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..5b7ab7b 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4852,6 +4852,30 @@ int ata_pci_device_resume(struct pci_dev
pci_set_master(pdev);
return 0;
}
+
+int pata_pci_device_resume(struct pci_dev *pdev)
+{
+ struct device *dev = pci_dev_to_dev(pdev);
+ struct ata_host_set *host_set = dev_get_drvdata(dev);
+ struct ata_port *ap;
+ u8 status;
+ int i;
+
+ ata_pci_device_resume(pdev);
+
+ mdelay(400);
+
+ for (i = 0; i < host_set->n_ports; i++) {
+ ap = host_set->ports[i];
+
+ status = ata_busy_wait(ap, ATA_BUSY, 200000);
+ if (status & ATA_BUSY)
+ printk(KERN_ERR "ata%u: failed to resume\n", ap->id);
+ }
+
+ return 0;
+}
+
#endif /* CONFIG_PCI */
@@ -4970,6 +4994,7 @@ EXPORT_SYMBOL_GPL(ata_pci_init_one);
EXPORT_SYMBOL_GPL(ata_pci_remove_one);
EXPORT_SYMBOL_GPL(ata_pci_device_suspend);
EXPORT_SYMBOL_GPL(ata_pci_device_resume);
+EXPORT_SYMBOL_GPL(pata_pci_device_resume);
EXPORT_SYMBOL_GPL(ata_pci_default_filter);
EXPORT_SYMBOL_GPL(ata_pci_clear_simplex);
#endif /* CONFIG_PCI */
diff --git a/include/linux/libata.h b/include/linux/libata.h
index b80d2e7..cad531a 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -516,6 +516,7 @@ extern int ata_pci_init_one (struct pci_
extern void ata_pci_remove_one (struct pci_dev *pdev);
extern int ata_pci_device_suspend(struct pci_dev *pdev, pm_message_t state);
extern int ata_pci_device_resume(struct pci_dev *pdev);
+extern int pata_pci_device_resume(struct pci_dev *pdev);
extern int ata_pci_clear_simplex(struct pci_dev *pdev);
#endif /* CONFIG_PCI */
extern int ata_device_add(const struct ata_probe_ent *ent);
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:47 ` Linus Torvalds
@ 2006-05-27 19:01 ` Jeff Garzik
2006-05-27 19:06 ` Jeff Garzik
2006-05-27 19:01 ` Mark Lord
2006-05-27 20:45 ` Jens Axboe
2 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 19:01 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
> On Sat, 27 May 2006, Mark Lord wrote:
>> The original "bad layering" patch still works perfectly in it's place.
>> Repeated below for Linus's benefit.
>
> Why isn't the right fix the minimal one?
Your minimal patch is FAR better than the other working patch.
> What's the layering violation in just having ATA resume make sure it's not
> ATA_BUSY?
> Why are you guys fighting over this?
As soon as we start powering on the SATA phy (code exists, 2.6.18), or
dealing with port multipliers (code exists, 2.6.19), the "working patch"
stops working.
So we have to start this all over again.
> And why the hell is Mark's patch not being accepted if it fixes something,
> and the alternate patches do not?
The alternate only appeared 12 hours ago, when people started yelling?
But whatever. Life goes on. I'll fix whatever gets merged.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:47 ` Linus Torvalds
2006-05-27 19:01 ` Jeff Garzik
@ 2006-05-27 19:01 ` Mark Lord
2006-05-27 20:45 ` Jens Axboe
2 siblings, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 19:01 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Garzik, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Mark Lord wrote:
>> The original "bad layering" patch still works perfectly in it's place.
>> Repeated below for Linus's benefit.
>
> Why isn't the right fix the minimal one?
..
> ---
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..0ef4cf4 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
> */
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
Bingo! Linus wins: that one-liner is not only "obviously correct",
but it works, too!
Great!
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 19:01 ` Jeff Garzik
@ 2006-05-27 19:06 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 19:06 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Linus Torvalds wrote:
>> What's the layering violation in just having ATA resume make sure it's
>> not ATA_BUSY?
>
>> Why are you guys fighting over this?
>
> As soon as we start powering on the SATA phy (code exists, 2.6.18), or
> dealing with port multipliers (code exists, 2.6.19), the "working patch"
> stops working.
so basically when the SCSI and libata suspend is used for any controller
other than ata_piix, really :)
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:54 ` Jeff Garzik
@ 2006-05-27 19:08 ` Mark Lord
2006-05-27 19:15 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 19:08 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
>
Jeff Garzik wrote:
>..
> Output please? Did you get any failed-to-resume messages?
Same as in my original posting:
> On resume from RAM, after a 30-second-ish timeout, the screen comes on
> but the hard disk is NOT accessible. "dmesg" in an already-open window
> shows this (typed in from handwritten notes):
>
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
> sd 0:0:0:0: SCSI error: return code = 0x40000
> end_request: I/O error, /dev/sda, sector nnnnnnn
>
> (the nnnnnnn was actually a real number, different on each message).
..
> Since this is claimed to be a regression, what was the last upstream
> kernel version that worked for you? 2.6.16.x?
..
Also from the original posting:
> My ata_piix based Notebook (Dell i9300) suspends/resumes perfectly (RAM or disk)
> with 2.6.16.xx kernels, but fails resume on 2.6.17-rc5-git1 (the first 2.6.17-*
> I've attempted on this machine).
It did NOT work any prior unpatched kernels before 2.6.16.xx.
> Finally, if you could test the updated patch I posted, and/or the one
> that is attached, that would be preferred.
> --------------030508080204070708000504
> Content-Type: text/plain;
> name="patch"
> Content-Transfer-Encoding: base64
> Content-Disposition: inline;
> filename="patch"
>
> ZGlmZiAtLWdpdCBhL2RyaXZlcnMvc2NzaS9hdGFfcGlpeC5jIGIvZHJpdmVycy9zY3NpL2F0
> YV9waWl4LmMKaW5kZXggNmRjODgxNC4uOTQ5ZDQ5NiAxMDA2NDQKLS0tIGEvZHJpdmVycy9z
> Y3NpL2F0YV9waWl4LmMKKysrIGIvZHJpdmVycy9zY3NpL2F0YV9waWl4LmMKQEAgLTIwMSw3
...
Fix your mailer, please. Meanwhile, I'll try to extract the patch
and apply it, but for informational purposes only.
Linus's one-liner has us all beat.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 19:08 ` Mark Lord
@ 2006-05-27 19:15 ` Jeff Garzik
2006-05-27 19:24 ` Mark Lord
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 19:15 UTC (permalink / raw)
To: Mark Lord; +Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Same as in my original posting:
>> On resume from RAM, after a 30-second-ish timeout, the screen comes on
>> but the hard disk is NOT accessible. "dmesg" in an already-open window
>> shows this (typed in from handwritten notes):
>>
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
>> sd 0:0:0:0: SCSI error: return code = 0x40000
>> end_request: I/O error, /dev/sda, sector nnnnnnn
what is in dmesg _before_ that, though?
> Also from the original posting:
>> My ata_piix based Notebook (Dell i9300) suspends/resumes perfectly
>> (RAM or disk)
>> with 2.6.16.xx kernels, but fails resume on 2.6.17-rc5-git1 (the first
>> 2.6.17-*
>> I've attempted on this machine).
>
> It did NOT work any prior unpatched kernels before 2.6.16.xx.
Thanks.
> Fix your mailer, please. Meanwhile, I'll try to extract the patch
> and apply it, but for informational purposes only.
> Linus's one-liner has us all beat.
I resent it inline. Stupid Thunderbird, its never done that before.
Sorry about that.
If mine works, it is preferred, because that's what we need for 2.6.18+.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 19:15 ` Jeff Garzik
@ 2006-05-27 19:24 ` Mark Lord
2006-05-27 20:24 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 19:24 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Mark Lord wrote:
>> Same as in my original posting:
>>> On resume from RAM, after a 30-second-ish timeout, the screen comes on
>>> but the hard disk is NOT accessible. "dmesg" in an already-open window
>>> shows this (typed in from handwritten notes):
>>>
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>>> sd 0:0:0:0: SCSI error: return code = 0x40000
>>> end_request: I/O error, /dev/sda, sector nnnnnnn
>
> what is in dmesg _before_ that, though?
Nothing useful --> only the suspend-time messages from USB and whatnot.
I'll crash it again now just to confirm that, though.
Give me another 10 minutes for the two reboots..
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 19:24 ` Mark Lord
@ 2006-05-27 20:24 ` Jens Axboe
0 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 20:24 UTC (permalink / raw)
To: Mark Lord
Cc: Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> Jeff Garzik wrote:
> >Mark Lord wrote:
> >>Same as in my original posting:
> >>>On resume from RAM, after a 30-second-ish timeout, the screen comes on
> >>>but the hard disk is NOT accessible. "dmesg" in an already-open window
> >>>shows this (typed in from handwritten notes):
> >>>
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >>>sd 0:0:0:0: SCSI error: return code = 0x40000
> >>>end_request: I/O error, /dev/sda, sector nnnnnnn
> >
> >what is in dmesg _before_ that, though?
>
> Nothing useful --> only the suspend-time messages from USB and whatnot.
>
> I'll crash it again now just to confirm that, though.
> Give me another 10 minutes for the two reboots..
I'll go over this thread when I have better connectivity, but I cna say
that the non-working solutions for me all involve the 0xef command first
issued failing. The scsi messages above is just a result of the device
being taken offline after failing, so not very interesting by
themselves.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 18:47 ` Linus Torvalds
2006-05-27 19:01 ` Jeff Garzik
2006-05-27 19:01 ` Mark Lord
@ 2006-05-27 20:45 ` Jens Axboe
2006-05-27 20:58 ` Jeff Garzik
2 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 20:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mark Lord, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Saturday 27 May 2006 20:47, Linus Torvalds wrote:
> On Sat, 27 May 2006, Mark Lord wrote:
> > The original "bad layering" patch still works perfectly in it's place.
> > Repeated below for Linus's benefit.
>
> Why isn't the right fix the minimal one?
>
> What's the layering violation in just having ATA resume make sure it's not
> ATA_BUSY?
>
> Why are you guys fighting over this?
>
> And why the hell is Mark's patch not being accepted if it fixes something,
> and the alternate patches do not?
>
> Linus
> ---
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..0ef4cf4 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
> */
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
This is fine with me, Jeff originally complained it was a layering violation.
Unless he really objects, I'd say go for that for 2.6.17 - well actually
moving it inside the ATA_FLAG_SUSPENDED case is clearly better.
I'll test it on my notebook right away.
Jens
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 20:45 ` Jens Axboe
@ 2006-05-27 20:58 ` Jeff Garzik
2006-05-27 21:11 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 20:58 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> This is fine with me, Jeff originally complained it was a layering violation.
> Unless he really objects, I'd say go for that for 2.6.17 - well actually
> moving it inside the ATA_FLAG_SUSPENDED case is clearly better.
It breaks 2.6.1[89] stuff, but whatever. Go ahead and apply Linus's
patch. I'll pick up the pieces post 2.6.17.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 20:58 ` Jeff Garzik
@ 2006-05-27 21:11 ` Jens Axboe
2006-05-27 21:17 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 21:11 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >This is fine with me, Jeff originally complained it was a layering
> >violation. Unless he really objects, I'd say go for that for 2.6.17 - well
> >actually moving it inside the ATA_FLAG_SUSPENDED case is clearly better.
>
> It breaks 2.6.1[89] stuff, but whatever. Go ahead and apply Linus's
> patch. I'll pick up the pieces post 2.6.17.
Linus' patch doesn't work for me, seems I still need a little delay
before waiting for BSY clear. I'm testing a small additional hack to the
pci resume function.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:11 ` Jens Axboe
@ 2006-05-27 21:17 ` Jeff Garzik
2006-05-27 21:20 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:17 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sat, May 27 2006, Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> This is fine with me, Jeff originally complained it was a layering
>>> violation. Unless he really objects, I'd say go for that for 2.6.17 - well
>>> actually moving it inside the ATA_FLAG_SUSPENDED case is clearly better.
>> It breaks 2.6.1[89] stuff, but whatever. Go ahead and apply Linus's
>> patch. I'll pick up the pieces post 2.6.17.
>
> Linus' patch doesn't work for me, seems I still need a little delay
> before waiting for BSY clear. I'm testing a small additional hack to the
> pci resume function.
If it works better for people, just doing an msleep() or mdelay() in
device suspend would be better than polling BSY.
A delay at least works for FIS-based controllers and PMs...
Post 2.6.17, a lot of work needs to go into actually re-initing the
hardware, since the current suspend resumes to silicon defaults rather
than BIOS defaults (and ata_piix doesn't have a controller reset).
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:17 ` Jeff Garzik
@ 2006-05-27 21:20 ` Jens Axboe
2006-05-27 21:23 ` Mark Lord
` (3 more replies)
0 siblings, 4 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 21:20 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sat, May 27 2006, Jeff Garzik wrote:
> >>Jens Axboe wrote:
> >>>This is fine with me, Jeff originally complained it was a layering
> >>>violation. Unless he really objects, I'd say go for that for 2.6.17 -
> >>>well actually moving it inside the ATA_FLAG_SUSPENDED case is clearly
> >>>better.
> >>It breaks 2.6.1[89] stuff, but whatever. Go ahead and apply Linus's
> >>patch. I'll pick up the pieces post 2.6.17.
> >
> >Linus' patch doesn't work for me, seems I still need a little delay
> >before waiting for BSY clear. I'm testing a small additional hack to the
> >pci resume function.
>
> If it works better for people, just doing an msleep() or mdelay() in
> device suspend would be better than polling BSY.
>
> A delay at least works for FIS-based controllers and PMs...
Ok, this one finally works reliably for me. I didn't experiment with
differnet sleep intervals, just picked 500msecs and it worked... The
ata_busy_wait() is _not_ enough for me, what seems to be happening is
that prematurely decides that BSY is cleared while it just hasn't been
asserted yet.
> Post 2.6.17, a lot of work needs to go into actually re-initing the
> hardware, since the current suspend resumes to silicon defaults rather
> than BIOS defaults (and ata_piix doesn't have a controller reset).
Definitely, and we need to cover a lot more controllers than just piix
and ahci. Being the selfish bastard that I am, I didn't care much about
other devices...
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..28878f4 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
+ ata_busy_wait(ap, ATA_BUSY, 200000);
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
}
@@ -4846,6 +4847,7 @@ int ata_pci_device_suspend(struct pci_de
int ata_pci_device_resume(struct pci_dev *pdev)
{
+ msleep(500);
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
pci_enable_device(pdev);
--
Jens Axboe
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:20 ` Jens Axboe
@ 2006-05-27 21:23 ` Mark Lord
2006-05-27 21:25 ` Jens Axboe
2006-05-27 21:30 ` Mark Lord
2006-05-27 21:24 ` Jeff Garzik
` (2 subsequent siblings)
3 siblings, 2 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 21:23 UTC (permalink / raw)
To: Jens Axboe
Cc: Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
..
> ata_busy_wait() is _not_ enough for me, what seems to be happening is
> that prematurely decides that BSY is cleared while it just hasn't been
> asserted yet.
..
Ahh.. at last, something that even makes sense!
This whole thing has seemed very fiddly and timing dependent,
and this discovery (above) explains it a bit.
Rebooting to try it here now..
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:20 ` Jens Axboe
2006-05-27 21:23 ` Mark Lord
@ 2006-05-27 21:24 ` Jeff Garzik
2006-05-27 21:26 ` Jens Axboe
` (2 more replies)
2006-05-27 21:38 ` Linus Torvalds
2006-05-27 21:50 ` Linus Torvalds
3 siblings, 3 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:24 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> Definitely, and we need to cover a lot more controllers than just piix
> and ahci. Being the selfish bastard that I am, I didn't care much about
> other devices...
For 2.6.17 that's practical in any case: ata_piix is the only upstream
driver with a suspend method.
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..28878f4 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
> }
> @@ -4846,6 +4847,7 @@ int ata_pci_device_suspend(struct pci_de
>
> int ata_pci_device_resume(struct pci_dev *pdev)
> {
> + msleep(500);
> pci_set_power_state(pdev, PCI_D0);
> pci_restore_state(pdev);
> pci_enable_device(pdev);
Does it work if you move msleep() below pci_restore_state()?
" " " " below pci_enable_device()?
I think the delay makes more sense after you wake up the controller...
The fact that the above patch works most likely indicates the device,
not the controller, needs the delay.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:23 ` Mark Lord
@ 2006-05-27 21:25 ` Jens Axboe
2006-05-27 21:30 ` Mark Lord
1 sibling, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 21:25 UTC (permalink / raw)
To: Mark Lord
Cc: Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> Jens Axboe wrote:
> ..
> >ata_busy_wait() is _not_ enough for me, what seems to be happening is
> >that prematurely decides that BSY is cleared while it just hasn't been
> >asserted yet.
> ..
>
> Ahh.. at last, something that even makes sense!
>
> This whole thing has seemed very fiddly and timing dependent,
> and this discovery (above) explains it a bit.
Yep, it should also make the 'better' fixes work I think, but I have no
real time to test that right now. I think what exposed the timing
problem is the part that Hugh originally uncovered.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:24 ` Jeff Garzik
@ 2006-05-27 21:26 ` Jens Axboe
2006-05-27 21:31 ` Mark Lord
2006-05-27 21:33 ` Jens Axboe
2 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 21:26 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >Definitely, and we need to cover a lot more controllers than just piix
> >and ahci. Being the selfish bastard that I am, I didn't care much about
> >other devices...
>
> For 2.6.17 that's practical in any case: ata_piix is the only upstream
> driver with a suspend method.
>
>
> >diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> >index fa476e7..28878f4 100644
> >--- a/drivers/scsi/libata-core.c
> >+++ b/drivers/scsi/libata-core.c
> >@@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
> > int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> > {
> > if (ap->flags & ATA_FLAG_SUSPENDED) {
> >+ ata_busy_wait(ap, ATA_BUSY, 200000);
> > ap->flags &= ~ATA_FLAG_SUSPENDED;
> > ata_set_mode(ap);
> > }
> >@@ -4846,6 +4847,7 @@ int ata_pci_device_suspend(struct pci_de
> >
> > int ata_pci_device_resume(struct pci_dev *pdev)
> > {
> >+ msleep(500);
> > pci_set_power_state(pdev, PCI_D0);
> > pci_restore_state(pdev);
> > pci_enable_device(pdev);
>
> Does it work if you move msleep() below pci_restore_state()?
> " " " " below pci_enable_device()?
>
> I think the delay makes more sense after you wake up the controller...
I'd be surprised if it didn't, let me do a reboot with it moved at the
bottom of ata_pci_device_resume(). brb...
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:23 ` Mark Lord
2006-05-27 21:25 ` Jens Axboe
@ 2006-05-27 21:30 ` Mark Lord
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 21:30 UTC (permalink / raw)
To: Mark Lord
Cc: Jens Axboe, Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo,
linux-ide
Mark Lord wrote:
> Jens Axboe wrote:
> ..
>> ata_busy_wait() is _not_ enough for me, what seems to be happening is
>> that prematurely decides that BSY is cleared while it just hasn't been
>> asserted yet.
> ..
> Rebooting to try it here now..
Works here (w/Jens' original placement of the delay)
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:24 ` Jeff Garzik
2006-05-27 21:26 ` Jens Axboe
@ 2006-05-27 21:31 ` Mark Lord
2006-05-27 21:32 ` Jeff Garzik
2006-05-27 21:33 ` Jens Axboe
2 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 21:31 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
..
>
> Does it work if you move msleep() below pci_restore_state()?
> " " " " below pci_enable_device()?
>
> I think the delay makes more sense after you wake up the controller...
>
> The fact that the above patch works most likely indicates the device,
> not the controller, needs the delay.
Another question here: I really would have expected the new EH code
to eventually recover from this, by resetting the chip/drive/whatever.
It doesn't.
??
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:31 ` Mark Lord
@ 2006-05-27 21:32 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:32 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Jeff Garzik wrote:
> ..
>>
>> Does it work if you move msleep() below pci_restore_state()?
>> " " " " below pci_enable_device()?
>>
>> I think the delay makes more sense after you wake up the controller...
>>
>> The fact that the above patch works most likely indicates the device,
>> not the controller, needs the delay.
>
> Another question here: I really would have expected the new EH code
> to eventually recover from this, by resetting the chip/drive/whatever.
>
> It doesn't.
>
> ??
Maybe that's because 2.6.17 doesn't have the new EH code? ;-)
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:24 ` Jeff Garzik
2006-05-27 21:26 ` Jens Axboe
2006-05-27 21:31 ` Mark Lord
@ 2006-05-27 21:33 ` Jens Axboe
2006-05-27 21:34 ` Jeff Garzik
2 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 21:33 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> >+ msleep(500);
> > pci_set_power_state(pdev, PCI_D0);
> > pci_restore_state(pdev);
> > pci_enable_device(pdev);
>
> Does it work if you move msleep() below pci_restore_state()?
> " " " " below pci_enable_device()?
Placement makes no difference here, it works after pci_enable_device()
as well. What matters is the delay coming before the ata_busy_wait().
You could just stuff it in front of that, if you wanted.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:33 ` Jens Axboe
@ 2006-05-27 21:34 ` Jeff Garzik
2006-05-27 21:37 ` Mark Lord
2006-05-27 21:41 ` Tejun Heo
0 siblings, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:34 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sat, May 27 2006, Jeff Garzik wrote:
>>> + msleep(500);
>>> pci_set_power_state(pdev, PCI_D0);
>>> pci_restore_state(pdev);
>>> pci_enable_device(pdev);
>> Does it work if you move msleep() below pci_restore_state()?
>> " " " " below pci_enable_device()?
>
> Placement makes no difference here, it works after pci_enable_device()
> as well. What matters is the delay coming before the ata_busy_wait().
> You could just stuff it in front of that, if you wanted.
That would increase the wait needlessly, since it would be then done
per-device, even though the devices spin up in parallel on ata_piix.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:34 ` Jeff Garzik
@ 2006-05-27 21:37 ` Mark Lord
2006-05-27 21:51 ` Jeff Garzik
2006-05-27 21:41 ` Tejun Heo
1 sibling, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 21:37 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Jens Axboe wrote:
>> On Sat, May 27 2006, Jeff Garzik wrote:
>>>> + msleep(500);
>>>> pci_set_power_state(pdev, PCI_D0);
>>>> pci_restore_state(pdev);
>>>> pci_enable_device(pdev);
>>> Does it work if you move msleep() below pci_restore_state()?
>>> " " " " below pci_enable_device()?
>>
>> Placement makes no difference here, it works after pci_enable_device()
>> as well. What matters is the delay coming before the ata_busy_wait().
>> You could just stuff it in front of that, if you wanted.
>
> That would increase the wait needlessly, since it would be then done
> per-device, even though the devices spin up in parallel on ata_piix.
Okay, so 2/3 of use have working machines again now.
How is your beast doing with this tweak, Jeff?
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:20 ` Jens Axboe
2006-05-27 21:23 ` Mark Lord
2006-05-27 21:24 ` Jeff Garzik
@ 2006-05-27 21:38 ` Linus Torvalds
2006-05-27 21:50 ` Jeff Garzik
2006-05-27 21:50 ` Linus Torvalds
3 siblings, 1 reply; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 21:38 UTC (permalink / raw)
To: Jens Axboe; +Cc: Jeff Garzik, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Jens Axboe wrote:
> @@ -4846,6 +4847,7 @@ int ata_pci_device_suspend(struct pci_de
>
> int ata_pci_device_resume(struct pci_dev *pdev)
> {
> + msleep(500);
> pci_set_power_state(pdev, PCI_D0);
That's just insane. Why would we need a delay before going to D0?
Especially a long one like half a second?
There's something else going on here, and that delay must be just hiding
the real issue.
Looking at ata_device_resume(), I think the whole thing is pretty broken.
Just as an example, it calls "ata_set_mode()", but that sounds pretty damn
broken, and it should probably do
if (ap->ops->set_mode)
ap->ops->set_mode(ap);
else
ata_set_mode(ap);
like ata_bus_probe() does. Similarly, why shouldn't it do the
probe_reset/phy_reset that is also done in ata_bus_probe()? If it was
required in ata_bus_probe(), it sounds like it would damn well be required
at resume time too - from a hw perspective, there should really be _no_
difference between the initial bootup, and a resume event. The hardware is
in the same state.
(From a _software_ perspective, it's different, of course - one does
discovery, while the other might instead try to see if the state _matches_
what it already knows. But the point is that coming back from power-off
after a resume should really not be any different than coming back from
power-off after a bootup)
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:34 ` Jeff Garzik
2006-05-27 21:37 ` Mark Lord
@ 2006-05-27 21:41 ` Tejun Heo
2006-05-27 21:45 ` Jeff Garzik
1 sibling, 1 reply; 100+ messages in thread
From: Tejun Heo @ 2006-05-27 21:41 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, Mark Lord, zhao, forrest, linux-ide
Jeff Garzik wrote:
> Jens Axboe wrote:
>> On Sat, May 27 2006, Jeff Garzik wrote:
>>>> + msleep(500);
>>>> pci_set_power_state(pdev, PCI_D0);
>>>> pci_restore_state(pdev);
>>>> pci_enable_device(pdev);
>>> Does it work if you move msleep() below pci_restore_state()?
>>> " " " " below pci_enable_device()?
>>
>> Placement makes no difference here, it works after pci_enable_device()
>> as well. What matters is the delay coming before the ata_busy_wait().
>> You could just stuff it in front of that, if you wanted.
>
> That would increase the wait needlessly, since it would be then done
> per-device, even though the devices spin up in parallel on ata_piix.
>
This sounds a lot like hotplug spinup-wait case. Are those harddisks
PATA or SATA? Also, I think this can be handled better by EH. ie.
register it as hotplug event -> wait for spin up if necessary -> reset &
revalidate.
--
tejun
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:41 ` Tejun Heo
@ 2006-05-27 21:45 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:45 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Linus Torvalds, Mark Lord, zhao, forrest, linux-ide
Tejun Heo wrote:
> This sounds a lot like hotplug spinup-wait case. Are those harddisks
> PATA or SATA? Also, I think this can be handled better by EH. ie.
> register it as hotplug event -> wait for spin up if necessary -> reset &
> revalidate.
Ultimately this is all a 2.6.17 temp fix that disappears immediately
with current #upstream work...
So we get to go through all these tests all over again :)
I agree your EH stuff will come in quite handy, but also it should be
noted that my call to ata_bus_probe() is wrong -- you want to IDLE
IMMEDIATE before doing some other stuff.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:38 ` Linus Torvalds
@ 2006-05-27 21:50 ` Jeff Garzik
2006-05-27 21:57 ` Linus Torvalds
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jens Axboe, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Jens Axboe wrote:
>> @@ -4846,6 +4847,7 @@ int ata_pci_device_suspend(struct pci_de
>>
>> int ata_pci_device_resume(struct pci_dev *pdev)
>> {
>> + msleep(500);
>> pci_set_power_state(pdev, PCI_D0);
>
> That's just insane. Why would we need a delay before going to D0?
> Especially a long one like half a second?
>
> There's something else going on here, and that delay must be just hiding
> the real issue.
It is. But I thought you wanted something that works? :)
Your patch + Jens' patch [with the delay MOVED to the end] would get my
ACK for 2.6.17, and we already have infrastructure queued for 2.6.18 to
do a better job of kicking the controller and bus.
> Looking at ata_device_resume(), I think the whole thing is pretty broken.
>
> Just as an example, it calls "ata_set_mode()", but that sounds pretty damn
> broken, and it should probably do
>
> if (ap->ops->set_mode)
> ap->ops->set_mode(ap);
> else
> ata_set_mode(ap);
Correct.
Furthermore, we probably need to issue IDLE IMMEDIATE command ("wake
up") before SET FEATURES - XFER MODE.
> like ata_bus_probe() does. Similarly, why shouldn't it do the
> probe_reset/phy_reset that is also done in ata_bus_probe()? If it was
It should, most definitely. Otherwise we skip initializing the
controller and phy (a key objection of mine all along).
> required in ata_bus_probe(), it sounds like it would damn well be required
> at resume time too - from a hw perspective, there should really be _no_
> difference between the initial bootup, and a resume event. The hardware is
> in the same state.
Correct, correct, correct :)
> (From a _software_ perspective, it's different, of course - one does
> discovery, while the other might instead try to see if the state _matches_
> what it already knows. But the point is that coming back from power-off
> after a resume should really not be any different than coming back from
> power-off after a bootup)
Key difference: we have no BIOS to give us sane hardware state, so the
controller is in a different state from bootup.
Lacking controller init (you mention this above), we basically have raw
silicon defaults.
We are really debugging from scratch here, and there's a lot of work to do.
Just apply the delay patch for 2.6.17, Mr. Get-It-Working ;-) Else dive
into the big can of worms and give us time to fix it right...
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:20 ` Jens Axboe
` (2 preceding siblings ...)
2006-05-27 21:38 ` Linus Torvalds
@ 2006-05-27 21:50 ` Linus Torvalds
2006-05-27 21:53 ` Jeff Garzik
2006-05-27 22:06 ` Mark Lord
3 siblings, 2 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 21:50 UTC (permalink / raw)
To: Jens Axboe; +Cc: Jeff Garzik, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Jens Axboe wrote:
> > hardware, since the current suspend resumes to silicon defaults rather
> > than BIOS defaults (and ata_piix doesn't have a controller reset).
>
> Definitely, and we need to cover a lot more controllers than just piix
> and ahci. Being the selfish bastard that I am, I didn't care much about
> other devices...
>
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..28878f4 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> + ata_busy_wait(ap, ATA_BUSY, 200000);
Btw, it might not be enough to wait for ATA_BUSY to clear.
I think you should wait for ATA_DRDY to be set too.
What does it say if you change that "ata_busy_wait()" to
u8 status = ata_wait_idle(ap);
if (!ata_ok(status))
printk("ATA status %02x\n", status);
instead? That should print out the status (unless it works, which it
apparently won't do for you without the msleep), and maybe give us some
clues about what the f* is going on with the device status thing.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:37 ` Mark Lord
@ 2006-05-27 21:51 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:51 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> On Sat, May 27 2006, Jeff Garzik wrote:
>>>>> + msleep(500);
>>>>> pci_set_power_state(pdev, PCI_D0);
>>>>> pci_restore_state(pdev);
>>>>> pci_enable_device(pdev);
>>>> Does it work if you move msleep() below pci_restore_state()?
>>>> " " " " below pci_enable_device()?
>>>
>>> Placement makes no difference here, it works after pci_enable_device()
>>> as well. What matters is the delay coming before the ata_busy_wait().
>>> You could just stuff it in front of that, if you wanted.
>>
>> That would increase the wait needlessly, since it would be then done
>> per-device, even though the devices spin up in parallel on ata_piix.
>
> Okay, so 2/3 of use have working machines again now.
>
> How is your beast doing with this tweak, Jeff?
I think my beast needs full controller init, the new patch still
hardlocks. Ignore it for 2.6.17 I think...
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:50 ` Linus Torvalds
@ 2006-05-27 21:53 ` Jeff Garzik
2006-05-27 22:14 ` Linus Torvalds
2006-05-27 22:06 ` Mark Lord
1 sibling, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 21:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jens Axboe, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
> u8 status = ata_wait_idle(ap);
Vastly decreased wait probably won't work, since we had to increase the
wait otherwise.
> if (!ata_ok(status))
Correct (re DRQ etc.).
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:50 ` Jeff Garzik
@ 2006-05-27 21:57 ` Linus Torvalds
2006-05-27 22:11 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 21:57 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Jeff Garzik wrote:
>
> It is. But I thought you wanted something that works? :)
I want something that can be _understood_ to work, and that clearly
doesn't have any downsides.
Waiting for the target to not be BUSY at resume is such an obviously
understandable thing that it's not even funny.
In contrast, having an unconditional half-second delay for each SATA port
is not obvious at all, and has potentially huge downsides for totally
unrelated users.
> Your patch + Jens' patch [with the delay MOVED to the end] would get my ACK
> for 2.6.17, and we already have infrastructure queued for 2.6.18 to do a
> better job of kicking the controller and bus.
No way am I adding a random half-second wait. One _millisecond_ or
something similar clearly won't impact anybody else. But half a second for
a condition that people don't even understand?
> > discovery, while the other might instead try to see if the state _matches_
> > what it already knows. But the point is that coming back from power-off
> > after a resume should really not be any different than coming back from
> > power-off after a bootup)
>
> Key difference: we have no BIOS to give us sane hardware state, so the
> controller is in a different state from bootup.
Note that that is true sometimes on other platforms too, even at bootup.
Maybe we've not had that issue with SATA, but we've definitely had it with
just about any other hw config.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:50 ` Linus Torvalds
2006-05-27 21:53 ` Jeff Garzik
@ 2006-05-27 22:06 ` Mark Lord
2006-05-27 22:11 ` Jens Axboe
2006-05-27 22:18 ` Linus Torvalds
1 sibling, 2 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:06 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Jens Axboe wrote:
>>> hardware, since the current suspend resumes to silicon defaults rather
>>> than BIOS defaults (and ata_piix doesn't have a controller reset).
>> Definitely, and we need to cover a lot more controllers than just piix
>> and ahci. Being the selfish bastard that I am, I didn't care much about
>> other devices...
>>
>> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
>> index fa476e7..28878f4 100644
>> --- a/drivers/scsi/libata-core.c
>> +++ b/drivers/scsi/libata-core.c
>> @@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>> {
>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>> + ata_busy_wait(ap, ATA_BUSY, 200000);
>
> Btw, it might not be enough to wait for ATA_BUSY to clear.
>
> I think you should wait for ATA_DRDY to be set too.
>
> What does it say if you change that "ata_busy_wait()" to
>
> u8 status = ata_wait_idle(ap);
> if (!ata_ok(status))
> printk("ATA status %02x\n", status);
I got rid of the 500msec delay (not needed on my machine anyway),
and then added the above code *in front of* the ata_busy_wait().
Resume worked, with this output:
May 27 18:03:33 localhost kernel: ATA status 80
May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
May 27 18:03:33 localhost kernel: ATA status 00
May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
Note that the second set are for the DVD-RW drive,
which (being ATAPI) doesn't assert ATA_DRDY.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:57 ` Linus Torvalds
@ 2006-05-27 22:11 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 22:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jens Axboe, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Jeff Garzik wrote:
>> It is. But I thought you wanted something that works? :)
>
> I want something that can be _understood_ to work, and that clearly
> doesn't have any downsides.
>
> Waiting for the target to not be BUSY at resume is such an obviously
> understandable thing that it's not even funny.
s/target/bus/, agreed.
Remember there might be two devices attached (master/slave), so for the
master/slave case we will wait for BSY twice, both times with the master
selected.
> In contrast, having an unconditional half-second delay for each SATA port
> is not obvious at all, and has potentially huge downsides for totally
> unrelated users.
Incorrect: half-second for each _controller_.
And the only users affected by the delay are ata_piix users doing resume.
Side note -- do you notice that the suspend is really hardcoded for
ata_piix and Intel laptops at the moment? Note ata_set_mode() versus
->set_mode(), complete lack of controller re-init, etc.
>> Your patch + Jens' patch [with the delay MOVED to the end] would get my ACK
>> for 2.6.17, and we already have infrastructure queued for 2.6.18 to do a
>> better job of kicking the controller and bus.
>
> No way am I adding a random half-second wait. One _millisecond_ or
> something similar clearly won't impact anybody else. But half a second for
> a condition that people don't even understand?
Normally I would agree, but the delay would be immediately removed for
2.6.18. libata-dev.git#upstream has reorganized the probe code to
better support this case as well as hotplug. And in general #upstream
recovers from errors A LOT better.
So AFAICS there are three paths to success:
1) Apply Jens' patch for 2.6.17, then undo it in 2.6.18 when #upstream
is merged.
2) Copy a bunch of code into pci_driver::resume and more into
ata_device_resume(), to do controller and phy init. And then remove the
copied code in 2.6.18 when #upstream is merged.
3) Pull #upstream now, and give us another -rc to fix things.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:06 ` Mark Lord
@ 2006-05-27 22:11 ` Jens Axboe
2006-05-27 22:13 ` Jeff Garzik
` (2 more replies)
2006-05-27 22:18 ` Linus Torvalds
1 sibling, 3 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:11 UTC (permalink / raw)
To: Mark Lord
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> Linus Torvalds wrote:
> >
> >On Sat, 27 May 2006, Jens Axboe wrote:
> >>>hardware, since the current suspend resumes to silicon defaults rather
> >>>than BIOS defaults (and ata_piix doesn't have a controller reset).
> >>Definitely, and we need to cover a lot more controllers than just piix
> >>and ahci. Being the selfish bastard that I am, I didn't care much about
> >>other devices...
> >>
> >>diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> >>index fa476e7..28878f4 100644
> >>--- a/drivers/scsi/libata-core.c
> >>+++ b/drivers/scsi/libata-core.c
> >>@@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
> >> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> >> {
> >> if (ap->flags & ATA_FLAG_SUSPENDED) {
> >>+ ata_busy_wait(ap, ATA_BUSY, 200000);
> >
> >Btw, it might not be enough to wait for ATA_BUSY to clear.
> >
> >I think you should wait for ATA_DRDY to be set too.
> >
> >What does it say if you change that "ata_busy_wait()" to
> >
> > u8 status = ata_wait_idle(ap);
> > if (!ata_ok(status))
> > printk("ATA status %02x\n", status);
>
> I got rid of the 500msec delay (not needed on my machine anyway),
> and then added the above code *in front of* the ata_busy_wait().
>
> Resume worked, with this output:
>
> May 27 18:03:33 localhost kernel: ATA status 80
> May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
> May 27 18:03:33 localhost kernel: ATA status 00
> May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
>
> Note that the second set are for the DVD-RW drive,
> which (being ATAPI) doesn't assert ATA_DRDY.
Works for me too, I changed the delay to match the 200000 above just to
be on the safe side.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:11 ` Jens Axboe
@ 2006-05-27 22:13 ` Jeff Garzik
2006-05-27 22:15 ` Jens Axboe
2006-05-27 22:15 ` Mark Lord
2006-05-27 22:21 ` Linus Torvalds
2 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 22:13 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mark Lord, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sat, May 27 2006, Mark Lord wrote:
>> Linus Torvalds wrote:
>>> On Sat, 27 May 2006, Jens Axboe wrote:
>>>>> hardware, since the current suspend resumes to silicon defaults rather
>>>>> than BIOS defaults (and ata_piix doesn't have a controller reset).
>>>> Definitely, and we need to cover a lot more controllers than just piix
>>>> and ahci. Being the selfish bastard that I am, I didn't care much about
>>>> other devices...
>>>>
>>>> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
>>>> index fa476e7..28878f4 100644
>>>> --- a/drivers/scsi/libata-core.c
>>>> +++ b/drivers/scsi/libata-core.c
>>>> @@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
>>>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>>>> {
>>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>>>> + ata_busy_wait(ap, ATA_BUSY, 200000);
>>> Btw, it might not be enough to wait for ATA_BUSY to clear.
>>>
>>> I think you should wait for ATA_DRDY to be set too.
>>>
>>> What does it say if you change that "ata_busy_wait()" to
>>>
>>> u8 status = ata_wait_idle(ap);
>>> if (!ata_ok(status))
>>> printk("ATA status %02x\n", status);
>> I got rid of the 500msec delay (not needed on my machine anyway),
>> and then added the above code *in front of* the ata_busy_wait().
>>
>> Resume worked, with this output:
>>
>> May 27 18:03:33 localhost kernel: ATA status 80
>> May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
>> May 27 18:03:33 localhost kernel: ATA status 00
>> May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
>>
>> Note that the second set are for the DVD-RW drive,
>> which (being ATAPI) doesn't assert ATA_DRDY.
>
> Works for me too, I changed the delay to match the 200000 above just to
> be on the safe side.
Well a status of 0x80 isn't good, noted above, indicates BSY is still
asserted.....
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 21:53 ` Jeff Garzik
@ 2006-05-27 22:14 ` Linus Torvalds
0 siblings, 0 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 22:14 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Jeff Garzik wrote:
>
> Linus Torvalds wrote:
> > u8 status = ata_wait_idle(ap);
>
> Vastly decreased wait probably won't work, since we had to increase the wait
> otherwise.
>
> > if (!ata_ok(status))
>
> Correct (re DRQ etc.).
I actually just want to see what the bits are here.
In other words, we already have a good solution for Mark's problem. I
don't think the suggested solution on top of that for Jens is any good,
and I'd like to see what bits are set for Jens here.
So I _expect_ (or rather, hope) it to fail for his setup.
Jens doesn't seem to have the ATA_BUSY bit set, because if he did, then
the ata_busy_wait() thing should basically have done the wait for him.
That's why I did the wait-idle() + ata_ok() check. It should all work
beautifully on machines that didn't have trouble before - but it should
hopefully print out something interesting for machines that are potential
trouble-spots.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:11 ` Jens Axboe
2006-05-27 22:13 ` Jeff Garzik
@ 2006-05-27 22:15 ` Mark Lord
2006-05-27 22:17 ` Jens Axboe
2006-05-27 22:21 ` Linus Torvalds
2 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:15 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sat, May 27 2006, Mark Lord wrote:
>
>> Resume worked, with this output:
>>
>> May 27 18:03:33 localhost kernel: ATA status 80
>> May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
>> May 27 18:03:33 localhost kernel: ATA status 00
>> May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
>>
>> Note that the second set are for the DVD-RW drive,
>> which (being ATAPI) doesn't assert ATA_DRDY.
>
> Works for me too, I changed the delay to match the 200000 above just to
> be on the safe side.
But what was the kernel output for the status value?
It's important here --> the printk() probably introduced enough delay
to compensate for the removal of your other delay.
You did remove the other delay, right?
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:13 ` Jeff Garzik
@ 2006-05-27 22:15 ` Jens Axboe
0 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:15 UTC (permalink / raw)
To: Jeff Garzik
Cc: Mark Lord, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sat, May 27 2006, Mark Lord wrote:
> >>Linus Torvalds wrote:
> >>>On Sat, 27 May 2006, Jens Axboe wrote:
> >>>>>hardware, since the current suspend resumes to silicon defaults rather
> >>>>>than BIOS defaults (and ata_piix doesn't have a controller reset).
> >>>>Definitely, and we need to cover a lot more controllers than just piix
> >>>>and ahci. Being the selfish bastard that I am, I didn't care much about
> >>>>other devices...
> >>>>
> >>>>diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> >>>>index fa476e7..28878f4 100644
> >>>>--- a/drivers/scsi/libata-core.c
> >>>>+++ b/drivers/scsi/libata-core.c
> >>>>@@ -4297,6 +4297,7 @@ static int ata_start_drive(struct ata_po
> >>>>int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> >>>>{
> >>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
> >>>>+ ata_busy_wait(ap, ATA_BUSY, 200000);
> >>>Btw, it might not be enough to wait for ATA_BUSY to clear.
> >>>
> >>>I think you should wait for ATA_DRDY to be set too.
> >>>
> >>>What does it say if you change that "ata_busy_wait()" to
> >>>
> >>> u8 status = ata_wait_idle(ap);
> >>> if (!ata_ok(status))
> >>> printk("ATA status %02x\n", status);
> >>I got rid of the 500msec delay (not needed on my machine anyway),
> >>and then added the above code *in front of* the ata_busy_wait().
> >>
> >>Resume worked, with this output:
> >>
> >>May 27 18:03:33 localhost kernel: ATA status 80
> >>May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
> >>May 27 18:03:33 localhost kernel: ATA status 00
> >>May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
> >>
> >>Note that the second set are for the DVD-RW drive,
> >>which (being ATAPI) doesn't assert ATA_DRDY.
> >
> >Works for me too, I changed the delay to match the 200000 above just to
> >be on the safe side.
>
> Well a status of 0x80 isn't good, noted above, indicates BSY is still
> asserted.....
Just commenting on how it worked for me, no issues. And I don't see any
ATA status messages.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:15 ` Mark Lord
@ 2006-05-27 22:17 ` Jens Axboe
0 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:17 UTC (permalink / raw)
To: Mark Lord
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> Jens Axboe wrote:
> >On Sat, May 27 2006, Mark Lord wrote:
> >
> >>Resume worked, with this output:
> >>
> >>May 27 18:03:33 localhost kernel: ATA status 80
> >>May 27 18:03:33 localhost kernel: ata1: dev 0 configured for UDMA/100
> >>May 27 18:03:33 localhost kernel: ATA status 00
> >>May 27 18:03:33 localhost kernel: ata2: dev 0 configured for UDMA/33
> >>
> >>Note that the second set are for the DVD-RW drive,
> >>which (being ATAPI) doesn't assert ATA_DRDY.
> >
> >Works for me too, I changed the delay to match the 200000 above just to
> >be on the safe side.
>
>
> But what was the kernel output for the status value?
None, I got no output. I resumed 4 times here, worked every time and no
ATA status messages.
> It's important here --> the printk() probably introduced enough delay
> to compensate for the removal of your other delay.
Not when there's no printk :)
> You did remove the other delay, right?
of course, in fact it resumes noticably faster.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:06 ` Mark Lord
2006-05-27 22:11 ` Jens Axboe
@ 2006-05-27 22:18 ` Linus Torvalds
2006-05-27 22:23 ` Mark Lord
1 sibling, 1 reply; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 22:18 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Mark Lord wrote:
>
> I got rid of the 500msec delay (not needed on my machine anyway),
> and then added the above code *in front of* the ata_busy_wait().
>
> Resume worked, with this output:
>
> May 27 18:03:33 localhost kernel: ATA status 80
Ok, we already knew you had ATA_BUSY set, so bit 7 being set isn't
surprising at all.
> May 27 18:03:33 localhost kernel: ATA status 00
>
> Note that the second set are for the DVD-RW drive,
> which (being ATAPI) doesn't assert ATA_DRDY.
Neither of them do. 80 is just your old ATA_BUSY (that you needed to wait
a long time for to disappear).
Can you move the two new lines to *after* the ata_busy_wait? I'd like to
see if ATA_DRDY ever comes on (it quite possibly will not, just humor me)
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:11 ` Jens Axboe
2006-05-27 22:13 ` Jeff Garzik
2006-05-27 22:15 ` Mark Lord
@ 2006-05-27 22:21 ` Linus Torvalds
2006-05-27 22:29 ` Mark Lord
2006-05-27 22:35 ` [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Jens Axboe
2 siblings, 2 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-27 22:21 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mark Lord, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sun, 28 May 2006, Jens Axboe wrote:
>
> Works for me too, I changed the delay to match the 200000 above just to
> be on the safe side.
That's insane. That was _not_ what I expected. That wasn't actually
supposed to fix anything at all for somebody who had problems.
Maybe you didn't remove the half-second delay (which is why it still works
for you)? I'd like to see what bits are set when it _isn't_ there, to see
if there's something we can trigger on..
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:18 ` Linus Torvalds
@ 2006-05-27 22:23 ` Mark Lord
2006-05-27 22:43 ` Mark Lord
2006-05-28 0:13 ` Linus Torvalds
0 siblings, 2 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:23 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Mark Lord wrote:
>
>> Note that the second set are for the DVD-RW drive,
>> which (being ATAPI) doesn't assert ATA_DRDY.
>
> Neither of them do. 80 is just your old ATA_BUSY (that you needed to wait
> a long time for to disappear).
Yeah, I know that. Just pointing it out because the original
discussion suggested we might want the code to also wait for DRDY,
which would be correct only for ATA disks.
> Can you move the two new lines to *after* the ata_busy_wait? I'd like to
> see if ATA_DRDY ever comes on (it quite possibly will not, just humor me)
Okay. rebooting now..
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:21 ` Linus Torvalds
@ 2006-05-27 22:29 ` Mark Lord
2006-05-27 22:36 ` Jens Axboe
2006-05-27 22:35 ` [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Jens Axboe
1 sibling, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:29 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sun, 28 May 2006, Jens Axboe wrote:
>> Works for me too, I changed the delay to match the 200000 above just to
>> be on the safe side.
>
> That's insane. That was _not_ what I expected. That wasn't actually
> supposed to fix anything at all for somebody who had problems.
The newly added ata_wait_idle() call waits on both BUSY and DRQ,
so perhaps the wait on DRQ was what "fixed" it this time for Jens?
-ml
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:21 ` Linus Torvalds
2006-05-27 22:29 ` Mark Lord
@ 2006-05-27 22:35 ` Jens Axboe
2006-05-27 22:52 ` Jeff Garzik
1 sibling, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:35 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mark Lord, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Linus Torvalds wrote:
>
>
> On Sun, 28 May 2006, Jens Axboe wrote:
> >
> > Works for me too, I changed the delay to match the 200000 above just to
> > be on the safe side.
>
> That's insane. That was _not_ what I expected. That wasn't actually
> supposed to fix anything at all for somebody who had problems.
I realize that, but that is what happens. And I don't think it's a
fluke, it's pretty much 100% reliably failed for me in the later 2.6.17
kernels and this one did 5 resume cycles without even complaining.
> Maybe you didn't remove the half-second delay (which is why it still works
> for you)? I'd like to see what bits are set when it _isn't_ there, to see
> if there's something we can trigger on..
The 500msec delay is gone, I can attest to that. This is a straight diff
of what I booted and which apparently works:
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..526d1e1 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4297,6 +4297,13 @@ static int ata_start_drive(struct ata_po
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
+#if 0
+ ata_busy_wait(ap, ATA_BUSY, 200000);
+#else
+ u8 stat = ata_wait_idle(ap);
+ if (!ata_ok(stat))
+ printk("ATA status %02x\n", stat);
+#endif
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
}
@@ -4850,6 +4857,7 @@ int ata_pci_device_resume(struct pci_dev
pci_restore_state(pdev);
pci_enable_device(pdev);
pci_set_master(pdev);
+ //msleep(500);
return 0;
}
#endif /* CONFIG_PCI */
diff --git a/include/linux/libata.h b/include/linux/libata.h
index b80d2e7..80fb1f9 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
static inline u8 ata_wait_idle(struct ata_port *ap)
{
- u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
+ u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
if (status & (ATA_BUSY | ATA_DRQ)) {
unsigned long l = ap->ioaddr.status_addr;
--
Jens Axboe
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:29 ` Mark Lord
@ 2006-05-27 22:36 ` Jens Axboe
2006-05-27 22:48 ` Mark Lord
0 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:36 UTC (permalink / raw)
To: Mark Lord
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> Linus Torvalds wrote:
> >
> >On Sun, 28 May 2006, Jens Axboe wrote:
> >>Works for me too, I changed the delay to match the 200000 above just to
> >>be on the safe side.
> >
> >That's insane. That was _not_ what I expected. That wasn't actually
> >supposed to fix anything at all for somebody who had problems.
>
> The newly added ata_wait_idle() call waits on both BUSY and DRQ,
> so perhaps the wait on DRQ was what "fixed" it this time for Jens?
It must be, unless the changed delay makes difference (unlikely).
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:23 ` Mark Lord
@ 2006-05-27 22:43 ` Mark Lord
2006-05-28 0:13 ` Linus Torvalds
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:43 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Linus Torvalds wrote:
..
>> Can you move the two new lines to *after* the ata_busy_wait? I'd like
>> to see if ATA_DRDY ever comes on (it quite possibly will not, just
>> humor me)
>
> Okay. rebooting now..
..
Okay, I used a patch like this on top of what we had before (warning: broken whitespace):
--- linux/drivers/scsi/libata-core.c.saved 2006-05-27 17:22:42.000000000 -0400
+++ linux/drivers/scsi/libata-core.c 2006-05-27 18:35:02.000000000 -0400
@@ -4297,7 +4297,11 @@
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
+ u8 status1, status2;
+ status1 = ata_chk_status(ap);
ata_busy_wait(ap, ATA_BUSY, 200000);
+ status2 = ata_chk_status(ap);
+ printk("ATA status1=%02x status2=%02x\n", status1, status2);
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
}
---------------------------------------------------------------------------
The output it gave me was:
May 27 18:39:12 localhost kernel: ATA status1=80 status2=50
May 27 18:39:12 localhost kernel: ata1: dev 0 configured for UDMA/100
May 27 18:39:12 localhost kernel: ATA status1=00 status2=00
May 27 18:39:12 localhost kernel: ata2: dev 0 configured for UDMA/33
So my drive, at least, seems to give DRDY very rapidly after de-asserting BUSY.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:36 ` Jens Axboe
@ 2006-05-27 22:48 ` Mark Lord
2006-05-27 22:53 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:48 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Saturday 27 May 2006 18:36, Jens Axboe wrote:
> On Sat, May 27 2006, Mark Lord wrote:
> > Linus Torvalds wrote:
> > >
> > >On Sun, 28 May 2006, Jens Axboe wrote:
> > >>Works for me too, I changed the delay to match the 200000 above just to
> > >>be on the safe side.
> > >
> > >That's insane. That was _not_ what I expected. That wasn't actually
> > >supposed to fix anything at all for somebody who had problems.
> >
> > The newly added ata_wait_idle() call waits on both BUSY and DRQ,
> > so perhaps the wait on DRQ was what "fixed" it this time for Jens?
>
> It must be, unless the changed delay makes difference (unlikely).
>
So a better trial for you (to answer Linus) might be this patch:
---
--- stock-2.6.17-rc4-git1/drivers/scsi/libata-core.c 2006-05-27 16:58:05.000000000 -0400
+++ linux/drivers/scsi/libata-core.c 2006-05-27 18:35:02.000000000 -0400
@@ -4297,6 +4297,11 @@
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
+ u8 status1, status2;
+ status1 = ata_chk_status(ap);
+ ata_busy_wait(ap, ATA_BUSY, 200000);
+ status2 = ata_chk_status(ap);
+ printk("ATA status1=%02x status2=%02x\n", status1, status2);
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
}
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:35 ` [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Jens Axboe
@ 2006-05-27 22:52 ` Jeff Garzik
2006-05-27 22:54 ` Jens Axboe
2006-05-27 22:56 ` Mark Lord
0 siblings, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 22:52 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> index b80d2e7..80fb1f9 100644
> --- a/include/linux/libata.h
> +++ b/include/linux/libata.h
> @@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
>
> static inline u8 ata_wait_idle(struct ata_port *ap)
> {
> - u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
> + u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
Ouch. That will slow down boot for a lot of otherwise-untouched users,
if that makes it in... :(
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:48 ` Mark Lord
@ 2006-05-27 22:53 ` Jens Axboe
2006-05-27 22:55 ` Jeff Garzik
2006-05-27 23:10 ` Mark Lord
0 siblings, 2 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:53 UTC (permalink / raw)
To: Mark Lord
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Mark Lord wrote:
> On Saturday 27 May 2006 18:36, Jens Axboe wrote:
> > On Sat, May 27 2006, Mark Lord wrote:
> > > Linus Torvalds wrote:
> > > >
> > > >On Sun, 28 May 2006, Jens Axboe wrote:
> > > >>Works for me too, I changed the delay to match the 200000 above just to
> > > >>be on the safe side.
> > > >
> > > >That's insane. That was _not_ what I expected. That wasn't actually
> > > >supposed to fix anything at all for somebody who had problems.
> > >
> > > The newly added ata_wait_idle() call waits on both BUSY and DRQ,
> > > so perhaps the wait on DRQ was what "fixed" it this time for Jens?
> >
> > It must be, unless the changed delay makes difference (unlikely).
> >
>
> So a better trial for you (to answer Linus) might be this patch:
>
> ---
> --- stock-2.6.17-rc4-git1/drivers/scsi/libata-core.c 2006-05-27 16:58:05.000000000 -0400
> +++ linux/drivers/scsi/libata-core.c 2006-05-27 18:35:02.000000000 -0400
> @@ -4297,6 +4297,11 @@
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> + u8 status1, status2;
> + status1 = ata_chk_status(ap);
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + status2 = ata_chk_status(ap);
> + printk("ATA status1=%02x status2=%02x\n", status1, status2);
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
> }
Inspired by your last posting, I already booted this:
+ u8 status1, status2, status3;
+ status1 = ata_chk_status(ap);
+ ata_busy_wait(ap, ATA_BUSY, 200000);
+ status2 = ata_chk_status(ap);
+ status3 = ata_wait_idle(ap);
+ printk("s 1/2/3 %x/%x/%x\n", status1, status2, status3);
with the rest being as posted before. It gives me:
s 1/2/3 80/50/50
on resume.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:52 ` Jeff Garzik
@ 2006-05-27 22:54 ` Jens Axboe
2006-05-27 23:06 ` Jens Axboe
2006-05-27 22:56 ` Mark Lord
1 sibling, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 22:54 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sat, May 27 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >index b80d2e7..80fb1f9 100644
> >--- a/include/linux/libata.h
> >+++ b/include/linux/libata.h
> >@@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
> >
> > static inline u8 ata_wait_idle(struct ata_port *ap)
> > {
> >- u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
> >+ u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>
>
> Ouch. That will slow down boot for a lot of otherwise-untouched users,
> if that makes it in... :(
Not proposing it for inclusion, just posted what I had in entirety to
avoid confusion. I made that change to potentially avoid recompiling
again for retesting. I can test without that too, I guess I will now
because we have no clear results so far.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:53 ` Jens Axboe
@ 2006-05-27 22:55 ` Jeff Garzik
2006-05-27 23:10 ` Mark Lord
1 sibling, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 22:55 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mark Lord, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> Inspired by your last posting, I already booted this:
>
> + u8 status1, status2, status3;
> + status1 = ata_chk_status(ap);
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + status2 = ata_chk_status(ap);
> + status3 = ata_wait_idle(ap);
> + printk("s 1/2/3 %x/%x/%x\n", status1, status2, status3);
>
> with the rest being as posted before. It gives me:
>
> s 1/2/3 80/50/50
Yep, that's consistent with ata_piix's PATA emulation during device
discovery.
DRDY on this hardware is 100% emulated...
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:52 ` Jeff Garzik
2006-05-27 22:54 ` Jens Axboe
@ 2006-05-27 22:56 ` Mark Lord
2006-05-27 23:03 ` Jeff Garzik
1 sibling, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 22:56 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Jens Axboe wrote:
>> index b80d2e7..80fb1f9 100644
>> --- a/include/linux/libata.h
>> +++ b/include/linux/libata.h
>> @@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
>>
>> static inline u8 ata_wait_idle(struct ata_port *ap)
>> {
>> - u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
>> + u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>
>
> Ouch. That will slow down boot for a lot of otherwise-untouched users,
Really? If a drive claims to be BUSY, then it shouldn't accept a new command.
But yes, this may be a bit too mainline to risk at this point,
given controller quirks and the like that may be spoofing the status.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:56 ` Mark Lord
@ 2006-05-27 23:03 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-27 23:03 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> index b80d2e7..80fb1f9 100644
>>> --- a/include/linux/libata.h
>>> +++ b/include/linux/libata.h
>>> @@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
>>>
>>> static inline u8 ata_wait_idle(struct ata_port *ap)
>>> {
>>> - u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
>>> + u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>>
>>
>> Ouch. That will slow down boot for a lot of otherwise-untouched users,
>
> Really? If a drive claims to be BUSY, then it shouldn't accept a new
> command.
>
> But yes, this may be a bit too mainline to risk at this point,
> given controller quirks and the like that may be spoofing the status.
All the other patches only affected ata_piix users using suspend, which
is a very small subset. That's why we could be cavalier about the
delay. Plus, I would expect resume to require a higher-than-normal delay.
In contrast, the above patch affects most users, on all platforms.
Cases which would previously error out in 10ms now take much longer to
error out. Slows boot, slows multi-path, and other fun stuff.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:54 ` Jens Axboe
@ 2006-05-27 23:06 ` Jens Axboe
0 siblings, 0 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-27 23:06 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Mark Lord, zhao, forrest, Tejun Heo, linux-ide
On Sun, May 28 2006, Jens Axboe wrote:
> On Sat, May 27 2006, Jeff Garzik wrote:
> > Jens Axboe wrote:
> > >index b80d2e7..80fb1f9 100644
> > >--- a/include/linux/libata.h
> > >+++ b/include/linux/libata.h
> > >@@ -742,7 +742,7 @@ static inline u8 ata_busy_wait(struct at
> > >
> > > static inline u8 ata_wait_idle(struct ata_port *ap)
> > > {
> > >- u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
> > >+ u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
> >
> >
> > Ouch. That will slow down boot for a lot of otherwise-untouched users,
> > if that makes it in... :(
>
> Not proposing it for inclusion, just posted what I had in entirety to
> avoid confusion. I made that change to potentially avoid recompiling
> again for retesting. I can test without that too, I guess I will now
> because we have no clear results so far.
With that hunk restored to normal, no change. Resume still works, status
is still 80/50/50.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:53 ` Jens Axboe
2006-05-27 22:55 ` Jeff Garzik
@ 2006-05-27 23:10 ` Mark Lord
2006-05-28 0:24 ` Linus Torvalds
1 sibling, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-27 23:10 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> + u8 status1, status2, status3;
> + status1 = ata_chk_status(ap);
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + status2 = ata_chk_status(ap);
> + status3 = ata_wait_idle(ap);
> + printk("s 1/2/3 %x/%x/%x\n", status1, status2, status3);
>
> with the rest being as posted before. It gives me:
>
> s 1/2/3 80/50/50 on resume.
...
> With that hunk restored to normal, no change. Resume still works, status
> is still 80/50/50.
So, just the "status1 = ata_chk_status(ap);" seems to be enough
to make it work for you, in place of the earlier 500msec delay we started with?
Curious.
Does it still work when the "status1 = ata_chk_status(ap);" is replaced with udelay(1) ??
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 22:23 ` Mark Lord
2006-05-27 22:43 ` Mark Lord
@ 2006-05-28 0:13 ` Linus Torvalds
1 sibling, 0 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-28 0:13 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Mark Lord wrote:
>
> Yeah, I know that. Just pointing it out because the original
> discussion suggested we might want the code to also wait for DRDY,
> which would be correct only for ATA disks.
Well, that was my half-arsed theory on why it didn't work for Jens (ie he
wasn't busy, but instead would wait for READY).
And judging from Jens' report, that was apparently exactly what was going
on. In some respects, Jens' case is actually the more natural one.
In fact, it seems to be yours too, if I read your next mail correctly. We
should _first_ wait for !BUSY and then the quick RDY wait is actually
good.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-27 23:10 ` Mark Lord
@ 2006-05-28 0:24 ` Linus Torvalds
2006-05-28 0:26 ` Linus Torvalds
` (4 more replies)
0 siblings, 5 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-28 0:24 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Mark Lord wrote:
>
> So, just the "status1 = ata_chk_status(ap);" seems to be enough
> to make it work for you, in place of the earlier 500msec delay we started
> with?
Not possible, since the "ata_busy_wait()" already did at least _one_
udelay(10);
status = ata_chk_status(ap);
as part of the busy wait. So it's not just a single ata_chk_status() that
matters for Jens, it's a couple. And apparently waiting for DRDY actually
works for him.
In your case, it won't assert DRDY, and that's ok, the "ata_wait_idle()"
timeout is very short, so even if it doesn't happen, we don't really care.
So I would suggest that for 2.6.17, we do this trivial patch that works
for both of you, and which does _not_ have any insane half-second
timeouts.
I think it's much easier to explain this one too. "Try a quick wait for
idle" to me sounds logical and fine - even if we don't actually _require_
it.
If I've understood correctly, this patch (with no additional timeouts or
anything else) should work for both Mark and Jens, and not make Jeff's
case noticeably worse either. And I'm certainly fine with it.
Objections? Can everybody confirm that a clean current git tree with just
this doesnt' have any huge downsides?
Linus
---
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index fa476e7..6ccfbc9 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4296,6 +4296,13 @@ static int ata_start_drive(struct ata_po
*/
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
+ /*
+ * Wait for BUSY to go away for up to 2 seconds,
+ * and then try to wait for idle (up to 1 msec)
+ */
+ ata_busy_wait(ap, ATA_BUSY, 200000);
+ ata_wait_idle(ap);
+
if (ap->flags & ATA_FLAG_SUSPENDED) {
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
^ permalink raw reply related [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:24 ` Linus Torvalds
@ 2006-05-28 0:26 ` Linus Torvalds
2006-05-28 0:56 ` Jeff Garzik
2006-05-28 0:35 ` Linus Torvalds
` (3 subsequent siblings)
4 siblings, 1 reply; 100+ messages in thread
From: Linus Torvalds @ 2006-05-28 0:26 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Linus Torvalds wrote:
>
> Objections? Can everybody confirm that a clean current git tree with just
> this doesnt' have any huge downsides?
Perhaps more importantly, can anybody imagine any realistic possible
downsides at all? Both waits have an _eventual_ timeout (although the busy
wait is pretty damn long), so even the most perverse controller should at
worst just wait a fairly long time.
And we're doing nothing but reading the status register, so I don't see
any reasonable breakage.
Hmm?
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:24 ` Linus Torvalds
2006-05-28 0:26 ` Linus Torvalds
@ 2006-05-28 0:35 ` Linus Torvalds
2006-05-28 0:51 ` Mark Lord
` (2 subsequent siblings)
4 siblings, 0 replies; 100+ messages in thread
From: Linus Torvalds @ 2006-05-28 0:35 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Linus Torvalds wrote:
> + /*
> + * Wait for BUSY to go away for up to 2 seconds,
> + * and then try to wait for idle (up to 1 msec)
> + */
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + ata_wait_idle(ap);
The comment is wrong by an order of magnitude. "ata_wait_idle()" will wait
for up to "10" msec, not "1" msec.
I'll fix that and commit, if people are ok with this.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:24 ` Linus Torvalds
2006-05-28 0:26 ` Linus Torvalds
2006-05-28 0:35 ` Linus Torvalds
@ 2006-05-28 0:51 ` Mark Lord
2006-05-28 0:53 ` Jeff Garzik
2006-05-28 1:01 ` Jeff Garzik
4 siblings, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-28 0:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Jeff Garzik, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
> On Sat, 27 May 2006, Mark Lord wrote:
>> So, just the "status1 = ata_chk_status(ap);" seems to be enough
>> to make it work for you, in place of the earlier 500msec delay we started
>> with?
>
> Not possible, since the "ata_busy_wait()" already did at least _one_
>
> udelay(10);
> status = ata_chk_status(ap);
>
> as part of the busy wait. So it's not just a single ata_chk_status() that
> matters for Jens, it's a couple. And apparently waiting for DRDY actually
> works for him.
...
> If I've understood correctly, this patch (with no additional timeouts or
> anything else) should work for both Mark and Jens, and not make Jeff's
> case noticeably worse either. And I'm certainly fine with it.
..
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> + /*
> + * Wait for BUSY to go away for up to 2 seconds,
> + * and then try to wait for idle (up to 1 msec)
> + */
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + ata_wait_idle(ap);
> +
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
..
It works for me. But I really think we need to hear from Jens,
as his machine seems the more finicky. As you pointed out,
his works only with a small delay in front of the previous patches,
and that small delay may still be needed here..
If Zhao is still "on the line", perhaps he could also comment
for his own setup, though the above patch should certainly work
(the original patch from me was really a quick backport of his patch).
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:24 ` Linus Torvalds
` (2 preceding siblings ...)
2006-05-28 0:51 ` Mark Lord
@ 2006-05-28 0:53 ` Jeff Garzik
2006-05-28 0:56 ` Mark Lord
2006-05-28 1:01 ` Linus Torvalds
2006-05-28 1:01 ` Jeff Garzik
4 siblings, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 0:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
> index fa476e7..6ccfbc9 100644
> --- a/drivers/scsi/libata-core.c
> +++ b/drivers/scsi/libata-core.c
> @@ -4296,6 +4296,13 @@ static int ata_start_drive(struct ata_po
> */
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> + /*
> + * Wait for BUSY to go away for up to 2 seconds,
> + * and then try to wait for idle (up to 1 msec)
> + */
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + ata_wait_idle(ap);
I don't object, but ata_wait_idle() is completely pointless.
ata_wait_idle() has nothing to do with DRDY. It waits until both BSY
and DRQ are clear. See:
> u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
Regards,
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:53 ` Jeff Garzik
@ 2006-05-28 0:56 ` Mark Lord
2006-05-28 1:01 ` Linus Torvalds
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-28 0:56 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Linus Torvalds wrote:
>> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
>> index fa476e7..6ccfbc9 100644
>> --- a/drivers/scsi/libata-core.c
>> +++ b/drivers/scsi/libata-core.c
>> @@ -4296,6 +4296,13 @@ static int ata_start_drive(struct ata_po
>> */
>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>> {
>> + /*
>> + * Wait for BUSY to go away for up to 2 seconds,
>> + * and then try to wait for idle (up to 1 msec)
>> + */
>> + ata_busy_wait(ap, ATA_BUSY, 200000);
>> + ata_wait_idle(ap);
>
> I don't object, but ata_wait_idle() is completely pointless.
>
> ata_wait_idle() has nothing to do with DRDY. It waits until both BSY
> and DRQ are clear. See:
>> u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 1000);
Good point. But I think Linus just mucked up his explanation,
as Jens has never used a patch to wait for DRDY, but rather only
to wait for DRQ to clear, same as the latest Linus patch does
(but in a different sequence from Linus's).
Good thing too, as waiting for DRDY would be silly for ATAPI drives.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:26 ` Linus Torvalds
@ 2006-05-28 0:56 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 0:56 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Linus Torvalds wrote:
>> Objections? Can everybody confirm that a clean current git tree with just
>> this doesnt' have any huge downsides?
>
> Perhaps more importantly, can anybody imagine any realistic possible
> downsides at all? Both waits have an _eventual_ timeout (although the busy
> wait is pretty damn long), so even the most perverse controller should at
> worst just wait a fairly long time.
>
> And we're doing nothing but reading the status register, so I don't see
> any reasonable breakage.
Well,
* The second wait is pointless, but harmless
* Modern FIS-based controllers (ahci, sata_sil24, SAS HBAs) don't poll
for BSY, ATA is completely packetized and different. But still
harmless, since only ata_piix suspends. The code must go elsewhere,
when other controllers start suspending.
Thus, no objections for 2.6.17.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:53 ` Jeff Garzik
2006-05-28 0:56 ` Mark Lord
@ 2006-05-28 1:01 ` Linus Torvalds
2006-05-28 1:03 ` Jeff Garzik
1 sibling, 1 reply; 100+ messages in thread
From: Linus Torvalds @ 2006-05-28 1:01 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
On Sat, 27 May 2006, Jeff Garzik wrote:
>
> I don't object, but ata_wait_idle() is completely pointless.
Ahh, you're right. I used it originally because I wanted just the value
for ata_ok(), and the other ata_ok() user got it that way.
Jens - willing to test one more thing? Instead of using ata_wait_idle,
what about just
ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
in case your issue actually comes from DRQ being on for some strange
reason (DMA in progress for some bootup inquiry command by the bios?)
Gaah.
Linus
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 0:24 ` Linus Torvalds
` (3 preceding siblings ...)
2006-05-28 0:53 ` Jeff Garzik
@ 2006-05-28 1:01 ` Jeff Garzik
2006-05-28 15:28 ` [PATCH] 2.6.17-rc5: the latest consensus libata resume fix Mark Lord
4 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 1:01 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> + /*
> + * Wait for BUSY to go away for up to 2 seconds,
> + * and then try to wait for idle (up to 1 msec)
> + */
> + ata_busy_wait(ap, ATA_BUSY, 200000);
> + ata_wait_idle(ap);
> +
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
>
Oh, and as Jens pointed out, the busy-wait should be inside the
ATA_FLAG_SUSPENDED test... (though one wonders when resume would ever
be called, except when that flag is set)
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue
2006-05-28 1:01 ` Linus Torvalds
@ 2006-05-28 1:03 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 1:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Lord, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Linus Torvalds wrote:
>
> On Sat, 27 May 2006, Jeff Garzik wrote:
>> I don't object, but ata_wait_idle() is completely pointless.
>
> Ahh, you're right. I used it originally because I wanted just the value
> for ata_ok(), and the other ata_ok() user got it that way.
>
> Jens - willing to test one more thing? Instead of using ata_wait_idle,
> what about just
>
> ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>
> in case your issue actually comes from DRQ being on for some strange
> reason (DMA in progress for some bootup inquiry command by the bios?)
Was DRQ ever actually asserted for him?
Regardless, the above line is "conservative" and safe and OK.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 1:01 ` Jeff Garzik
@ 2006-05-28 15:28 ` Mark Lord
2006-05-28 17:14 ` Jens Axboe
0 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-28 15:28 UTC (permalink / raw)
To: Jeff Garzik
Cc: Linus Torvalds, Jens Axboe, zhao, forrest, Tejun Heo, linux-ide
Okay, just to sum things up.
The patch below is the current "release candidate" for
improving the libata/ata_piix resume functionality in 2.6.17-rc*.
It forces libata to wait for up to 2 seconds for BUSY|DRQ to clear
on resume before continuing. This is only for 2.6.17 at present.
We are waiting on Jens to test and report back for this specific version.
Signed-off-by: Mark Lord <liml@rtr.ca>
---
--- linux-2.6.17-rc5/drivers/scsi/libata-core.c
+++ linux/drivers/scsi/libata-core.c
@@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
*/
int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
{
if (ap->flags & ATA_FLAG_SUSPENDED) {
+ ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
ap->flags &= ~ATA_FLAG_SUSPENDED;
ata_set_mode(ap);
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 15:28 ` [PATCH] 2.6.17-rc5: the latest consensus libata resume fix Mark Lord
@ 2006-05-28 17:14 ` Jens Axboe
2006-05-28 19:05 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Jens Axboe @ 2006-05-28 17:14 UTC (permalink / raw)
To: Mark Lord
Cc: Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
On Sun, May 28 2006, Mark Lord wrote:
> Okay, just to sum things up.
>
> The patch below is the current "release candidate" for
> improving the libata/ata_piix resume functionality in 2.6.17-rc*.
>
> It forces libata to wait for up to 2 seconds for BUSY|DRQ to clear
> on resume before continuing. This is only for 2.6.17 at present.
>
> We are waiting on Jens to test and report back for this specific version.
>
> Signed-off-by: Mark Lord <liml@rtr.ca>
> ---
> --- linux-2.6.17-rc5/drivers/scsi/libata-core.c
> +++ linux/drivers/scsi/libata-core.c
> @@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
> */
> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> {
> if (ap->flags & ATA_FLAG_SUSPENDED) {
> + ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
> ap->flags &= ~ATA_FLAG_SUSPENDED;
> ata_set_mode(ap);
Sorry for the unresponsiveness, still away and internet connectivity
spotty. Just tested the above, and it works for me! I think Marks
analisys wrt DRQ is completely correct and this validates it.
Acked-by: Jens Axboe <axboe@suse.de>
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 17:14 ` Jens Axboe
@ 2006-05-28 19:05 ` Jeff Garzik
2006-05-28 19:18 ` Mark Lord
0 siblings, 1 reply; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 19:05 UTC (permalink / raw)
To: Jens Axboe; +Cc: Mark Lord, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sun, May 28 2006, Mark Lord wrote:
>> Okay, just to sum things up.
>>
>> The patch below is the current "release candidate" for
>> improving the libata/ata_piix resume functionality in 2.6.17-rc*.
>>
>> It forces libata to wait for up to 2 seconds for BUSY|DRQ to clear
>> on resume before continuing. This is only for 2.6.17 at present.
>>
>> We are waiting on Jens to test and report back for this specific version.
>>
>> Signed-off-by: Mark Lord <liml@rtr.ca>
>> ---
>> --- linux-2.6.17-rc5/drivers/scsi/libata-core.c
>> +++ linux/drivers/scsi/libata-core.c
>> @@ -4296,6 +4296,7 @@ static int ata_start_drive(struct ata_po
>> */
>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>> {
>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>> + ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>> ap->flags &= ~ATA_FLAG_SUSPENDED;
>> ata_set_mode(ap);
>
> Sorry for the unresponsiveness, still away and internet connectivity
> spotty. Just tested the above, and it works for me! I think Marks
> analisys wrt DRQ is completely correct and this validates it.
Does your box work without ATA_DRQ?
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 19:05 ` Jeff Garzik
@ 2006-05-28 19:18 ` Mark Lord
2006-05-28 20:10 ` Jeff Garzik
0 siblings, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-28 19:18 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Jens Axboe wrote:
>> On Sun, May 28 2006, Mark Lord wrote:
>>>
..
>>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>>> {
>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>>> + ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>>> ap->flags &= ~ATA_FLAG_SUSPENDED;
>>> ata_set_mode(ap);
>>
>> Sorry for the unresponsiveness, still away and internet connectivity
>> spotty. Just tested the above, and it works for me! I think Marks
>> analisys wrt DRQ is completely correct and this validates it.
>
> Does your box work without ATA_DRQ?
Without ATA_DRQ, we're back to the original Linus one-liner,
which Jens said did NOT work for him on Saturday.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 19:18 ` Mark Lord
@ 2006-05-28 20:10 ` Jeff Garzik
2006-05-28 20:27 ` Mark Lord
2006-05-28 22:28 ` Jens Axboe
0 siblings, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-28 20:10 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> On Sun, May 28 2006, Mark Lord wrote:
>>>>
> ..
>>>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>>>> {
>>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>>>> + ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>>>> ap->flags &= ~ATA_FLAG_SUSPENDED;
>>>> ata_set_mode(ap);
>>>
>>> Sorry for the unresponsiveness, still away and internet connectivity
>>> spotty. Just tested the above, and it works for me! I think Marks
>>> analisys wrt DRQ is completely correct and this validates it.
>>
>> Does your box work without ATA_DRQ?
>
> Without ATA_DRQ, we're back to the original Linus one-liner,
> which Jens said did NOT work for him on Saturday.
Indeed, but nowhere in the ATA Status printks did I ever see the DRQ bit
asserted. It was all 80/50/50.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 20:10 ` Jeff Garzik
@ 2006-05-28 20:27 ` Mark Lord
2006-05-28 22:28 ` Jens Axboe
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-28 20:27 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
> Mark Lord wrote:
>> Jeff Garzik wrote:
>>> Does your box work without ATA_DRQ?
>>
>> Without ATA_DRQ, we're back to the original Linus one-liner,
>> which Jens said did NOT work for him on Saturday.
>
> Indeed, but nowhere in the ATA Status printks did I ever see the DRQ bit
> asserted. It was all 80/50/50.
So true, so true.
Welcome to my (ex-)world!
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 20:10 ` Jeff Garzik
2006-05-28 20:27 ` Mark Lord
@ 2006-05-28 22:28 ` Jens Axboe
2006-05-29 1:28 ` Jeff Garzik
2006-05-29 2:43 ` Mark Lord
1 sibling, 2 replies; 100+ messages in thread
From: Jens Axboe @ 2006-05-28 22:28 UTC (permalink / raw)
To: Jeff Garzik
Cc: Mark Lord, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
On Sun, May 28 2006, Jeff Garzik wrote:
> Mark Lord wrote:
> >Jeff Garzik wrote:
> >>Jens Axboe wrote:
> >>>On Sun, May 28 2006, Mark Lord wrote:
> >>>>
> >..
> >>>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
> >>>> {
> >>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
> >>>>+ ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
> >>>> ap->flags &= ~ATA_FLAG_SUSPENDED;
> >>>> ata_set_mode(ap);
> >>>
> >>>Sorry for the unresponsiveness, still away and internet connectivity
> >>>spotty. Just tested the above, and it works for me! I think Marks
> >>>analisys wrt DRQ is completely correct and this validates it.
> >>
> >>Does your box work without ATA_DRQ?
> >
> >Without ATA_DRQ, we're back to the original Linus one-liner,
> >which Jens said did NOT work for him on Saturday.
>
> Indeed, but nowhere in the ATA Status printks did I ever see the DRQ bit
> asserted. It was all 80/50/50.
Mark is right, ATA_BUSY alone does _not_ work for me. I agree it's a
little odd based on the printk output, it must be a timing thing.
--
Jens Axboe
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 22:28 ` Jens Axboe
@ 2006-05-29 1:28 ` Jeff Garzik
2006-05-29 2:53 ` Mark Lord
2006-05-29 3:28 ` zhao, forrest
2006-05-29 2:43 ` Mark Lord
1 sibling, 2 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-29 1:28 UTC (permalink / raw)
To: Jens Axboe, Mark Lord; +Cc: Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
> On Sun, May 28 2006, Jeff Garzik wrote:
>> Mark Lord wrote:
>>> Jeff Garzik wrote:
>>>> Jens Axboe wrote:
>>>>> On Sun, May 28 2006, Mark Lord wrote:
>>> ..
>>>>>> int ata_device_resume(struct ata_port *ap, struct ata_device *dev)
>>>>>> {
>>>>>> if (ap->flags & ATA_FLAG_SUSPENDED) {
>>>>>> + ata_busy_wait(ap, ATA_BUSY | ATA_DRQ, 200000);
>>>>>> ap->flags &= ~ATA_FLAG_SUSPENDED;
>>>>>> ata_set_mode(ap);
>>>>> Sorry for the unresponsiveness, still away and internet connectivity
>>>>> spotty. Just tested the above, and it works for me! I think Marks
>>>>> analisys wrt DRQ is completely correct and this validates it.
>>>> Does your box work without ATA_DRQ?
>>> Without ATA_DRQ, we're back to the original Linus one-liner,
>>> which Jens said did NOT work for him on Saturday.
>> Indeed, but nowhere in the ATA Status printks did I ever see the DRQ bit
>> asserted. It was all 80/50/50.
>
> Mark is right, ATA_BUSY alone does _not_ work for me. I agree it's a
> little odd based on the printk output, it must be a timing thing.
Its not a timing thing, because it consumes exactly the same amount of
CPU cycles, and exactly the same amount of IO cycles. You're just
testing a different static constant, that's it.
<shrug> It works and it's merged...
If you and Mark would be kind enough to satisfy my paranoia, please test
the current upstream kernel, without any additional patches, and make
sure it works.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-28 22:28 ` Jens Axboe
2006-05-29 1:28 ` Jeff Garzik
@ 2006-05-29 2:43 ` Mark Lord
1 sibling, 0 replies; 100+ messages in thread
From: Mark Lord @ 2006-05-29 2:43 UTC (permalink / raw)
To: Jens Axboe
Cc: Jeff Garzik, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jens Axboe wrote:
>
> Mark is right, ATA_BUSY alone does _not_ work for me. I agree it's a
> little odd based on the printk output, it must be a timing thing.
I think maybe your drive is doing something like this ("B"==BUSY, "D"==DRQ):
BBBDDBBBBBB
Where the first couple of BBB last less than 10usec.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-29 1:28 ` Jeff Garzik
@ 2006-05-29 2:53 ` Mark Lord
2006-05-29 3:18 ` Jeff Garzik
2006-05-29 3:28 ` zhao, forrest
1 sibling, 1 reply; 100+ messages in thread
From: Mark Lord @ 2006-05-29 2:53 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Jeff Garzik wrote:
>
> If you and Mark would be kind enough to satisfy my paranoia, please test
> the current upstream kernel, without any additional patches, and make
> sure it works.
I'm not sure what you mean by "upstream kernel" here,
but the latest 2.6.17-rc5-git* (including our patch) works for me.
Cheers
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-29 2:53 ` Mark Lord
@ 2006-05-29 3:18 ` Jeff Garzik
0 siblings, 0 replies; 100+ messages in thread
From: Jeff Garzik @ 2006-05-29 3:18 UTC (permalink / raw)
To: Mark Lord; +Cc: Jens Axboe, Linus Torvalds, zhao, forrest, Tejun Heo, linux-ide
Mark Lord wrote:
> Jeff Garzik wrote:
>>
>> If you and Mark would be kind enough to satisfy my paranoia, please
>> test the current upstream kernel, without any additional patches, and
>> make sure it works.
>
> I'm not sure what you mean by "upstream kernel" here,
> but the latest 2.6.17-rc5-git* (including our patch) works for me.
Yep, that's what I meant, thanks.
I grant "upstream kernel" and "#upstream" are easily confused :/ even
though I never interchange the two myself.
Jeff
^ permalink raw reply [flat|nested] 100+ messages in thread
* Re: [PATCH] 2.6.17-rc5: the latest consensus libata resume fix
2006-05-29 1:28 ` Jeff Garzik
2006-05-29 2:53 ` Mark Lord
@ 2006-05-29 3:28 ` zhao, forrest
1 sibling, 0 replies; 100+ messages in thread
From: zhao, forrest @ 2006-05-29 3:28 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, Mark Lord, Linus Torvalds, Tejun Heo, linux-ide
On Sun, 2006-05-28 at 21:28 -0400, Jeff Garzik wrote:
> If you and Mark would be kind enough to satisfy my paranoia, please test
> the current upstream kernel, without any additional patches, and make
> sure it works.
>
The libata #upstream works for me, too.
Sorry for the unresponsiveness for the whole weekend, but I have no
development machine in my home. So I can only test the patch when I'm in
office.
Thanks,
Forrest
^ permalink raw reply [flat|nested] 100+ messages in thread
end of thread, other threads:[~2006-05-29 3:40 UTC | newest]
Thread overview: 100+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-26 9:04 [PATCH] Add ata_piix's own resume function zhao, forrest
2006-05-26 23:05 ` Jens Axboe
2006-05-26 23:28 ` Jeff Garzik
2006-05-26 23:38 ` Jeff Garzik
2006-05-26 23:50 ` Jeff Garzik
2006-05-27 6:21 ` Jens Axboe
2006-05-27 6:31 ` Jeff Garzik
2006-05-27 6:46 ` Jens Axboe
2006-05-27 6:52 ` Jeff Garzik
2006-05-27 3:22 ` 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Mark Lord
2006-05-27 3:32 ` Linus Torvalds
2006-05-27 3:41 ` Jeff Garzik
2006-05-27 4:00 ` [PATCH] " Jeff Garzik
2006-05-27 18:23 ` Mark Lord
2006-05-27 18:47 ` Linus Torvalds
2006-05-27 19:01 ` Jeff Garzik
2006-05-27 19:06 ` Jeff Garzik
2006-05-27 19:01 ` Mark Lord
2006-05-27 20:45 ` Jens Axboe
2006-05-27 20:58 ` Jeff Garzik
2006-05-27 21:11 ` Jens Axboe
2006-05-27 21:17 ` Jeff Garzik
2006-05-27 21:20 ` Jens Axboe
2006-05-27 21:23 ` Mark Lord
2006-05-27 21:25 ` Jens Axboe
2006-05-27 21:30 ` Mark Lord
2006-05-27 21:24 ` Jeff Garzik
2006-05-27 21:26 ` Jens Axboe
2006-05-27 21:31 ` Mark Lord
2006-05-27 21:32 ` Jeff Garzik
2006-05-27 21:33 ` Jens Axboe
2006-05-27 21:34 ` Jeff Garzik
2006-05-27 21:37 ` Mark Lord
2006-05-27 21:51 ` Jeff Garzik
2006-05-27 21:41 ` Tejun Heo
2006-05-27 21:45 ` Jeff Garzik
2006-05-27 21:38 ` Linus Torvalds
2006-05-27 21:50 ` Jeff Garzik
2006-05-27 21:57 ` Linus Torvalds
2006-05-27 22:11 ` Jeff Garzik
2006-05-27 21:50 ` Linus Torvalds
2006-05-27 21:53 ` Jeff Garzik
2006-05-27 22:14 ` Linus Torvalds
2006-05-27 22:06 ` Mark Lord
2006-05-27 22:11 ` Jens Axboe
2006-05-27 22:13 ` Jeff Garzik
2006-05-27 22:15 ` Jens Axboe
2006-05-27 22:15 ` Mark Lord
2006-05-27 22:17 ` Jens Axboe
2006-05-27 22:21 ` Linus Torvalds
2006-05-27 22:29 ` Mark Lord
2006-05-27 22:36 ` Jens Axboe
2006-05-27 22:48 ` Mark Lord
2006-05-27 22:53 ` Jens Axboe
2006-05-27 22:55 ` Jeff Garzik
2006-05-27 23:10 ` Mark Lord
2006-05-28 0:24 ` Linus Torvalds
2006-05-28 0:26 ` Linus Torvalds
2006-05-28 0:56 ` Jeff Garzik
2006-05-28 0:35 ` Linus Torvalds
2006-05-28 0:51 ` Mark Lord
2006-05-28 0:53 ` Jeff Garzik
2006-05-28 0:56 ` Mark Lord
2006-05-28 1:01 ` Linus Torvalds
2006-05-28 1:03 ` Jeff Garzik
2006-05-28 1:01 ` Jeff Garzik
2006-05-28 15:28 ` [PATCH] 2.6.17-rc5: the latest consensus libata resume fix Mark Lord
2006-05-28 17:14 ` Jens Axboe
2006-05-28 19:05 ` Jeff Garzik
2006-05-28 19:18 ` Mark Lord
2006-05-28 20:10 ` Jeff Garzik
2006-05-28 20:27 ` Mark Lord
2006-05-28 22:28 ` Jens Axboe
2006-05-29 1:28 ` Jeff Garzik
2006-05-29 2:53 ` Mark Lord
2006-05-29 3:18 ` Jeff Garzik
2006-05-29 3:28 ` zhao, forrest
2006-05-29 2:43 ` Mark Lord
2006-05-27 22:35 ` [PATCH] Re: 2.6.17-rc5-git1: regression: resume from suspend(RAM) fails: libata issue Jens Axboe
2006-05-27 22:52 ` Jeff Garzik
2006-05-27 22:54 ` Jens Axboe
2006-05-27 23:06 ` Jens Axboe
2006-05-27 22:56 ` Mark Lord
2006-05-27 23:03 ` Jeff Garzik
2006-05-27 22:18 ` Linus Torvalds
2006-05-27 22:23 ` Mark Lord
2006-05-27 22:43 ` Mark Lord
2006-05-28 0:13 ` Linus Torvalds
2006-05-27 18:54 ` Jeff Garzik
2006-05-27 19:08 ` Mark Lord
2006-05-27 19:15 ` Jeff Garzik
2006-05-27 19:24 ` Mark Lord
2006-05-27 20:24 ` Jens Axboe
2006-05-27 6:29 ` Jens Axboe
2006-05-27 6:36 ` Jeff Garzik
2006-05-27 7:01 ` Jens Axboe
2006-05-27 7:06 ` Jeff Garzik
2006-05-27 18:46 ` Mark Lord
2006-05-27 3:35 ` Jeff Garzik
2006-05-27 6:20 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).