* questions regarding possible violation of AHCI spec in AHCI driver
@ 2010-12-07 7:43 Jian Peng
2010-12-08 1:54 ` Robert Hancock
0 siblings, 1 reply; 22+ messages in thread
From: Jian Peng @ 2010-12-07 7:43 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org; +Cc: jgarzik@pobox.com
Recently, while bringing up a new AHCI host controller, I found out that current AHCI driver (in 2.6.37-rc3) may violate AHCI spec in function libahci.c: ahci_start_engine().
>From end of section 10.1.2 in AHCI 1.3 spec, it claims
Software shall not set PxCMD.ST to '1' until it is determined that a functional device is present on the port
as determined by PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h.
It seems working well on most controller without this extra checking, but does cause problem in our new core. Since toggling PxCMD.SUD already initiated reset process at early time, and by the time of ahci_start_engine() got called, BSY bit may not be cleared yet, and forcing PxCMD.ST bit to 1 will cause problem for HW in this case.
Since I do not know all history about AHCI driver, I really hope that AHCI driver and libata maintainer can comment on this.
Thanks,
Jian
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-07 7:43 questions regarding possible violation of AHCI spec in AHCI driver Jian Peng @ 2010-12-08 1:54 ` Robert Hancock 2010-12-08 10:07 ` Tejun Heo 0 siblings, 1 reply; 22+ messages in thread From: Robert Hancock @ 2010-12-08 1:54 UTC (permalink / raw) To: Jian Peng; +Cc: linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide, Tejun Heo CCing linux-ide and Tejun. On 12/07/2010 01:43 AM, Jian Peng wrote: > Recently, while bringing up a new AHCI host controller, I found out that current AHCI driver (in 2.6.37-rc3) may violate AHCI spec in function libahci.c: ahci_start_engine(). > > From end of section 10.1.2 in AHCI 1.3 spec, it claims > > Software shall not set PxCMD.ST to '1' until it is determined that a functional device is present on the port > as determined by PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. > > It seems working well on most controller without this extra checking, but does cause problem in our new core. Since toggling PxCMD.SUD already initiated reset process at early time, and by the time of ahci_start_engine() got called, BSY bit may not be cleared yet, and forcing PxCMD.ST bit to 1 will cause problem for HW in this case. > > Since I do not know all history about AHCI driver, I really hope that AHCI driver and libata maintainer can comment on this. Seems like a valid point to me, according to the spec. I'm guessing it just hasn't come up previously with other controllers? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 1:54 ` Robert Hancock @ 2010-12-08 10:07 ` Tejun Heo 2010-12-08 18:14 ` Jian Peng 0 siblings, 1 reply; 22+ messages in thread From: Tejun Heo @ 2010-12-08 10:07 UTC (permalink / raw) To: Robert Hancock Cc: Jian Peng, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, On 12/08/2010 02:54 AM, Robert Hancock wrote: > On 12/07/2010 01:43 AM, Jian Peng wrote: >> Recently, while bringing up a new AHCI host controller, I found out >> that current AHCI driver (in 2.6.37-rc3) may violate AHCI spec in >> function libahci.c: ahci_start_engine(). >> >> From end of section 10.1.2 in AHCI 1.3 spec, it claims >> >> Software shall not set PxCMD.ST to '1' until it is determined that >> a functional device is present on the port as determined by >> PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. >> >> It seems working well on most controller without this extra >> checking, but does cause problem in our new core. Since toggling >> PxCMD.SUD already initiated reset process at early time, and by the >> time of ahci_start_engine() got called, BSY bit may not be cleared >> yet, and forcing PxCMD.ST bit to 1 will cause problem for HW in >> this case. Hmmm... interesting. Yeah, we have never had any problem in that area and would like to avoid changing unless necessary but then again if it's broken, well, we should. What kind of problem is the controller showing? Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 10:07 ` Tejun Heo @ 2010-12-08 18:14 ` Jian Peng 2010-12-08 18:24 ` Tejun Heo 0 siblings, 1 reply; 22+ messages in thread From: Jian Peng @ 2010-12-08 18:14 UTC (permalink / raw) To: Tejun Heo, Robert Hancock Cc: linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hi, Tejun, The problem happened as follow: After power up, inside ahci_init_one(), it will call ahci_power_up() to toggle PxCMD.SUD bit first, then HBA will send COMRESET to device, and device will send first D2H FIS back. Here it will call ahci_start_engine() to turn on PxCMD.ST to process command. In this case, it may run into race condition that transaction triggered by toggling PxCMD.SUD is not completed yet, and that is the reason why extra check is required by spec to guarantee that HBA already received FIS and in sane state. In most HBA, either staggered spin-up feature was not supported, or time required for transaction is less than that between two function calls, it may work. IMHO, this is a clear violation of spec, and not robust against all HBA design. The major concern is that ahci_start_engine() is used widely in EH and it does not return result to reflect whether ST bit was set or not, this may cause trouble in some cases. I am working on verifying those cases with different HBAs now. Thanks, Jian -----Original Message----- From: Tejun Heo [mailto:tj@kernel.org] Sent: Wednesday, December 08, 2010 2:07 AM To: Robert Hancock Cc: Jian Peng; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On 12/08/2010 02:54 AM, Robert Hancock wrote: > On 12/07/2010 01:43 AM, Jian Peng wrote: >> Recently, while bringing up a new AHCI host controller, I found out >> that current AHCI driver (in 2.6.37-rc3) may violate AHCI spec in >> function libahci.c: ahci_start_engine(). >> >> From end of section 10.1.2 in AHCI 1.3 spec, it claims >> >> Software shall not set PxCMD.ST to '1' until it is determined that >> a functional device is present on the port as determined by >> PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. >> >> It seems working well on most controller without this extra >> checking, but does cause problem in our new core. Since toggling >> PxCMD.SUD already initiated reset process at early time, and by the >> time of ahci_start_engine() got called, BSY bit may not be cleared >> yet, and forcing PxCMD.ST bit to 1 will cause problem for HW in >> this case. Hmmm... interesting. Yeah, we have never had any problem in that area and would like to avoid changing unless necessary but then again if it's broken, well, we should. What kind of problem is the controller showing? Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 18:14 ` Jian Peng @ 2010-12-08 18:24 ` Tejun Heo 2010-12-08 18:48 ` Jian Peng 0 siblings, 1 reply; 22+ messages in thread From: Tejun Heo @ 2010-12-08 18:24 UTC (permalink / raw) To: Jian Peng Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, On 12/08/2010 07:14 PM, Jian Peng wrote: > The problem happened as follow: > > After power up, inside ahci_init_one(), it will call ahci_power_up() > to toggle PxCMD.SUD bit first, then HBA will send COMRESET to > device, and device will send first D2H FIS back. Here it will call > ahci_start_engine() to turn on PxCMD.ST to process command. In this > case, it may run into race condition that transaction triggered by > toggling PxCMD.SUD is not completed yet, and that is the reason why > extra check is required by spec to guarantee that HBA already > received FIS and in sane state. > > In most HBA, either staggered spin-up feature was not supported, or > time required for transaction is less than that between two function > calls, it may work. IMHO, this is a clear violation of spec, and not > robust against all HBA design. > > The major concern is that ahci_start_engine() is used widely in EH > and it does not return result to reflect whether ST bit was set or > not, this may cause trouble in some cases. I am working on verifying > those cases with different HBAs now. I see, so it's not that the controller actually failed but you observed the race condition. During EH, ahci_start_engine() is used rather liberally when the driver wants the controller in working condition. The assumption is that even if the driver tries to set ST with the incorrect condition, the controller wouldn't go crazy where it can't be restarted later, which _must_ be true as there's no atomic way to check the condition and set ST. The driver at the same time guarantees that if the whole initialization protocol succeeds, the last ahci_start_engine() is called after hardreset is sucessfully completed and thus all the prerequisites are met. So, yeap, the driver might set ST when the conditions are not met but that can't be avoided completely anyway so the controller shouldn't go completely bonkers for that (it's okay to fail in recoverable way) and the driver will do the last ST setting after all the conditions are met on the success path. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 18:24 ` Tejun Heo @ 2010-12-08 18:48 ` Jian Peng 2010-12-08 18:52 ` Tejun Heo 0 siblings, 1 reply; 22+ messages in thread From: Jian Peng @ 2010-12-08 18:48 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide So it is reasonable to add extra check in ahci_start_engine() without returning status of ST bit. If so, here is my patch --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 +++ libahci.c 2010-12-08 10:45:17.495156944 -0800 @@ -542,6 +542,13 @@ { void __iomem *port_mmio = ahci_port_base(ap); u32 tmp; + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; + + /* avoid race condition per spec (end of section 10.1.2) */ + if (status & (ATA_BUSY | ATA_DRQ) || + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || + (tmp & 0x0f) != 0x03) + return; /* start DMA */ tmp = readl(port_mmio + PORT_CMD); Thanks, Jian ________________________________________ From: Tejun Heo [tj@kernel.org] Sent: Wednesday, December 08, 2010 10:24 AM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On 12/08/2010 07:14 PM, Jian Peng wrote: > The problem happened as follow: > > After power up, inside ahci_init_one(), it will call ahci_power_up() > to toggle PxCMD.SUD bit first, then HBA will send COMRESET to > device, and device will send first D2H FIS back. Here it will call > ahci_start_engine() to turn on PxCMD.ST to process command. In this > case, it may run into race condition that transaction triggered by > toggling PxCMD.SUD is not completed yet, and that is the reason why > extra check is required by spec to guarantee that HBA already > received FIS and in sane state. > > In most HBA, either staggered spin-up feature was not supported, or > time required for transaction is less than that between two function > calls, it may work. IMHO, this is a clear violation of spec, and not > robust against all HBA design. > > The major concern is that ahci_start_engine() is used widely in EH > and it does not return result to reflect whether ST bit was set or > not, this may cause trouble in some cases. I am working on verifying > those cases with different HBAs now. I see, so it's not that the controller actually failed but you observed the race condition. During EH, ahci_start_engine() is used rather liberally when the driver wants the controller in working condition. The assumption is that even if the driver tries to set ST with the incorrect condition, the controller wouldn't go crazy where it can't be restarted later, which _must_ be true as there's no atomic way to check the condition and set ST. The driver at the same time guarantees that if the whole initialization protocol succeeds, the last ahci_start_engine() is called after hardreset is sucessfully completed and thus all the prerequisites are met. So, yeap, the driver might set ST when the conditions are not met but that can't be avoided completely anyway so the controller shouldn't go completely bonkers for that (it's okay to fail in recoverable way) and the driver will do the last ST setting after all the conditions are met on the success path. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 18:48 ` Jian Peng @ 2010-12-08 18:52 ` Tejun Heo 2010-12-08 19:49 ` Jian Peng 0 siblings, 1 reply; 22+ messages in thread From: Tejun Heo @ 2010-12-08 18:52 UTC (permalink / raw) To: Jian Peng Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, On 12/08/2010 07:48 PM, Jian Peng wrote: > So it is reasonable to add extra check in ahci_start_engine() without returning status of ST bit. If so, here is my patch > > --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 > +++ libahci.c 2010-12-08 10:45:17.495156944 -0800 > @@ -542,6 +542,13 @@ > { > void __iomem *port_mmio = ahci_port_base(ap); > u32 tmp; > + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; > + > + /* avoid race condition per spec (end of section 10.1.2) */ > + if (status & (ATA_BUSY | ATA_DRQ) || > + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || > + (tmp & 0x0f) != 0x03) > + return; > > /* start DMA */ > tmp = readl(port_mmio + PORT_CMD); Yes, it is reasonable but I want to see that it actually fixes something. There are just too many controllers which use this path to blindly apply the above change and given my previous explanation even without the above change any ahci controller _should_ work fine. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 18:52 ` Tejun Heo @ 2010-12-08 19:49 ` Jian Peng 2010-12-08 19:55 ` Tejun Heo 0 siblings, 1 reply; 22+ messages in thread From: Jian Peng @ 2010-12-08 19:49 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide I agree. I have AHCI based PCI card using HBA from Marvell, Via and Silicon Image, and am going to test my patch. Before this patch can be applied universally, I like to use it for specific PCI_VENDOR_ID first. Here is my new patch to limit it to Broadcom's AHCI core --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 +++ libahci.c 2010-12-08 11:44:24.394023128 -0800 @@ -44,6 +44,7 @@ #include <scsi/scsi_host.h> #include <scsi/scsi_cmnd.h> #include <linux/libata.h> +#include <linux/pci.h> #include "ahci.h" static int ahci_skip_host_reset; @@ -541,8 +542,20 @@ void ahci_start_engine(struct ata_port *ap) { void __iomem *port_mmio = ahci_port_base(ap); + struct ata_host *host = ap->host; + struct pci_dev *pdev = to_pci_dev(host->dev); u32 tmp; + /* avoid race condition per spec (end of section 10.1.2) */ + if (pdev->vendor == PCI_VENDOR_ID_BROADCOM) { + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; + + if (status & (ATA_BUSY | ATA_DRQ) || + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || + (tmp & 0x0f) != 0x03) + return; + } + /* start DMA */ tmp = readl(port_mmio + PORT_CMD); tmp |= PORT_CMD_START; Thanks, Jian ________________________________________ From: Tejun Heo [tj@kernel.org] Sent: Wednesday, December 08, 2010 10:52 AM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On 12/08/2010 07:48 PM, Jian Peng wrote: > So it is reasonable to add extra check in ahci_start_engine() without returning status of ST bit. If so, here is my patch > > --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 > +++ libahci.c 2010-12-08 10:45:17.495156944 -0800 > @@ -542,6 +542,13 @@ > { > void __iomem *port_mmio = ahci_port_base(ap); > u32 tmp; > + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; > + > + /* avoid race condition per spec (end of section 10.1.2) */ > + if (status & (ATA_BUSY | ATA_DRQ) || > + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || > + (tmp & 0x0f) != 0x03) > + return; > > /* start DMA */ > tmp = readl(port_mmio + PORT_CMD); Yes, it is reasonable but I want to see that it actually fixes something. There are just too many controllers which use this path to blindly apply the above change and given my previous explanation even without the above change any ahci controller _should_ work fine. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 19:49 ` Jian Peng @ 2010-12-08 19:55 ` Tejun Heo 2010-12-08 20:09 ` Jian Peng 0 siblings, 1 reply; 22+ messages in thread From: Tejun Heo @ 2010-12-08 19:55 UTC (permalink / raw) To: Jian Peng Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, On 12/08/2010 08:49 PM, Jian Peng wrote: > I agree. I have AHCI based PCI card using HBA from Marvell, Via and > Silicon Image, and am going to test my patch. Before this patch can > be applied universally, I like to use it for specific PCI_VENDOR_ID > first. Here is my new patch to limit it to Broadcom's AHCI core Hmmm... is the change actually necessary for broadcom controllers? As I wrote before, any ahci controller should just work without the above checks because, > + /* avoid race condition per spec (end of section 10.1.2) */ > + if (pdev->vendor == PCI_VENDOR_ID_BROADCOM) { > + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; > + > + if (status & (ATA_BUSY | ATA_DRQ) || > + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || > + (tmp & 0x0f) != 0x03) > + return; PHY event can occur here which causes the device to send D2H Reg FIS w/ BSY set. 1. So, the controller _MUST NOT_ fail in irrecoverable way even if the driver sets ST while BSY is set. 2. The driver guarantees the final ST setting before entering normal operation is done when the prerequisites are met. If you combine 1 and 2, the current behavior is perfectly fine. Do broadcom controllers actually fail without the above change? Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 19:55 ` Tejun Heo @ 2010-12-08 20:09 ` Jian Peng 2010-12-08 22:54 ` Tejun Heo 0 siblings, 1 reply; 22+ messages in thread From: Jian Peng @ 2010-12-08 20:09 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide The controller may take much longer time to recover in this case, and leads to wrong HW state after stop_engine() inside ahci_hardreset() and cause device type checking failure due to unfinished HW state change and missing D2H FIS after start_engine() again inside ahci_hardreset(). I guess this is the reason why AHCI spec try to emphasize. Yes, without this change, Broadcom controller will fail due to above reason. Thanks, Jian -----Original Message----- From: Tejun Heo [mailto:tj@kernel.org] Sent: Wednesday, December 08, 2010 11:55 AM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On 12/08/2010 08:49 PM, Jian Peng wrote: > I agree. I have AHCI based PCI card using HBA from Marvell, Via and > Silicon Image, and am going to test my patch. Before this patch can > be applied universally, I like to use it for specific PCI_VENDOR_ID > first. Here is my new patch to limit it to Broadcom's AHCI core Hmmm... is the change actually necessary for broadcom controllers? As I wrote before, any ahci controller should just work without the above checks because, > + /* avoid race condition per spec (end of section 10.1.2) */ > + if (pdev->vendor == PCI_VENDOR_ID_BROADCOM) { > + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; > + > + if (status & (ATA_BUSY | ATA_DRQ) || > + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || > + (tmp & 0x0f) != 0x03) > + return; PHY event can occur here which causes the device to send D2H Reg FIS w/ BSY set. 1. So, the controller _MUST NOT_ fail in irrecoverable way even if the driver sets ST while BSY is set. 2. The driver guarantees the final ST setting before entering normal operation is done when the prerequisites are met. If you combine 1 and 2, the current behavior is perfectly fine. Do broadcom controllers actually fail without the above change? Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 20:09 ` Jian Peng @ 2010-12-08 22:54 ` Tejun Heo 2010-12-09 1:30 ` Jian Peng 2011-01-11 18:09 ` Jian Peng 0 siblings, 2 replies; 22+ messages in thread From: Tejun Heo @ 2010-12-08 22:54 UTC (permalink / raw) To: Jian Peng Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, Jian. On 12/08/2010 09:09 PM, Jian Peng wrote: > The controller may take much longer time to recover in this case, > and leads to wrong HW state after stop_engine() inside > ahci_hardreset() and cause device type checking failure due to > unfinished HW state change and missing D2H FIS after start_engine() > again inside ahci_hardreset(). I guess this is the reason why AHCI > spec try to emphasize. I don't necessarily agree there. The requirement is impossible to reliably satisfy to begin with (it's inherently racy) and most specs are filled with "the outcome is undefined" when they don't _need_ to be well defined. The hardware can do "eh.. well, whatever, I don't know" but shouldn't get lost and fail to react to further state-resetting actions. > Yes, without this change, Broadcom controller will fail due to above > reason. Okay, so, the controller goes bonkers if ST is set when prerequisites are not met. You know that the problem can still happen with the proposed change, right? It's much less likely but definitely can and actually is likely to happen once in a blue moon. It isn't too uncommon for link to take some time to stabilize after a PHY event (including hardreset) and some devices will end up sending multiple D2H Reg FISes in the process with conflicting status. Also, note that the delay between the check and ST setting could be substantial especially with parallel probing / booting. I'm not objecting to the change but you guys probably want to fix the controller in following revisions. If we're gonna make the change, I'd like to go with the previous version without the vendor check. What is the timeframe for the controller release? Would it be enough to merge the change during 2.6.38-rc1? After baking it for some time in 2.6.38, we can propagate the change back through -stable. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 22:54 ` Tejun Heo @ 2010-12-09 1:30 ` Jian Peng 2011-01-11 18:09 ` Jian Peng 1 sibling, 0 replies; 22+ messages in thread From: Jian Peng @ 2010-12-09 1:30 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hi, Tejun, I will go over with chip designer on all detail of this race condition again. AFAIK, our controller reacted to ST bit change but lack of full handshaking between SW and HW leads to failure finally. I can definitely help checking all available controllers I can get. Schedule wise, it is not too bad since this AHCI core is part of SOC instead of standalone controller so we have manageable kernel and patches release for our customers. To help AHCI driver to be more compliant with spec, and also fix specific problem in our controller, it requires some actions. I will post my findings on other controllers after testing it. Thanks, Jian -----Original Message----- From: Tejun Heo [mailto:tj@kernel.org] Sent: Wednesday, December 08, 2010 2:54 PM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, Jian. On 12/08/2010 09:09 PM, Jian Peng wrote: > The controller may take much longer time to recover in this case, > and leads to wrong HW state after stop_engine() inside > ahci_hardreset() and cause device type checking failure due to > unfinished HW state change and missing D2H FIS after start_engine() > again inside ahci_hardreset(). I guess this is the reason why AHCI > spec try to emphasize. I don't necessarily agree there. The requirement is impossible to reliably satisfy to begin with (it's inherently racy) and most specs are filled with "the outcome is undefined" when they don't _need_ to be well defined. The hardware can do "eh.. well, whatever, I don't know" but shouldn't get lost and fail to react to further state-resetting actions. > Yes, without this change, Broadcom controller will fail due to above > reason. Okay, so, the controller goes bonkers if ST is set when prerequisites are not met. You know that the problem can still happen with the proposed change, right? It's much less likely but definitely can and actually is likely to happen once in a blue moon. It isn't too uncommon for link to take some time to stabilize after a PHY event (including hardreset) and some devices will end up sending multiple D2H Reg FISes in the process with conflicting status. Also, note that the delay between the check and ST setting could be substantial especially with parallel probing / booting. I'm not objecting to the change but you guys probably want to fix the controller in following revisions. If we're gonna make the change, I'd like to go with the previous version without the vendor check. What is the timeframe for the controller release? Would it be enough to merge the change during 2.6.38-rc1? After baking it for some time in 2.6.38, we can propagate the change back through -stable. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2010-12-08 22:54 ` Tejun Heo 2010-12-09 1:30 ` Jian Peng @ 2011-01-11 18:09 ` Jian Peng 2011-01-11 18:23 ` Tejun Heo 1 sibling, 1 reply; 22+ messages in thread From: Jian Peng @ 2011-01-11 18:09 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hi, Tejun, Happy Holiday! I want to revisit this issue and hopefully get consensus on it asap. Here is the sequence that will cause problem if BSY|DRQ and SSTS.DET was not checked in start_engine: 1. SUD bit was set in ahci_power_up() to start communication between host and device 2. START bit was set in ahci_start_engine() to prepare for data transfer (should check condition and do not set START bit here per spec) By the time, host did not receive first FIS so BSY|DRQ was not cleared 3. inside ahci_hardreset(), call ahci_stop_engine() first, host controller will take time to clean up internal pipeline and transit to idle state 4. toggle SCTL.DET to reset interface, now COMRESET was not sent since internal state machine stuck at #2 and #3 (per spec), BSY|DRQ bit was not cleared 5. call ahci_start_engine() again and try to read ID from device but interface is busy since BSY bit was not cleared, failed here At the end of section 10.1 of AHCI spec (rev 1.3), it states Software shall not set PxCMD.ST to '1' until it is determined that a functional device is present on the port as determined by PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. It is likely used to prevent host controller from jumping into wrong state before first FIS was received. Please review this issue, and let me know how to resolve it by either adopting my previous patch, or creating a new patch. Thanks, Jian Here is my previous patch against 2.6.37-rc3 > > --- libahci.c.orig 2010-12-08 10:42:48.383976763 -0800 > +++ libahci.c 2010-12-08 10:45:17.495156944 -0800 > @@ -542,6 +542,13 @@ > { > void __iomem *port_mmio = ahci_port_base(ap); > u32 tmp; > + u8 status = readl(port_mmio + PORT_TFDATA) & 0xFF; > + > + /* avoid race condition per spec (end of section 10.1.2) */ > + if (status & (ATA_BUSY | ATA_DRQ) || > + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || > + (tmp & 0x0f) != 0x03) > + return; > > /* start DMA */ > tmp = readl(port_mmio + PORT_CMD); -----Original Message----- From: Tejun Heo [mailto:tj@kernel.org] Sent: Wednesday, December 08, 2010 2:54 PM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, Jian. On 12/08/2010 09:09 PM, Jian Peng wrote: > The controller may take much longer time to recover in this case, > and leads to wrong HW state after stop_engine() inside > ahci_hardreset() and cause device type checking failure due to > unfinished HW state change and missing D2H FIS after start_engine() > again inside ahci_hardreset(). I guess this is the reason why AHCI > spec try to emphasize. I don't necessarily agree there. The requirement is impossible to reliably satisfy to begin with (it's inherently racy) and most specs are filled with "the outcome is undefined" when they don't _need_ to be well defined. The hardware can do "eh.. well, whatever, I don't know" but shouldn't get lost and fail to react to further state-resetting actions. > Yes, without this change, Broadcom controller will fail due to above > reason. Okay, so, the controller goes bonkers if ST is set when prerequisites are not met. You know that the problem can still happen with the proposed change, right? It's much less likely but definitely can and actually is likely to happen once in a blue moon. It isn't too uncommon for link to take some time to stabilize after a PHY event (including hardreset) and some devices will end up sending multiple D2H Reg FISes in the process with conflicting status. Also, note that the delay between the check and ST setting could be substantial especially with parallel probing / booting. I'm not objecting to the change but you guys probably want to fix the controller in following revisions. If we're gonna make the change, I'd like to go with the previous version without the vendor check. What is the timeframe for the controller release? Would it be enough to merge the change during 2.6.38-rc1? After baking it for some time in 2.6.38, we can propagate the change back through -stable. Thanks. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-11 18:09 ` Jian Peng @ 2011-01-11 18:23 ` Tejun Heo 2011-01-11 18:55 ` Jian Peng 0 siblings, 1 reply; 22+ messages in thread From: Tejun Heo @ 2011-01-11 18:23 UTC (permalink / raw) To: Jian Peng Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hello, On Tue, Jan 11, 2011 at 10:09:08AM -0800, Jian Peng wrote: > Happy Holiday! I want to revisit this issue and hopefully get > consensus on it asap. Happy new year! > Here is the sequence that will cause problem if BSY|DRQ and SSTS.DET > was not checked in start_engine: > > 1. SUD bit was set in ahci_power_up() to start communication between > host and device > 2. START bit was set in ahci_start_engine() to prepare for data > transfer (should check condition and do not set START bit here > per spec) > By the time, host did not receive first FIS so BSY|DRQ was not > cleared > 3. inside ahci_hardreset(), call ahci_stop_engine() first, host > controller will take time to clean up internal pipeline and > transit to idle state > 4. toggle SCTL.DET to reset interface, now COMRESET was not sent > since internal state machine stuck at #2 and #3 (per> spec), > BSY|DRQ bit was not cleared > 5. call ahci_start_engine() again and try to read ID from device but > interface is busy since BSY bit was not cleared, failed here > > At the end of section 10.1 of AHCI spec (rev 1.3), it states > > Software shall not set PxCMD.ST to '1' until it is determined that a > functional device is present on the port as determined by > PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. > > It is likely used to prevent host controller from jumping into wrong > state before first FIS was received. I still have to say it's pretty silly of the controller to stall inrecoverably. > Please review this issue, and let me know how to resolve it by > either adopting my previous patch, or creating a new patch. You haven't really brought up any new issue so my comment stays the same. >> I'm not objecting to the change but you guys probably want to fix >> the controller in following revisions. If we're gonna make the >> change, I'd like to go with the previous version without the vendor >> check. What is the timeframe for the controller release? Would it >> be enough to merge the change during 2.6.38-rc1? After baking it >> for some time in 2.6.38, we can propagate the change back through >> -stable. So, yeah, let's proceed with the patch. Please make the following changes and resubmit. * Please don't do readl() on the same line as variable declaration and drop the unnecessary & 0xff. * Hmmm... please consider adding a KERN_DEBUG message when ahci_start_port() exits without starting the port. Just in case it causes problems on other controllers. Thank you. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-11 18:23 ` Tejun Heo @ 2011-01-11 18:55 ` Jian Peng 2011-01-12 0:45 ` Harik 2011-01-19 0:51 ` Jeff Garzik 0 siblings, 2 replies; 22+ messages in thread From: Jian Peng @ 2011-01-11 18:55 UTC (permalink / raw) To: Tejun Heo Cc: Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Hi, Tejun, Here is new patch based on your suggestion (against 2.6.37 release) --- libahci.c.orig 2011-01-11 10:46:56.623991326 -0800 +++ libahci.c 2011-01-11 10:52:19.634036938 -0800 @@ -542,6 +542,15 @@ { void __iomem *port_mmio = ahci_port_base(ap); u32 tmp; + u8 status; + + status = readl(port_mmio + PORT_TFDATA); + if (status & (ATA_BUSY | ATA_DRQ) || + ahci_scr_read(&ap->link, SCR_STATUS, &tmp) || + (tmp & 0x0f) != 0x03) { + printk(KERN_DEBUG "START bit was not set in %s\n", __func__); + return; + } /* start DMA */ tmp = readl(port_mmio + PORT_CMD); I will test host controller made by other vendors and report issues soon. Thanks, Jian ________________________________________ From: Tejun Heo [htejun@gmail.com] On Behalf Of Tejun Heo [tj@kernel.org] Sent: Tuesday, January 11, 2011 10:23 AM To: Jian Peng Cc: Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver Hello, On Tue, Jan 11, 2011 at 10:09:08AM -0800, Jian Peng wrote: > Happy Holiday! I want to revisit this issue and hopefully get > consensus on it asap. Happy new year! > Here is the sequence that will cause problem if BSY|DRQ and SSTS.DET > was not checked in start_engine: > > 1. SUD bit was set in ahci_power_up() to start communication between > host and device > 2. START bit was set in ahci_start_engine() to prepare for data > transfer (should check condition and do not set START bit here > per spec) > By the time, host did not receive first FIS so BSY|DRQ was not > cleared > 3. inside ahci_hardreset(), call ahci_stop_engine() first, host > controller will take time to clean up internal pipeline and > transit to idle state > 4. toggle SCTL.DET to reset interface, now COMRESET was not sent > since internal state machine stuck at #2 and #3 (per> spec), > BSY|DRQ bit was not cleared > 5. call ahci_start_engine() again and try to read ID from device but > interface is busy since BSY bit was not cleared, failed here > > At the end of section 10.1 of AHCI spec (rev 1.3), it states > > Software shall not set PxCMD.ST to '1' until it is determined that a > functional device is present on the port as determined by > PxTFD.STS.BSY = '0', PxTFD.STS.DRQ = '0', and PxSSTS.DET = 3h. > > It is likely used to prevent host controller from jumping into wrong > state before first FIS was received. I still have to say it's pretty silly of the controller to stall inrecoverably. > Please review this issue, and let me know how to resolve it by > either adopting my previous patch, or creating a new patch. You haven't really brought up any new issue so my comment stays the same. >> I'm not objecting to the change but you guys probably want to fix >> the controller in following revisions. If we're gonna make the >> change, I'd like to go with the previous version without the vendor >> check. What is the timeframe for the controller release? Would it >> be enough to merge the change during 2.6.38-rc1? After baking it >> for some time in 2.6.38, we can propagate the change back through >> -stable. So, yeah, let's proceed with the patch. Please make the following changes and resubmit. * Please don't do readl() on the same line as variable declaration and drop the unnecessary & 0xff. * Hmmm... please consider adding a KERN_DEBUG message when ahci_start_port() exits without starting the port. Just in case it causes problems on other controllers. Thank you. -- tejun ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-11 18:55 ` Jian Peng @ 2011-01-12 0:45 ` Harik 2011-01-12 0:51 ` Jian Peng 2011-01-19 0:51 ` Jeff Garzik 1 sibling, 1 reply; 22+ messages in thread From: Harik @ 2011-01-12 0:45 UTC (permalink / raw) To: Jian Peng Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide > + (tmp & 0x0f) != 0x03) { > + printk(KERN_DEBUG "START bit was not set in %s\n", __func__); > + return; > + } Are there defines for the mask and result that you could use instead of raw hex? ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-12 0:45 ` Harik @ 2011-01-12 0:51 ` Jian Peng 0 siblings, 0 replies; 22+ messages in thread From: Jian Peng @ 2011-01-12 0:51 UTC (permalink / raw) To: Harik Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, jgarzik@pobox.com, ide Unfortunately, no -----Original Message----- From: Harik [mailto:harik.attar@gmail.com] Sent: Tuesday, January 11, 2011 4:46 PM To: Jian Peng Cc: Tejun Heo; Robert Hancock; linux-kernel@vger.kernel.org; jgarzik@pobox.com; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver > + (tmp & 0x0f) != 0x03) { > + printk(KERN_DEBUG "START bit was not set in %s\n", __func__); > + return; > + } Are there defines for the mask and result that you could use instead of raw hex? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-11 18:55 ` Jian Peng 2011-01-12 0:45 ` Harik @ 2011-01-19 0:51 ` Jeff Garzik 2011-01-19 0:58 ` Jian Peng ` (3 more replies) 1 sibling, 4 replies; 22+ messages in thread From: Jeff Garzik @ 2011-01-19 0:51 UTC (permalink / raw) To: Jian Peng; +Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, ide On 01/11/2011 01:55 PM, Jian Peng wrote: > Here is new patch based on your suggestion (against 2.6.37 release) [...] > I will test host controller made by other vendors and report issues soon. Any updates? Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-19 0:51 ` Jeff Garzik @ 2011-01-19 0:58 ` Jian Peng 2011-01-19 23:35 ` Jian Peng ` (2 subsequent siblings) 3 siblings, 0 replies; 22+ messages in thread From: Jian Peng @ 2011-01-19 0:58 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, ide Sorry that I switched to other tasks recently, and will report the testing results within this week. Any suggestion on how to test the code path affected by this patch, AFAIK, bootup and hotplug will use this path, What else? Thanks, Jian -----Original Message----- From: Jeff Garzik [mailto:jgpobox@gmail.com] On Behalf Of Jeff Garzik Sent: Tuesday, January 18, 2011 4:51 PM To: Jian Peng Cc: Tejun Heo; Robert Hancock; linux-kernel@vger.kernel.org; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver On 01/11/2011 01:55 PM, Jian Peng wrote: > Here is new patch based on your suggestion (against 2.6.37 release) [...] > I will test host controller made by other vendors and report issues soon. Any updates? Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-19 0:51 ` Jeff Garzik 2011-01-19 0:58 ` Jian Peng @ 2011-01-19 23:35 ` Jian Peng 2011-01-19 23:37 ` Jian Peng 2011-04-19 21:48 ` Jian Peng 3 siblings, 0 replies; 22+ messages in thread From: Jian Peng @ 2011-01-19 23:35 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, ide Hi, Jeff, I bought a bunch of PCI cards and tested them, only one running AHCI using Marvell 88SE9125 controller chip. It worked well with the patched kernel at bootup, hotplug, and data transfer. I put an order of a JMicron AHCI based PCI card and will test it. Thanks, Jian -----Original Message----- From: Jeff Garzik [mailto:jgpobox@gmail.com] On Behalf Of Jeff Garzik Sent: Tuesday, January 18, 2011 4:51 PM To: Jian Peng Cc: Tejun Heo; Robert Hancock; linux-kernel@vger.kernel.org; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver On 01/11/2011 01:55 PM, Jian Peng wrote: > Here is new patch based on your suggestion (against 2.6.37 release) [...] > I will test host controller made by other vendors and report issues soon. Any updates? Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-19 0:51 ` Jeff Garzik 2011-01-19 0:58 ` Jian Peng 2011-01-19 23:35 ` Jian Peng @ 2011-01-19 23:37 ` Jian Peng 2011-04-19 21:48 ` Jian Peng 3 siblings, 0 replies; 22+ messages in thread From: Jian Peng @ 2011-01-19 23:37 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, ide Resent it since mailbox crashed last time Hi, Jeff, I bought a bunch of PCI cards and tested them, only one running AHCI using Marvell 88SE9125 controller chip. It worked well with the patched kernel at bootup, hotplug, and data transfer. I put an order of a JMicron AHCI based PCI card and will test it. Thanks, Jian -----Original Message----- From: Jeff Garzik [mailto:jgpobox@gmail.com] On Behalf Of Jeff Garzik Sent: Tuesday, January 18, 2011 4:51 PM To: Jian Peng Cc: Tejun Heo; Robert Hancock; linux-kernel@vger.kernel.org; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver On 01/11/2011 01:55 PM, Jian Peng wrote: > Here is new patch based on your suggestion (against 2.6.37 release) [...] > I will test host controller made by other vendors and report issues soon. Any updates? Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: questions regarding possible violation of AHCI spec in AHCI driver 2011-01-19 0:51 ` Jeff Garzik ` (2 preceding siblings ...) 2011-01-19 23:37 ` Jian Peng @ 2011-04-19 21:48 ` Jian Peng 3 siblings, 0 replies; 22+ messages in thread From: Jian Peng @ 2011-04-19 21:48 UTC (permalink / raw) To: Jeff Garzik; +Cc: Tejun Heo, Robert Hancock, linux-kernel@vger.kernel.org, ide Hi, Jeff/Tejun, I retested my patch using Marvell AHCI controller card and Intel built-in controller with 2.6.38 kernel, it worked well. I resubmitted the patch against Linus git tree in separate email through git send-email already. Please review it. Thanks, Jian -----Original Message----- From: Jeff Garzik [mailto:jgpobox@gmail.com] On Behalf Of Jeff Garzik Sent: Tuesday, January 18, 2011 4:51 PM To: Jian Peng Cc: Tejun Heo; Robert Hancock; linux-kernel@vger.kernel.org; ide Subject: Re: questions regarding possible violation of AHCI spec in AHCI driver On 01/11/2011 01:55 PM, Jian Peng wrote: > Here is new patch based on your suggestion (against 2.6.37 release) [...] > I will test host controller made by other vendors and report issues soon. Any updates? Jeff ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2011-04-19 21:48 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-12-07 7:43 questions regarding possible violation of AHCI spec in AHCI driver Jian Peng 2010-12-08 1:54 ` Robert Hancock 2010-12-08 10:07 ` Tejun Heo 2010-12-08 18:14 ` Jian Peng 2010-12-08 18:24 ` Tejun Heo 2010-12-08 18:48 ` Jian Peng 2010-12-08 18:52 ` Tejun Heo 2010-12-08 19:49 ` Jian Peng 2010-12-08 19:55 ` Tejun Heo 2010-12-08 20:09 ` Jian Peng 2010-12-08 22:54 ` Tejun Heo 2010-12-09 1:30 ` Jian Peng 2011-01-11 18:09 ` Jian Peng 2011-01-11 18:23 ` Tejun Heo 2011-01-11 18:55 ` Jian Peng 2011-01-12 0:45 ` Harik 2011-01-12 0:51 ` Jian Peng 2011-01-19 0:51 ` Jeff Garzik 2011-01-19 0:58 ` Jian Peng 2011-01-19 23:35 ` Jian Peng 2011-01-19 23:37 ` Jian Peng 2011-04-19 21:48 ` Jian Peng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox