linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cxl: Flush PSL cache before resetting the adapter
@ 2016-10-03 19:36 Frederic Barrat
  2016-10-04  3:45 ` Andrew Donnellan
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Frederic Barrat @ 2016-10-03 19:36 UTC (permalink / raw)
  To: imunsie, linuxppc-dev

If the capi link is going down while the PSL owns a dirty cache line,
any access from the host for that data could lead to an Unrecoverable
Error.
So when resetting the capi adapter through sysfs, make sure the PSL
cache is flushed. It won't help if there are any active Process
Elements on the card, as the cache would likely get new dirty cache
lines immediately, but if resetting an idle adapter, it should avoid
any bad surprises from data left over from terminated Process Elements.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
---
 drivers/misc/cxl/cxl.h    |  6 +++++-
 drivers/misc/cxl/native.c | 31 +++++++++++++++++++++++++++++++
 drivers/misc/cxl/pci.c    |  3 +++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 344a0ff..01d372a 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -162,7 +162,10 @@ static const cxl_p2n_reg_t CXL_PSL_WED_An     = {0x0A0};
 #define CXL_PSL_SPAP_V    0x0000000000000001ULL
 
 /****** CXL_PSL_Control ****************************************************/
-#define CXL_PSL_Control_tb 0x0000000000000001ULL
+#define CXL_PSL_Control_tb              (0x1ull << (63-63))
+#define CXL_PSL_Control_Fr              (0x1ull << (63-31))
+#define CXL_PSL_Control_Fs_MASK         (0x3ull << (63-29))
+#define CXL_PSL_Control_Fs_Complete     (0x3ull << (63-29))
 
 /****** CXL_PSL_DLCNTL *****************************************************/
 #define CXL_PSL_DLCNTL_D (0x1ull << (63-28))
@@ -854,6 +857,7 @@ int cxl_register_one_irq(struct cxl *adapter, irq_handler_t handler,
 int cxl_check_error(struct cxl_afu *afu);
 int cxl_afu_slbia(struct cxl_afu *afu);
 int cxl_tlb_slb_invalidate(struct cxl *adapter);
+int cxl_data_cache_flush(struct cxl *adapter);
 int cxl_afu_disable(struct cxl_afu *afu);
 int cxl_psl_purge(struct cxl_afu *afu);
 
diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index e606fdc..a217a74 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -290,6 +290,37 @@ int cxl_tlb_slb_invalidate(struct cxl *adapter)
 	return 0;
 }
 
+int cxl_data_cache_flush(struct cxl *adapter)
+{
+	u64 reg;
+	unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT);
+
+	pr_devel("Flushing data cache\n");
+
+	reg = cxl_p1_read(adapter, CXL_PSL_Control);
+	reg |= CXL_PSL_Control_Fr;
+	cxl_p1_write(adapter, CXL_PSL_Control, reg);
+
+	reg = cxl_p1_read(adapter, CXL_PSL_Control);
+	while ((reg & CXL_PSL_Control_Fs_MASK) != CXL_PSL_Control_Fs_Complete) {
+		if (time_after_eq(jiffies, timeout)) {
+			dev_warn(&adapter->dev, "WARNING: cache flush timed out!\n");
+			return -EBUSY;
+		}
+
+		if (!cxl_ops->link_ok(adapter, NULL)) {
+			dev_warn(&adapter->dev, "WARNING: link down when flushing cache\n");
+			return -EIO;
+		}
+		cpu_relax();
+		reg = cxl_p1_read(adapter, CXL_PSL_Control);
+	}
+
+	reg &= ~CXL_PSL_Control_Fr;
+	cxl_p1_write(adapter, CXL_PSL_Control, reg);
+	return 0;
+}
+
 static int cxl_write_sstp(struct cxl_afu *afu, u64 sstp0, u64 sstp1)
 {
 	int rc;
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 6f0c4ac..731e2e2 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1239,6 +1239,9 @@ int cxl_pci_reset(struct cxl *adapter)
 
 	dev_info(&dev->dev, "CXL reset\n");
 
+	/* the adapter is about to be reset, so ignore errors */
+	cxl_data_cache_flush(adapter);
+
 	/* pcie_warm_reset requests a fundamental pci reset which includes a
 	 * PERST assert/deassert.  PERST triggers a loading of the image
 	 * if "user" or "factory" is selected in sysfs */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cxl: Flush PSL cache before resetting the adapter
  2016-10-03 19:36 [PATCH] cxl: Flush PSL cache before resetting the adapter Frederic Barrat
@ 2016-10-04  3:45 ` Andrew Donnellan
  2016-10-04  4:31 ` Ian Munsie
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Andrew Donnellan @ 2016-10-04  3:45 UTC (permalink / raw)
  To: Frederic Barrat, imunsie, linuxppc-dev

On 04/10/16 06:36, Frederic Barrat wrote:
> If the capi link is going down while the PSL owns a dirty cache line,
> any access from the host for that data could lead to an Unrecoverable

IIRC, s/Unrecoverable/Uncorrectable/

> Error.
> So when resetting the capi adapter through sysfs, make sure the PSL
> cache is flushed. It won't help if there are any active Process
> Elements on the card, as the cache would likely get new dirty cache
> lines immediately, but if resetting an idle adapter, it should avoid
> any bad surprises from data left over from terminated Process Elements.
>
> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>

Otherwise looks good to me.

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cxl: Flush PSL cache before resetting the adapter
  2016-10-03 19:36 [PATCH] cxl: Flush PSL cache before resetting the adapter Frederic Barrat
  2016-10-04  3:45 ` Andrew Donnellan
@ 2016-10-04  4:31 ` Ian Munsie
  2016-10-04  5:49 ` Vaibhav Jain
  2016-10-05  2:36 ` Michael Ellerman
  3 siblings, 0 replies; 6+ messages in thread
From: Ian Munsie @ 2016-10-04  4:31 UTC (permalink / raw)
  To: Frederic Barrat; +Cc: linuxppc-dev

Acked-by: Ian Munsie <imunsie@au1.ibm.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cxl: Flush PSL cache before resetting the adapter
  2016-10-03 19:36 [PATCH] cxl: Flush PSL cache before resetting the adapter Frederic Barrat
  2016-10-04  3:45 ` Andrew Donnellan
  2016-10-04  4:31 ` Ian Munsie
@ 2016-10-04  5:49 ` Vaibhav Jain
  2016-10-04  8:35   ` Frederic Barrat
  2016-10-05  2:36 ` Michael Ellerman
  3 siblings, 1 reply; 6+ messages in thread
From: Vaibhav Jain @ 2016-10-04  5:49 UTC (permalink / raw)
  To: Frederic Barrat, imunsie, linuxppc-dev

Hi Fred,

Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes:

>
> +	/* the adapter is about to be reset, so ignore errors */
> +	cxl_data_cache_flush(adapter);
> +
Will be a good idea if we return error and not let the reset to proceed,
if cxl_data_cache_flush returns EBUSY as continuing again may cause the
UE error.

~ Vaibhav

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cxl: Flush PSL cache before resetting the adapter
  2016-10-04  5:49 ` Vaibhav Jain
@ 2016-10-04  8:35   ` Frederic Barrat
  0 siblings, 0 replies; 6+ messages in thread
From: Frederic Barrat @ 2016-10-04  8:35 UTC (permalink / raw)
  To: Vaibhav Jain, imunsie, linuxppc-dev

Hi Vaibhav,


Le 04/10/2016 à 07:49, Vaibhav Jain a écrit :
> Hi Fred,
>
> Frederic Barrat <fbarrat@linux.vnet.ibm.com> writes:
>
>>
>> +	/* the adapter is about to be reset, so ignore errors */
>> +	cxl_data_cache_flush(adapter);
>> +
> Will be a good idea if we return error and not let the reset to proceed,
> if cxl_data_cache_flush returns EBUSY as continuing again may cause the
> UE error.

I'm going to change cxl_data_cache_flush() to return ETIMEOUT instead of 
EBUSY, as it is misleading. With the current patch, EBUSY is not 
returned because there are active contexts running on the card. It is 
returned when the hardware/psl doesn't reply within 5 seconds to the 
flush request. It's not supposed to happen and would show an issue with 
the hardware/psl. In which case the adapter is close to useless, so we 
might as well try resetting it.

On a related note, we've talked with the folks from cxlflash, and we'll 
test a separate (complementary) patch to deny resetting the adapter if 
there are any active contexts, since, as you say, the likelihood of 
hitting a UE would be pretty high.

   Fred

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: cxl: Flush PSL cache before resetting the adapter
  2016-10-03 19:36 [PATCH] cxl: Flush PSL cache before resetting the adapter Frederic Barrat
                   ` (2 preceding siblings ...)
  2016-10-04  5:49 ` Vaibhav Jain
@ 2016-10-05  2:36 ` Michael Ellerman
  3 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2016-10-05  2:36 UTC (permalink / raw)
  To: Frederic Barrat, imunsie, linuxppc-dev

On Mon, 2016-03-10 at 19:36:02 UTC, Frederic Barrat wrote:
> If the capi link is going down while the PSL owns a dirty cache line,
> any access from the host for that data could lead to an Unrecoverable
> Error.
> So when resetting the capi adapter through sysfs, make sure the PSL
> cache is flushed. It won't help if there are any active Process
> Elements on the card, as the cache would likely get new dirty cache
> lines immediately, but if resetting an idle adapter, it should avoid
> any bad surprises from data left over from terminated Process Elements.
> 
> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Acked-by: Ian Munsie <imunsie@au1.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/aaa2245ed836824f21f8e42e0ab63b

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-05  2:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-03 19:36 [PATCH] cxl: Flush PSL cache before resetting the adapter Frederic Barrat
2016-10-04  3:45 ` Andrew Donnellan
2016-10-04  4:31 ` Ian Munsie
2016-10-04  5:49 ` Vaibhav Jain
2016-10-04  8:35   ` Frederic Barrat
2016-10-05  2:36 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).