public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
@ 2024-02-18  5:02 ni.liqiang
  2024-02-19  5:44 ` Daniel Mentz
  0 siblings, 1 reply; 6+ messages in thread
From: ni.liqiang @ 2024-02-18  5:02 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel
  Cc: ni.liqiang, jin . qi, linux-arm-kernel, iommu, linux-kernel

In the system reboot test, I encountered an issue:
After the OS started, the base address of CMDQ failed to be written
successfully and remained at the default value of 0.

Through timing analysis of CMN, it was found that although
the write request for the CMDQ base precedes the write request
for CMDQEN, the write response for the CMDQ base might be later
than that for CMDQEN.

Upon reviewing the SMMU Architecture Specification,
I found the following explanation:
The registers must be initialized in this order:
1. Write SMMU_CMDQ_BASE to set the queue base and size.
2. Write initial values to SMMU_CMDQ_CONS and SMMU_CMDQ_PROD.
3. Enable the queue with an Update of the respective SMMU_CR0.CMDQEN to 1.

If there are no memory barriers, how can we ensure this order?
Therefore, I believe that adding a memory barrier before enabling CMDQ
is necessary to ensure that the base address of CMDQ is correctly written.

The base addresses of EVENTQ and PRIQ would also be subject
to the same situation.

Could you please review if this modification seems reasonable? Thank you.

Signed-off-by: ni.liqiang <niliqiang.io@gmail.com>
Reviewed-by: jin.qi <jin.qi@zte.com.cn>
Tested-by: ni.liqiang <niliqiang.io@gmail.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0ffb1cf17e0b..ac854c46fdf3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3324,6 +3324,11 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD);
 	writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS);
 
+	/* Ensure that SMMU_CMDQ_BASE is written completely
+	 * when SMMU_CR0.CMDQEN == 0.
+	 */
+	__iomb();
+
 	enables = CR0_CMDQEN;
 	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
 				      ARM_SMMU_CR0ACK);
@@ -3350,6 +3355,11 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	writel_relaxed(smmu->evtq.q.llq.prod, smmu->page1 + ARM_SMMU_EVTQ_PROD);
 	writel_relaxed(smmu->evtq.q.llq.cons, smmu->page1 + ARM_SMMU_EVTQ_CONS);
 
+	/* Ensure that SMMU_EVENTQ_BASE is written completely
+	 * when SMMU_CR0.EVENTQEN == 0.
+	 */
+	__iomb();
+
 	enables |= CR0_EVTQEN;
 	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
 				      ARM_SMMU_CR0ACK);
@@ -3367,6 +3377,11 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 		writel_relaxed(smmu->priq.q.llq.cons,
 			       smmu->page1 + ARM_SMMU_PRIQ_CONS);
 
+		/* Ensure that SMMU_PRIQ_BASE is written completely
+		 * when SMMU_CR0.PRIQEN == 0.
+		 */
+		__iomb();
+
 		enables |= CR0_PRIQEN;
 		ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
 					      ARM_SMMU_CR0ACK);
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
  2024-02-18  5:02 [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization ni.liqiang
@ 2024-02-19  5:44 ` Daniel Mentz
  2024-02-19  9:17   ` Will Deacon
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Mentz @ 2024-02-19  5:44 UTC (permalink / raw)
  To: ni.liqiang
  Cc: Will Deacon, Robin Murphy, Joerg Roedel, jin . qi,
	linux-arm-kernel, iommu, linux-kernel

On Sat, Feb 17, 2024 at 9:02 PM ni.liqiang <niliqiang.io@gmail.com> wrote:
> If there are no memory barriers, how can we ensure this order?

 The SMMU registers are accessed using Device-nGnRE attributes. It is
my understanding that, for Device-nGnRE, the Arm architecture requires
that writes to the same peripheral arrive at the endpoint in program
order.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
  2024-02-19  5:44 ` Daniel Mentz
@ 2024-02-19  9:17   ` Will Deacon
  2024-02-21 15:26     ` ni.liqiang
  0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2024-02-19  9:17 UTC (permalink / raw)
  To: Daniel Mentz
  Cc: ni.liqiang, Robin Murphy, Joerg Roedel, jin . qi,
	linux-arm-kernel, iommu, linux-kernel

On Sun, Feb 18, 2024 at 09:44:47PM -0800, Daniel Mentz wrote:
> On Sat, Feb 17, 2024 at 9:02 PM ni.liqiang <niliqiang.io@gmail.com> wrote:
> > If there are no memory barriers, how can we ensure this order?
> 
>  The SMMU registers are accessed using Device-nGnRE attributes. It is
> my understanding that, for Device-nGnRE, the Arm architecture requires
> that writes to the same peripheral arrive at the endpoint in program
> order.

Yup, that's correct. The "nR" part means "non-Reordering", so something
else is going on here.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
  2024-02-19  9:17   ` Will Deacon
@ 2024-02-21 15:26     ` ni.liqiang
  2024-02-21 16:08       ` Will Deacon
  0 siblings, 1 reply; 6+ messages in thread
From: ni.liqiang @ 2024-02-21 15:26 UTC (permalink / raw)
  To: will, danielmentz
  Cc: iommu, jin.qi, joro, linux-arm-kernel, linux-kernel, robin.murphy,
	ni.liqiang

>>  The SMMU registers are accessed using Device-nGnRE attributes. It is
>> my understanding that, for Device-nGnRE, the Arm architecture requires
>> that writes to the same peripheral arrive at the endpoint in program
>> order.
>
> Yup, that's correct. The "nR" part means "non-Reordering", so something
> else is going on here.

Yes, the SMMU registers are accessed using Device-nGnRE attributes. 

One additional point to note is: in cases where there is a failure writing to the CMDQ base address register, the testing environment was a multi-die, multi-socket server. This issue has not been observed on a single-die server. I apologize for omitting this information in my initial patch submission. 

Over the past few days, I have referenced the kernel source code and ported the SMMU register initialization process. Through multiple stress tests, I have attempted to reproduce the CMDQ base address register write failure issue. The summarized results of my experiments are as follows:
1. When testing with one CPU core bound using taskset, the initialization process was executed 300,000 times without encountering the CMDQ base address register write failure issue. However, when not binding CPU using taskset, the issue was reproduced around 1,000 iterations into the test.
2. Without CPU binding, I inserted a memory barrier between accesses to the CMDQ_BASE register and CMDQEN register, similar to the modification made in the patch. After executing the initialization process 300,000 times, the CMDQ base address register write failure issue did not occur.

Based on these observations and joint analysis with CMN colleagues, we speculate that in the SMMU register initialization process, if the CPU core changes, and these CPUs are located on different dies, the underlying 4 CCG ports are utilized to perform die-to-die accesses. However, in our current strategy, these 4 CCG ports cannot guarantee ordering, resulting in the completion of CMDQEN writing before the completion of CMDQ base address writing.

From the analysis above, it seems that modifying the die-to-die access strategy to achieve ordering of Device-nGnRE memory might be a better solution compared to adding a memory barrier?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
  2024-02-21 15:26     ` ni.liqiang
@ 2024-02-21 16:08       ` Will Deacon
  2024-02-23 16:20         ` ni.liqiang
  0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2024-02-21 16:08 UTC (permalink / raw)
  To: ni.liqiang
  Cc: danielmentz, iommu, jin.qi, joro, linux-arm-kernel, linux-kernel,
	robin.murphy, ni.liqiang

On Wed, Feb 21, 2024 at 11:26:29PM +0800, ni.liqiang wrote:
> >>  The SMMU registers are accessed using Device-nGnRE attributes. It is
> >> my understanding that, for Device-nGnRE, the Arm architecture requires
> >> that writes to the same peripheral arrive at the endpoint in program
> >> order.
> >
> > Yup, that's correct. The "nR" part means "non-Reordering", so something
> > else is going on here.
> 
> Yes, the SMMU registers are accessed using Device-nGnRE attributes. 
> 
> One additional point to note is: in cases where there is a failure writing
> to the CMDQ base address register, the testing environment was a
> multi-die, multi-socket server. This issue has not been observed on a
> single-die server. I apologize for omitting this information in my initial
> patch submission. 

Uh-oh, smells like a hardware issue ;p
I wonder if Device-nGnRnE behaves any differently?

> Over the past few days, I have referenced the kernel source code and
> ported the SMMU register initialization process. Through multiple stress
> tests, I have attempted to reproduce the CMDQ base address register write
> failure issue. The summarized results of my experiments are as follows:
> 1. When testing with one CPU core bound using taskset, the initialization
> process was executed 300,000 times without encountering the CMDQ base
> address register write failure issue. However, when not binding CPU using
> taskset, the issue was reproduced around 1,000 iterations into the test.
> 2. Without CPU binding, I inserted a memory barrier between accesses to
> the CMDQ_BASE register and CMDQEN register, similar to the modification
> made in the patch. After executing the initialization process 300,000
> times, the CMDQ base address register write failure issue did not occur.
> 
> Based on these observations and joint analysis with CMN colleagues, we
> speculate that in the SMMU register initialization process, if the CPU
> core changes, and these CPUs are located on different dies, the underlying
> 4 CCG ports are utilized to perform die-to-die accesses. However, in our
> current strategy, these 4 CCG ports cannot guarantee ordering, resulting
> in the completion of CMDQEN writing before the completion of CMDQ base
> address writing.

(Disclaimer: I don't know what a CCG port is)

Hmmm. The part that doesn't make sense to me here is that migrating between
CPUs implies context-switching, and we have a DSB on that path in
__switch_to(). So why would adding barriers to the driver help? Maybe it
just changes the timing?

> From the analysis above, it seems that modifying the die-to-die access
> strategy to achieve ordering of Device-nGnRE memory might be a better
> solution compared to adding a memory barrier?

I'm not sure what you're proposing, but I don't think Linux should be
changed to accomodate this.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization.
  2024-02-21 16:08       ` Will Deacon
@ 2024-02-23 16:20         ` ni.liqiang
  0 siblings, 0 replies; 6+ messages in thread
From: ni.liqiang @ 2024-02-23 16:20 UTC (permalink / raw)
  To: will, danielmentz
  Cc: iommu, jin.qi, joro, linux-arm-kernel, linux-kernel, ni.liqiang,
	niliqiang.io, robin.murphy

> (Disclaimer: I don't know what a CCG port is)

CCG(CXL Gateway) is a part of the CMN700 Coherent Mesh Network. It plays
a crucial role in facilitating cross-die access in a multi-chip system.

> Hmmm. The part that doesn't make sense to me here is that migrating between
> CPUs implies context-switching, and we have a DSB on that path in
> __switch_to(). So why would adding barriers to the driver help? Maybe it
> just changes the timing?

This is very likely. Through our experiments, adding a delay before CMDQEN
does not reproduce the failure of writing to the CMDQ base register.

> I'm not sure what you're proposing, but I don't think Linux should be
> changed to accomodate this.

I am very grateful for the responses from both of you experts. We will
continue to check the current hardware configuration of the system
and attempt to fix this issue.
Once again, I express my thanks. Thank you.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-23 16:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-18  5:02 [PATCH] drivers/iommu: Ensure that the queue base address is successfully written during SMMU initialization ni.liqiang
2024-02-19  5:44 ` Daniel Mentz
2024-02-19  9:17   ` Will Deacon
2024-02-21 15:26     ` ni.liqiang
2024-02-21 16:08       ` Will Deacon
2024-02-23 16:20         ` ni.liqiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox