linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
@ 2025-06-27 11:10 Yuanfang Zhang
  2025-06-27 11:23 ` James Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Yuanfang Zhang @ 2025-06-27 11:10 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach, James Clark, Alexander Shishkin
  Cc: kernel, coresight, linux-arm-kernel, linux-kernel, Yuanfang Zhang

The current implementation uses a fixed timeout via
coresight_timeout(), which may be insufficient when multiple
sources are enabled or under heavy load, leading to TMC
readiness or flush completion timeout.

This patch introduces a configurable timeout mechanism for
flush and tmcready.

Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
---
 drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-core.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
@@ -8,6 +8,7 @@
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/types.h>
+#include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/idr.h>
 #include <linux/io.h>
@@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
 DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
 DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
 
+static u32 tmc_timeout;
+
+static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
+{
+	int i;
+
+	for (i = tmc_timeout; i > 0; i--) {
+		if (i - 1)
+			udelay(1);
+	}
+}
+
+static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
+{
+	return coresight_timeout_action(csa, offset, pos, val,
+			tmc_extend_timeout);
+}
+
 int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
 {
 	struct coresight_device *csdev = drvdata->csdev;
 	struct csdev_access *csa = &csdev->access;
 
 	/* Ensure formatter, unformatter and hardware fifo are empty */
-	if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
+	if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
 		dev_err(&csdev->dev,
 			"timeout while waiting for TMC to be Ready\n");
 		return -EBUSY;
@@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
 	ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
 	writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
 	/* Ensure flush completes */
-	if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
+	if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
 		dev_err(&csdev->dev,
 		"timeout while waiting for completion of Manual Flush\n");
 	}
@@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
 
 static DEVICE_ATTR_RW(stop_on_flush);
 
+static ssize_t timeout_cfg_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
+}
+
+static ssize_t timeout_cfg_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t size)
+{
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val))
+		return -EINVAL;
+	tmc_timeout = val;
+
+	return size;
+}
+static DEVICE_ATTR_RW(timeout_cfg);
 
 static struct attribute *coresight_tmc_attrs[] = {
 	&dev_attr_trigger_cntr.attr,
 	&dev_attr_buffer_size.attr,
 	&dev_attr_stop_on_flush.attr,
+	&dev_attr_timeout_cfg.attr,
 	NULL,
 };
 

---
base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
change-id: 20250627-flush_timeout-a598b4c0ce7b

Best regards,
-- 
Yuanfang Zhang <quic_yuanfang@quicinc.com>


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-27 11:10 [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready Yuanfang Zhang
@ 2025-06-27 11:23 ` James Clark
  2025-06-27 14:17   ` Leo Yan
  2025-06-30 10:03   ` Yuanfang Zhang
  0 siblings, 2 replies; 8+ messages in thread
From: James Clark @ 2025-06-27 11:23 UTC (permalink / raw)
  To: Yuanfang Zhang, Suzuki K Poulose, Mike Leach, Leo Yan
  Cc: kernel, coresight, linux-arm-kernel, linux-kernel,
	Alexander Shishkin



On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
> The current implementation uses a fixed timeout via
> coresight_timeout(), which may be insufficient when multiple
> sources are enabled or under heavy load, leading to TMC
> readiness or flush completion timeout.
> 
> This patch introduces a configurable timeout mechanism for
> flush and tmcready.
> 

What kind of values are you using? Is there a reason to not increase the 
global one?

I don't think it's important what value we choose because it's only to 
stop hangs and make it terminate eventually. As far as I can see it 
wouldn't matter if we set it to a huge value like 1 second. That would 
only cause a big delay when something has actually gone wrong. Under 
normal circumstances the timeout won't be hit so it doesn't really need 
to be configurable.

> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>   1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> @@ -8,6 +8,7 @@
>   #include <linux/kernel.h>
>   #include <linux/init.h>
>   #include <linux/types.h>
> +#include <linux/delay.h>
>   #include <linux/device.h>
>   #include <linux/idr.h>
>   #include <linux/io.h>
> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>   DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>   DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>   
> +static u32 tmc_timeout;
> +
> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
> +{
> +	int i;
> +
> +	for (i = tmc_timeout; i > 0; i--) {
> +		if (i - 1)

I didn't get what the if is for here? Removing it does basically the 
same thing, but if you do want to keep it maybe if (i > 1) is more 
explanatory.

> +			udelay(1);

Can you not do udelay(tmc_timeout)?

> +	}
> +}
> +
> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
> +{
> +	return coresight_timeout_action(csa, offset, pos, val,
> +			tmc_extend_timeout);
> +}
> +
>   int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>   {
>   	struct coresight_device *csdev = drvdata->csdev;
>   	struct csdev_access *csa = &csdev->access;
>   
>   	/* Ensure formatter, unformatter and hardware fifo are empty */
> -	if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
> +	if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>   		dev_err(&csdev->dev,
>   			"timeout while waiting for TMC to be Ready\n");
>   		return -EBUSY;
> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>   	ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>   	writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>   	/* Ensure flush completes */
> -	if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
> +	if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>   		dev_err(&csdev->dev,
>   		"timeout while waiting for completion of Manual Flush\n");
>   	}
> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>   
>   static DEVICE_ATTR_RW(stop_on_flush);
>   
> +static ssize_t timeout_cfg_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
> +}
> +
> +static ssize_t timeout_cfg_store(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf, size_t size)
> +{
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val))
> +		return -EINVAL;
> +	tmc_timeout = val;
> +
> +	return size;
> +}
> +static DEVICE_ATTR_RW(timeout_cfg);
>   

Seeing as the existing timeout is global for all devices, if we do want 
a configurable one shouldn't we make the global one configurable rather 
than per-device? That seems too fine grained to me.

>   static struct attribute *coresight_tmc_attrs[] = {
>   	&dev_attr_trigger_cntr.attr,
>   	&dev_attr_buffer_size.attr,
>   	&dev_attr_stop_on_flush.attr,
> +	&dev_attr_timeout_cfg.attr,
>   	NULL,
>   };
>   
> 
> ---
> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
> change-id: 20250627-flush_timeout-a598b4c0ce7b
> 
> Best regards,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-27 11:23 ` James Clark
@ 2025-06-27 14:17   ` Leo Yan
  2025-06-30 10:40     ` Yuanfang Zhang
  2025-06-30 10:03   ` Yuanfang Zhang
  1 sibling, 1 reply; 8+ messages in thread
From: Leo Yan @ 2025-06-27 14:17 UTC (permalink / raw)
  To: James Clark
  Cc: Yuanfang Zhang, Suzuki K Poulose, Mike Leach, kernel, coresight,
	linux-arm-kernel, linux-kernel, Alexander Shishkin

On Fri, Jun 27, 2025 at 12:23:29PM +0100, James Clark wrote:
> 
> 
> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
> > The current implementation uses a fixed timeout via
> > coresight_timeout(), which may be insufficient when multiple
> > sources are enabled or under heavy load, leading to TMC
> > readiness or flush completion timeout.
> > 
> > This patch introduces a configurable timeout mechanism for
> > flush and tmcready.
> > 
> 
> What kind of values are you using? Is there a reason to not increase the
> global one?

IIUC, this patch is related to patch [1].

It seems to me that both patches aim to address the long latency when
flushing the TMC, but take different approaches. In the earlier patch,
both Mike and I questioned how the timeout occurred in hardware, but
no information provided about the cause.

I would suggest that we first make clear if this is a hardware quirk or
a common issue in Arm CoreSight.

Thanks,
Leo

[1] https://lore.kernel.org/linux-arm-kernel/CAJ9a7Vgre_3mkXB_xeVx5N9BqPTa2Ai4_8E+daDZ-yAUv44A9g@mail.gmail.com/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-27 11:23 ` James Clark
  2025-06-27 14:17   ` Leo Yan
@ 2025-06-30 10:03   ` Yuanfang Zhang
  2025-06-30 13:46     ` James Clark
  1 sibling, 1 reply; 8+ messages in thread
From: Yuanfang Zhang @ 2025-06-30 10:03 UTC (permalink / raw)
  To: James Clark, Suzuki K Poulose, Mike Leach, Leo Yan
  Cc: kernel, coresight, linux-arm-kernel, linux-kernel,
	Alexander Shishkin



On 6/27/2025 7:23 PM, James Clark wrote:
> 
> 
> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>> The current implementation uses a fixed timeout via
>> coresight_timeout(), which may be insufficient when multiple
>> sources are enabled or under heavy load, leading to TMC
>> readiness or flush completion timeout.
>>
>> This patch introduces a configurable timeout mechanism for
>> flush and tmcready.
>>
> 
> What kind of values are you using? Is there a reason to not increase the global one?
1000, Because only TMC FLUSH will face timeout situations.
> 
> I don't think it's important what value we choose because it's only to stop hangs and make it terminate eventually. As far as I can see it wouldn't matter if we set it to a huge value like 1 second. That would only cause a big delay when something has actually gone wrong. Under normal circumstances the timeout won't be hit so it doesn't really need to be configurable.

But in some cases, TMC doesn't hang up, it just requires a longer waiting time.
> 
>> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/kernel.h>
>>   #include <linux/init.h>
>>   #include <linux/types.h>
>> +#include <linux/delay.h>
>>   #include <linux/device.h>
>>   #include <linux/idr.h>
>>   #include <linux/io.h>
>> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>>   DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>>   DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>>   +static u32 tmc_timeout;
>> +
>> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
>> +{
>> +    int i;
>> +
>> +    for (i = tmc_timeout; i > 0; i--) {
>> +        if (i - 1)
> 
> I didn't get what the if is for here? Removing it does basically the same thing, but if you do want to keep it maybe if (i > 1) is more explanatory.
sure.
> 
>> +            udelay(1);
> 
> Can you not do udelay(tmc_timeout)?
sure.
> 
>> +    }
>> +}
>> +
>> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
>> +{
>> +    return coresight_timeout_action(csa, offset, pos, val,
>> +            tmc_extend_timeout);
>> +}
>> +
>>   int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>>   {
>>       struct coresight_device *csdev = drvdata->csdev;
>>       struct csdev_access *csa = &csdev->access;
>>         /* Ensure formatter, unformatter and hardware fifo are empty */
>> -    if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>> +    if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>           dev_err(&csdev->dev,
>>               "timeout while waiting for TMC to be Ready\n");
>>           return -EBUSY;
>> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>>       ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>>       writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>>       /* Ensure flush completes */
>> -    if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>> +    if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>           dev_err(&csdev->dev,
>>           "timeout while waiting for completion of Manual Flush\n");
>>       }
>> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>>     static DEVICE_ATTR_RW(stop_on_flush);
>>   +static ssize_t timeout_cfg_show(struct device *dev,
>> +                struct device_attribute *attr, char *buf)
>> +{
>> +    return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
>> +}
>> +
>> +static ssize_t timeout_cfg_store(struct device *dev,
>> +                 struct device_attribute *attr,
>> +                 const char *buf, size_t size)
>> +{
>> +    unsigned long val;
>> +
>> +    if (kstrtoul(buf, 0, &val))
>> +        return -EINVAL;
>> +    tmc_timeout = val;
>> +
>> +    return size;
>> +}
>> +static DEVICE_ATTR_RW(timeout_cfg);
>>   
> 
> Seeing as the existing timeout is global for all devices, if we do want a configurable one shouldn't we make the global one configurable rather than per-device? That seems too fine grained to me.
sure.
> 
>>   static struct attribute *coresight_tmc_attrs[] = {
>>       &dev_attr_trigger_cntr.attr,
>>       &dev_attr_buffer_size.attr,
>>       &dev_attr_stop_on_flush.attr,
>> +    &dev_attr_timeout_cfg.attr,
>>       NULL,
>>   };
>>  
>> ---
>> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
>> change-id: 20250627-flush_timeout-a598b4c0ce7b
>>
>> Best regards,
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-27 14:17   ` Leo Yan
@ 2025-06-30 10:40     ` Yuanfang Zhang
  2025-06-30 11:08       ` Leo Yan
  0 siblings, 1 reply; 8+ messages in thread
From: Yuanfang Zhang @ 2025-06-30 10:40 UTC (permalink / raw)
  To: Leo Yan, James Clark
  Cc: Suzuki K Poulose, Mike Leach, kernel, coresight, linux-arm-kernel,
	linux-kernel, Alexander Shishkin



On 6/27/2025 10:17 PM, Leo Yan wrote:
> On Fri, Jun 27, 2025 at 12:23:29PM +0100, James Clark wrote:
>>
>>
>> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>>> The current implementation uses a fixed timeout via
>>> coresight_timeout(), which may be insufficient when multiple
>>> sources are enabled or under heavy load, leading to TMC
>>> readiness or flush completion timeout.
>>>
>>> This patch introduces a configurable timeout mechanism for
>>> flush and tmcready.
>>>
>>
>> What kind of values are you using? Is there a reason to not increase the
>> global one?
> 
> IIUC, this patch is related to patch [1].
> 
> It seems to me that both patches aim to address the long latency when
> flushing the TMC, but take different approaches. In the earlier patch,
> both Mike and I questioned how the timeout occurred in hardware, but
> no information provided about the cause.
> 
> I would suggest that we first make clear if this is a hardware quirk or
> a common issue in Arm CoreSight.
> 
> Thanks,
> Leo
> 
sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.
> [1] https://lore.kernel.org/linux-arm-kernel/CAJ9a7Vgre_3mkXB_xeVx5N9BqPTa2Ai4_8E+daDZ-yAUv44A9g@mail.gmail.com/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-30 10:40     ` Yuanfang Zhang
@ 2025-06-30 11:08       ` Leo Yan
  2025-07-03  8:23         ` Yuanfang Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Leo Yan @ 2025-06-30 11:08 UTC (permalink / raw)
  To: Yuanfang Zhang
  Cc: James Clark, Suzuki K Poulose, Mike Leach, kernel, coresight,
	linux-arm-kernel, linux-kernel, Alexander Shishkin

On Mon, Jun 30, 2025 at 06:40:53PM +0800, Yuanfang Zhang wrote:

[...]

> >>> The current implementation uses a fixed timeout via
> >>> coresight_timeout(), which may be insufficient when multiple
> >>> sources are enabled or under heavy load, leading to TMC
> >>> readiness or flush completion timeout.
> >
> > I would suggest that we first make clear if this is a hardware quirk or
> > a common issue in Arm CoreSight.

> sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.

As the commit log states, "sources are enabled or under heavy load,
leading to TMC readiness or flush completion timeout." I would like to
confirm how this situation can happen.

When disabling a CoreSight path, the driver follows a sequential
order: the source device is disabled first, followed by flushing and
disabling the TMC. We expect that there should be no contention
between source devices and the sink when disabling the path. For a
subsystem ETM, I expect we should also follow this sequence.

Leo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-30 10:03   ` Yuanfang Zhang
@ 2025-06-30 13:46     ` James Clark
  0 siblings, 0 replies; 8+ messages in thread
From: James Clark @ 2025-06-30 13:46 UTC (permalink / raw)
  To: Yuanfang Zhang, Suzuki K Poulose, Mike Leach, Leo Yan
  Cc: kernel, coresight, linux-arm-kernel, linux-kernel,
	Alexander Shishkin



On 30/06/2025 11:03 am, Yuanfang Zhang wrote:
> 
> 
> On 6/27/2025 7:23 PM, James Clark wrote:
>>
>>
>> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>>> The current implementation uses a fixed timeout via
>>> coresight_timeout(), which may be insufficient when multiple
>>> sources are enabled or under heavy load, leading to TMC
>>> readiness or flush completion timeout.
>>>
>>> This patch introduces a configurable timeout mechanism for
>>> flush and tmcready.
>>>
>>
>> What kind of values are you using? Is there a reason to not increase the global one?
> 1000, Because only TMC FLUSH will face timeout situations.

How long was the flush taking exactly? You should be able to log the 
time it took to get past the flush. Because if it's 101us then we can 
increase the global timeout to 150us or 200us without too much thought.

I don't think we can justify why 100us was picked over any other value 
anyway. And we've seen a couple of timeouts in our own CI.

>>
>> I don't think it's important what value we choose because it's only to stop hangs and make it terminate eventually. As far as I can see it wouldn't matter if we set it to a huge value like 1 second. That would only cause a big delay when something has actually gone wrong. Under normal circumstances the timeout won't be hit so it doesn't really need to be configurable.
> 
> But in some cases, TMC doesn't hang up, it just requires a longer waiting time.
>>
>>> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
>>> ---
>>>    drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>>>    1 file changed, 41 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> @@ -8,6 +8,7 @@
>>>    #include <linux/kernel.h>
>>>    #include <linux/init.h>
>>>    #include <linux/types.h>
>>> +#include <linux/delay.h>
>>>    #include <linux/device.h>
>>>    #include <linux/idr.h>
>>>    #include <linux/io.h>
>>> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>>>    DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>>>    DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>>>    +static u32 tmc_timeout;
>>> +
>>> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
>>> +{
>>> +    int i;
>>> +
>>> +    for (i = tmc_timeout; i > 0; i--) {
>>> +        if (i - 1)
>>
>> I didn't get what the if is for here? Removing it does basically the same thing, but if you do want to keep it maybe if (i > 1) is more explanatory.
> sure.
>>
>>> +            udelay(1);
>>
>> Can you not do udelay(tmc_timeout)?
> sure.
>>
>>> +    }
>>> +}
>>> +
>>> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
>>> +{
>>> +    return coresight_timeout_action(csa, offset, pos, val,
>>> +            tmc_extend_timeout);
>>> +}
>>> +
>>>    int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>>>    {
>>>        struct coresight_device *csdev = drvdata->csdev;
>>>        struct csdev_access *csa = &csdev->access;
>>>          /* Ensure formatter, unformatter and hardware fifo are empty */
>>> -    if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>> +    if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>>            dev_err(&csdev->dev,
>>>                "timeout while waiting for TMC to be Ready\n");
>>>            return -EBUSY;
>>> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>>>        ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>>>        writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>>>        /* Ensure flush completes */
>>> -    if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>> +    if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>>            dev_err(&csdev->dev,
>>>            "timeout while waiting for completion of Manual Flush\n");
>>>        }
>>> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>>>      static DEVICE_ATTR_RW(stop_on_flush);
>>>    +static ssize_t timeout_cfg_show(struct device *dev,
>>> +                struct device_attribute *attr, char *buf)
>>> +{
>>> +    return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
>>> +}
>>> +
>>> +static ssize_t timeout_cfg_store(struct device *dev,
>>> +                 struct device_attribute *attr,
>>> +                 const char *buf, size_t size)
>>> +{
>>> +    unsigned long val;
>>> +
>>> +    if (kstrtoul(buf, 0, &val))
>>> +        return -EINVAL;
>>> +    tmc_timeout = val;
>>> +
>>> +    return size;
>>> +}
>>> +static DEVICE_ATTR_RW(timeout_cfg);
>>>    
>>
>> Seeing as the existing timeout is global for all devices, if we do want a configurable one shouldn't we make the global one configurable rather than per-device? That seems too fine grained to me.
> sure.
>>
>>>    static struct attribute *coresight_tmc_attrs[] = {
>>>        &dev_attr_trigger_cntr.attr,
>>>        &dev_attr_buffer_size.attr,
>>>        &dev_attr_stop_on_flush.attr,
>>> +    &dev_attr_timeout_cfg.attr,
>>>        NULL,
>>>    };
>>>   
>>> ---
>>> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
>>> change-id: 20250627-flush_timeout-a598b4c0ce7b
>>>
>>> Best regards,
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
  2025-06-30 11:08       ` Leo Yan
@ 2025-07-03  8:23         ` Yuanfang Zhang
  0 siblings, 0 replies; 8+ messages in thread
From: Yuanfang Zhang @ 2025-07-03  8:23 UTC (permalink / raw)
  To: Leo Yan
  Cc: James Clark, Suzuki K Poulose, Mike Leach, kernel, coresight,
	linux-arm-kernel, linux-kernel, Alexander Shishkin



On 6/30/2025 7:08 PM, Leo Yan wrote:
> On Mon, Jun 30, 2025 at 06:40:53PM +0800, Yuanfang Zhang wrote:
> 
> [...]
> 
>>>>> The current implementation uses a fixed timeout via
>>>>> coresight_timeout(), which may be insufficient when multiple
>>>>> sources are enabled or under heavy load, leading to TMC
>>>>> readiness or flush completion timeout.
>>>
>>> I would suggest that we first make clear if this is a hardware quirk or
>>> a common issue in Arm CoreSight.
> 
>> sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.
> 
> As the commit log states, "sources are enabled or under heavy load,
> leading to TMC readiness or flush completion timeout." I would like to
> confirm how this situation can happen.
> 
Enable all etms, then cat TMC without disable source.
> When disabling a CoreSight path, the driver follows a sequential
> order: the source device is disabled first, followed by flushing and
> disabling the TMC. We expect that there should be no contention
> between source devices and the sink when disabling the path. For a
> subsystem ETM, I expect we should also follow this sequence.
> 
> Leo
this issue is a different case: running cat tmc directly did not disable source.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-03  8:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 11:10 [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready Yuanfang Zhang
2025-06-27 11:23 ` James Clark
2025-06-27 14:17   ` Leo Yan
2025-06-30 10:40     ` Yuanfang Zhang
2025-06-30 11:08       ` Leo Yan
2025-07-03  8:23         ` Yuanfang Zhang
2025-06-30 10:03   ` Yuanfang Zhang
2025-06-30 13:46     ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).