* Re: [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle [not found] <CAGo=-X1K1qZ_p9X0yeKy8Wm4QMDs2+4VE08LUNKOCrA15KFLTA@mail.gmail.com> @ 2025-06-20 8:44 ` Suzuki K Poulose 2025-06-20 10:19 ` Keita Morisaki 0 siblings, 1 reply; 5+ messages in thread From: Suzuki K Poulose @ 2025-06-20 8:44 UTC (permalink / raw) To: Keita Morisaki, linux-kernel, alexander.shishkin Cc: Yi-ming Tseng, Eric Chan, Leo Yan, James Clark, coresight@lists.linaro.org, Mike Leach Cc: coresight lists, Leo, James, Mike L Hello ! Thanks for the report ! In the future, please use scripts/get_maintainer.pl for the clear list of people/list for reporting issues. Response inline, below. On 20/06/2025 08:21, Keita Morisaki wrote: > Hello folks, > > I am writing to report a WARN_ON message I'm encountering in the > CoreSight driver on a multi-core ARM system running a 6.12-based kernel. > The warning appears consistently when disabling an Embedded Trace > Extension (ETE) source after it has been active. The issue is not > reproducible when CPUidle is disabled. > > The problem occurs because the driver assumes the CoreSight claim > register is persistent, but it could be reset by the CPUidle power > management flow. The section B2.3.2 of Arm CoreSight Architecture > Specification v3.0[1] indicates that the claim register must reset at > “reset”. A CPU power-up from an idle state can trigger a Cold reset, > which might explain this behavior. > > My ftrace analysis confirms this. I traced the only two functions that > modify the claim state: coresight_set_claim_tags (which sets the claim) > and coresight_clear_claim_tags (which is the only part of the kernel > that writes to CLAIMCLR). The trace shows the claim being set, followed > by a CPUidle transition, but no subsequent call to > coresight_clear_claim_tags. > > Here are the steps to reproduce the issue: > > modprobecoresight_etm4x > > # Enable any relevant sink > > echo1>/sys/bus/coresight/devices/ete0/enable_source > > echo0>/sys/bus/coresight/devices/ete0/enable_source > > > Here is a relevant snippet from the ftrace log that illustrates the > sequence: > > #tracer:function_graph > > # > > #CPUDURATIONFUNCTIONCALLS > > #||||||| > > 0)|coresight_claim_device_unlocked[coresight](){ > > 0)3.750us|coresight_set_claim_tags[coresight]();//Claimissethere > > 0)+20.260us|} > > 0)|/*psci_domain_idle_enter:cpu_id=0state={Our PSCI parameter value}*/// > CPUgoesidle > > 0)|/*psci_domain_idle_exit:cpu_id=0state={Our PSCI parameter value}*/// > CPUwakesup,causingColdreset > > ... > > 0)@309346.3us|coresight_disclaim_device_unlocked[coresight]();// > TriggersWARN_ON > > > The following WARN_ON [2] is printed because the CLAIMCLR register has > already been reset at the time coresight_disclaim_device_unlocked is > called, contrary to the driver's expectation. > We have the ETM driver performing the save/restore of ETM context during a CPUidle. This is only done when the ETM/ETE is described to be loosing context over PM operation. If this is not done (via DT), the driver doesn't do anything. This could be problematic. Could you try adding: "arm,coresight-loses-context-with-cpu" property to the ETE nodes and see if it makes a difference ? Kind regards Suzuki [0] https://elixir.bootlin.com/linux/v6.12/source/Documentation/devicetree/bindings/arm/arm,coresight-etm.yaml#L79 > [416.354181][C0]WARNING:CPU:0PID:0atdrivers/hwtracing/coresight/ > coresight-core.c:187coresight_disclaim_device_unlocked+0x84/0x9c[coresight] > > [416.535454][C0]Calltrace: > > [416.538606][C0]coresight_disclaim_device_unlocked+0x84/0x9c[coresight] > > [416.549359][C0]etm4_disable_hw+0x2d8/0x374[coresight_etm4x] > > [416.623310][C0]do_idle+0x1d4/0x264 > > (Note on tracing: To get this detailed trace, I made two modifications > to the kernel. First, since the trace_psci_domain_idle_enter/exit events > are not available in kernel 6.12, I cherry-picked the upstream patch > 7b7644831e72 [3] to add them. Second, to specifically trace the claim > functions, I temporarily replaced their inline compiler hints with > noinline.) > > Given the evidence, it appears the driver's assumption that the claim > register is persistent across CPU power states is incorrect and may need > to be addressed. > > Could you please provide your guidance on this? > > Thank you for your time and assistance. > > [1] https://developer.arm.com/documentation/ihi0029/latest/ <https:// > developer.arm.com/documentation/ihi0029/latest/>_ > _[2] https://elixir.bootlin.com/linux/v6.12/source/drivers/hwtracing/ > coresight/coresight-core.c#L187 <https://elixir.bootlin.com/linux/v6.12/ > source/drivers/hwtracing/coresight/coresight-core.c#L187>_ > _[3] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/ > linux.git/commit/?id=7b7644831e7276f52a233ec685d13c965fff09d9 <https:// > web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/? > id=7b7644831e7276f52a233ec685d13c965fff09d9> > > Best regards, > Keita ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle 2025-06-20 8:44 ` [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle Suzuki K Poulose @ 2025-06-20 10:19 ` Keita Morisaki 2025-06-23 8:59 ` Keita Morisaki 0 siblings, 1 reply; 5+ messages in thread From: Keita Morisaki @ 2025-06-20 10:19 UTC (permalink / raw) To: suzuki.poulose Cc: alexander.shishkin, coresight, ericchancf, james.clark, keyz, leo.yan, linux-kernel, mike.leach, yimingtseng Hi, (Resending the same message in plain text (no HTML). The previous message was rejected by the mailing list because it contained HTML.) thank you so much for the quick response. Really appreciate it. > Thanks for the report ! In the future, please use > scripts/get_maintainer.pl for the clear list of people/list > for reporting issues. I will do that! > We have the ETM driver performing the save/restore of ETM context during > a CPUidle. This is only done when the ETM/ETE is described to be loosing > context over PM operation. If this is not done (via DT), the driver > doesn't do anything. This could be problematic. Could you try adding: > > "arm,coresight-loses-context-with-cpu" > > > property to the ETE nodes and see if it makes a difference ? Noted. We will try this and get back to you. Best, Keita ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle 2025-06-20 10:19 ` Keita Morisaki @ 2025-06-23 8:59 ` Keita Morisaki 2025-06-23 12:05 ` James Clark 0 siblings, 1 reply; 5+ messages in thread From: Keita Morisaki @ 2025-06-23 8:59 UTC (permalink / raw) To: suzuki.poulose Cc: alexander.shishkin, coresight, ericchancf, james.clark, keyz, leo.yan, linux-kernel, mike.leach, yimingtseng > We have the ETM driver performing the save/restore of ETM context during > a CPUidle. This is only done when the ETM/ETE is described to be loosing > context over PM operation. If this is not done (via DT), the driver > doesn't do anything. This could be problematic. Could you try adding: > > "arm,coresight-loses-context-with-cpu" > > > property to the ETE nodes and see if it makes a difference ? I tried this in our environment, and this worked well. The "arm,coresight-loses-context-with-cpu" property was what we needed. Thank you so much again for the swift response with the useful information! Best, Keita ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle 2025-06-23 8:59 ` Keita Morisaki @ 2025-06-23 12:05 ` James Clark 2025-06-24 13:02 ` Leo Yan 0 siblings, 1 reply; 5+ messages in thread From: James Clark @ 2025-06-23 12:05 UTC (permalink / raw) To: Keita Morisaki, suzuki.poulose, Leo Yan Cc: alexander.shishkin, coresight, ericchancf, linux-kernel, mike.leach, yimingtseng On 23/06/2025 9:59 am, Keita Morisaki wrote: >> We have the ETM driver performing the save/restore of ETM context during >> a CPUidle. This is only done when the ETM/ETE is described to be loosing >> context over PM operation. If this is not done (via DT), the driver >> doesn't do anything. This could be problematic. Could you try adding: >> >> "arm,coresight-loses-context-with-cpu" >> >> >> property to the ETE nodes and see if it makes a difference ? > > I tried this in our environment, and this worked well. The "arm,coresight-loses-context-with-cpu" property was what we needed. > Thank you so much again for the swift response with the useful information! > > Best, > Keita Hi Keita, Thanks for the report. We discussed internally and decided that it would be better for the driver to always save the context by default, because this mistake is easy to make. Saving when it doesn't need to be saved doesn't do any harm, but not saving when it should be causes quite bad bugs. So "arm,coresight-loses-context-with-cpu" will be ignored in the future and we'll add a new flag like "arm,coresight-save-context" if anyone wants the optimization of not saving. Thanks James ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle 2025-06-23 12:05 ` James Clark @ 2025-06-24 13:02 ` Leo Yan 0 siblings, 0 replies; 5+ messages in thread From: Leo Yan @ 2025-06-24 13:02 UTC (permalink / raw) To: James Clark Cc: Keita Morisaki, suzuki.poulose, alexander.shishkin, coresight, ericchancf, linux-kernel, mike.leach, yimingtseng On Mon, Jun 23, 2025 at 01:05:13PM +0100, James Clark wrote: [...] > Hi Keita, > > Thanks for the report. We discussed internally and decided that it would be > better for the driver to always save the context by default, because this > mistake is easy to make. Saving when it doesn't need to be saved doesn't do > any harm, but not saving when it should be causes quite bad bugs. > > So "arm,coresight-loses-context-with-cpu" will be ignored in the future and > we'll add a new flag like "arm,coresight-save-context" if anyone wants the > optimization of not saving. I'm a bit concerned that we might provide information that has not yet been finalized. Before landing any changes in the mainline kernel, at this stage, I'd recommend using the option "coresight_etm4x.pm_save_enable=2" in the Linux kernel command line. This provides a reliable configuration for production environment, as it ensures consistency between the current mainline kernel and any future versions. Setting coresight_etm4x.pm_save_enable=2 overrides any setting in the device tree binding and always enables context save and restore for the ETM / ETE. If coresight_etm4x.pm_save_enable is set to 1, the ETM driver will never perform context save and restore. Setting it to 0 (the default value) allows the device tree or ACPI to determine whether context save and restore should be performed. If you are trying to upstream the DT binding for ETE, you need to omit the property "arm,coresight-loses-context-with-cpu" since it is not defined in the ETE device tree YAML schema now. As James mentioned, we need to consolidate this part. Thanks, Leo ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-06-24 13:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAGo=-X1K1qZ_p9X0yeKy8Wm4QMDs2+4VE08LUNKOCrA15KFLTA@mail.gmail.com>
2025-06-20 8:44 ` [BUG] CoreSight: WARN_ON in coresight_disclaim_device_unlocked due to register reset on CPU power-cycle Suzuki K Poulose
2025-06-20 10:19 ` Keita Morisaki
2025-06-23 8:59 ` Keita Morisaki
2025-06-23 12:05 ` James Clark
2025-06-24 13:02 ` Leo Yan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox