From: Mark Rutland <mark.rutland@arm.com>
To: Yogesh Lal <quic_ylal@quicinc.com>
Cc: maz@kernel.org, daniel.lezcano@linaro.org, tglx@linutronix.de,
linux-arm-kernel@lists.infradead.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>
Subject: Re: ERRATUM_858921 is broken on 5.15 kernel
Date: Thu, 5 Jan 2023 14:12:11 +0000 [thread overview]
Message-ID: <Y7bar/zQ4khMDyiv@FVFF77S0Q05N> (raw)
In-Reply-To: <ca4679a0-7f29-65f4-54b9-c575248192f1@quicinc.com>
On Thu, Jan 05, 2023 at 07:03:48PM +0530, Yogesh Lal wrote:
> Hi,
>
> We are observing issue on A73 core where ERRATUM_858921 is broken.
Do you *only* see this issue on v5.15.y, or is mainline (e.g. v6.2-rc2) also
broken?
I don't see any fix that fits your exact description below, but I do see that
we've made a bunch of changes in this area since.
>
> On 5.15 kernel arch_timer_enable_workaround is set by reading
> arm64_858921_read_cntpct_el0 and arm64_858921_read_cntvct_el0 during timer
> register using following path.
>
> arch_timer_enable_workaround->atomic_set(&timer_unstable_counter_workaround_in_use,
> 1);
>
> [code snap]
> 564 static
> 565 void arch_timer_enable_workaround(const struct
> arch_timer_erratum_workaround *wa,
> 566 bool local)
> 567 {
> 568 int i;
> 569
> 570 if (local) {
> 571 __this_cpu_write(timer_unstable_counter_workaround, wa);
> 572 } else {
> 573 for_each_possible_cpu(i)
> 574 per_cpu(timer_unstable_counter_workaround, i) = wa;
> 575 }
> 576
> 577 if (wa->read_cntvct_el0 || wa->read_cntpct_el0)
> 578 atomic_set(&timer_unstable_counter_workaround_in_use, 1);
>
>
> and based on above workaround enablement , appropriate function to get
> counter is used.
>
> 1008 static void __init arch_counter_register(unsigned type)
> 1009 {
> 1010 u64 start_count;
> 1011
> 1012 /* Register the CP15 based counter if we have one */
> 1013 if (type & ARCH_TIMER_TYPE_CP15) {
> 1014 u64 (*rd)(void);
> 1015
> 1016 if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
> 1017 arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> 1018 if (arch_timer_counter_has_wa())
> 1019 rd = arch_counter_get_cntvct_stable;
> 1020 else
> 1021 rd = arch_counter_get_cntvct;
> 1022 } else {
> 1023 if (arch_timer_counter_has_wa())
> 1024 rd = arch_counter_get_cntpct_stable;
> 1025 else
> 1026 rd = arch_counter_get_cntpct;
> 1027 }
> [snap]
> 1043 /* 56 bits minimum, so we assume worst case rollover */
> 1044 sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
>
>
> As our boot cores are not impacted by errata sched_clock_register() will
> register !arch_timer_counter_has_wa() callback.
It would be helpful to mention this fact (that the system is big.LITTLE, and
the boot cores are not Cortex-A73) earlier in the report.
> Now when errata impacted core boots up and sched_clock_register already
> register will !arch_timer_counter_has_wa() path.
> As sched_clock_register is not per_cpu bases so arch_timer_read_counter will
> always point to !arch_timer_counter_has_wa() function calls.
Hmm... yes, AFAICT this cannot work unless the affected CPUs are up before we
probe, and it doesn't make much sense for arch_counter_register() to look at
arch_timer_counter_has_wa() since it can be called before all CPUs are up.
> Looks like this bug is side effect of following patch:
>
> commit 0ea415390cd345b7d09e8c9ebd4b68adfe873043
> Author: Marc Zyngier <marc.zyngier@arm.com>
> Date: Mon Apr 8 16:49:07 2019 +0100
>
> clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable
> counters
>
> Instead of always going via arch_counter_get_cntvct_stable to access the
> counter workaround, let's have arch_timer_read_counter point to the
> right method.
>
> For that, we need to track whether any CPU in the system has a
> workaround for the counter. This is done by having an atomic variable
> tracking this.
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
>
Yeah, that does look to be broken, but I think there are futher issues anyway
(e.g. late onlining).
AFAICT we need to detect this *stupidly early* in the CPU bringup path in order
to handle this safely, which is quite painful.
What a great.
Thanks,
Mark.
next prev parent reply other threads:[~2023-01-05 14:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-05 13:33 ERRATUM_858921 is broken on 5.15 kernel Yogesh Lal
2023-01-05 14:12 ` Mark Rutland [this message]
2023-01-06 16:38 ` Yogesh Lal
2023-01-05 14:22 ` Marc Zyngier
2023-01-09 6:52 ` Yogesh Lal
2023-01-09 10:39 ` Marc Zyngier
2023-01-12 12:47 ` Yogesh Lal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y7bar/zQ4khMDyiv@FVFF77S0Q05N \
--to=mark.rutland@arm.com \
--cc=daniel.lezcano@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=quic_ylal@quicinc.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox