* [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) @ 2024-02-05 19:46 Ilkka Koskinen 2024-02-06 10:00 ` Robin Murphy 0 siblings, 1 reply; 4+ messages in thread From: Ilkka Koskinen @ 2024-02-05 19:46 UTC (permalink / raw) To: Robin Murphy, Will Deacon, Mark Rutland Cc: linux-arm-kernel, linux-kernel, Ilkka Koskinen AmpereOneX mesh implementation has a bug in HN-P nodes that makes them report incorrect child count. The failing crosspoints report 8 children while they only have two. When the driver tries to access the inexistent child nodes, it believes it has reached an invalid node type and probing fails. The workaround is to ignore those incorrect child nodes and continue normally. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> --- drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c index c584165b13ba..97fed8ec3693 100644 --- a/drivers/perf/arm-cmn.c +++ b/drivers/perf/arm-cmn.c @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum cmn_node_type type) } } +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn, + struct arm_cmn_node *dn, + u16 child_count, int child) +{ + /* + * The bug occurs only when a crosspoint reports 8 children + * while it only has two HN-P child nodes. + */ + dn -= 2; + + if (arm_cmn_model(cmn) == CMN650 && child_count == 8 && + child == 2 && dn->type == CMN_TYPE_HNP) + return true; + + return false; +} + static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) { void __iomem *cfg_region; @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) for (j = 0; j < child_count; j++) { reg = readq_relaxed(xp_region + child_poff + j * 8); + if (reg == 0) + if (arm_cmn_is_ampereonex_bug(cmn, dn, child_count, j)) + /* + * We know there are only two real children and the rest 6 + * are inexistent. Thus, we can skip the rest of the loop + */ + break; + /* * Don't even try to touch anything external, since in general * we haven't a clue how to power up arbitrary CHI requesters. -- 2.40.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) 2024-02-05 19:46 [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) Ilkka Koskinen @ 2024-02-06 10:00 ` Robin Murphy 2024-02-06 21:04 ` Ilkka Koskinen 0 siblings, 1 reply; 4+ messages in thread From: Robin Murphy @ 2024-02-06 10:00 UTC (permalink / raw) To: Ilkka Koskinen, Will Deacon, Mark Rutland; +Cc: linux-arm-kernel, linux-kernel On 2024-02-05 7:46 pm, Ilkka Koskinen wrote: > AmpereOneX mesh implementation has a bug in HN-P nodes that makes them > report incorrect child count. The failing crosspoints report 8 children > while they only have two. Ooh, fun :) > When the driver tries to access the inexistent child nodes, it believes it > has reached an invalid node type and probing fails. The workaround is to > ignore those incorrect child nodes and continue normally. > > Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> > --- > drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c > index c584165b13ba..97fed8ec3693 100644 > --- a/drivers/perf/arm-cmn.c > +++ b/drivers/perf/arm-cmn.c > @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum cmn_node_type type) > } > } > > +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn, > + struct arm_cmn_node *dn, > + u16 child_count, int child) > +{ > + /* > + * The bug occurs only when a crosspoint reports 8 children > + * while it only has two HN-P child nodes. > + */ > + dn -= 2; > + > + if (arm_cmn_model(cmn) == CMN650 && child_count == 8 && > + child == 2 && dn->type == CMN_TYPE_HNP) > + return true; > + > + return false; > +} > + > static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) > { > void __iomem *cfg_region; > @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) > > for (j = 0; j < child_count; j++) { > reg = readq_relaxed(xp_region + child_poff + j * 8); > + if (reg == 0) > + if (arm_cmn_is_ampereonex_bug(cmn, dn, child_count, j)) > + /* > + * We know there are only two real children and the rest 6 > + * are inexistent. Thus, we can skip the rest of the loop > + */ > + break; > + TBH I don't see much harm in taking an even simpler approach, so I'd be inclined to not bother being all that specific beyond documenting it, something like the below: Cheers, Robin. ----->8----- diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c index c584165b13ba..7e3aa7e2345f 100644 --- a/drivers/perf/arm-cmn.c +++ b/drivers/perf/arm-cmn.c @@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) dev_dbg(cmn->dev, "ignoring external node %llx\n", reg); continue; } + /* + * AmpereOneX erratum AC04_MESH_1 makes some XPs report a bogus + * child count larger than the number of valid child pointers. + * A child offset of 0 can only occur on CMN-600; otherwise it + * would imply the root node being its own grandchild, which + * we can safely dismiss in general. + */ + if (reg == 0 && cmn->part != PART_CMN600) { + dev_dbg(cmn->dev, "bogus child pointer?\n"); + continue; + } arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, dn); ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) 2024-02-06 10:00 ` Robin Murphy @ 2024-02-06 21:04 ` Ilkka Koskinen 2024-02-09 17:02 ` Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Ilkka Koskinen @ 2024-02-06 21:04 UTC (permalink / raw) To: Robin Murphy Cc: Ilkka Koskinen, Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel On Tue, 6 Feb 2024, Robin Murphy wrote: > On 2024-02-05 7:46 pm, Ilkka Koskinen wrote: >> AmpereOneX mesh implementation has a bug in HN-P nodes that makes them >> report incorrect child count. The failing crosspoints report 8 children >> while they only have two. > > Ooh, fun :) > >> When the driver tries to access the inexistent child nodes, it believes it >> has reached an invalid node type and probing fails. The workaround is to >> ignore those incorrect child nodes and continue normally. >> >> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> >> --- >> drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++ >> 1 file changed, 25 insertions(+) >> >> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c >> index c584165b13ba..97fed8ec3693 100644 >> --- a/drivers/perf/arm-cmn.c >> +++ b/drivers/perf/arm-cmn.c >> @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum >> cmn_node_type type) >> } >> } >> +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn, >> + struct arm_cmn_node *dn, >> + u16 child_count, int child) >> +{ >> + /* >> + * The bug occurs only when a crosspoint reports 8 children >> + * while it only has two HN-P child nodes. >> + */ >> + dn -= 2; >> + >> + if (arm_cmn_model(cmn) == CMN650 && child_count == 8 && >> + child == 2 && dn->type == CMN_TYPE_HNP) >> + return true; >> + >> + return false; >> +} >> + >> static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset) >> { >> void __iomem *cfg_region; >> @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, >> unsigned int rgn_offset) >> for (j = 0; j < child_count; j++) { >> reg = readq_relaxed(xp_region + child_poff + j * 8); >> + if (reg == 0) >> + if (arm_cmn_is_ampereonex_bug(cmn, dn, >> child_count, j)) >> + /* >> + * We know there are only two real >> children and the rest 6 >> + * are inexistent. Thus, we can skip >> the rest of the loop >> + */ >> + break; >> + > > TBH I don't see much harm in taking an even simpler approach, so I'd be > inclined to not bother being all that specific beyond documenting it, > something like the below: Sounds good to me. > > Cheers, > Robin. > > ----->8----- > > diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c > index c584165b13ba..7e3aa7e2345f 100644 > --- a/drivers/perf/arm-cmn.c > +++ b/drivers/perf/arm-cmn.c > @@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, > unsigned int rgn_offset) > dev_dbg(cmn->dev, "ignoring external node > %llx\n", reg); > continue; > } > + /* > + * AmpereOneX erratum AC04_MESH_1 makes some XPs > report a bogus > + * child count larger than the number of valid child > pointers. > + * A child offset of 0 can only occur on CMN-600; > otherwise it > + * would imply the root node being its own > grandchild, which > + * we can safely dismiss in general. > + */ > + if (reg == 0 && cmn->part != PART_CMN600) { > + dev_dbg(cmn->dev, "bogus child pointer?\n"); > + continue; > + } > arm_cmn_init_node_info(cmn, reg & > CMN_CHILD_NODE_ADDR, dn); > Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cheers, Ilkka ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) 2024-02-06 21:04 ` Ilkka Koskinen @ 2024-02-09 17:02 ` Will Deacon 0 siblings, 0 replies; 4+ messages in thread From: Will Deacon @ 2024-02-09 17:02 UTC (permalink / raw) To: Ilkka Koskinen; +Cc: Robin Murphy, Mark Rutland, linux-arm-kernel, linux-kernel On Tue, Feb 06, 2024 at 01:04:27PM -0800, Ilkka Koskinen wrote: > On Tue, 6 Feb 2024, Robin Murphy wrote: > > On 2024-02-05 7:46 pm, Ilkka Koskinen wrote: > > diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c > > index c584165b13ba..7e3aa7e2345f 100644 > > --- a/drivers/perf/arm-cmn.c > > +++ b/drivers/perf/arm-cmn.c > > @@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, > > unsigned int rgn_offset) > > dev_dbg(cmn->dev, "ignoring external node %llx\n", reg); > > continue; > > } > > + /* > > + * AmpereOneX erratum AC04_MESH_1 makes some XPs report a bogus > > + * child count larger than the number of valid child pointers. > > + * A child offset of 0 can only occur on CMN-600; otherwise it > > + * would imply the root node being its own grandchild, which > > + * we can safely dismiss in general. > > + */ > > + if (reg == 0 && cmn->part != PART_CMN600) { > > + dev_dbg(cmn->dev, "bogus child pointer?\n"); > > + continue; > > + } > > arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, dn); > > > > Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Mind sending that out as a proper patch that I can pick up, please? Cheers, Will ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-02-09 17:02 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-05 19:46 [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) Ilkka Koskinen 2024-02-06 10:00 ` Robin Murphy 2024-02-06 21:04 ` Ilkka Koskinen 2024-02-09 17:02 ` Will Deacon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox