From: Jakub Kicinski <kuba@kernel.org>
To: mmyangfl@gmail.com
Cc: Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, andrew@lunn.ch, olteanv@gmail.com,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next] net: dsa: yt921x: Add limited ACL flow statistics support
Date: Sat, 13 Jun 2026 17:45:30 -0700 [thread overview]
Message-ID: <20260614004530.402503-1-kuba@kernel.org> (raw)
In-Reply-To: <20260610202508.845328-1-mmyangfl@gmail.com>
This is an AI-generated review of your patch. The human sending this
email says "I thought this is a false positive and the register is
Clear-on-Read but you seem to be writing 0 to it when allocating."
---
net: dsa: yt921x: Add limited ACL flow statistics support
This commit adds .cls_flower_stats() support to the yt921x DSA driver,
backed by a pool of 64 hardware flow counters configured in byte mode.
A counter is allocated at rule-install time and read on demand via
yt921x_acl_stat().
> diff --git a/drivers/net/dsa/yt921x.c b/drivers/net/dsa/yt921x.c
> index 159b16606f6c..7919afcb8136 100644
> --- a/drivers/net/dsa/yt921x.c
> +++ b/drivers/net/dsa/yt921x.c
> @@ -373,6 +373,11 @@ yt921x_regs_clear_bits(struct yt921x_priv *priv, u32 reg, const u32 *masks,
> return yt921x_regs_write(priv, reg, vs, num_regs);
> }
>
> +static int yt921x_reg64_read(struct yt921x_priv *priv, u32 reg, u32 *vals)
> +{
> + return yt921x_regs_read(priv, reg, vals, 2);
> +}
> +
[Low]
yt921x_reg64_read() reads the 64-bit FLOWSTATn_STAT register as two
independent 32-bit transactions, one for each word, and combines them
in yt921x_acl_stat() with:
*statp = ((u64)vals[1] << 32) | vals[0];
The hardware counter is live and can increment between the two reads.
If the low word wraps from ~0xffffffff to a small value while the high
word increments between the two transactions, the recombined value
becomes (new_high << 32) | old_low, producing a spurious ~4 GiB jump.
Does the chip provide a latch/snapshot register, or a documented
read-low/high/low retry protocol, that should be used here instead of
two independent reads?
> static int
> yt921x_reg64_write(struct yt921x_priv *priv, u32 reg, const u32 *vals)
> {
> @@ -2224,6 +2229,40 @@ yt921x_acl_reserve(struct yt921x_priv *priv, unsigned int entscnt,
> return UINT_MAX;
> }
>
> +static int
> +yt921x_acl_stat(struct yt921x_priv *priv, enum tc_setup_type type,
> + unsigned long tag, u64 *statp)
> +{
> + const struct yt921x_acl_rule *aclrule;
> + const struct yt921x_acl_blk *aclblk;
> + unsigned int statid;
> + unsigned int binid;
> + unsigned int blkid;
> + unsigned int entid;
> + u32 vals[2];
> + int res;
> +
> + entid = yt921x_acl_find(priv, type, tag);
> + if (entid == UINT_MAX)
> + return -ENOENT;
> +
> + blkid = entid / YT921X_ACL_ENT_PER_BLK;
> + binid = entid % YT921X_ACL_ENT_PER_BLK;
> + aclblk = priv->acl_blks[blkid];
> + aclrule = aclblk->rules[binid];
> +
> + if (!(aclrule->action[0] & YT921X_ACL_ACTa_FLOWSTAT_EN))
> + return -EOPNOTSUPP;
> +
> + statid = FIELD_GET(YT921X_ACL_ACTa_FLOWSTAT_ID_M, aclrule->action[0]);
> + res = yt921x_reg64_read(priv, YT921X_FLOWSTATn_STAT(statid), vals);
> + if (res)
> + return res;
> +
> + *statp = ((u64)vals[1] << 32) | vals[0];
> + return 0;
> +}
> +
[High]
The absolute hardware counter is written directly into cls->stats.bytes,
and the flow_offload framework treats this value as a per-poll delta.
Following the call chain tcf_exts_hw_stats_update() ->
tcf_action_stats_update() -> a->ops->stats_update() -> _bstats_update(),
the framework performs:
u64_stats_add(&bstats->bytes, bytes);
So if the hardware counter currently shows X bytes, every subsequent
poll of an idle counter adds another X to the kernel running total. After
N polls of an unchanged counter the user-visible "tc -s flower" value
will read N*X instead of X.
Two existing drivers handle this correctly:
- drivers/net/dsa/sja1105/sja1105_vl.c:sja1105_vl_stats() caches a
"last reported" value and reports (current - last).
- drivers/net/ethernet/mscc/ocelot_vcap.c:ocelot_vcap_filter_stats_update()
clears the hardware counter after each read.
Should yt921x_acl_stat() do something similar, either by tracking the
previously reported value per-rule, or by clearing FLOWSTATn_STAT after
the read?
[Medium]
cls->stats.pkts and cls->stats.lastused are never populated. The
hardware is hardwired to byte mode (YT921X_FLOWSTAT_CTRL_TYPE_FLOW with
PKT_MODE cleared), so no packet count is available, but the driver still
sets:
cls->stats.used_hw_stats = FLOW_ACTION_HW_STATS_IMMEDIATE;
cls->stats.used_hw_stats_valid = true;
Combined with the additive _bstats_update() semantics above, "tc -s
flower" will always show 0 packets even under traffic. Similarly
flow_stats->lastused stays 0, and any rule-aging logic that consults
lastused (it is updated via max_t(u64, ...)) will see the rule as never
used. For comparison, sja1105_vl_stats() passes
(jiffies - rule->vl.stats.lastused) as a delta.
Should the commit message and/or used_hw_stats_valid acknowledge that
pkts and lastused are not being reported?
> static int
> yt921x_acl_commit(struct yt921x_priv *priv, unsigned int entid, u8 entsmask)
> {
> @@ -2336,6 +2375,10 @@ yt921x_acl_del(struct yt921x_priv *priv, enum tc_setup_type type,
> clear_bit(FIELD_GET(YT921X_ACL_ACTa_METER_ID_M,
> aclrule->action[0]),
> priv->meters_map);
> + if (aclrule->action[0] & YT921X_ACL_ACTa_FLOWSTAT_EN)
> + clear_bit(FIELD_GET(YT921X_ACL_ACTa_FLOWSTAT_ID_M,
> + aclrule->action[0]),
> + priv->flowstats_map);
[Low]
On rule delete the priv->flowstats_map bit is cleared, but
YT921X_FLOWSTATn_CTRL(statid) is not written to clear FLOWSTAT_CTRL_EN.
The counter remains physically enabled with TYPE_FLOW until something
else reprograms it.
Today this is benign because acl_commit() zeros the action register so
no rule references the counter, and reuse rewrites CTRL. However, the
header file enumerates other types (TYPE_CPU_CODE, TYPE_DROP_CODE,
TYPE_PORT) for the same slot pool. If a future caller picks one of
these types without first disabling the slot, could it observe stale
counts from the previous flow user?
> priv->acl_masks[blkid] &= ~aclrule->mask;
> kvfree(aclrule);
> if (!priv->acl_masks[blkid]) {
> @@ -2355,11 +2398,13 @@ yt921x_acl_add(struct yt921x_priv *priv,
> struct yt921x_acl_blk *aclblk;
> bool use_trap = false;
> unsigned int meterid;
> + unsigned int statid;
> unsigned long mask;
> unsigned int binid;
> unsigned int blkid;
> unsigned int entid;
> unsigned int o;
> + u32 ctrl;
> int res;
>
> /* Allocate resources */
> @@ -2386,6 +2431,22 @@ yt921x_acl_add(struct yt921x_priv *priv,
> }
> }
>
> + statid = find_first_zero_bit(priv->flowstats_map, YT921X_FLOWSTAT_NUM);
> + if (statid < YT921X_FLOWSTAT_NUM) {
> + u32 zeros[2] = {};
> +
> + ctrl = YT921X_FLOWSTAT_CTRL_EN | YT921X_FLOWSTAT_CTRL_TYPE_FLOW;
> + res = yt921x_reg_write(priv, YT921X_FLOWSTATn_CTRL(statid),
> + ctrl);
> + if (res)
> + return res;
> +
> + res = yt921x_reg64_write(priv, YT921X_FLOWSTATn_STAT(statid),
> + zeros);
> + if (res)
> + return res;
> + }
> +
[Medium]
This block runs unconditionally for every ACL rule, regardless of
whether the user requested statistics. When all 64 counters are in use
(statid >= YT921X_FLOWSTAT_NUM), the rule is still installed but
silently without stats; the user only learns by querying and getting
-EOPNOTSUPP from yt921x_acl_stat().
There is no NL_SET_ERR_MSG_MOD() at install time and no kernel log
message. The same function explicitly emits
NL_SET_ERR_MSG_MOD(extack, "No more meters available");
for the meter pool right above this hunk.
Should the flow-stat path either be gated on whether the rule actually
needs stats (so unrelated rules do not consume counters), or surface a
diagnostic via extack when the pool is exhausted? Otherwise an old set
of un-queried rules can starve newer rules that do need stats.
[Low]
After the YT921X_FLOWSTATn_CTRL write succeeds, the bare "return res;"
on a failing YT921X_FLOWSTATn_STAT write leaves the counter enabled in
hardware while priv->flowstats_map remains clear. Likewise, if any
later step (kvmemdup, yt921x_acl_commit) fails, the err: label does not
unwind the CTRL write or the meter configuration done in the prior
hunk.
The same asymmetric rollback pre-exists for meters and is implicitly
healed by the next allocation reprogramming the slot, but the new
flow-stat path's bare returns bypass the err: label entirely. Should
this teardown path be made symmetric?
> /* Prepare acl block ctrlblk */
> blkid = entid / YT921X_ACL_ENT_PER_BLK;
> binid = entid % YT921X_ACL_ENT_PER_BLK;
> @@ -2426,6 +2487,9 @@ yt921x_acl_add(struct yt921x_priv *priv,
> aclrule->action[0] |= YT921X_ACL_ACTa_METER_ID(meterid);
> else
> aclrule->action[0] &= ~YT921X_ACL_ACTa_METER_EN;
> + if (statid < YT921X_FLOWSTAT_NUM)
> + aclrule->action[0] |= YT921X_ACL_ACTa_FLOWSTAT_EN |
> + YT921X_ACL_ACTa_FLOWSTAT_ID(statid);
>
> /* Write rules */
> aclblk->rules[binid] = aclrule;
[ ... ]
--
pw-bot: cr
prev parent reply other threads:[~2026-06-14 0:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 20:25 [PATCH net-next] net: dsa: yt921x: Add limited ACL flow statistics support David Yang
2026-06-14 0:45 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260614004530.402503-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mmyangfl@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox