From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 959601DE8BE; Wed, 27 May 2026 01:55:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779846931; cv=none; b=S5/fPCwJPmPZPZBcoyH+P69ndKW7MnelkHsGR2u63wdIx462N9D4WETlSvjjyoHnDnKpj0pUAeURtQtXeaFyZjxu2ecvhH+0weWZxtikWAV0pTHnpQP4LmA1KP5Y+3ll5t/7g5kxbYViyWnJPN6/jo7chOUpVieIG+lk6PEkpWk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779846931; c=relaxed/simple; bh=eeHIn//hSymJGANkZOXokx3u7vbI4MfPEkxeu3jOq94=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KDCr+/WD4YrbopHHw4VVGz9KvJ8uMUrSCwyJdCyzZQ1g/oKJ3kcWbXC33rWLmARnBNykdA4fcu4sTiVROedOIA8Nv/TtEcuzzl9INShijjHV6kdWyX5sPl/b+JPkxEl38025YvJgWviUzjrPzoACf8fHW8jtE5bq9Cb1MiS+XfU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=A+rd2iGK; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="A+rd2iGK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB1E31F000E9; Wed, 27 May 2026 01:55:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1779846929; bh=Nas175rcdEQu2O5gpSAlT+fPAtMYmxGk8DidoV8D5Fg=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=A+rd2iGKDK75E7NM5XvZAXKSfAUJolUDB6Sa09vCLMqfifVOrFuvaqLbMsnIuJ4zi fkxcyAyXUM2HH7RDdBDqJ0a+nz/0Hax/DqkzUfa+A3y/zhaQLqMvLoSj+pns0ahyZN JW7rs35M04caKepkuGxhcnHSBBgn5GQKrsKOCdNmZjQorkWjO2oVROVsq5CN1ztjse xEN97eAKN9Rkb33JzUQ0tnQfFSuChXX3hxqwCLvVn41bdcR+1D1rH6iZUdp+s98+sG JZ190HPpOVNNF8Mz87uVsRlzwHGQQSz+oVP/N3NSZohfx6OFCDfMie1SdMfCKC37T7 4NthhEV0TMg3w== From: Jakub Kicinski To: mmyangfl@gmail.com Cc: Jakub Kicinski , netdev@vger.kernel.org, hongson.hn@gmail.com, andrew@lunn.ch, olteanv@gmail.com, davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH net-next v3] net: dsa: yt921x: Add ACL support Date: Tue, 26 May 2026 18:55:27 -0700 Message-ID: <20260527015527.3178315-1-kuba@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260523020704.503966-1-mmyangfl@gmail.com> References: <20260523020704.503966-1-mmyangfl@gmail.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. --- net: dsa: yt921x: Add ACL support This adds tc-flower offload support for the YT921x switch family, mapping incoming-direction classifiers and a small set of actions onto the chip's ACL engine. A few questions about the implementation are inline. > diff --git a/drivers/net/dsa/yt921x.c b/drivers/net/dsa/yt921x.c > index a652599a4561..f645879817db 100644 > --- a/drivers/net/dsa/yt921x.c > +++ b/drivers/net/dsa/yt921x.c [ ... ] > @@ -1457,6 +1467,955 @@ yt921x_dsa_port_setup_tc(struct dsa_switch *ds, int port, > } > } > > +/* ACL: 48 blocks * 8 entries > + * > + * One rule can span multiple entries, but within a block. > + */ [ ... ] > + want_dport = false; > + want_sport = false; > + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_PORTS)) { > + struct flow_match_ports match; > + > + entry = yt921x_acl_entries_find(entries, &size, > + YT921X_ACL_TYPE_L4); > + if (!entry) > + goto err; > + > + flow_rule_match_ports(rule, &match); > + > + want_dport = !!match.mask->dst; > + want_sport = !!match.mask->src; [Low] Could the L4 entry slot be allocated only when at least one mask is non-zero? The IPV4_ADDRS, IPV6_ADDRS and ETH_ADDRS branches above all compute want_dst/want_src first and only call yt921x_acl_entries_find() when a mask is set, but here the L4 entry is reserved as soon as the dissector key is present. A rule that exposes the PORTS key with both masks zero will consume one of the eight per-block entry slots with an all-zero key/mask, and force an L4 type match the user did not request. > + > + entry->key[0] |= (ntohs(match.key->dst) << 16) | > + ntohs(match.key->src); > + entry->mask[0] |= (ntohs(match.mask->dst) << 16) | > + ntohs(match.mask->src); > + } > + > + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_PORTS_RANGE)) { > + struct flow_match_ports_range match; > + > + entry = yt921x_acl_entries_find(entries, &size, > + YT921X_ACL_TYPE_L4); > + if (!entry) > + goto err; > + > + flow_rule_match_ports_range(rule, &match); > + > + if ((want_dport && !!match.mask->tp.dst) || > + (want_sport && !!match.mask->tp.src)) { > + NL_SET_ERR_MSG_MOD(extack, > + "Ports and ports range are mutually exclusive"); > + return -EINVAL; > + } > + > + entry->key[0] |= (ntohs(match.key->tp_min.dst) << 16) | > + ntohs(match.key->tp_min.src); > + if (match.mask->tp.dst) > + entry->key[1] |= YT921X_ACL_KEYb_L4_DPORT_RANGE_EN; > + if (match.mask->tp.src) > + entry->key[1] |= YT921X_ACL_KEYb_L4_SPORT_RANGE_EN; > + entry->mask[0] |= (ntohs(match.mask->tp_max.dst) << 16) | > + ntohs(match.mask->tp_max.src); > + } [High] Is the source for the upper bound of the port range correct here? Looking at fl_set_key_port_range() in net/sched/cls_flower.c, the high port number is stored in key->tp_max while mask->tp_max is filled with 0xff bytes when the corresponding TCA attribute is present: fl_set_key_val(tb, &key->tp_max.dst, TCA_FLOWER_KEY_PORT_DST_MAX, &mask->tp_max.dst, TCA_FLOWER_UNSPEC, sizeof(key->tp_max.dst)); So mask->tp_max.{dst,src} is 0xffff whenever the attribute exists, not the high port number. Other drivers that consume PORTS_RANGE (for example mlxsw_sp_flower_parse_ports_range() in drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c) read match.key->tp_max.{src,dst} for the upper bound. Given the comment "Range: turn KEY:MASK into MIN:MAX" in yt921x.h, would a tc-flower rule like dst_port 1000-2000 not end up programmed as min=1000, max=0xffff in hardware? [ ... ] > + /* Misc only */ > + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IP)) { > + struct flow_match_ip match; > + > + flow_rule_match_ip(rule, &match); > + if (match.mask->ttl) > + return -EOPNOTSUPP; [Low] Should this rejection set an extack message, similar to the "Unsupported keys used" / "Action not supported" / "Rule too complex" paths in the same function? > + > + if (match.mask->tos) { [ ... ] > +static int > +yt921x_acl_rule_ext_parse_flow_action(struct yt921x_acl_rule_ext *ruleext, > + const struct flow_action *flow_action, > + struct netlink_ext_ack *extack, > + struct yt921x_priv *priv, int port) > +{ > + const struct flow_action_entry *act; > + u32 *action = ruleext->r.action; > + int res; > + int i; > + > + memset(action, 0, 3 * sizeof(*action)); > + flow_action_for_each(i, act, flow_action) > + switch (act->id) { [ ... ] > + case FLOW_ACTION_ACCEPT: > + action[2] |= YT921X_ACL_ACTc_REDIR_EN | > + YT921X_ACL_ACTc_REDIR_FWD; > + break; > + case FLOW_ACTION_DROP: > + action[2] |= YT921X_ACL_ACTc_REDIR_EN | > + YT921X_ACL_ACTc_REDIR_STEER; > + break; > + case FLOW_ACTION_REDIRECT: { > + struct dsa_port *to_dp; > + > + to_dp = dsa_port_from_netdev(act->dev); > + if (IS_ERR(to_dp)) { > + NL_SET_ERR_MSG_MOD(extack, > + "Destination not a switch port"); > + return -EOPNOTSUPP; > + } > + > + action[2] |= YT921X_ACL_ACTc_REDIR_EN | > + YT921X_ACL_ACTc_REDIR_STEER | > + YT921X_ACL_ACTc_REDIR_DPORTn(to_dp->index); > + break; > + } [Medium] Should this also verify that the redirect target lives on the same switch (to_dp->ds == ds)? dsa_port_from_netdev() only checks dsa_user_dev_check(); in a multi- switch DSA tree the returned port can be on a different chip with an index >= 11. YT921X_ACL_ACTc_REDIR_DPORTn(port) is BIT(port + 10), and YT921X_ACL_ACTc_REDIR_DPORTS_M is GENMASK(20, 10), so an index of 11 sets bit 21, which is the low bit of YT921X_ACL_ACTc_REDIR_M (GENMASK(22, 21)). That promotes the encoded REDIR_STEER (value 2) into REDIR_TRAP (value 3), and higher indices spill further into the CTAG and STAG fields. [Low] With multiple terminal verdicts in a single action list, does the bitwise OR into action[2] produce a deterministic result? For example, REDIR_STEER (value 2) OR'd with REDIR_TRAP (value 3) in the YT921X_ACL_ACTc_REDIR_M field results in REDIR_TRAP, silently turning a drop into a CPU trap. Should contradictory verdicts be rejected with an extack message instead? > + case FLOW_ACTION_TRAP: > + action[2] |= YT921X_ACL_ACTc_REDIR_EN | > + YT921X_ACL_ACTc_REDIR_TRAP; > + break; [ ... ] > +static int > +yt921x_acl_rule_ext_parse_flow(struct yt921x_acl_rule_ext *ruleext, int port, > + const struct flow_cls_offload *cls, bool ingress, > + struct yt921x_priv *priv) > +{ > + const struct flow_rule *rule = flow_cls_offload_flow_rule(cls); > + struct netlink_ext_ack *extack = cls->common.extack; > + int res; > + > + if (!ingress || cls->common.chain_index) > + return -EOPNOTSUPP; [Low] Could this rejection also use NL_SET_ERR_MSG_MOD()? Other failure paths in this file populate extack ("Unsupported keys used", "Rule too complex", etc.); a user passing an egress filter or a non-zero chain index will currently see a bare EOPNOTSUPP without any diagnostic. > + > + res = yt921x_acl_rule_ext_parse_flow_action(ruleext, &rule->action, > + extack, priv, port); [ ... ] > +static unsigned int > +yt921x_acl_reserve(struct yt921x_priv *priv, unsigned int entscnt, > + struct netlink_ext_ack *extack) > +{ > + int candidates[YT921X_ACL_ENT_PER_BLK]; > + unsigned int acl_used_cnt = 0; > + > + if (WARN_ON(entscnt > YT921X_ACL_ENT_PER_BLK)) > + return UINT_MAX; > + > + for (unsigned int i = 0; i < ARRAY_SIZE(candidates); i++) > + candidates[i] = -1; > + for (unsigned int i = 0; i < YT921X_ACL_BLK_NUM; i++) { > + unsigned int blk_used_cnt = hweight8(priv->acl_masks[i]); > + > + candidates[blk_used_cnt] = i; > + acl_used_cnt += blk_used_cnt; > + } [Critical] Can this overrun candidates[] when a block is fully populated? YT921X_ACL_ENT_PER_BLK is 8 and priv->acl_masks[i] is a u8, so hweight8(priv->acl_masks[i]) ranges over [0..8]. As soon as any block has all 8 bins in use (acl_masks[i] == 0xff, reachable by adding rules since priv->acl_masks[blkid] |= aclrule->mask accumulates and a single rule can have entscnt up to 8), the next call writes: candidates[8] = i; which is one past the end of the eight-element stack array. Would it work to size the array as YT921X_ACL_ENT_PER_BLK + 1, or to skip fully-saturated blocks here? > + > + if (acl_used_cnt >= YT921X_ACL_NUM) { > + NL_SET_ERR_MSG_MOD(extack, "ACL entry limit reached"); > + return UINT_MAX; > + } [ ... ] > +static int > +yt921x_acl_commit(struct yt921x_priv *priv, unsigned int entid, u8 entsmask) > +{ > + const struct yt921x_acl_rule *aclrule; > + const struct yt921x_acl_blk *aclblk; > + unsigned int blkid; > + unsigned int binid; > + unsigned long mask; > + u32 zeros[3] = {}; > + unsigned int i; > + unsigned int o; > + u32 ctrl; > + int res; > + > + blkid = entid / YT921X_ACL_ENT_PER_BLK; > + binid = entid % YT921X_ACL_ENT_PER_BLK; > + aclblk = priv->acl_blks[blkid]; > + aclrule = aclblk->rules[binid]; > + > + /* Open the block */ > + ctrl = YT921X_ACL_BLK_CMD_MODIFY | YT921X_ACL_BLK_CMD_BLKID(blkid); > + res = yt921x_reg_write(priv, YT921X_ACL_BLK_CMD, ctrl); > + if (res) > + return res; > + > + /* Write keys and masks */ [Medium] If any of the register writes between the "Open the block" and "Close the block" sequence fails, the function returns immediately with the block still latched in MODIFY mode and no compensating close is performed. res = yt921x_reg_write(priv, YT921X_ACL_BLK_CMD, ctrl); if (res) return res; ... res = yt921x_reg64_write(priv, YT921X_ACLn_KEYm(blkid, o), ...); if (res) return res; /* block still in MODIFY */ ... Would a goto-style unwind that always issues the close-block command and then returns the original error be safer here? > + ctrl = 0; > + for (unsigned int i = 0; i < YT921X_ACL_ENT_PER_BLK; i++) > + ctrl |= YT921X_ACL_BLK_KEEP_KEEPn(i); [ ... ] > + res = yt921x_reg_write(priv, YT921X_ACLn_ENTRY(blkid), ctrl); > + if (res) > + return res; > + > + /* Close the block */ > + ctrl = YT921X_ACL_BLK_CMD_BLKID(blkid); > + res = yt921x_reg_write(priv, YT921X_ACL_BLK_CMD, ctrl); > + if (res) > + return res; > + > + /* Write actions */ > + res = yt921x_reg96_write(priv, YT921X_ACLn_ACT(entid), > + aclrule ? aclrule->action : zeros); > + if (res) > + return res; > + > + return 0; > +} [Medium] Is the activation order between the keys/EN and the action register intentional? The new keys, masks, and the per-block YT921X_ACLn_ENTRY register (holding the EN bits) are written inside the MODIFY-bracketed sequence; the block is then closed (which is what activates the new keys/EN in the data path); only after the close is YT921X_ACLn_ACT(entid) written. Between the close and the ACT write, packets matching the freshly programmed keys would consult the prior contents of the action slot. For a freshly allocated entid, that slot is zero, so REDIR_EN is clear and the rule's intent (DROP/REDIRECT/TRAP) is briefly not enforced. Is there a way to write the action before the EN bit becomes visible to the data path? [ ... ] > +static int > +yt921x_acl_add(struct yt921x_priv *priv, > + const struct yt921x_acl_rule_ext *ruleext, > + struct netlink_ext_ack *extack) > +{ > + bool want_meter = ruleext->r.action[0] & YT921X_ACL_ACTa_METER_EN; [ ... ] > + if (want_meter) { > + meterid = find_first_zero_bit(priv->meters_map, > + YT921X_METER_NUM); > + if (meterid >= YT921X_METER_NUM) { > + NL_SET_ERR_MSG_MOD(extack, > + "No more meters available"); > + return -EOPNOTSUPP; > + } > + > + res = yt921x_meter_config(priv, meterid, &ruleext->marker); > + if (res) > + return res; > + } > + > + /* Prepare acl block ctrlblk */ > + blkid = entid / YT921X_ACL_ENT_PER_BLK; > + binid = entid % YT921X_ACL_ENT_PER_BLK; > + aclblk = priv->acl_blks[blkid]; > + if (!aclblk) { > + aclblk = devm_kzalloc(dev, sizeof(*aclblk), GFP_KERNEL); > + if (!aclblk) > + return -ENOMEM; > + priv->acl_blks[blkid] = aclblk; > + } > + > + /* Prepare acl rule ctrlblk */ > + aclrule = devm_kmemdup(dev, &ruleext->r, > + offsetof(struct yt921x_acl_rule, > + entries[entscnt]), > + GFP_KERNEL); > + if (!aclrule) { > + res = -ENOMEM; > + goto err; > + } [Medium] What happens to a meter that has been programmed in hardware when one of these later steps fails? yt921x_meter_config() writes YT921X_METERn_CTRL with METER_EN set before any of the kernel-side bookkeeping has succeeded. If the following devm_kzalloc(aclblk), devm_kmemdup(aclrule), or yt921x_acl_commit() call fails, the function returns or jumps to err: without: - reverting the meter hardware programming, and - reserving the meter id in priv->meters_map. The slot stays active in hardware with the user's policer config, and only set_bit(meterid, priv->meters_map) on the success path tracks it. A subsequent successful add will pick the same id from find_first_zero_bit() and reprogram it, but in the meantime the device polices traffic through that meter. [ ... ] > @@ -3766,6 +4747,9 @@ static const struct dsa_switch_ops yt921x_dsa_switch_ops = { > .port_policer_del = yt921x_dsa_port_policer_del, > .port_policer_add = yt921x_dsa_port_policer_add, > .port_setup_tc = yt921x_dsa_port_setup_tc, > + /* acl */ > + .cls_flower_del = yt921x_dsa_cls_flower_del, > + .cls_flower_add = yt921x_dsa_cls_flower_add, [Low] Is there a reason cls_flower_stats is not implemented? The action register definitions YT921X_ACL_ACTa_FLOWSTAT_EN and YT921X_ACL_ACTa_FLOWSTAT_ID_M suggest the hardware can report per-rule match counters, but the bits are never set and no stats callback is hooked up, so tc -s filter show will not show match counts for offloaded rules. > /* hsr */ > .port_hsr_leave = dsa_port_simple_hsr_leave, > .port_hsr_join = dsa_port_simple_hsr_join, -- pw-bot: cr