From: Jakub Kicinski <kuba@kernel.org>
To: linusw@kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
woojung.huh@microchip.com, UNGLinuxDriver@microchip.com,
andrew@lunn.ch, olteanv@gmail.com, davem@davemloft.net,
edumazet@google.com, pabeni@redhat.com, robh@kernel.org,
krzk+dt@kernel.org, conor+dt@kernel.org, marex@denx.de,
horms@kernel.org, linux@armlinux.org.uk, netdev@vger.kernel.org,
Woojung.Huh@microchip.com, devicetree@vger.kernel.org,
nb@tipi-net.de
Subject: Re: [PATCH net-next v6 3/5] net: dsa: tag_ks8995: Add the KS8995 tag handling
Date: Wed, 10 Jun 2026 08:39:52 -0700 [thread overview]
Message-ID: <20260610153952.1685895-1-kuba@kernel.org> (raw)
In-Reply-To: <20260608-ks8995-to-ksz8-v6-3-d91eb43edcc5@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: tag_ks8995: Add the KS8995 tag handling
Adds a new DSA tagger for the Micrel/Microchip KS8995XA switch using a
modified 0x8100 ethertype where the lower nybble encodes a port bitmap.
The patch wires the new tagger into the DSA core, Kconfig, Makefile and
MAINTAINERS.
A few questions inline below.
> diff --git a/net/dsa/tag_ks8995.c b/net/dsa/tag_ks8995.c
> new file mode 100644
> index 000000000000..5bdb18271582
> --- /dev/null
> +++ b/net/dsa/tag_ks8995.c
> @@ -0,0 +1,132 @@
[ ... ]
> +static struct sk_buff *ks8995_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> + struct vlan_ethhdr *hdr = vlan_eth_hdr(skb);
> + bool have_hwaccel_tag = false;
> + u16 tci = 0, portmask;
> +
> + /* Prepare the special KS8995 tags */
> + portmask = dsa_xmit_port_mask(skb, dev);
> +
> + if (skb_vlan_tag_present(skb) && skb->vlan_proto == htons(ETH_P_8021Q)) {
> + tci = skb_vlan_tag_get(skb);
> + __vlan_hwaccel_clear_tag(skb);
> + have_hwaccel_tag = true;
> + }
[Low]
What is intended to happen if the skb arrives with both a hwaccel VLAN
tag and an in-head 0x8100 tag, or with a hwaccel tag whose vlan_proto
is something other than 0x8100 (for example 0x88a8 service tag)?
In those cases this branch leaves the hwaccel tag alone or only consumes
it conditionally, and the second branch below either inserts a fresh
in-head tag while keeping a pre-existing in-head 0x8100, or modifies the
in-head 0x8100 while a non-8021Q hwaccel tag is still attached. Should
this path either reject such frames or normalise them first?
> +
> + if (have_hwaccel_tag || hdr->h_vlan_proto != htons(ETH_P_8021Q)) {
> + skb = vlan_insert_tag(skb, KS8995M_STPID(portmask), tci);
> + if (!skb)
> + return NULL;
[High]
Is returning NULL here safe?
include/linux/if_vlan.h documents that vlan_insert_tag() calls
dev_kfree_skb_any(skb) on failure (e.g. when skb_cow_head() fails under
memory pressure) and returns NULL. The caller in net/dsa/user.c does:
nskb = p->xmit(skb, dev);
if (!nskb) {
kfree_skb(skb);
return NETDEV_TX_OK;
}
so on the NULL return dsa_user_xmit() runs kfree_skb() on its local
pointer that vlan_insert_tag() has already freed. Doesn't that produce a
use-after-free on skb->users plus a double free?
Should this path swallow the free done by vlan_insert_tag() (e.g. by
returning a sentinel handled by the caller) so dsa_user_xmit() does not
re-free the skb?
> + hdr = vlan_eth_hdr(skb);
> + netdev_dbg(dev, "%s: inserted VLAN TAG %04x TCI %04x\n",
> + __func__, hdr->h_vlan_proto, hdr->h_vlan_TCI);
> + } else {
> + /* VLAN tag already exists in skb head, modify it in place */
> + hdr = vlan_eth_hdr(skb);
> + hdr->h_vlan_proto = KS8995M_STPID(portmask);
> + netdev_dbg(dev, "%s: modified VLAN TAG %04x\n",
> + __func__, hdr->h_vlan_proto);
> + }
> +
> + return skb;
> +}
> +
> +static struct sk_buff *ks8995_rcv(struct sk_buff *skb, struct net_device *dev)
> +{
> + int portmask;
> + u16 etype;
> +
> + /* We are expecting all received packets to have a mangled VLAN
> + * TPID, so drop anything else. Because of the non-standard TPID,
> + * don't even bother looking for a tag in the hwaccel area.
> + *
> + * We have to inspect the ethertype directly because skb->protocol
> + * will contain garbage.
> + */
> + etype = ntohs(*(__be16 *)dsa_etype_header_pos_rx(skb));
> + if ((etype & KS8995M_STPID_STD) != ETH_P_8021Q) {
> + netdev_dbg(dev, "%s: dropped ethertype 0x%04x\n",
> + __func__, etype);
> + return NULL;
> + }
> + netdev_dbg(dev, "%s: received ethertype %04x\n",
> + __func__, etype);
> +
> + /* Move the custom DSA+VLAN tag into the hwaccel area and strip
> + * it from the skb head
> + */
> + skb = skb_vlan_untag(skb);
> + if (!skb) {
> + netdev_err(dev, "%s: unable to untag skb\n", __func__);
> + return NULL;
> + }
[High]
Is the NULL return here compatible with the cpu_dp->rcv contract?
skb_vlan_untag() can take its err_free path (e.g. when the
pskb_may_pull(VLAN_HLEN + sizeof(unsigned short)) inside it fails, or
skb_reorder_vlan_header() fails) which calls kfree_skb(skb) and returns
NULL.
The caller in net/dsa/tag.c is:
skb = skb_unshare(skb, GFP_ATOMIC);
...
nskb = cpu_dp->rcv(skb, dev);
...
if (!nskb) {
kfree_skb(skb);
return 0;
}
so when ks8995_rcv() returns NULL after skb_vlan_untag() has already
freed its argument, dsa_switch_rcv() ends up calling kfree_skb() on a
freed skb. Wouldn't that be a use-after-free on skb->users plus a
double free?
Other cpu_dp->rcv handlers appear to assume that returning NULL means
the skb was not freed. Should this handler do something equivalent so
the contract still holds?
> +
> + portmask = FIELD_GET(KS8995M_STPID_PORTMASK, etype);
> + netdev_dbg(dev, "%s: etype %04x portmask %04x (%d)\n",
> + __func__, etype, portmask, ilog2(portmask));
> + skb->dev = dsa_conduit_find_user(dev, 0, ilog2(portmask));
> + if (!skb->dev)
> + return NULL;
[Low, Low]
A literal 0x8100 frame from the wire passes the
(etype & GENMASK(15,4)) != ETH_P_8021Q gate above, in which case
FIELD_GET(GENMASK(3,0), 0x8100) is 0 and ilog2(0) is computed (and also
passed as a netdev_dbg() argument). The frame is only dropped because
dsa_conduit_find_user() happens not to match port -1.
Would an explicit check be clearer:
portmask = FIELD_GET(KS8995M_STPID_PORTMASK, etype);
if (!portmask)
return NULL;
The commit message says "a bit in the last nybble is set for each
target port", suggesting bitmap semantics, but ilog2(portmask) on
ingress only handles a single bit and silently picks the highest set
bit if more than one is set. Would an is_power_of_2(portmask) sanity
check be appropriate here?
> +
> + /* Preserve the VLAN tag if it contains a non-zero VID which is not
> + * identical to 0x001, or PCP, and restore its TPID to the standard
> + * value.
> + *
> + * If this is just an ordinary inbound package the datasheet claims
> + * it will "replace null VID with ingress port VID", which means
> + * VID set to 1: 0x8101 0001 for port 0 or 0x8102 0001 for port 1.
> + * So in the DSA driver we will set the default port VID to 0 so
> + * we can properly detect non-VLAN frames.
> + */
> + if (!skb->vlan_tci) {
> + netdev_dbg(dev, "%s: clear VLAN tag from frame\n", __func__);
> + __vlan_hwaccel_clear_tag(skb);
> + } else {
> + skb->vlan_proto = htons(ETH_P_8021Q);
> + netdev_dbg(dev, "%s: vlan_tci = 0x%04x VLAN frame\n",
> + __func__, skb->vlan_tci);
> + }
[Low]
The comment says
Preserve the VLAN tag if it contains a non-zero VID which is not
identical to 0x001, or PCP
but the code only tests:
if (!skb->vlan_tci)
That matches VID == 0 and PCP == 0 and DEI == 0 only, so frames with
VID == 1 (the chip's "replace null VID with ingress port VID" default)
fall through into the "preserve as a VLAN frame" branch instead of
being stripped. Should the comment and the code be reconciled, or
should the test also strip VID == 1 frames so the behaviour does not
depend on the switch driver also forcing the default port VID to 0?
> +
> + dsa_default_offload_fwd_mark(skb);
> +
> + return skb;
> +}
--
pw-bot: cr
next prev parent reply other threads:[~2026-06-10 15:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-07 22:41 [PATCH net-next v6 0/5] net: dsa: microchip: Add support for KSZ8995XA/KS8995XA Linus Walleij
2026-06-07 22:41 ` [PATCH net-next v6 1/5] net: dsa: microchip: Add fallback Micrel compatibles Linus Walleij
2026-06-07 22:41 ` [PATCH net-next v6 2/5] dt-bindings: net: dsa: microchip: Add KSZ8995XA Linus Walleij
2026-06-07 22:41 ` [PATCH net-next v6 3/5] net: dsa: tag_ks8995: Add the KS8995 tag handling Linus Walleij
2026-06-10 15:39 ` Jakub Kicinski [this message]
2026-06-07 22:41 ` [PATCH net-next v6 4/5] net: dsa: microchip: Support Microchip KSZ8995XA / KS8995XA Linus Walleij
2026-06-10 15:39 ` Jakub Kicinski
2026-06-07 22:41 ` [PATCH net-next v6 5/5] net: dsa: ks8995: Delete surplus driver Linus Walleij
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260610153952.1685895-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=UNGLinuxDriver@microchip.com \
--cc=andrew@lunn.ch \
--cc=conor+dt@kernel.org \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=linusw@kernel.org \
--cc=linux@armlinux.org.uk \
--cc=marex@denx.de \
--cc=nb@tipi-net.de \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
--cc=robh@kernel.org \
--cc=woojung.huh@microchip.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox