BPF List
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Bobby Eshleman" <bobbyeshleman@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH net-next 07/11] net: devmem: support TX over NETMEM_TX_NO_DMA devices
Date: Wed, 29 Apr 2026 22:43:29 +0000	[thread overview]
Message-ID: <20260429224330.C1917C19425@smtp.kernel.org> (raw)
In-Reply-To: <20260428-tcp-dm-netkit-v1-7-719280eba4d2@meta.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Calling `netdev_queue_get_dma_dev()` on an unlocked physical device triggers a lockdep assertion.
- [High] Unrecognized unreadable SKBs bypass device capability checks in `validate_xmit_unreadable_skb()`.
- [High] Dangling `binding->dev` and `binding->vdev` pointers in TX bindings cause ABA vulnerabilities and bypass DMA isolation.
- [Medium] Missing rollback on `genlmsg_reply()` failure creates zombie bindings in the netlink TX bind path.
--

commit ad5b7f6dddf55058541369471398808c30d9cb9e
Author: Bobby Eshleman <bobbyeshleman@meta.com>
Subject: net: devmem: support TX over NETMEM_TX_NO_DMA devices

This patch adds support for devmem TX over netkit virtual devices by finding
the underlying DMA-capable physical device. It updates
validate_xmit_unreadable_skb() and netdev_nl_bind_tx_doit() to handle the
dual-device validation and binding setup.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 06c195906231..f6575cf48287 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3990,22 +3990,32 @@ static struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
>  static struct sk_buff *validate_xmit_unreadable_skb(struct sk_buff *skb,
>  						    struct net_device *dev)
>  {
> +	struct net_devmem_dmabuf_binding *binding;
>  	struct skb_shared_info *shinfo;
>  	struct net_iov *niov;
>  
>  	if (likely(skb_frags_readable(skb)))
>  		goto out;
>  
> -	if (!dev->netmem_tx)
> -		goto out_free;
> -
>  	shinfo = skb_shinfo(skb);
> +	if (shinfo->nr_frags == 0)
> +		goto out;
> +
> +	niov = netmem_to_net_iov(skb_frag_netmem(&shinfo->frags[0]));
> +	if (!net_is_devmem_iov(niov))
> +		goto out;

Will this allow unreadable fragments that are not devmem iovs (such as those
potentially introduced by io_uring) to bypass device capability checks?

Previously, if dev->netmem_tx was NETMEM_TX_NONE, any unreadable fragment
would trigger goto out_free. Now, if it's unreadable but not a devmem iov,
it jumps to out and returns the SKB, successfully bypassing the
dev->netmem_tx check below and potentially failing open on legacy devices.

> -	if (shinfo->nr_frags > 0) {
> -		niov = netmem_to_net_iov(skb_frag_netmem(&shinfo->frags[0]));
> -		if (net_is_devmem_iov(niov) &&
> -		    READ_ONCE(net_devmem_iov_binding(niov)->dev) != dev)
> +	binding = net_devmem_iov_binding(niov);
> +
> +	switch (dev->netmem_tx) {
> +	case NETMEM_TX_DMA:
> +		if (READ_ONCE(binding->dev) != dev)
>  			goto out_free;
> +		break;
> +	case NETMEM_TX_NO_DMA:
> +		break;
> +	default: /* NETMEM_TX_NONE */
> +		goto out_free;
>  	}
>  
>  out:
> diff --git a/net/core/devmem.c b/net/core/devmem.c
> index cde4c89bc146..644c286b778f 100644
> --- a/net/core/devmem.c
> +++ b/net/core/devmem.c
[ ... ]
> @@ -212,6 +212,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
>  	}
>  
>  	binding->dev = dev;
> +	binding->vdev = vdev;

Are dev_hold() references needed here for the TX bindings?

Without a reference count, and since TX bindings do not have an RX queue
memory provider to provide an unregister hook, these pointers could become
dangling if the devices are destroyed.

If a new net_device is later allocated at the same address, couldn't this
lead to an ABA problem in net_devmem_get_binding() and allow users to
transmit packets mapped for the old device's DMA domain?

>  	xa_init_flags(&binding->bound_rxqs, XA_FLAGS_ALLOC);
>  
>  	err = percpu_ref_init(&binding->ref,
> diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
> index b8f6076d8007..bc6057aee98e 100644
> --- a/net/core/netdev-genl.c
> +++ b/net/core/netdev-genl.c
[ ... ]
> @@ -1164,16 +1197,26 @@ int netdev_nl_bind_tx_doit(struct sk_buff *skb, struct genl_info *info)
>  		goto err_unlock_netdev;
>  	}
>  
> -	if (!netdev->netmem_tx) {
> +	if (netdev->netmem_tx == NETMEM_TX_NONE) {
>  		err = -EOPNOTSUPP;
>  		NL_SET_ERR_MSG(info->extack,
>  			       "Driver does not support netmem TX");
>  		goto err_unlock_netdev;
>  	}
>  
> -	dma_dev = netdev_queue_get_dma_dev(netdev, 0, NETDEV_QUEUE_TYPE_TX);
> -	binding = net_devmem_bind_dmabuf(netdev, dma_dev, DMA_TO_DEVICE,
> -					 dmabuf_fd, priv, info->extack);
> +	bind_dev = netdev_find_netmem_tx_dev(netdev);
> +	if (!bind_dev) {
> +		err = -EOPNOTSUPP;
> +		NL_SET_ERR_MSG(info->extack,
> +			       "No DMA-capable device found for netmem TX");
> +		goto err_unlock_netdev;
> +	}
> +
> +	dma_dev = netdev_queue_get_dma_dev(bind_dev, 0, NETDEV_QUEUE_TYPE_TX);

Does this cause a lockdep assertion if bind_dev != netdev?

In this function, netdev is locked via netdev_get_by_index_lock(), but
bind_dev is found via netdev_find_netmem_tx_dev(). When bind_dev is not
the same as netdev (e.g., netkit), it appears bind_dev is unlocked.

Calling netdev_queue_get_dma_dev() on bind_dev will trigger the
netdev_ops_assert_locked(dev) assertion inside that function and cause a
splat.

> +	binding = net_devmem_bind_dmabuf(bind_dev,
> +					 bind_dev != netdev ? netdev : NULL,
> +					 dma_dev, DMA_TO_DEVICE, dmabuf_fd,
> +					 priv, info->extack);
>  	if (IS_ERR(binding)) {
>  		err = PTR_ERR(binding);
>  		goto err_unlock_netdev;

Is there a missing rollback if genlmsg_reply() fails at the end of this
function?

After the binding is successfully created here and attached to the socket,
the locks are released and the function returns the result of
genlmsg_reply().

If sending the netlink response fails (e.g., due to memory exhaustion), it
returns an error code, but unlike the RX path, there is no err_unbind
rollback. Does this leave a zombie binding active in the kernel that
userspace has no dmabuf_id for?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260428-tcp-dm-netkit-v1-0-719280eba4d2@meta.com?part=7

  reply	other threads:[~2026-04-29 22:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 22:41 [PATCH net-next 00/11] net: devmem: support devmem with netkit devices Bobby Eshleman
2026-04-28 22:41 ` [PATCH net-next 01/11] net: add netmem_tx modes that indicate dma capability Bobby Eshleman
2026-04-28 22:41 ` [PATCH net-next 02/11] net: bnxt: convert netmem_tx from bool to NETMEM_TX_DMA enum Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 03/11] gve: " Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 04/11] net/mlx5e: " Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 05/11] eth: fbnic: " Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 06/11] netkit: set NETMEM_TX_NO_DMA for unreadable skb passthrough Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 07/11] net: devmem: support TX over NETMEM_TX_NO_DMA devices Bobby Eshleman
2026-04-29 22:43   ` sashiko-bot [this message]
2026-05-01  0:57   ` Jakub Kicinski
2026-05-01  1:07     ` Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 08/11] selftests: drv-net: ncdevmem: add -n flag to skip NIC configuration Bobby Eshleman
2026-04-28 22:42 ` [PATCH net-next 09/11] selftests: drv-net: refactor devmem command builders into lib module Bobby Eshleman
2026-04-29 22:43   ` sashiko-bot
2026-04-28 22:42 ` [PATCH net-next 10/11] selftests: drv-net: add primary_rx_redirect support to NetDrvContEnv Bobby Eshleman
2026-04-29 22:43   ` sashiko-bot
2026-04-28 22:42 ` [PATCH net-next 11/11] selftests: drv-net: add netkit devmem tests Bobby Eshleman
2026-04-29 22:43   ` sashiko-bot
2026-04-29 12:08 ` [PATCH net-next 00/11] net: devmem: support devmem with netkit devices Daniel Borkmann
2026-04-29 15:18   ` Bobby Eshleman
2026-04-29 15:33     ` Daniel Borkmann
2026-05-01  0:59 ` Jakub Kicinski
2026-05-01  1:04   ` Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260429224330.C1917C19425@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bobbyeshleman@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox