linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device
@ 2025-07-02 17:24 Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 17:24 UTC (permalink / raw)
  To: almasrymina, asml.silence, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Andrew Lunn,
	Jens Axboe, Saeed Mahameed, Tariq Toukan, Leon Romanovsky
  Cc: Dragos Tatulea, cratiu, netdev, linux-kernel, io-uring,
	linux-rdma

The io_uring and devmem code is assuming that the parent device of the
netdev is a DMA capable device. This is not always the case.

Some devices do have a DMA capable device that can be used, but not as
parent: mlx5 SFs have an auxdev as parent, but they do have an
associated PCI device.

Also, if DMA is not supported the operation should be blocked. Otherwise
the mapping will return success with 0 mapped entries and the caller
will consider the mapping as succesful.

This RFC is supposed to start the discussion on the best way to:
- Block the binding operation early if not supported.
- Allow devices that support this usecase but don't have a
  parent device as a PCI device.

Dragos Tatulea (4):
  net: Allow non parent devices to be used for ZC DMA
  io_uring/zcrx: Use the new netdev_get_dma_dev() API
  net: devmem: Use the new netdev_get_dma_dev() API
  net/mlx5e: Enable HDS zerocopy flows for SFs

 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |  3 +++
 include/linux/netdevice.h                         | 13 +++++++++++++
 io_uring/zcrx.c                                   |  2 +-
 net/core/devmem.c                                 | 10 +++++++++-
 4 files changed, 26 insertions(+), 2 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 17:24 [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device Dragos Tatulea
@ 2025-07-02 17:24 ` Dragos Tatulea
  2025-07-02 18:32   ` Jakub Kicinski
  2025-07-08 11:06   ` Pavel Begunkov
  2025-07-02 17:24 ` [RFC net-next 2/4] io_uring/zcrx: Use the new netdev_get_dma_dev() API Dragos Tatulea
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 17:24 UTC (permalink / raw)
  To: almasrymina, asml.silence, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Dragos Tatulea, Saeed Mahameed, tariqt, cratiu, netdev,
	linux-kernel

For zerocopy (io_uring, devmem), there is an assumption that the
parent device can do DMA. However that is not always the case:
for example mlx5 SF devices have an auxiliary device as a parent.

This patch introduces the possibility for the driver to specify
another DMA device to be used via the new dma_dev field. The field
should be set before register_netdev().

A new helper function is added to get the DMA device or return NULL.
The callers can check for NULL and fail early if the device is
not capable of DMA.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
 include/linux/netdevice.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5847c20994d3..83faa2314c30 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2550,6 +2550,9 @@ struct net_device {
 
 	struct hwtstamp_provider __rcu	*hwprov;
 
+	/* To be set by devices that can do DMA but not via parent. */
+	struct device		*dma_dev;
+
 	u8			priv[] ____cacheline_aligned
 				       __counted_by(priv_len);
 } ____cacheline_aligned;
@@ -5560,4 +5563,14 @@ extern struct net_device *blackhole_netdev;
 		atomic_long_add((VAL), &(DEV)->stats.__##FIELD)
 #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD)
 
+static inline struct device *netdev_get_dma_dev(const struct net_device *dev)
+{
+	struct device *dma_dev = dev->dma_dev ? dev->dma_dev : dev->dev.parent;
+
+	if (!dma_dev->dma_mask)
+		dma_dev = NULL;
+
+	return dma_dev;
+}
+
 #endif	/* _LINUX_NETDEVICE_H */
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC net-next 2/4] io_uring/zcrx: Use the new netdev_get_dma_dev() API
  2025-07-02 17:24 [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
@ 2025-07-02 17:24 ` Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 3/4] net: devmem: " Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 4/4] net/mlx5e: Enable HDS zerocopy flows for SFs Dragos Tatulea
  3 siblings, 0 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 17:24 UTC (permalink / raw)
  To: almasrymina, asml.silence, Jens Axboe
  Cc: Dragos Tatulea, Saeed Mahameed, tariqt, cratiu, io-uring,
	linux-kernel

Using the new DMA dev helper API, there will be an early failure if the
device does not support DMA.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
 io_uring/zcrx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 797247a34cb7..93462e5b2207 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -584,7 +584,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 		goto err;
 	}
 
-	ifq->dev = ifq->netdev->dev.parent;
+	ifq->dev = netdev_get_dma_dev(ifq->netdev);
 	if (!ifq->dev) {
 		ret = -EOPNOTSUPP;
 		goto err;
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC net-next 3/4] net: devmem: Use the new netdev_get_dma_dev() API
  2025-07-02 17:24 [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 2/4] io_uring/zcrx: Use the new netdev_get_dma_dev() API Dragos Tatulea
@ 2025-07-02 17:24 ` Dragos Tatulea
  2025-07-02 17:24 ` [RFC net-next 4/4] net/mlx5e: Enable HDS zerocopy flows for SFs Dragos Tatulea
  3 siblings, 0 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 17:24 UTC (permalink / raw)
  To: almasrymina, asml.silence, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Dragos Tatulea, Saeed Mahameed, tariqt, cratiu, netdev,
	linux-kernel

Using the new DMA dev helper API, there will be an error on
buffer binding if the device does not support DMA.

Previously this went through and was returning success event if the
mappings were not done. Only a warning was printed.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
 net/core/devmem.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/core/devmem.c b/net/core/devmem.c
index b3a62ca0df65..c6354b47257f 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -183,6 +183,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 {
 	struct net_devmem_dmabuf_binding *binding;
 	static u32 id_alloc_next;
+	struct device *dma_dev;
 	struct scatterlist *sg;
 	struct dma_buf *dmabuf;
 	unsigned int sg_idx, i;
@@ -193,6 +194,13 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 	if (IS_ERR(dmabuf))
 		return ERR_CAST(dmabuf);
 
+	dma_dev = netdev_get_dma_dev(dev);
+	if (!dma_dev) {
+		err = -EOPNOTSUPP;
+		NL_SET_ERR_MSG(extack, "Parent device can't do dma");
+		goto err_put_dmabuf;
+	}
+
 	binding = kzalloc_node(sizeof(*binding), GFP_KERNEL,
 			       dev_to_node(&dev->dev));
 	if (!binding) {
@@ -209,7 +217,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 
 	binding->dmabuf = dmabuf;
 
-	binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent);
+	binding->attachment = dma_buf_attach(binding->dmabuf, dma_dev);
 	if (IS_ERR(binding->attachment)) {
 		err = PTR_ERR(binding->attachment);
 		NL_SET_ERR_MSG(extack, "Failed to bind dmabuf to device");
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC net-next 4/4] net/mlx5e: Enable HDS zerocopy flows for SFs
  2025-07-02 17:24 [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device Dragos Tatulea
                   ` (2 preceding siblings ...)
  2025-07-02 17:24 ` [RFC net-next 3/4] net: devmem: " Dragos Tatulea
@ 2025-07-02 17:24 ` Dragos Tatulea
  3 siblings, 0 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 17:24 UTC (permalink / raw)
  To: almasrymina, asml.silence, Saeed Mahameed, Tariq Toukan,
	Leon Romanovsky, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Dragos Tatulea, cratiu, netdev, linux-rdma, linux-kernel

An SF has an auxiliary device as a parent. This type of device can't be
used for zerocopy DMA mapping operations. A PCI device is required.

Use the new netdev dma_dev functionality to expose the actual PCI device
to be used for DMA. Always set it to keep things generic.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e8e5b347f9b2..c4e45205fba4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5841,6 +5841,9 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
 	/* update XDP supported features */
 	mlx5e_set_xdp_feature(netdev);
 
+	/* Set pci device for dma. Useful for SFs. */
+	netdev->dma_dev = &mdev->pdev->dev;
+
 	if (take_rtnl)
 		rtnl_unlock();
 
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
@ 2025-07-02 18:32   ` Jakub Kicinski
  2025-07-02 20:01     ` Dragos Tatulea
  2025-07-08 11:06   ` Pavel Begunkov
  1 sibling, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2025-07-02 18:32 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: almasrymina, asml.silence, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed, tariqt,
	cratiu, netdev, linux-kernel

On Wed, 2 Jul 2025 20:24:23 +0300 Dragos Tatulea wrote:
> For zerocopy (io_uring, devmem), there is an assumption that the
> parent device can do DMA. However that is not always the case:
> for example mlx5 SF devices have an auxiliary device as a parent.

Noob question -- I thought that the point of SFs was that you can pass
them thru to a VM. How do they not have DMA support? Is it added on
demand by the mediated driver or some such?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 18:32   ` Jakub Kicinski
@ 2025-07-02 20:01     ` Dragos Tatulea
  2025-07-02 20:53       ` Jakub Kicinski
  0 siblings, 1 reply; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-02 20:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: almasrymina, asml.silence, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed, tariqt,
	cratiu, netdev, linux-kernel

On Wed, Jul 02, 2025 at 11:32:08AM -0700, Jakub Kicinski wrote:
> On Wed, 2 Jul 2025 20:24:23 +0300 Dragos Tatulea wrote:
> > For zerocopy (io_uring, devmem), there is an assumption that the
> > parent device can do DMA. However that is not always the case:
> > for example mlx5 SF devices have an auxiliary device as a parent.
> 
> Noob question -- I thought that the point of SFs was that you can pass
> them thru to a VM. How do they not have DMA support? Is it added on
> demand by the mediated driver or some such?
They do have DMA support. Maybe didn't state it properly in the commit
message. It is just that the the parent device
(sf_netdev->dev.parent.device) is not a DMA device. The grandparent
device is a DMA device though (PCI dev of parent PFs). But I wanted to
keep it generic. Maybe it doesn't need to be so generic?

Regarding SFs and VM passtrhough: my understanding is that SFs are more
for passing them to a container.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 20:01     ` Dragos Tatulea
@ 2025-07-02 20:53       ` Jakub Kicinski
  2025-07-03 11:58         ` Parav Pandit
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2025-07-02 20:53 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: almasrymina, asml.silence, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed, tariqt,
	cratiu, netdev, linux-kernel

On Wed, 2 Jul 2025 20:01:48 +0000 Dragos Tatulea wrote:
> On Wed, Jul 02, 2025 at 11:32:08AM -0700, Jakub Kicinski wrote:
> > On Wed, 2 Jul 2025 20:24:23 +0300 Dragos Tatulea wrote:  
> > > For zerocopy (io_uring, devmem), there is an assumption that the
> > > parent device can do DMA. However that is not always the case:
> > > for example mlx5 SF devices have an auxiliary device as a parent.  
> > 
> > Noob question -- I thought that the point of SFs was that you can pass
> > them thru to a VM. How do they not have DMA support? Is it added on
> > demand by the mediated driver or some such?  
> They do have DMA support. Maybe didn't state it properly in the commit
> message. It is just that the the parent device
> (sf_netdev->dev.parent.device) is not a DMA device. The grandparent
> device is a DMA device though (PCI dev of parent PFs). But I wanted to
> keep it generic. Maybe it doesn't need to be so generic?
> 
> Regarding SFs and VM passtrhough: my understanding is that SFs are more
> for passing them to a container.

Mm. We had macvlan offload for over a decade, there's no need for
a fake struct device, auxbus and all them layers to delegate a
"subdevice" to a container in netdev world.
In my head subfunctions are a way of configuring a PCIe PASID ergo
they _only_ make sense in context of DMA.
Maybe someone with closer understanding can chime in. If the kind
of subfunctions you describe are expected, and there's a generic 
way of recognizing them -- automatically going to parent of parent
would indeed be cleaner and less error prone, as you suggest.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 20:53       ` Jakub Kicinski
@ 2025-07-03 11:58         ` Parav Pandit
  2025-07-04 13:11           ` Dragos Tatulea
  2025-07-10 23:58           ` Jakub Kicinski
  0 siblings, 2 replies; 24+ messages in thread
From: Parav Pandit @ 2025-07-03 11:58 UTC (permalink / raw)
  To: Jakub Kicinski, Dragos Tatulea
  Cc: almasrymina@google.com, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org


> From: Jakub Kicinski <kuba@kernel.org>
> Sent: 03 July 2025 02:23 AM
> 
> On Wed, 2 Jul 2025 20:01:48 +0000 Dragos Tatulea wrote:
> > On Wed, Jul 02, 2025 at 11:32:08AM -0700, Jakub Kicinski wrote:
> > > On Wed, 2 Jul 2025 20:24:23 +0300 Dragos Tatulea wrote:
> > > > For zerocopy (io_uring, devmem), there is an assumption that the
> > > > parent device can do DMA. However that is not always the case:
> > > > for example mlx5 SF devices have an auxiliary device as a parent.
> > >
> > > Noob question -- I thought that the point of SFs was that you can
> > > pass them thru to a VM. How do they not have DMA support? Is it
> > > added on demand by the mediated driver or some such?
> > They do have DMA support. Maybe didn't state it properly in the commit
> > message. It is just that the the parent device
> > (sf_netdev->dev.parent.device) is not a DMA device. The grandparent
> > device is a DMA device though (PCI dev of parent PFs). But I wanted to
> > keep it generic. Maybe it doesn't need to be so generic?
> >
> > Regarding SFs and VM passtrhough: my understanding is that SFs are
> > more for passing them to a container.
> 
> Mm. We had macvlan offload for over a decade, there's no need for a fake
> struct device, auxbus and all them layers to delegate a "subdevice" to a
> container in netdev world.

SFs are full PCI devices except having unique PCI BDF as they utilize the parent PCI
Device's BDF (RID).
Presently, SFs are used with and without containers when users need
hw based netdevs.
Some CSPs use them as hot-plug devices from the DPU side too.

Unlike macvlan,
SF netdevs have dedicated hw queues, switchdev representors,
mtu, qdiscs, QoS rate limiters.
vdpa of SFs is prominent use too to offload virtio queues.
And some are using SFs rdma devices too.

SFs are the pre-SIOV_R2 devices and hence reliance of auxiliary bus
and utilizing core driver infrastructure sort of aligns to the kernel core.
If I recollect correctly, the Intel ICE SFs are exactly similar.

> In my head subfunctions are a way of configuring a PCIe PASID ergo they
> _only_ make sense in context of DMA.
SF DMA is on the parent PCI device.

SIOV_R2 will have its own PCI RID which is ratified or getting ratified.
When its done, SF (as SIOV_R2 device) instantiation can be extended
with its own PCI RID. At that point they can be mapped to a VM.

> Maybe someone with closer understanding can chime in. If the kind of
> subfunctions you describe are expected, and there's a generic way of
> recognizing them -- automatically going to parent of parent would indeed be
> cleaner and less error prone, as you suggest.

I am not sure when the parent of parent assumption would fail, but can be
a good start.

If netdev 8 bytes extension to store dma_dev is concern,
probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
So that there is no guess work in devmem layer.

That said, my understanding of devmem is limited, so I could be mistaken here.

In the long term, the devmem infrastructure likely needs to be
modernized to support queue-level DMA mapping.
This is useful because drivers like mlx5 already support
socket-direct netdev that span across two PCI devices.

Currently, devmem is limited to a single PCI device per netdev.
While the buffer pool could be per device, the actual DMA
mapping might need to be deferred until buffer posting
time to support such multi-device scenarios.

In an offline discussion, Dragos mentioned that io_uring already
operates at the queue level, may be some ideas can be picked up
from io_uring?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-03 11:58         ` Parav Pandit
@ 2025-07-04 13:11           ` Dragos Tatulea
  2025-07-07 18:44             ` Mina Almasry
  2025-07-08 11:08             ` Pavel Begunkov
  2025-07-10 23:58           ` Jakub Kicinski
  1 sibling, 2 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-04 13:11 UTC (permalink / raw)
  To: Parav Pandit, Jakub Kicinski
  Cc: almasrymina@google.com, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> 
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: 03 July 2025 02:23 AM
> > 
[...]
> > Maybe someone with closer understanding can chime in. If the kind of
> > subfunctions you describe are expected, and there's a generic way of
> > recognizing them -- automatically going to parent of parent would indeed be
> > cleaner and less error prone, as you suggest.
> 
> I am not sure when the parent of parent assumption would fail, but can be
> a good start.
> 
> If netdev 8 bytes extension to store dma_dev is concern,
> probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> So that there is no guess work in devmem layer.
> 
> That said, my understanding of devmem is limited, so I could be mistaken here.
> 
> In the long term, the devmem infrastructure likely needs to be
> modernized to support queue-level DMA mapping.
> This is useful because drivers like mlx5 already support
> socket-direct netdev that span across two PCI devices.
> 
> Currently, devmem is limited to a single PCI device per netdev.
> While the buffer pool could be per device, the actual DMA
> mapping might need to be deferred until buffer posting
> time to support such multi-device scenarios.
> 
> In an offline discussion, Dragos mentioned that io_uring already
> operates at the queue level, may be some ideas can be picked up
> from io_uring?
The problem for devmem is that the device based API is already set in
stone so not sure how we can change this. Maybe Mina can chime in.

To sum the conversation up, there are 2 imperfect and overlapping
solutions:

1) For the common case of having a single PCI device per netdev, going one
   parent up if the parent device is not DMA capable would be a good
   starting point.

2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
   as it provides the right PF device for the given queue. io_uring
   could use this but devmem can't. Devmem could use 1. but the
   driver has to detect and block the multi PF case.

I think we need both. Either that or a netdev op with an optional queue
parameter. Any thoughts?

[0] https://docs.kernel.org/networking/multi-pf-netdev.html

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-04 13:11           ` Dragos Tatulea
@ 2025-07-07 18:44             ` Mina Almasry
  2025-07-07 21:35               ` Dragos Tatulea
  2025-07-08 11:08             ` Pavel Begunkov
  1 sibling, 1 reply; 24+ messages in thread
From: Mina Almasry @ 2025-07-07 18:44 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Parav Pandit, Jakub Kicinski, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Fri, Jul 4, 2025 at 6:11 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> >
> > > From: Jakub Kicinski <kuba@kernel.org>
> > > Sent: 03 July 2025 02:23 AM
> > >
> [...]
> > > Maybe someone with closer understanding can chime in. If the kind of
> > > subfunctions you describe are expected, and there's a generic way of
> > > recognizing them -- automatically going to parent of parent would indeed be
> > > cleaner and less error prone, as you suggest.
> >
> > I am not sure when the parent of parent assumption would fail, but can be
> > a good start.
> >
> > If netdev 8 bytes extension to store dma_dev is concern,
> > probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> > So that there is no guess work in devmem layer.
> >
> > That said, my understanding of devmem is limited, so I could be mistaken here.
> >
> > In the long term, the devmem infrastructure likely needs to be
> > modernized to support queue-level DMA mapping.
> > This is useful because drivers like mlx5 already support
> > socket-direct netdev that span across two PCI devices.
> >
> > Currently, devmem is limited to a single PCI device per netdev.
> > While the buffer pool could be per device, the actual DMA
> > mapping might need to be deferred until buffer posting
> > time to support such multi-device scenarios.
> >
> > In an offline discussion, Dragos mentioned that io_uring already
> > operates at the queue level, may be some ideas can be picked up
> > from io_uring?
> The problem for devmem is that the device based API is already set in
> stone so not sure how we can change this. Maybe Mina can chime in.
>

I think what's being discussed here is pretty straight forward and
doesn't need UAPI changes, right? Or were you referring to another
API?

> To sum the conversation up, there are 2 imperfect and overlapping
> solutions:
>
> 1) For the common case of having a single PCI device per netdev, going one
>    parent up if the parent device is not DMA capable would be a good
>    starting point.
>
> 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
>    as it provides the right PF device for the given queue.

Agreed these are the 2 options.

> io_uring
>    could use this but devmem can't. Devmem could use 1. but the
>    driver has to detect and block the multi PF case.
>

Why? AFAICT both io_uring and devmem are in the exact same boat right
now, and your patchset seems to show that? Both use dev->dev.parent as
the mapping device, and AFAIU you want to use dev->dev.parent.parent
or something like that?

Also AFAIU the driver won't need to block the multi PF case, it's
actually core that would need to handle that. For example, if devmem
wants to bind a dmabuf to 4 queues, but queues 0 & 1 use 1 dma device,
but queues 2 & 3 use another dma-device, then core doesn't know what
to do, because it can't map the dmabuf to both devices at once. The
restriction would be at bind time that all the queues being bound to
have the same dma device. Core would need to check that and return an
error if the devices diverge. I imagine all of this is the same for
io_uring, unless I'm missing something.

> I think we need both. Either that or a netdev op with an optional queue
> parameter. Any thoughts?
>

At the moment, from your description of the problem, I would lean to
going with Jakub's approach and handling the common case via #1. If
more use cases that require a very custom dma device to be passed we
can always move to #2 later, but FWIW I don't see a reason to come up
with a super future proof complicated solution right now, but I'm
happy to hear disagreements.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-07 18:44             ` Mina Almasry
@ 2025-07-07 21:35               ` Dragos Tatulea
  2025-07-07 21:55                 ` Mina Almasry
  0 siblings, 1 reply; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-07 21:35 UTC (permalink / raw)
  To: Mina Almasry
  Cc: Parav Pandit, Jakub Kicinski, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Mon, Jul 07, 2025 at 11:44:19AM -0700, Mina Almasry wrote:
> On Fri, Jul 4, 2025 at 6:11 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> >
> > On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> > >
> > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > Sent: 03 July 2025 02:23 AM
> > > >
> > [...]
> > > > Maybe someone with closer understanding can chime in. If the kind of
> > > > subfunctions you describe are expected, and there's a generic way of
> > > > recognizing them -- automatically going to parent of parent would indeed be
> > > > cleaner and less error prone, as you suggest.
> > >
> > > I am not sure when the parent of parent assumption would fail, but can be
> > > a good start.
> > >
> > > If netdev 8 bytes extension to store dma_dev is concern,
> > > probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> > > So that there is no guess work in devmem layer.
> > >
> > > That said, my understanding of devmem is limited, so I could be mistaken here.
> > >
> > > In the long term, the devmem infrastructure likely needs to be
> > > modernized to support queue-level DMA mapping.
> > > This is useful because drivers like mlx5 already support
> > > socket-direct netdev that span across two PCI devices.
> > >
> > > Currently, devmem is limited to a single PCI device per netdev.
> > > While the buffer pool could be per device, the actual DMA
> > > mapping might need to be deferred until buffer posting
> > > time to support such multi-device scenarios.
> > >
> > > In an offline discussion, Dragos mentioned that io_uring already
> > > operates at the queue level, may be some ideas can be picked up
> > > from io_uring?
> > The problem for devmem is that the device based API is already set in
> > stone so not sure how we can change this. Maybe Mina can chime in.
> >
> 
> I think what's being discussed here is pretty straight forward and
> doesn't need UAPI changes, right? Or were you referring to another
> API?
>
I was referring to the fact that devmem takes one big buffer, maps it
for a single device (in net_devmem_bind_dmabuf()) and then assigns it to
queues in net_devmem_bind_dmabuf_to_queue(). As the single buffer is
part of the API, I don't see how the mapping could be done in a per
queue way.

> > To sum the conversation up, there are 2 imperfect and overlapping
> > solutions:
> >
> > 1) For the common case of having a single PCI device per netdev, going one
> >    parent up if the parent device is not DMA capable would be a good
> >    starting point.
> >
> > 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
> >    as it provides the right PF device for the given queue.
> 
> Agreed these are the 2 options.
> 
> > io_uring
> >    could use this but devmem can't. Devmem could use 1. but the
> >    driver has to detect and block the multi PF case.
> >
> 
> Why? AFAICT both io_uring and devmem are in the exact same boat right
> now, and your patchset seems to show that? Both use dev->dev.parent as
> the mapping device, and AFAIU you want to use dev->dev.parent.parent
> or something like that?
> 
Right. My patches show that. But the issue raised by Parav is different:
different queues can belong to different DMA devices from different
PFs in the case of Multi PF netdev.

io_uring can do it because it maps individual buffers to individual
queues. So it would be trivial to get the DMA device of each queue through
a new queue op.

> Also AFAIU the driver won't need to block the multi PF case, it's
> actually core that would need to handle that. For example, if devmem
> wants to bind a dmabuf to 4 queues, but queues 0 & 1 use 1 dma device,
> but queues 2 & 3 use another dma-device, then core doesn't know what
> to do, because it can't map the dmabuf to both devices at once. The
> restriction would be at bind time that all the queues being bound to
> have the same dma device. Core would need to check that and return an
> error if the devices diverge. I imagine all of this is the same for
> io_uring, unless I'm missing something.
>
Agreed. Currently I didn't see an API for Multi PF netdev to expose
this information so my thinking defaulted to "let's block it from the
driver side".

> > I think we need both. Either that or a netdev op with an optional queue
> > parameter. Any thoughts?
> >
> 
> At the moment, from your description of the problem, I would lean to
> going with Jakub's approach and handling the common case via #1. If
> more use cases that require a very custom dma device to be passed we
> can always move to #2 later, but FWIW I don't see a reason to come up
> with a super future proof complicated solution right now, but I'm
> happy to hear disagreements.
But we also don't want to start off on the left foot when we know of
both issues right now. And I think we can wrap it up nicely in a single
function similary to how the current patch does it.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-07 21:35               ` Dragos Tatulea
@ 2025-07-07 21:55                 ` Mina Almasry
  2025-07-08  8:52                   ` Parav Pandit
                                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Mina Almasry @ 2025-07-07 21:55 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Parav Pandit, Jakub Kicinski, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Mon, Jul 7, 2025 at 2:35 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> On Mon, Jul 07, 2025 at 11:44:19AM -0700, Mina Almasry wrote:
> > On Fri, Jul 4, 2025 at 6:11 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > >
> > > On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> > > >
> > > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > > Sent: 03 July 2025 02:23 AM
> > > > >
> > > [...]
> > > > > Maybe someone with closer understanding can chime in. If the kind of
> > > > > subfunctions you describe are expected, and there's a generic way of
> > > > > recognizing them -- automatically going to parent of parent would indeed be
> > > > > cleaner and less error prone, as you suggest.
> > > >
> > > > I am not sure when the parent of parent assumption would fail, but can be
> > > > a good start.
> > > >
> > > > If netdev 8 bytes extension to store dma_dev is concern,
> > > > probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> > > > So that there is no guess work in devmem layer.
> > > >
> > > > That said, my understanding of devmem is limited, so I could be mistaken here.
> > > >
> > > > In the long term, the devmem infrastructure likely needs to be
> > > > modernized to support queue-level DMA mapping.
> > > > This is useful because drivers like mlx5 already support
> > > > socket-direct netdev that span across two PCI devices.
> > > >
> > > > Currently, devmem is limited to a single PCI device per netdev.
> > > > While the buffer pool could be per device, the actual DMA
> > > > mapping might need to be deferred until buffer posting
> > > > time to support such multi-device scenarios.
> > > >
> > > > In an offline discussion, Dragos mentioned that io_uring already
> > > > operates at the queue level, may be some ideas can be picked up
> > > > from io_uring?
> > > The problem for devmem is that the device based API is already set in
> > > stone so not sure how we can change this. Maybe Mina can chime in.
> > >
> >
> > I think what's being discussed here is pretty straight forward and
> > doesn't need UAPI changes, right? Or were you referring to another
> > API?
> >
> I was referring to the fact that devmem takes one big buffer, maps it
> for a single device (in net_devmem_bind_dmabuf()) and then assigns it to
> queues in net_devmem_bind_dmabuf_to_queue(). As the single buffer is
> part of the API, I don't see how the mapping could be done in a per
> queue way.
>

Oh, I see. devmem does support mapping a single buffer to multiple
queues in a single netlink API call, but there is nothing stopping the
user from mapping N buffers to N queues in N netlink API calls.

> > > To sum the conversation up, there are 2 imperfect and overlapping
> > > solutions:
> > >
> > > 1) For the common case of having a single PCI device per netdev, going one
> > >    parent up if the parent device is not DMA capable would be a good
> > >    starting point.
> > >
> > > 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
> > >    as it provides the right PF device for the given queue.
> >
> > Agreed these are the 2 options.
> >
> > > io_uring
> > >    could use this but devmem can't. Devmem could use 1. but the
> > >    driver has to detect and block the multi PF case.
> > >
> >
> > Why? AFAICT both io_uring and devmem are in the exact same boat right
> > now, and your patchset seems to show that? Both use dev->dev.parent as
> > the mapping device, and AFAIU you want to use dev->dev.parent.parent
> > or something like that?
> >
> Right. My patches show that. But the issue raised by Parav is different:
> different queues can belong to different DMA devices from different
> PFs in the case of Multi PF netdev.
>
> io_uring can do it because it maps individual buffers to individual
> queues. So it would be trivial to get the DMA device of each queue through
> a new queue op.
>

Right, devmem doesn't stop you from mapping individual buffers to
individual queues. It just also supports mapping the same buffer to
multiple queues. AFAIR, io_uring also supports mapping a single buffer
to multiple queues, but I could easily be very wrong about that. It's
just a vague recollection from reviewing the iozcrx.c implementation a
while back.

In your case, I think, if the user is trying to map a single buffer to
multiple queues, and those queues have different dma-devices, then you
have to error out. I don't see how to sanely handle that without
adding a lot of code. The user would have to fall back onto mapping a
single buffer to a single queue (or multiple queues that share the
same dma-device).

> > Also AFAIU the driver won't need to block the multi PF case, it's
> > actually core that would need to handle that. For example, if devmem
> > wants to bind a dmabuf to 4 queues, but queues 0 & 1 use 1 dma device,
> > but queues 2 & 3 use another dma-device, then core doesn't know what
> > to do, because it can't map the dmabuf to both devices at once. The
> > restriction would be at bind time that all the queues being bound to
> > have the same dma device. Core would need to check that and return an
> > error if the devices diverge. I imagine all of this is the same for
> > io_uring, unless I'm missing something.
> >
> Agreed. Currently I didn't see an API for Multi PF netdev to expose
> this information so my thinking defaulted to "let's block it from the
> driver side".
>

Agreed.

> > > I think we need both. Either that or a netdev op with an optional queue
> > > parameter. Any thoughts?
> > >
> >
> > At the moment, from your description of the problem, I would lean to
> > going with Jakub's approach and handling the common case via #1. If
> > more use cases that require a very custom dma device to be passed we
> > can always move to #2 later, but FWIW I don't see a reason to come up
> > with a super future proof complicated solution right now, but I'm
> > happy to hear disagreements.
> But we also don't want to start off on the left foot when we know of
> both issues right now. And I think we can wrap it up nicely in a single
> function similary to how the current patch does it.
>

FWIW I don't have a strong preference. I'm fine with the simple
solution for now and I'm fine with the slightly more complicated
future proof solution.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-07 21:55                 ` Mina Almasry
@ 2025-07-08  8:52                   ` Parav Pandit
  2025-07-08 10:47                   ` Pavel Begunkov
  2025-07-08 14:23                   ` Dragos Tatulea
  2 siblings, 0 replies; 24+ messages in thread
From: Parav Pandit @ 2025-07-08  8:52 UTC (permalink / raw)
  To: Mina Almasry, Dragos Tatulea
  Cc: Jakub Kicinski, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org


> From: Mina Almasry <almasrymina@google.com>
> Sent: 08 July 2025 03:25 AM
> 
> On Mon, Jul 7, 2025 at 2:35 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> >
> > On Mon, Jul 07, 2025 at 11:44:19AM -0700, Mina Almasry wrote:
> > > On Fri, Jul 4, 2025 at 6:11 AM Dragos Tatulea <dtatulea@nvidia.com>
> wrote:
> > > >
> > > > On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> > > > >
> > > > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > > > Sent: 03 July 2025 02:23 AM
> > > > > >
> > > > [...]
> > > > > > Maybe someone with closer understanding can chime in. If the
> > > > > > kind of subfunctions you describe are expected, and there's a
> > > > > > generic way of recognizing them -- automatically going to
> > > > > > parent of parent would indeed be cleaner and less error prone, as you
> suggest.
> > > > >
> > > > > I am not sure when the parent of parent assumption would fail,
> > > > > but can be a good start.
> > > > >
> > > > > If netdev 8 bytes extension to store dma_dev is concern,
> > > > > probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer
> parent->parent?
> > > > > So that there is no guess work in devmem layer.
> > > > >
> > > > > That said, my understanding of devmem is limited, so I could be
> mistaken here.
> > > > >
> > > > > In the long term, the devmem infrastructure likely needs to be
> > > > > modernized to support queue-level DMA mapping.
> > > > > This is useful because drivers like mlx5 already support
> > > > > socket-direct netdev that span across two PCI devices.
> > > > >
> > > > > Currently, devmem is limited to a single PCI device per netdev.
> > > > > While the buffer pool could be per device, the actual DMA
> > > > > mapping might need to be deferred until buffer posting time to
> > > > > support such multi-device scenarios.
> > > > >
> > > > > In an offline discussion, Dragos mentioned that io_uring already
> > > > > operates at the queue level, may be some ideas can be picked up
> > > > > from io_uring?
> > > > The problem for devmem is that the device based API is already set
> > > > in stone so not sure how we can change this. Maybe Mina can chime in.
> > > >
> > >
> > > I think what's being discussed here is pretty straight forward and
> > > doesn't need UAPI changes, right? Or were you referring to another
> > > API?
> > >
> > I was referring to the fact that devmem takes one big buffer, maps it
> > for a single device (in net_devmem_bind_dmabuf()) and then assigns it
> > to queues in net_devmem_bind_dmabuf_to_queue(). As the single buffer
> > is part of the API, I don't see how the mapping could be done in a per
> > queue way.
> >
> 
> Oh, I see. devmem does support mapping a single buffer to multiple queues in a
> single netlink API call, but there is nothing stopping the user from mapping N
> buffers to N queues in N netlink API calls.
> 
> > > > To sum the conversation up, there are 2 imperfect and overlapping
> > > > solutions:
> > > >
> > > > 1) For the common case of having a single PCI device per netdev, going
> one
> > > >    parent up if the parent device is not DMA capable would be a good
> > > >    starting point.
> > > >
> > > > 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
> > > >    as it provides the right PF device for the given queue.
> > >
> > > Agreed these are the 2 options.
> > >
> > > > io_uring
> > > >    could use this but devmem can't. Devmem could use 1. but the
> > > >    driver has to detect and block the multi PF case.
> > > >
> > >
> > > Why? AFAICT both io_uring and devmem are in the exact same boat
> > > right now, and your patchset seems to show that? Both use
> > > dev->dev.parent as the mapping device, and AFAIU you want to use
> > > dev->dev.parent.parent or something like that?
> > >
> > Right. My patches show that. But the issue raised by Parav is different:
> > different queues can belong to different DMA devices from different
> > PFs in the case of Multi PF netdev.
> >
> > io_uring can do it because it maps individual buffers to individual
> > queues. So it would be trivial to get the DMA device of each queue
> > through a new queue op.
> >
> 
> Right, devmem doesn't stop you from mapping individual buffers to individual
> queues. It just also supports mapping the same buffer to multiple queues.
> AFAIR, io_uring also supports mapping a single buffer to multiple queues, but I
> could easily be very wrong about that. It's just a vague recollection from
> reviewing the iozcrx.c implementation a while back.
> 
> In your case, I think, if the user is trying to map a single buffer to multiple
> queues, and those queues have different dma-devices, then you have to error
> out. I don't see how to sanely handle that without adding a lot of code. The user
> would have to fall back onto mapping a single buffer to a single queue (or
> multiple queues that share the same dma-device).
> 
> > > Also AFAIU the driver won't need to block the multi PF case, it's
> > > actually core that would need to handle that. For example, if devmem
> > > wants to bind a dmabuf to 4 queues, but queues 0 & 1 use 1 dma
> > > device, but queues 2 & 3 use another dma-device, then core doesn't
> > > know what to do, because it can't map the dmabuf to both devices at
> > > once. The restriction would be at bind time that all the queues
> > > being bound to have the same dma device. Core would need to check
> > > that and return an error if the devices diverge. I imagine all of
> > > this is the same for io_uring, unless I'm missing something.
> > >
> > Agreed. Currently I didn't see an API for Multi PF netdev to expose
> > this information so my thinking defaulted to "let's block it from the
> > driver side".
> >
> 
> Agreed.
> 
> > > > I think we need both. Either that or a netdev op with an optional
> > > > queue parameter. Any thoughts?
> > > >
> > >
> > > At the moment, from your description of the problem, I would lean to
> > > going with Jakub's approach and handling the common case via #1. If
> > > more use cases that require a very custom dma device to be passed we
> > > can always move to #2 later, but FWIW I don't see a reason to come
> > > up with a super future proof complicated solution right now, but I'm
> > > happy to hear disagreements.
> > But we also don't want to start off on the left foot when we know of
> > both issues right now. And I think we can wrap it up nicely in a
> > single function similary to how the current patch does it.
> >
> 
> FWIW I don't have a strong preference. I'm fine with the simple solution for now
> and I'm fine with the slightly more complicated future proof solution.
> 
Looks good to me as well.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-07 21:55                 ` Mina Almasry
  2025-07-08  8:52                   ` Parav Pandit
@ 2025-07-08 10:47                   ` Pavel Begunkov
  2025-07-08 14:23                   ` Dragos Tatulea
  2 siblings, 0 replies; 24+ messages in thread
From: Pavel Begunkov @ 2025-07-08 10:47 UTC (permalink / raw)
  To: Mina Almasry, Dragos Tatulea
  Cc: Parav Pandit, Jakub Kicinski, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed,
	Tariq Toukan, Cosmin Ratiu, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 7/7/25 22:55, Mina Almasry wrote:
> On Mon, Jul 7, 2025 at 2:35 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
...>> Right. My patches show that. But the issue raised by Parav is different:
>> different queues can belong to different DMA devices from different
>> PFs in the case of Multi PF netdev.
>>
>> io_uring can do it because it maps individual buffers to individual
>> queues. So it would be trivial to get the DMA device of each queue through
>> a new queue op.
>>
> 
> Right, devmem doesn't stop you from mapping individual buffers to
> individual queues. It just also supports mapping the same buffer to
> multiple queues. AFAIR, io_uring also supports mapping a single buffer
> to multiple queues, but I could easily be very wrong about that. It's

It doesn't, but it could benefit from sharing depending on userspace,
so it might eventually come to the same problem.

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
  2025-07-02 18:32   ` Jakub Kicinski
@ 2025-07-08 11:06   ` Pavel Begunkov
  2025-07-08 14:10     ` Mina Almasry
  1 sibling, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2025-07-08 11:06 UTC (permalink / raw)
  To: Dragos Tatulea, almasrymina, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Saeed Mahameed, tariqt, cratiu, netdev, linux-kernel

On 7/2/25 18:24, Dragos Tatulea wrote:
> For zerocopy (io_uring, devmem), there is an assumption that the
> parent device can do DMA. However that is not always the case:
> for example mlx5 SF devices have an auxiliary device as a parent.
> 
> This patch introduces the possibility for the driver to specify
> another DMA device to be used via the new dma_dev field. The field
> should be set before register_netdev().
> 
> A new helper function is added to get the DMA device or return NULL.
> The callers can check for NULL and fail early if the device is
> not capable of DMA.
> 
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> ---
>   include/linux/netdevice.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 5847c20994d3..83faa2314c30 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2550,6 +2550,9 @@ struct net_device {
>   
>   	struct hwtstamp_provider __rcu	*hwprov;
>   
> +	/* To be set by devices that can do DMA but not via parent. */
> +	struct device		*dma_dev;
> +
>   	u8			priv[] ____cacheline_aligned
>   				       __counted_by(priv_len);
>   } ____cacheline_aligned;
> @@ -5560,4 +5563,14 @@ extern struct net_device *blackhole_netdev;
>   		atomic_long_add((VAL), &(DEV)->stats.__##FIELD)
>   #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD)
>   
> +static inline struct device *netdev_get_dma_dev(const struct net_device *dev)
> +{
> +	struct device *dma_dev = dev->dma_dev ? dev->dma_dev : dev->dev.parent;
> +
> +	if (!dma_dev->dma_mask)

dev->dev.parent is NULL for veth and I assume other virtual devices as well.

Mina, can you verify that devmem checks that? Seems like veth is rejected
by netdev_need_ops_lock() in netdev_nl_bind_rx_doit(), but IIRC per netdev
locking came after devmem got merged, and there are other virt devices that
might already be converted.

> +		dma_dev = NULL;
> +
> +	return dma_dev;
> +}
> +
>   #endif	/* _LINUX_NETDEVICE_H */

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-04 13:11           ` Dragos Tatulea
  2025-07-07 18:44             ` Mina Almasry
@ 2025-07-08 11:08             ` Pavel Begunkov
  2025-07-08 14:26               ` Dragos Tatulea
  1 sibling, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2025-07-08 11:08 UTC (permalink / raw)
  To: Dragos Tatulea, Parav Pandit, Jakub Kicinski
  Cc: almasrymina@google.com, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed,
	Tariq Toukan, Cosmin Ratiu, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 7/4/25 14:11, Dragos Tatulea wrote:
> On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
>>
>>> From: Jakub Kicinski <kuba@kernel.org>
>>> Sent: 03 July 2025 02:23 AM
...>> In an offline discussion, Dragos mentioned that io_uring already
>> operates at the queue level, may be some ideas can be picked up
>> from io_uring?
> The problem for devmem is that the device based API is already set in
> stone so not sure how we can change this. Maybe Mina can chime in.
> 
> To sum the conversation up, there are 2 imperfect and overlapping
> solutions:
> 
> 1) For the common case of having a single PCI device per netdev, going one
>     parent up if the parent device is not DMA capable would be a good
>     starting point.
> 
> 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
>     as it provides the right PF device for the given queue. io_uring
>     could use this but devmem can't. Devmem could use 1. but the
>     driver has to detect and block the multi PF case.
> 
> I think we need both. Either that or a netdev op with an optional queue
> parameter. Any thoughts?

No objection from zcrx for either approach, but it sounds like a good
idea to have something simple for 1) sooner than later, and perhaps
marked as a fix.

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-08 11:06   ` Pavel Begunkov
@ 2025-07-08 14:10     ` Mina Almasry
  2025-07-08 15:25       ` Pavel Begunkov
  0 siblings, 1 reply; 24+ messages in thread
From: Mina Almasry @ 2025-07-08 14:10 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: Dragos Tatulea, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Saeed Mahameed, tariqt,
	cratiu, netdev, linux-kernel

On Tue, Jul 8, 2025 at 4:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 7/2/25 18:24, Dragos Tatulea wrote:
> > For zerocopy (io_uring, devmem), there is an assumption that the
> > parent device can do DMA. However that is not always the case:
> > for example mlx5 SF devices have an auxiliary device as a parent.
> >
> > This patch introduces the possibility for the driver to specify
> > another DMA device to be used via the new dma_dev field. The field
> > should be set before register_netdev().
> >
> > A new helper function is added to get the DMA device or return NULL.
> > The callers can check for NULL and fail early if the device is
> > not capable of DMA.
> >
> > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> > ---
> >   include/linux/netdevice.h | 13 +++++++++++++
> >   1 file changed, 13 insertions(+)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 5847c20994d3..83faa2314c30 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2550,6 +2550,9 @@ struct net_device {
> >
> >       struct hwtstamp_provider __rcu  *hwprov;
> >
> > +     /* To be set by devices that can do DMA but not via parent. */
> > +     struct device           *dma_dev;
> > +
> >       u8                      priv[] ____cacheline_aligned
> >                                      __counted_by(priv_len);
> >   } ____cacheline_aligned;
> > @@ -5560,4 +5563,14 @@ extern struct net_device *blackhole_netdev;
> >               atomic_long_add((VAL), &(DEV)->stats.__##FIELD)
> >   #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD)
> >
> > +static inline struct device *netdev_get_dma_dev(const struct net_device *dev)
> > +{
> > +     struct device *dma_dev = dev->dma_dev ? dev->dma_dev : dev->dev.parent;
> > +
> > +     if (!dma_dev->dma_mask)
>
> dev->dev.parent is NULL for veth and I assume other virtual devices as well.
>
> Mina, can you verify that devmem checks that? Seems like veth is rejected
> by netdev_need_ops_lock() in netdev_nl_bind_rx_doit(), but IIRC per netdev
> locking came after devmem got merged, and there are other virt devices that
> might already be converted.
>

We never attempt devmem binding on any devices that don't support the
queue API, even before the per netdev locking was merged (there was an
explicit ops check).

even then, dev->dev.parent == NULL isn't disasterous, as far as I
could surmise from a quick look. Seems to be only used with
dma_buf_attach which NULL checks it.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-07 21:55                 ` Mina Almasry
  2025-07-08  8:52                   ` Parav Pandit
  2025-07-08 10:47                   ` Pavel Begunkov
@ 2025-07-08 14:23                   ` Dragos Tatulea
  2 siblings, 0 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-08 14:23 UTC (permalink / raw)
  To: Mina Almasry
  Cc: Parav Pandit, Jakub Kicinski, asml.silence@gmail.com, Andrew Lunn,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Mon, Jul 07, 2025 at 02:55:11PM -0700, Mina Almasry wrote:
> On Mon, Jul 7, 2025 at 2:35 PM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> >
> > On Mon, Jul 07, 2025 at 11:44:19AM -0700, Mina Almasry wrote:
> > > On Fri, Jul 4, 2025 at 6:11 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
> > > >
> > > > On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> > > > >
> > > > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > > > Sent: 03 July 2025 02:23 AM
> > > > > >
> > > > [...]
> > > > > > Maybe someone with closer understanding can chime in. If the kind of
> > > > > > subfunctions you describe are expected, and there's a generic way of
> > > > > > recognizing them -- automatically going to parent of parent would indeed be
> > > > > > cleaner and less error prone, as you suggest.
> > > > >
> > > > > I am not sure when the parent of parent assumption would fail, but can be
> > > > > a good start.
> > > > >
> > > > > If netdev 8 bytes extension to store dma_dev is concern,
> > > > > probably a netdev IFF_DMA_DEV_PARENT can be elegant to refer parent->parent?
> > > > > So that there is no guess work in devmem layer.
> > > > >
> > > > > That said, my understanding of devmem is limited, so I could be mistaken here.
> > > > >
> > > > > In the long term, the devmem infrastructure likely needs to be
> > > > > modernized to support queue-level DMA mapping.
> > > > > This is useful because drivers like mlx5 already support
> > > > > socket-direct netdev that span across two PCI devices.
> > > > >
> > > > > Currently, devmem is limited to a single PCI device per netdev.
> > > > > While the buffer pool could be per device, the actual DMA
> > > > > mapping might need to be deferred until buffer posting
> > > > > time to support such multi-device scenarios.
> > > > >
> > > > > In an offline discussion, Dragos mentioned that io_uring already
> > > > > operates at the queue level, may be some ideas can be picked up
> > > > > from io_uring?
> > > > The problem for devmem is that the device based API is already set in
> > > > stone so not sure how we can change this. Maybe Mina can chime in.
> > > >
> > >
> > > I think what's being discussed here is pretty straight forward and
> > > doesn't need UAPI changes, right? Or were you referring to another
> > > API?
> > >
> > I was referring to the fact that devmem takes one big buffer, maps it
> > for a single device (in net_devmem_bind_dmabuf()) and then assigns it to
> > queues in net_devmem_bind_dmabuf_to_queue(). As the single buffer is
> > part of the API, I don't see how the mapping could be done in a per
> > queue way.
> >
> 
> Oh, I see. devmem does support mapping a single buffer to multiple
> queues in a single netlink API call, but there is nothing stopping the
> user from mapping N buffers to N queues in N netlink API calls.
Oh, yes, of course. Why didn't I think of that...

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-08 11:08             ` Pavel Begunkov
@ 2025-07-08 14:26               ` Dragos Tatulea
  0 siblings, 0 replies; 24+ messages in thread
From: Dragos Tatulea @ 2025-07-08 14:26 UTC (permalink / raw)
  To: Pavel Begunkov, Parav Pandit, Jakub Kicinski
  Cc: almasrymina@google.com, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, Saeed Mahameed,
	Tariq Toukan, Cosmin Ratiu, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, Jul 08, 2025 at 12:08:48PM +0100, Pavel Begunkov wrote:
> On 7/4/25 14:11, Dragos Tatulea wrote:
> > On Thu, Jul 03, 2025 at 01:58:50PM +0200, Parav Pandit wrote:
> > > 
> > > > From: Jakub Kicinski <kuba@kernel.org>
> > > > Sent: 03 July 2025 02:23 AM
> ...>> In an offline discussion, Dragos mentioned that io_uring already
> > > operates at the queue level, may be some ideas can be picked up
> > > from io_uring?
> > The problem for devmem is that the device based API is already set in
> > stone so not sure how we can change this. Maybe Mina can chime in.
> > 
> > To sum the conversation up, there are 2 imperfect and overlapping
> > solutions:
> > 
> > 1) For the common case of having a single PCI device per netdev, going one
> >     parent up if the parent device is not DMA capable would be a good
> >     starting point.
> > 
> > 2) For multi-PF netdev [0], a per-queue get_dma_dev() op would be ideal
> >     as it provides the right PF device for the given queue. io_uring
> >     could use this but devmem can't. Devmem could use 1. but the
> >     driver has to detect and block the multi PF case.
> > 
> > I think we need both. Either that or a netdev op with an optional queue
> > parameter. Any thoughts?
> 
> No objection from zcrx for either approach, but it sounds like a good
> idea to have something simple for 1) sooner than later, and perhaps
> marked as a fix.
>
Sounds good. This is light enough to be a single patch.

Will tackle multi-PF netdev in a subsequent series.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-08 14:10     ` Mina Almasry
@ 2025-07-08 15:25       ` Pavel Begunkov
  0 siblings, 0 replies; 24+ messages in thread
From: Pavel Begunkov @ 2025-07-08 15:25 UTC (permalink / raw)
  To: Mina Almasry
  Cc: Dragos Tatulea, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Saeed Mahameed, tariqt,
	cratiu, netdev, linux-kernel

On 7/8/25 15:10, Mina Almasry wrote:
> On Tue, Jul 8, 2025 at 4:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 7/2/25 18:24, Dragos Tatulea wrote:
>>> For zerocopy (io_uring, devmem), there is an assumption that the
>>> parent device can do DMA. However that is not always the case:
>>> for example mlx5 SF devices have an auxiliary device as a parent.
>>>
>>> This patch introduces the possibility for the driver to specify
>>> another DMA device to be used via the new dma_dev field. The field
>>> should be set before register_netdev().
>>>
>>> A new helper function is added to get the DMA device or return NULL.
>>> The callers can check for NULL and fail early if the device is
>>> not capable of DMA.
>>>
>>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
>>> ---
>>>    include/linux/netdevice.h | 13 +++++++++++++
>>>    1 file changed, 13 insertions(+)
>>>
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index 5847c20994d3..83faa2314c30 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2550,6 +2550,9 @@ struct net_device {
>>>
>>>        struct hwtstamp_provider __rcu  *hwprov;
>>>
>>> +     /* To be set by devices that can do DMA but not via parent. */
>>> +     struct device           *dma_dev;
>>> +
>>>        u8                      priv[] ____cacheline_aligned
>>>                                       __counted_by(priv_len);
>>>    } ____cacheline_aligned;
>>> @@ -5560,4 +5563,14 @@ extern struct net_device *blackhole_netdev;
>>>                atomic_long_add((VAL), &(DEV)->stats.__##FIELD)
>>>    #define DEV_STATS_READ(DEV, FIELD) atomic_long_read(&(DEV)->stats.__##FIELD)
>>>
>>> +static inline struct device *netdev_get_dma_dev(const struct net_device *dev)
>>> +{
>>> +     struct device *dma_dev = dev->dma_dev ? dev->dma_dev : dev->dev.parent;
>>> +
>>> +     if (!dma_dev->dma_mask)
>>
>> dev->dev.parent is NULL for veth and I assume other virtual devices as well.
>>
>> Mina, can you verify that devmem checks that? Seems like veth is rejected
>> by netdev_need_ops_lock() in netdev_nl_bind_rx_doit(), but IIRC per netdev
>> locking came after devmem got merged, and there are other virt devices that
>> might already be converted.
>>
> 
> We never attempt devmem binding on any devices that don't support the
> queue API, even before the per netdev locking was merged (there was an
> explicit ops check).

great!

io_uring doesn't look at ->queue_mgmt_ops, so the helper from this
patch needs to handle it one way or another.

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-03 11:58         ` Parav Pandit
  2025-07-04 13:11           ` Dragos Tatulea
@ 2025-07-10 23:58           ` Jakub Kicinski
  2025-07-11  2:52             ` Parav Pandit
  1 sibling, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2025-07-10 23:58 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Dragos Tatulea, almasrymina@google.com, asml.silence@gmail.com,
	Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Thu, 3 Jul 2025 11:58:50 +0000 Parav Pandit wrote:
> > In my head subfunctions are a way of configuring a PCIe PASID ergo they
> > _only_ make sense in context of DMA.  
> SF DMA is on the parent PCI device.
> 
> SIOV_R2 will have its own PCI RID which is ratified or getting ratified.
> When its done, SF (as SIOV_R2 device) instantiation can be extended
> with its own PCI RID. At that point they can be mapped to a VM.

AFAIU every PCIe transaction for a queue with a PASID assigned
should have a PASID prefix. Why is a different RID necessary?
CPUs can't select IOMMU context based on RID+PASID?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-10 23:58           ` Jakub Kicinski
@ 2025-07-11  2:52             ` Parav Pandit
  2025-07-11 13:51               ` Jakub Kicinski
  0 siblings, 1 reply; 24+ messages in thread
From: Parav Pandit @ 2025-07-11  2:52 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dragos Tatulea, almasrymina@google.com, asml.silence@gmail.com,
	Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org



> From: Jakub Kicinski <kuba@kernel.org>
> Sent: 11 July 2025 05:29 AM
> Subject: Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC
> DMA
> 
> On Thu, 3 Jul 2025 11:58:50 +0000 Parav Pandit wrote:
> > > In my head subfunctions are a way of configuring a PCIe PASID ergo
> > > they _only_ make sense in context of DMA.
> > SF DMA is on the parent PCI device.
> >
> > SIOV_R2 will have its own PCI RID which is ratified or getting ratified.
> > When its done, SF (as SIOV_R2 device) instantiation can be extended
> > with its own PCI RID. At that point they can be mapped to a VM.
> 
> AFAIU every PCIe transaction for a queue with a PASID assigned should have a
> PASID prefix. Why is a different RID necessary?
> CPUs can't select IOMMU context based on RID+PASID?
It can, however,
PASID is meant to be used for process isolation and not expected to be abused for identify the device.
Doing so, would also prohibits using PASID inside the VM. It requires another complex vPASID to pPASID translation.

Tagging MSI-X interrupts with PASID is another challenge.
For CC defining isolation boundary with RID+PASID was yet another hack.

There were other issues in splitting PASID for device scaling vs process scaling for dual use.

So it was concluded to opt to avoid that abuse and use the standard RID construct for device identification.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA
  2025-07-11  2:52             ` Parav Pandit
@ 2025-07-11 13:51               ` Jakub Kicinski
  0 siblings, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2025-07-11 13:51 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Dragos Tatulea, almasrymina@google.com, asml.silence@gmail.com,
	Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Saeed Mahameed, Tariq Toukan, Cosmin Ratiu,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

On Fri, 11 Jul 2025 02:52:23 +0000 Parav Pandit wrote:
> > On Thu, 3 Jul 2025 11:58:50 +0000 Parav Pandit wrote:  
> > > > In my head subfunctions are a way of configuring a PCIe PASID ergo
> > > > they _only_ make sense in context of DMA.  
> > > SF DMA is on the parent PCI device.
> > >
> > > SIOV_R2 will have its own PCI RID which is ratified or getting ratified.
> > > When its done, SF (as SIOV_R2 device) instantiation can be extended
> > > with its own PCI RID. At that point they can be mapped to a VM.  
> > 
> > AFAIU every PCIe transaction for a queue with a PASID assigned should have a
> > PASID prefix. Why is a different RID necessary?
> > CPUs can't select IOMMU context based on RID+PASID?  
> It can, however,
> PASID is meant to be used for process isolation and not expected to
> be abused for identify the device. Doing so, would also prohibits
> using PASID inside the VM. It requires another complex vPASID to
> pPASID translation.
> 
> Tagging MSI-X interrupts with PASID is another challenge.
> For CC defining isolation boundary with RID+PASID was yet another
> hack.
> 
> There were other issues in splitting PASID for device scaling vs
> process scaling for dual use.
> 
> So it was concluded to opt to avoid that abuse and use the standard
> RID construct for device identification.

I see, that explains it. Thanks Parav!

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-07-11 13:51 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-02 17:24 [RFC net-next 0/4] devmem/io_uring: Allow devices without parent PCI device Dragos Tatulea
2025-07-02 17:24 ` [RFC net-next 1/4] net: Allow non parent devices to be used for ZC DMA Dragos Tatulea
2025-07-02 18:32   ` Jakub Kicinski
2025-07-02 20:01     ` Dragos Tatulea
2025-07-02 20:53       ` Jakub Kicinski
2025-07-03 11:58         ` Parav Pandit
2025-07-04 13:11           ` Dragos Tatulea
2025-07-07 18:44             ` Mina Almasry
2025-07-07 21:35               ` Dragos Tatulea
2025-07-07 21:55                 ` Mina Almasry
2025-07-08  8:52                   ` Parav Pandit
2025-07-08 10:47                   ` Pavel Begunkov
2025-07-08 14:23                   ` Dragos Tatulea
2025-07-08 11:08             ` Pavel Begunkov
2025-07-08 14:26               ` Dragos Tatulea
2025-07-10 23:58           ` Jakub Kicinski
2025-07-11  2:52             ` Parav Pandit
2025-07-11 13:51               ` Jakub Kicinski
2025-07-08 11:06   ` Pavel Begunkov
2025-07-08 14:10     ` Mina Almasry
2025-07-08 15:25       ` Pavel Begunkov
2025-07-02 17:24 ` [RFC net-next 2/4] io_uring/zcrx: Use the new netdev_get_dma_dev() API Dragos Tatulea
2025-07-02 17:24 ` [RFC net-next 3/4] net: devmem: " Dragos Tatulea
2025-07-02 17:24 ` [RFC net-next 4/4] net/mlx5e: Enable HDS zerocopy flows for SFs Dragos Tatulea

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).