* [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support
@ 2026-03-05 15:56 Koichiro Den
2026-03-05 15:56 ` [PATCH v3 1/4] net: ntb_netdev: Introduce per-queue context Koichiro Den
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Koichiro Den @ 2026-03-05 15:56 UTC (permalink / raw)
To: Jon Mason, Dave Jiang, Allen Hubbe, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: ntb, netdev, linux-kernel
Hi,
ntb_netdev currently hard-codes a single NTB transport queue pair, which
means the datapath effectively runs as a single-queue netdev regardless
of available CPUs / parallel flows.
The longer-term motivation here is throughput scale-out: allow
ntb_netdev to grow beyond the single-QP bottleneck and make it possible
to spread TX/RX work across multiple queue pairs as link speeds and core
counts keep increasing.
Multi-queue also unlocks the standard networking knobs on top of it. In
particular, once the device exposes multiple TX queues, qdisc/tc can
steer flows/traffic classes into different queues (via
skb->queue_mapping), enabling per-flow/per-class scheduling and QoS in a
familiar way.
Usage
=====
1. Ensure the NTB device you want to use has multiple Memory Windows.
2. modprobe ntb_transport on both sides, if it's not built-in.
3. modprobe ntb_netdev on both sides, if it's not built-in.
4. Use ethtool -L to configure the desired number of queues.
The default number of real (combined) queues is 1.
e.g. ethtool -L eth0 combined 2 # to increase
ethtool -L eth0 combined 1 # to reduce back to 1
Note:
* If the NTB device has only a single Memory Window, ethtool -L eth0
combined N (N > 1) fails with:
"netlink error: No space left on device".
* ethtool -L can be executed while the net_device is up.
Compatibility
=============
The default remains a single queue, so behavior is unchanged unless
the user explicitly increases the number of queues.
Kernel base
===========
ntb-next (latest as of 2026-03-06):
commit 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link
disable path")
Testing / Results
=================
Environment / command line:
- 2x R-Car S4 Spider boards
"Kernel base" (see above) + this series
TCP:
[RC] $ sudo iperf3 -s
[EP] $ sudo iperf3 -Z -c ${SERVER_IP} -l 65480 -w 512M -P 4
UDP:
[RC] $ sudo iperf3 -s
[EP] $ sudo iperf3 -ub0 -c ${SERVER_IP} -l 65480 -w 512M -P 4
Without this series:
TCP / UDP : 589 Mbps / 580 Mbps
With this series (default single queue):
TCP / UDP : 583 Mbps / 583 Mbps
With this series + `ethtool -L eth0 combined 2`:
TCP / UDP : 576 Mbps / 584 Mbps
With this series + `ethtool -L eth0 combined 2` + [1], where flows are
properly distributed across queues:
TCP / UDP : 1.13 Gbps / 1.16 Gbps (re-measured with v3)
The 575~590 Mbps variation is run-to-run variance i.e. no measurable
regression or improvement is observed with a single queue. The key
point is scaling from ~600 Mbps to ~1.20 Gbps once flows are
distributed across multiple queues.
Note: On R-Car S4 Spider, only BAR2 is usable for ntb_transport MW.
For testing, BAR2 was expanded from 1 MiB to 2 MiB and split into two
Memory Windows. A follow-up series is planned to add split BAR support
for vNTB. On platforms where multiple BARs can be used for the
datapath, this series should allow >=2 queues without additional
changes.
[1] [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@valinux.co.jp/
(subject was accidentally incorrect in the original posting)
Changelog
=========
Changes in v3:
- Address Jakub's feedback: drop redundant defensive checks, use the
local qp argument where applicable, switch queue-array allocation to
kzalloc_objs(), remove the ntb_netdev_handlers forward declaration
and split ntb_set_channels().
- Dropped ntb_netdev_sync_subqueues() that did more than necessary.
Adjusted the original call sites by open coding what really needs to be
done.
Changes in v2:
- Drop the ntb_num_queues module parameter and implement ethtool
.set_channels().
v1 Patch 2-3 are dropped; v2 Patch 2-3 become preparatory changes
for the new Patch 4 implementing .set_channels().
- Drop unrelated changes from Patch 1 to keep it focused and easier to
review.
Best regards,
Koichiro
Koichiro Den (4):
net: ntb_netdev: Introduce per-queue context
net: ntb_netdev: Gate subqueue stop/wake by transport link
net: ntb_netdev: Factor out multi-queue helpers
net: ntb_netdev: Support ethtool channels for multi-queue
drivers/net/ntb_netdev.c | 493 +++++++++++++++++++++++++++++++--------
1 file changed, 390 insertions(+), 103 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/4] net: ntb_netdev: Introduce per-queue context
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
@ 2026-03-05 15:56 ` Koichiro Den
2026-03-05 15:56 ` [PATCH v3 2/4] net: ntb_netdev: Gate subqueue stop/wake by transport link Koichiro Den
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Koichiro Den @ 2026-03-05 15:56 UTC (permalink / raw)
To: Jon Mason, Dave Jiang, Allen Hubbe, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: ntb, netdev, linux-kernel
Prepare ntb_netdev for multi-queue operation by moving queue-pair state
out of struct ntb_netdev.
Introduce struct ntb_netdev_queue to carry the ntb_transport_qp pointer,
the per-QP TX timer and queue id. Pass this object as the callback
context and convert the RX/TX handlers and link event path accordingly.
The probe path allocates a fixed upper bound for netdev queues while
instantiating only a single ntb_transport queue pair, preserving the
previous behavior. Also store client_dev for future queue pair
creation/removal via the ntb_transport API.
Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
Changes in v3:
- Drop redundant defensive checks in start_xmit().
- Use kzalloc_objs() for the queue array.
- Use the local qp argument where applicable.
- Make variable declarations follow reverse Christmas tree style.
- Remove double blank line.
drivers/net/ntb_netdev.c | 277 ++++++++++++++++++++++++++-------------
1 file changed, 188 insertions(+), 89 deletions(-)
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
index fbeae05817e9..4b65e938d549 100644
--- a/drivers/net/ntb_netdev.c
+++ b/drivers/net/ntb_netdev.c
@@ -53,6 +53,7 @@
#include <linux/pci.h>
#include <linux/ntb.h>
#include <linux/ntb_transport.h>
+#include <linux/slab.h>
#define NTB_NETDEV_VER "0.7"
@@ -70,11 +71,24 @@ static unsigned int tx_start = 10;
/* Number of descriptors still available before stop upper layer tx */
static unsigned int tx_stop = 5;
-struct ntb_netdev {
- struct pci_dev *pdev;
- struct net_device *ndev;
+#define NTB_NETDEV_MAX_QUEUES 64
+#define NTB_NETDEV_DEFAULT_QUEUES 1
+
+struct ntb_netdev;
+
+struct ntb_netdev_queue {
+ struct ntb_netdev *ntdev;
struct ntb_transport_qp *qp;
struct timer_list tx_timer;
+ u16 qid;
+};
+
+struct ntb_netdev {
+ struct pci_dev *pdev;
+ struct device *client_dev;
+ struct net_device *ndev;
+ unsigned int num_queues;
+ struct ntb_netdev_queue *queues;
};
#define NTB_TX_TIMEOUT_MS 1000
@@ -82,14 +96,17 @@ struct ntb_netdev {
static void ntb_netdev_event_handler(void *data, int link_is_up)
{
- struct net_device *ndev = data;
- struct ntb_netdev *dev = netdev_priv(ndev);
+ struct ntb_netdev_queue *q = data;
+ struct ntb_netdev *dev = q->ntdev;
+ struct net_device *ndev;
- netdev_dbg(ndev, "Event %x, Link %x\n", link_is_up,
- ntb_transport_link_query(dev->qp));
+ ndev = dev->ndev;
+
+ netdev_dbg(ndev, "Event %x, Link %x, qp %u\n", link_is_up,
+ ntb_transport_link_query(q->qp), q->qid);
if (link_is_up) {
- if (ntb_transport_link_query(dev->qp))
+ if (ntb_transport_link_query(q->qp))
netif_carrier_on(ndev);
} else {
netif_carrier_off(ndev);
@@ -99,10 +116,13 @@ static void ntb_netdev_event_handler(void *data, int link_is_up)
static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
void *data, int len)
{
- struct net_device *ndev = qp_data;
+ struct ntb_netdev_queue *q = qp_data;
+ struct ntb_netdev *dev = q->ntdev;
+ struct net_device *ndev;
struct sk_buff *skb;
int rc;
+ ndev = dev->ndev;
skb = data;
if (!skb)
return;
@@ -118,6 +138,7 @@ static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
skb_put(skb, len);
skb->protocol = eth_type_trans(skb, ndev);
skb->ip_summed = CHECKSUM_NONE;
+ skb_record_rx_queue(skb, q->qid);
if (netif_rx(skb) == NET_RX_DROP) {
ndev->stats.rx_errors++;
@@ -144,42 +165,43 @@ static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
}
static int __ntb_netdev_maybe_stop_tx(struct net_device *netdev,
- struct ntb_transport_qp *qp, int size)
+ struct ntb_netdev_queue *q, int size)
{
- struct ntb_netdev *dev = netdev_priv(netdev);
+ netif_stop_subqueue(netdev, q->qid);
- netif_stop_queue(netdev);
/* Make sure to see the latest value of ntb_transport_tx_free_entry()
* since the queue was last started.
*/
smp_mb();
- if (likely(ntb_transport_tx_free_entry(qp) < size)) {
- mod_timer(&dev->tx_timer, jiffies + usecs_to_jiffies(tx_time));
+ if (likely(ntb_transport_tx_free_entry(q->qp) < size)) {
+ mod_timer(&q->tx_timer, jiffies + usecs_to_jiffies(tx_time));
return -EBUSY;
}
- netif_start_queue(netdev);
+ netif_start_subqueue(netdev, q->qid);
return 0;
}
static int ntb_netdev_maybe_stop_tx(struct net_device *ndev,
- struct ntb_transport_qp *qp, int size)
+ struct ntb_netdev_queue *q, int size)
{
- if (netif_queue_stopped(ndev) ||
- (ntb_transport_tx_free_entry(qp) >= size))
+ if (__netif_subqueue_stopped(ndev, q->qid) ||
+ (ntb_transport_tx_free_entry(q->qp) >= size))
return 0;
- return __ntb_netdev_maybe_stop_tx(ndev, qp, size);
+ return __ntb_netdev_maybe_stop_tx(ndev, q, size);
}
static void ntb_netdev_tx_handler(struct ntb_transport_qp *qp, void *qp_data,
void *data, int len)
{
- struct net_device *ndev = qp_data;
+ struct ntb_netdev_queue *q = qp_data;
+ struct ntb_netdev *dev = q->ntdev;
+ struct net_device *ndev;
struct sk_buff *skb;
- struct ntb_netdev *dev = netdev_priv(ndev);
+ ndev = dev->ndev;
skb = data;
if (!skb || !ndev)
return;
@@ -194,13 +216,13 @@ static void ntb_netdev_tx_handler(struct ntb_transport_qp *qp, void *qp_data,
dev_kfree_skb_any(skb);
- if (ntb_transport_tx_free_entry(dev->qp) >= tx_start) {
+ if (ntb_transport_tx_free_entry(qp) >= tx_start) {
/* Make sure anybody stopping the queue after this sees the new
* value of ntb_transport_tx_free_entry()
*/
smp_mb();
- if (netif_queue_stopped(ndev))
- netif_wake_queue(ndev);
+ if (__netif_subqueue_stopped(ndev, q->qid))
+ netif_wake_subqueue(ndev, q->qid);
}
}
@@ -208,16 +230,20 @@ static netdev_tx_t ntb_netdev_start_xmit(struct sk_buff *skb,
struct net_device *ndev)
{
struct ntb_netdev *dev = netdev_priv(ndev);
+ u16 qid = skb_get_queue_mapping(skb);
+ struct ntb_netdev_queue *q;
int rc;
- ntb_netdev_maybe_stop_tx(ndev, dev->qp, tx_stop);
+ q = &dev->queues[qid];
- rc = ntb_transport_tx_enqueue(dev->qp, skb, skb->data, skb->len);
+ ntb_netdev_maybe_stop_tx(ndev, q, tx_stop);
+
+ rc = ntb_transport_tx_enqueue(q->qp, skb, skb->data, skb->len);
if (rc)
goto err;
/* check for next submit */
- ntb_netdev_maybe_stop_tx(ndev, dev->qp, tx_stop);
+ ntb_netdev_maybe_stop_tx(ndev, q, tx_stop);
return NETDEV_TX_OK;
@@ -229,80 +255,104 @@ static netdev_tx_t ntb_netdev_start_xmit(struct sk_buff *skb,
static void ntb_netdev_tx_timer(struct timer_list *t)
{
- struct ntb_netdev *dev = timer_container_of(dev, t, tx_timer);
- struct net_device *ndev = dev->ndev;
+ struct ntb_netdev_queue *q = timer_container_of(q, t, tx_timer);
+ struct ntb_netdev *dev = q->ntdev;
+ struct net_device *ndev;
- if (ntb_transport_tx_free_entry(dev->qp) < tx_stop) {
- mod_timer(&dev->tx_timer, jiffies + usecs_to_jiffies(tx_time));
+ ndev = dev->ndev;
+
+ if (ntb_transport_tx_free_entry(q->qp) < tx_stop) {
+ mod_timer(&q->tx_timer, jiffies + usecs_to_jiffies(tx_time));
} else {
/* Make sure anybody stopping the queue after this sees the new
* value of ntb_transport_tx_free_entry()
*/
smp_mb();
- if (netif_queue_stopped(ndev))
- netif_wake_queue(ndev);
+ if (__netif_subqueue_stopped(ndev, q->qid))
+ netif_wake_subqueue(ndev, q->qid);
}
}
static int ntb_netdev_open(struct net_device *ndev)
{
struct ntb_netdev *dev = netdev_priv(ndev);
+ struct ntb_netdev_queue *queue;
struct sk_buff *skb;
- int rc, i, len;
+ int rc = 0, i, len;
+ unsigned int q;
- /* Add some empty rx bufs */
- for (i = 0; i < NTB_RXQ_SIZE; i++) {
- skb = netdev_alloc_skb(ndev, ndev->mtu + ETH_HLEN);
- if (!skb) {
- rc = -ENOMEM;
- goto err;
- }
+ /* Add some empty rx bufs for each queue */
+ for (q = 0; q < dev->num_queues; q++) {
+ queue = &dev->queues[q];
+
+ for (i = 0; i < NTB_RXQ_SIZE; i++) {
+ skb = netdev_alloc_skb(ndev, ndev->mtu + ETH_HLEN);
+ if (!skb) {
+ rc = -ENOMEM;
+ goto err;
+ }
- rc = ntb_transport_rx_enqueue(dev->qp, skb, skb->data,
- ndev->mtu + ETH_HLEN);
- if (rc) {
- dev_kfree_skb(skb);
- goto err;
+ rc = ntb_transport_rx_enqueue(queue->qp, skb, skb->data,
+ ndev->mtu + ETH_HLEN);
+ if (rc) {
+ dev_kfree_skb(skb);
+ goto err;
+ }
}
+
+ timer_setup(&queue->tx_timer, ntb_netdev_tx_timer, 0);
}
- timer_setup(&dev->tx_timer, ntb_netdev_tx_timer, 0);
-
netif_carrier_off(ndev);
- ntb_transport_link_up(dev->qp);
+
+ for (q = 0; q < dev->num_queues; q++)
+ ntb_transport_link_up(dev->queues[q].qp);
+
netif_start_queue(ndev);
return 0;
err:
- while ((skb = ntb_transport_rx_remove(dev->qp, &len)))
- dev_kfree_skb(skb);
+ for (q = 0; q < dev->num_queues; q++) {
+ queue = &dev->queues[q];
+
+ while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
+ dev_kfree_skb(skb);
+ }
return rc;
}
static int ntb_netdev_close(struct net_device *ndev)
{
struct ntb_netdev *dev = netdev_priv(ndev);
+ struct ntb_netdev_queue *queue;
struct sk_buff *skb;
+ unsigned int q;
int len;
- ntb_transport_link_down(dev->qp);
- while ((skb = ntb_transport_rx_remove(dev->qp, &len)))
- dev_kfree_skb(skb);
+ for (q = 0; q < dev->num_queues; q++) {
+ queue = &dev->queues[q];
- timer_delete_sync(&dev->tx_timer);
+ ntb_transport_link_down(queue->qp);
+ while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
+ dev_kfree_skb(skb);
+
+ timer_delete_sync(&queue->tx_timer);
+ }
return 0;
}
static int ntb_netdev_change_mtu(struct net_device *ndev, int new_mtu)
{
struct ntb_netdev *dev = netdev_priv(ndev);
+ struct ntb_netdev_queue *queue;
struct sk_buff *skb;
- int len, rc;
+ unsigned int q, i;
+ int len, rc = 0;
- if (new_mtu > ntb_transport_max_size(dev->qp) - ETH_HLEN)
+ if (new_mtu > ntb_transport_max_size(dev->queues[0].qp) - ETH_HLEN)
return -EINVAL;
if (!netif_running(ndev)) {
@@ -311,41 +361,54 @@ static int ntb_netdev_change_mtu(struct net_device *ndev, int new_mtu)
}
/* Bring down the link and dispose of posted rx entries */
- ntb_transport_link_down(dev->qp);
+ for (q = 0; q < dev->num_queues; q++)
+ ntb_transport_link_down(dev->queues[q].qp);
if (ndev->mtu < new_mtu) {
- int i;
+ for (q = 0; q < dev->num_queues; q++) {
+ queue = &dev->queues[q];
- for (i = 0; (skb = ntb_transport_rx_remove(dev->qp, &len)); i++)
- dev_kfree_skb(skb);
-
- for (; i; i--) {
- skb = netdev_alloc_skb(ndev, new_mtu + ETH_HLEN);
- if (!skb) {
- rc = -ENOMEM;
- goto err;
- }
-
- rc = ntb_transport_rx_enqueue(dev->qp, skb, skb->data,
- new_mtu + ETH_HLEN);
- if (rc) {
+ for (i = 0;
+ (skb = ntb_transport_rx_remove(queue->qp, &len));
+ i++)
dev_kfree_skb(skb);
- goto err;
+
+ for (; i; i--) {
+ skb = netdev_alloc_skb(ndev,
+ new_mtu + ETH_HLEN);
+ if (!skb) {
+ rc = -ENOMEM;
+ goto err;
+ }
+
+ rc = ntb_transport_rx_enqueue(queue->qp, skb,
+ skb->data,
+ new_mtu +
+ ETH_HLEN);
+ if (rc) {
+ dev_kfree_skb(skb);
+ goto err;
+ }
}
}
}
WRITE_ONCE(ndev->mtu, new_mtu);
- ntb_transport_link_up(dev->qp);
+ for (q = 0; q < dev->num_queues; q++)
+ ntb_transport_link_up(dev->queues[q].qp);
return 0;
err:
- ntb_transport_link_down(dev->qp);
+ for (q = 0; q < dev->num_queues; q++) {
+ struct ntb_netdev_queue *queue = &dev->queues[q];
- while ((skb = ntb_transport_rx_remove(dev->qp, &len)))
- dev_kfree_skb(skb);
+ ntb_transport_link_down(queue->qp);
+
+ while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
+ dev_kfree_skb(skb);
+ }
netdev_err(ndev, "Error changing MTU, device inoperable\n");
return rc;
@@ -404,6 +467,7 @@ static int ntb_netdev_probe(struct device *client_dev)
struct net_device *ndev;
struct pci_dev *pdev;
struct ntb_netdev *dev;
+ unsigned int q;
int rc;
ntb = dev_ntb(client_dev->parent);
@@ -411,7 +475,7 @@ static int ntb_netdev_probe(struct device *client_dev)
if (!pdev)
return -ENODEV;
- ndev = alloc_etherdev(sizeof(*dev));
+ ndev = alloc_etherdev_mq(sizeof(*dev), NTB_NETDEV_MAX_QUEUES);
if (!ndev)
return -ENOMEM;
@@ -420,6 +484,16 @@ static int ntb_netdev_probe(struct device *client_dev)
dev = netdev_priv(ndev);
dev->ndev = ndev;
dev->pdev = pdev;
+ dev->client_dev = client_dev;
+ dev->num_queues = 0;
+
+ dev->queues = kzalloc_objs(*dev->queues, NTB_NETDEV_MAX_QUEUES,
+ GFP_KERNEL);
+ if (!dev->queues) {
+ rc = -ENOMEM;
+ goto err_free_netdev;
+ }
+
ndev->features = NETIF_F_HIGHDMA;
ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
@@ -436,26 +510,47 @@ static int ntb_netdev_probe(struct device *client_dev)
ndev->min_mtu = 0;
ndev->max_mtu = ETH_MAX_MTU;
- dev->qp = ntb_transport_create_queue(ndev, client_dev,
- &ntb_netdev_handlers);
- if (!dev->qp) {
+ for (q = 0; q < NTB_NETDEV_DEFAULT_QUEUES; q++) {
+ struct ntb_netdev_queue *queue = &dev->queues[q];
+
+ queue->ntdev = dev;
+ queue->qid = q;
+ queue->qp = ntb_transport_create_queue(queue, client_dev,
+ &ntb_netdev_handlers);
+ if (!queue->qp)
+ break;
+
+ dev->num_queues++;
+ }
+
+ if (!dev->num_queues) {
rc = -EIO;
- goto err;
+ goto err_free_queues;
}
- ndev->mtu = ntb_transport_max_size(dev->qp) - ETH_HLEN;
+ rc = netif_set_real_num_queues(ndev, dev->num_queues, dev->num_queues);
+ if (rc)
+ goto err_free_qps;
+
+ ndev->mtu = ntb_transport_max_size(dev->queues[0].qp) - ETH_HLEN;
rc = register_netdev(ndev);
if (rc)
- goto err1;
+ goto err_free_qps;
dev_set_drvdata(client_dev, ndev);
- dev_info(&pdev->dev, "%s created\n", ndev->name);
+ dev_info(&pdev->dev, "%s created with %u queue pairs\n",
+ ndev->name, dev->num_queues);
return 0;
-err1:
- ntb_transport_free_queue(dev->qp);
-err:
+err_free_qps:
+ for (q = 0; q < dev->num_queues; q++)
+ ntb_transport_free_queue(dev->queues[q].qp);
+
+err_free_queues:
+ kfree(dev->queues);
+
+err_free_netdev:
free_netdev(ndev);
return rc;
}
@@ -464,9 +559,13 @@ static void ntb_netdev_remove(struct device *client_dev)
{
struct net_device *ndev = dev_get_drvdata(client_dev);
struct ntb_netdev *dev = netdev_priv(ndev);
+ unsigned int q;
unregister_netdev(ndev);
- ntb_transport_free_queue(dev->qp);
+ for (q = 0; q < dev->num_queues; q++)
+ ntb_transport_free_queue(dev->queues[q].qp);
+
+ kfree(dev->queues);
free_netdev(ndev);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/4] net: ntb_netdev: Gate subqueue stop/wake by transport link
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
2026-03-05 15:56 ` [PATCH v3 1/4] net: ntb_netdev: Introduce per-queue context Koichiro Den
@ 2026-03-05 15:56 ` Koichiro Den
2026-03-05 15:56 ` [PATCH v3 3/4] net: ntb_netdev: Factor out multi-queue helpers Koichiro Den
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Koichiro Den @ 2026-03-05 15:56 UTC (permalink / raw)
To: Jon Mason, Dave Jiang, Allen Hubbe, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: ntb, netdev, linux-kernel
When ntb_netdev is extended to multiple ntb_transport queue pairs, the
netdev carrier can be up as long as at least one QP link is up. In that
setup, a given QP may be link-down while the carrier remains on.
Make the link event handler start/stop the corresponding netdev TX
subqueue and drive carrier state based on whether any QP link is up.
Also guard subqueue wake/start points in the TX completion and timer
paths so a subqueue is not restarted while its QP link is down.
Stop all queues in ndo_open() and let the link event handler wake each
subqueue once ntb_transport link negotiation succeeds.
Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
Changes in v3:
- Adjusted context due to changes in Patch 1.
- No functional changes intended.
drivers/net/ntb_netdev.c | 42 ++++++++++++++++++++++++++++++----------
1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
index 4b65e938d549..e4c1422d1d7a 100644
--- a/drivers/net/ntb_netdev.c
+++ b/drivers/net/ntb_netdev.c
@@ -99,18 +99,32 @@ static void ntb_netdev_event_handler(void *data, int link_is_up)
struct ntb_netdev_queue *q = data;
struct ntb_netdev *dev = q->ntdev;
struct net_device *ndev;
+ bool any_up = false;
+ unsigned int i;
ndev = dev->ndev;
netdev_dbg(ndev, "Event %x, Link %x, qp %u\n", link_is_up,
ntb_transport_link_query(q->qp), q->qid);
- if (link_is_up) {
- if (ntb_transport_link_query(q->qp))
- netif_carrier_on(ndev);
- } else {
+ if (netif_running(ndev)) {
+ if (link_is_up)
+ netif_wake_subqueue(ndev, q->qid);
+ else
+ netif_stop_subqueue(ndev, q->qid);
+ }
+
+ for (i = 0; i < dev->num_queues; i++) {
+ if (ntb_transport_link_query(dev->queues[i].qp)) {
+ any_up = true;
+ break;
+ }
+ }
+
+ if (any_up)
+ netif_carrier_on(ndev);
+ else
netif_carrier_off(ndev);
- }
}
static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
@@ -179,7 +193,10 @@ static int __ntb_netdev_maybe_stop_tx(struct net_device *netdev,
return -EBUSY;
}
- netif_start_subqueue(netdev, q->qid);
+ /* The subqueue must be kept stopped if the link is down */
+ if (ntb_transport_link_query(q->qp))
+ netif_start_subqueue(netdev, q->qid);
+
return 0;
}
@@ -221,7 +238,8 @@ static void ntb_netdev_tx_handler(struct ntb_transport_qp *qp, void *qp_data,
* value of ntb_transport_tx_free_entry()
*/
smp_mb();
- if (__netif_subqueue_stopped(ndev, q->qid))
+ if (__netif_subqueue_stopped(ndev, q->qid) &&
+ ntb_transport_link_query(q->qp))
netif_wake_subqueue(ndev, q->qid);
}
}
@@ -268,7 +286,10 @@ static void ntb_netdev_tx_timer(struct timer_list *t)
* value of ntb_transport_tx_free_entry()
*/
smp_mb();
- if (__netif_subqueue_stopped(ndev, q->qid))
+
+ /* The subqueue must be kept stopped if the link is down */
+ if (__netif_subqueue_stopped(ndev, q->qid) &&
+ ntb_transport_link_query(q->qp))
netif_wake_subqueue(ndev, q->qid);
}
}
@@ -304,12 +325,11 @@ static int ntb_netdev_open(struct net_device *ndev)
}
netif_carrier_off(ndev);
+ netif_tx_stop_all_queues(ndev);
for (q = 0; q < dev->num_queues; q++)
ntb_transport_link_up(dev->queues[q].qp);
- netif_start_queue(ndev);
-
return 0;
err:
@@ -330,6 +350,8 @@ static int ntb_netdev_close(struct net_device *ndev)
unsigned int q;
int len;
+ netif_tx_stop_all_queues(ndev);
+ netif_carrier_off(ndev);
for (q = 0; q < dev->num_queues; q++) {
queue = &dev->queues[q];
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 3/4] net: ntb_netdev: Factor out multi-queue helpers
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
2026-03-05 15:56 ` [PATCH v3 1/4] net: ntb_netdev: Introduce per-queue context Koichiro Den
2026-03-05 15:56 ` [PATCH v3 2/4] net: ntb_netdev: Gate subqueue stop/wake by transport link Koichiro Den
@ 2026-03-05 15:56 ` Koichiro Den
2026-03-05 15:56 ` [PATCH v3 4/4] net: ntb_netdev: Support ethtool channels for multi-queue Koichiro Den
2026-03-07 3:20 ` [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: Koichiro Den @ 2026-03-05 15:56 UTC (permalink / raw)
To: Jon Mason, Dave Jiang, Allen Hubbe, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: ntb, netdev, linux-kernel
Implementing .set_channels will otherwise duplicate the same multi-queue
operations at multiple call sites. Factor out the following helpers:
- ntb_netdev_update_carrier(): carrier is switched on when at least
one QP link is up
- ntb_netdev_queue_rx_drain(): drain and free all queued RX packets
for one QP
- ntb_netdev_queue_rx_fill(): prefill RX ring for one QP
No functional change.
Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
Changes in v3:
- Adjusted context due to changes on Patch 1.
- No functional change intended.
drivers/net/ntb_netdev.c | 101 +++++++++++++++++++++++----------------
1 file changed, 61 insertions(+), 40 deletions(-)
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
index e4c1422d1d7a..ac39652b0488 100644
--- a/drivers/net/ntb_netdev.c
+++ b/drivers/net/ntb_netdev.c
@@ -94,26 +94,14 @@ struct ntb_netdev {
#define NTB_TX_TIMEOUT_MS 1000
#define NTB_RXQ_SIZE 100
-static void ntb_netdev_event_handler(void *data, int link_is_up)
+static void ntb_netdev_update_carrier(struct ntb_netdev *dev)
{
- struct ntb_netdev_queue *q = data;
- struct ntb_netdev *dev = q->ntdev;
struct net_device *ndev;
bool any_up = false;
unsigned int i;
ndev = dev->ndev;
- netdev_dbg(ndev, "Event %x, Link %x, qp %u\n", link_is_up,
- ntb_transport_link_query(q->qp), q->qid);
-
- if (netif_running(ndev)) {
- if (link_is_up)
- netif_wake_subqueue(ndev, q->qid);
- else
- netif_stop_subqueue(ndev, q->qid);
- }
-
for (i = 0; i < dev->num_queues; i++) {
if (ntb_transport_link_query(dev->queues[i].qp)) {
any_up = true;
@@ -127,6 +115,58 @@ static void ntb_netdev_event_handler(void *data, int link_is_up)
netif_carrier_off(ndev);
}
+static void ntb_netdev_queue_rx_drain(struct ntb_netdev_queue *queue)
+{
+ struct sk_buff *skb;
+ int len;
+
+ while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
+ dev_kfree_skb(skb);
+}
+
+static int ntb_netdev_queue_rx_fill(struct net_device *ndev,
+ struct ntb_netdev_queue *queue)
+{
+ struct sk_buff *skb;
+ int rc, i;
+
+ for (i = 0; i < NTB_RXQ_SIZE; i++) {
+ skb = netdev_alloc_skb(ndev, ndev->mtu + ETH_HLEN);
+ if (!skb)
+ return -ENOMEM;
+
+ rc = ntb_transport_rx_enqueue(queue->qp, skb, skb->data,
+ ndev->mtu + ETH_HLEN);
+ if (rc) {
+ dev_kfree_skb(skb);
+ return rc;
+ }
+ }
+
+ return 0;
+}
+
+static void ntb_netdev_event_handler(void *data, int link_is_up)
+{
+ struct ntb_netdev_queue *q = data;
+ struct ntb_netdev *dev = q->ntdev;
+ struct net_device *ndev;
+
+ ndev = dev->ndev;
+
+ netdev_dbg(ndev, "Event %x, Link %x, qp %u\n", link_is_up,
+ ntb_transport_link_query(q->qp), q->qid);
+
+ if (netif_running(ndev)) {
+ if (link_is_up)
+ netif_wake_subqueue(ndev, q->qid);
+ else
+ netif_stop_subqueue(ndev, q->qid);
+ }
+
+ ntb_netdev_update_carrier(dev);
+}
+
static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
void *data, int len)
{
@@ -298,28 +338,16 @@ static int ntb_netdev_open(struct net_device *ndev)
{
struct ntb_netdev *dev = netdev_priv(ndev);
struct ntb_netdev_queue *queue;
- struct sk_buff *skb;
- int rc = 0, i, len;
unsigned int q;
+ int rc = 0;
/* Add some empty rx bufs for each queue */
for (q = 0; q < dev->num_queues; q++) {
queue = &dev->queues[q];
- for (i = 0; i < NTB_RXQ_SIZE; i++) {
- skb = netdev_alloc_skb(ndev, ndev->mtu + ETH_HLEN);
- if (!skb) {
- rc = -ENOMEM;
- goto err;
- }
-
- rc = ntb_transport_rx_enqueue(queue->qp, skb, skb->data,
- ndev->mtu + ETH_HLEN);
- if (rc) {
- dev_kfree_skb(skb);
- goto err;
- }
- }
+ rc = ntb_netdev_queue_rx_fill(ndev, queue);
+ if (rc)
+ goto err;
timer_setup(&queue->tx_timer, ntb_netdev_tx_timer, 0);
}
@@ -335,9 +363,7 @@ static int ntb_netdev_open(struct net_device *ndev)
err:
for (q = 0; q < dev->num_queues; q++) {
queue = &dev->queues[q];
-
- while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
- dev_kfree_skb(skb);
+ ntb_netdev_queue_rx_drain(queue);
}
return rc;
}
@@ -346,9 +372,7 @@ static int ntb_netdev_close(struct net_device *ndev)
{
struct ntb_netdev *dev = netdev_priv(ndev);
struct ntb_netdev_queue *queue;
- struct sk_buff *skb;
unsigned int q;
- int len;
netif_tx_stop_all_queues(ndev);
netif_carrier_off(ndev);
@@ -357,12 +381,10 @@ static int ntb_netdev_close(struct net_device *ndev)
queue = &dev->queues[q];
ntb_transport_link_down(queue->qp);
-
- while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
- dev_kfree_skb(skb);
-
+ ntb_netdev_queue_rx_drain(queue);
timer_delete_sync(&queue->tx_timer);
}
+
return 0;
}
@@ -428,8 +450,7 @@ static int ntb_netdev_change_mtu(struct net_device *ndev, int new_mtu)
ntb_transport_link_down(queue->qp);
- while ((skb = ntb_transport_rx_remove(queue->qp, &len)))
- dev_kfree_skb(skb);
+ ntb_netdev_queue_rx_drain(queue);
}
netdev_err(ndev, "Error changing MTU, device inoperable\n");
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 4/4] net: ntb_netdev: Support ethtool channels for multi-queue
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
` (2 preceding siblings ...)
2026-03-05 15:56 ` [PATCH v3 3/4] net: ntb_netdev: Factor out multi-queue helpers Koichiro Den
@ 2026-03-05 15:56 ` Koichiro Den
2026-03-07 3:20 ` [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: Koichiro Den @ 2026-03-05 15:56 UTC (permalink / raw)
To: Jon Mason, Dave Jiang, Allen Hubbe, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: ntb, netdev, linux-kernel
Support dynamic queue pair addition/removal via ethtool channels.
Use the combined channel count to control the number of netdev TX/RX
queues, each corresponding to a ntb_transport queue pair.
When the number of queues is reduced, tear down and free the removed
ntb_transport queue pairs (not just deactivate them) so other
ntb_transport clients can reuse the freed resources.
When the number of queues is increased, create additional queue pairs up
to NTB_NETDEV_MAX_QUEUES (=64). The effective limit is determined by the
underlying ntb_transport implementation and NTB hardware resources (the
number of MWs), so set_channels may return -ENOSPC if no more QPs can be
allocated.
Keep the default at one queue pair to preserve the previous behavior.
Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
Changes in v3:
- Remove redundant checks already handled by ethtool core.
- Split ntb_set_channels() into helper functions.
- Drop ntb_netdev_sync_subqueues(), which did more than necessary in
some call sites. Adjust the original call sites to perform only the
required operations.
drivers/net/ntb_netdev.c | 157 +++++++++++++++++++++++++++++++++++++--
1 file changed, 151 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
index ac39652b0488..6397d898c991 100644
--- a/drivers/net/ntb_netdev.c
+++ b/drivers/net/ntb_netdev.c
@@ -284,6 +284,12 @@ static void ntb_netdev_tx_handler(struct ntb_transport_qp *qp, void *qp_data,
}
}
+static const struct ntb_queue_handlers ntb_netdev_handlers = {
+ .tx_handler = ntb_netdev_tx_handler,
+ .rx_handler = ntb_netdev_rx_handler,
+ .event_handler = ntb_netdev_event_handler,
+};
+
static netdev_tx_t ntb_netdev_start_xmit(struct sk_buff *skb,
struct net_device *ndev)
{
@@ -492,16 +498,155 @@ static int ntb_get_link_ksettings(struct net_device *dev,
return 0;
}
+static void ntb_get_channels(struct net_device *ndev,
+ struct ethtool_channels *channels)
+{
+ struct ntb_netdev *dev = netdev_priv(ndev);
+
+ channels->combined_count = dev->num_queues;
+ channels->max_combined = ndev->num_tx_queues;
+}
+
+static int ntb_inc_channels(struct net_device *ndev,
+ unsigned int old, unsigned int new)
+{
+ struct ntb_netdev *dev = netdev_priv(ndev);
+ bool running = netif_running(ndev);
+ struct ntb_netdev_queue *queue;
+ unsigned int q, created;
+ int rc;
+
+ created = old;
+ for (q = old; q < new; q++) {
+ queue = &dev->queues[q];
+
+ queue->ntdev = dev;
+ queue->qid = q;
+ queue->qp = ntb_transport_create_queue(queue, dev->client_dev,
+ &ntb_netdev_handlers);
+ if (!queue->qp) {
+ rc = -ENOSPC;
+ goto err_new;
+ }
+ created++;
+
+ if (!running)
+ continue;
+
+ timer_setup(&queue->tx_timer, ntb_netdev_tx_timer, 0);
+
+ rc = ntb_netdev_queue_rx_fill(ndev, queue);
+ if (rc)
+ goto err_new;
+
+ /*
+ * Carrier may already be on due to other QPs. Keep the new
+ * subqueue stopped until we get a Link Up event for this QP.
+ */
+ netif_stop_subqueue(ndev, q);
+ }
+
+ rc = netif_set_real_num_queues(ndev, new, new);
+ if (rc)
+ goto err_new;
+
+ dev->num_queues = new;
+
+ if (running)
+ for (q = old; q < new; q++)
+ ntb_transport_link_up(dev->queues[q].qp);
+
+ return 0;
+
+err_new:
+ if (running) {
+ unsigned int rollback = created;
+
+ while (rollback-- > old) {
+ queue = &dev->queues[rollback];
+ ntb_transport_link_down(queue->qp);
+ ntb_netdev_queue_rx_drain(queue);
+ timer_delete_sync(&queue->tx_timer);
+ }
+ }
+ while (created-- > old) {
+ queue = &dev->queues[created];
+ ntb_transport_free_queue(queue->qp);
+ queue->qp = NULL;
+ }
+ return rc;
+}
+
+static int ntb_dec_channels(struct net_device *ndev,
+ unsigned int old, unsigned int new)
+{
+ struct ntb_netdev *dev = netdev_priv(ndev);
+ bool running = netif_running(ndev);
+ struct ntb_netdev_queue *queue;
+ unsigned int q;
+ int rc;
+
+ if (running)
+ for (q = new; q < old; q++)
+ netif_stop_subqueue(ndev, q);
+
+ rc = netif_set_real_num_queues(ndev, new, new);
+ if (rc)
+ goto err;
+
+ /* Publish new queue count before invalidating QP pointers */
+ dev->num_queues = new;
+
+ for (q = new; q < old; q++) {
+ queue = &dev->queues[q];
+
+ if (running) {
+ ntb_transport_link_down(queue->qp);
+ ntb_netdev_queue_rx_drain(queue);
+ timer_delete_sync(&queue->tx_timer);
+ }
+
+ ntb_transport_free_queue(queue->qp);
+ queue->qp = NULL;
+ }
+
+ /*
+ * It might be the case that the removed queues are the only queues that
+ * were up, so see if the global carrier needs to change.
+ */
+ ntb_netdev_update_carrier(dev);
+ return 0;
+
+err:
+ if (running) {
+ for (q = new; q < old; q++)
+ netif_wake_subqueue(ndev, q);
+ }
+ return rc;
+}
+
+static int ntb_set_channels(struct net_device *ndev,
+ struct ethtool_channels *channels)
+{
+ struct ntb_netdev *dev = netdev_priv(ndev);
+ unsigned int new = channels->combined_count;
+ unsigned int old = dev->num_queues;
+
+ if (new == old)
+ return 0;
+
+ if (new < old)
+ return ntb_dec_channels(ndev, old, new);
+ else
+ return ntb_inc_channels(ndev, old, new);
+}
+
static const struct ethtool_ops ntb_ethtool_ops = {
.get_drvinfo = ntb_get_drvinfo,
.get_link = ethtool_op_get_link,
.get_link_ksettings = ntb_get_link_ksettings,
-};
-
-static const struct ntb_queue_handlers ntb_netdev_handlers = {
- .tx_handler = ntb_netdev_tx_handler,
- .rx_handler = ntb_netdev_rx_handler,
- .event_handler = ntb_netdev_event_handler,
+ .get_channels = ntb_get_channels,
+ .set_channels = ntb_set_channels,
};
static int ntb_netdev_probe(struct device *client_dev)
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
` (3 preceding siblings ...)
2026-03-05 15:56 ` [PATCH v3 4/4] net: ntb_netdev: Support ethtool channels for multi-queue Koichiro Den
@ 2026-03-07 3:20 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-07 3:20 UTC (permalink / raw)
To: Koichiro Den
Cc: jdmason, dave.jiang, allenbh, andrew+netdev, davem, edumazet,
kuba, pabeni, ntb, netdev, linux-kernel
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 6 Mar 2026 00:56:35 +0900 you wrote:
> Hi,
>
> ntb_netdev currently hard-codes a single NTB transport queue pair, which
> means the datapath effectively runs as a single-queue netdev regardless
> of available CPUs / parallel flows.
>
> The longer-term motivation here is throughput scale-out: allow
> ntb_netdev to grow beyond the single-QP bottleneck and make it possible
> to spread TX/RX work across multiple queue pairs as link speeds and core
> counts keep increasing.
>
> [...]
Here is the summary with links:
- [v3,1/4] net: ntb_netdev: Introduce per-queue context
https://git.kernel.org/netdev/net-next/c/ee970634c777
- [v3,2/4] net: ntb_netdev: Gate subqueue stop/wake by transport link
https://git.kernel.org/netdev/net-next/c/304132b7a5e6
- [v3,3/4] net: ntb_netdev: Factor out multi-queue helpers
https://git.kernel.org/netdev/net-next/c/b83bf617dc84
- [v3,4/4] net: ntb_netdev: Support ethtool channels for multi-queue
https://git.kernel.org/netdev/net-next/c/24d9e73c7e00
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-07 3:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-05 15:56 [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support Koichiro Den
2026-03-05 15:56 ` [PATCH v3 1/4] net: ntb_netdev: Introduce per-queue context Koichiro Den
2026-03-05 15:56 ` [PATCH v3 2/4] net: ntb_netdev: Gate subqueue stop/wake by transport link Koichiro Den
2026-03-05 15:56 ` [PATCH v3 3/4] net: ntb_netdev: Factor out multi-queue helpers Koichiro Den
2026-03-05 15:56 ` [PATCH v3 4/4] net: ntb_netdev: Support ethtool channels for multi-queue Koichiro Den
2026-03-07 3:20 ` [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox