netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next/RFC PATCH v1 0/4] Introduce napi queues support
@ 2023-06-01 17:42 Amritha Nambiar
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues Amritha Nambiar
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Amritha Nambiar @ 2023-06-01 17:42 UTC (permalink / raw)
  To: netdev, kuba, davem; +Cc: sridhar.samudrala, amritha.nambiar

Introduce support for associating napi instances with
corresponding RX and TX queue set. Add the capability
to export napi information supported by the device.
Extend the netdev_genl generic netlink family for netdev
with napi data. The napi fields exposed are:
- napi id
- queue/queue-set (both RX and TX) associated with each
  napi instance

Additional napi fields such as PID association for napi
thread etc. can be supported in a follow-on patch set.

This series only supports 'get' ability for retrieving
napi fields (specifically, napi ids and queue[s]). The 'set'
ability for setting queue[s] associated with a napi instance
via netdev-genl will be submitted as a separate patch series.

Previous discussion at:
https://lore.kernel.org/netdev/c8476530638a5f4381d64db0e024ed49c2db3b02.camel@gmail.com/T/#m00999652a8b4731fbdb7bf698d2e3666c65a60e7

$ ./cli.py --spec netdev.yaml  --do dev-get --json='{"ifindex": 12}'
[{'ifindex': 12,
  'xdp-features': {'xsk-zerocopy', 'basic', 'rx-sg', 'redirect'}},
 {'napi-info': [{'napi-id': 600, 'rx-queues': [7], 'tx-queues': [7]},
                {'napi-id': 599, 'rx-queues': [6], 'tx-queues': [6]},
                {'napi-id': 598, 'rx-queues': [5], 'tx-queues': [5]},
                {'napi-id': 597, 'rx-queues': [4], 'tx-queues': [4]},
                {'napi-id': 596, 'rx-queues': [3], 'tx-queues': [3]},
                {'napi-id': 595, 'rx-queues': [2], 'tx-queues': [2]},
                {'napi-id': 594, 'rx-queues': [1], 'tx-queues': [1]},
                {'napi-id': 593, 'rx-queues': [0], 'tx-queues': [0]}]}]

---

Amritha Nambiar (4):
      net: Introduce new napi fields for rx/tx queues
      net: Add support for associating napi with queue[s]
      netdev-genl: Introduce netdev dump ctx
      netdev-genl: Add support for exposing napi info from netdev


 Documentation/netlink/specs/netdev.yaml   |   39 ++++
 drivers/net/ethernet/intel/ice/ice_lib.c  |   57 ++++++
 drivers/net/ethernet/intel/ice/ice_lib.h  |    4 
 drivers/net/ethernet/intel/ice/ice_main.c |    4 
 include/linux/netdevice.h                 |   18 ++
 include/uapi/linux/netdev.h               |    4 
 net/core/dev.c                            |   55 ++++++
 net/core/netdev-genl.c                    |  261 ++++++++++++++++++++++++-----
 tools/include/uapi/linux/netdev.h         |    4 
 9 files changed, 402 insertions(+), 44 deletions(-)

--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
@ 2023-06-01 17:42 ` Amritha Nambiar
  2023-06-03  6:06   ` Jakub Kicinski
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s] Amritha Nambiar
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Amritha Nambiar @ 2023-06-01 17:42 UTC (permalink / raw)
  To: netdev, kuba, davem; +Cc: sridhar.samudrala, amritha.nambiar

Introduce new napi fields 'napi_rxq_list' and 'napi_txq_list'
for rx and tx queue set associated with the napi and
initialize them. Handle their removal as well.

This enables a mapping of each napi instance with the
queue/queue-set on the corresponding irq line.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 include/linux/netdevice.h |    7 +++++++
 net/core/dev.c            |   21 +++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 08fbd4622ccf..49f64401af7c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -342,6 +342,11 @@ struct gro_list {
  */
 #define GRO_HASH_BUCKETS	8
 
+struct napi_queue {
+	struct list_head	q_list;
+	u16			queue_index;
+};
+
 /*
  * Structure for NAPI scheduling similar to tasklet but with weighting
  */
@@ -376,6 +381,8 @@ struct napi_struct {
 	/* control-path-only fields follow */
 	struct list_head	dev_list;
 	struct hlist_node	napi_hash_node;
+	struct list_head	napi_rxq_list;
+	struct list_head	napi_txq_list;
 };
 
 enum {
diff --git a/net/core/dev.c b/net/core/dev.c
index 3393c2f3dbe8..9ee8eb3ef223 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6401,6 +6401,9 @@ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
 	 */
 	if (dev->threaded && napi_kthread_create(napi))
 		dev->threaded = 0;
+
+	INIT_LIST_HEAD(&napi->napi_rxq_list);
+	INIT_LIST_HEAD(&napi->napi_txq_list);
 }
 EXPORT_SYMBOL(netif_napi_add_weight);
 
@@ -6462,6 +6465,23 @@ static void flush_gro_hash(struct napi_struct *napi)
 	}
 }
 
+static void __napi_del_queue(struct napi_queue *napi_queue)
+{
+	list_del_rcu(&napi_queue->q_list);
+	kfree(napi_queue);
+}
+
+static void napi_del_queues(struct napi_struct *napi)
+{
+	struct napi_queue *napi_queue, *n;
+
+	list_for_each_entry_safe(napi_queue, n, &napi->napi_rxq_list, q_list)
+		__napi_del_queue(napi_queue);
+
+	list_for_each_entry_safe(napi_queue, n, &napi->napi_txq_list, q_list)
+		__napi_del_queue(napi_queue);
+}
+
 /* Must be called in process context */
 void __netif_napi_del(struct napi_struct *napi)
 {
@@ -6479,6 +6499,7 @@ void __netif_napi_del(struct napi_struct *napi)
 		kthread_stop(napi->thread);
 		napi->thread = NULL;
 	}
+	napi_del_queues(napi);
 }
 EXPORT_SYMBOL(__netif_napi_del);
 


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s]
  2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues Amritha Nambiar
@ 2023-06-01 17:42 ` Amritha Nambiar
  2023-06-02 15:42   ` Simon Horman
  2023-06-03  6:31   ` Paolo Abeni
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 3/4] netdev-genl: Introduce netdev dump ctx Amritha Nambiar
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Amritha Nambiar @ 2023-06-01 17:42 UTC (permalink / raw)
  To: netdev, kuba, davem; +Cc: sridhar.samudrala, amritha.nambiar

After the napi context is initialized, map the napi instance
with the queue/queue-set on the corresponding irq line.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_lib.c  |   57 +++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_lib.h  |    4 ++
 drivers/net/ethernet/intel/ice/ice_main.c |    4 ++
 include/linux/netdevice.h                 |   11 ++++++
 net/core/dev.c                            |   34 +++++++++++++++++
 5 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 5ddb95d1073a..58f68363119f 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2478,6 +2478,12 @@ ice_vsi_cfg_def(struct ice_vsi *vsi, struct ice_vsi_cfg_params *params)
 			goto unroll_vector_base;
 
 		ice_vsi_map_rings_to_vectors(vsi);
+
+		/* Associate q_vector rings to napi */
+		ret = ice_vsi_add_napi_queues(vsi);
+		if (ret)
+			goto unroll_vector_base;
+
 		vsi->stat_offsets_loaded = false;
 
 		if (ice_is_xdp_ena_vsi(vsi)) {
@@ -2957,6 +2963,57 @@ void ice_vsi_dis_irq(struct ice_vsi *vsi)
 		synchronize_irq(vsi->q_vectors[i]->irq.virq);
 }
 
+/**
+ * ice_q_vector_add_napi_queues - Add queue[s] associated with the napi
+ * @q_vector: q_vector pointer
+ *
+ * Associate the q_vector napi with all the queue[s] on the vector
+ * Returns 0 on success or < 0 on error
+ */
+int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector)
+{
+	struct ice_rx_ring *rx_ring;
+	struct ice_tx_ring *tx_ring;
+	int ret;
+
+	ice_for_each_rx_ring(rx_ring, q_vector->rx) {
+		ret = netif_napi_add_queue(&q_vector->napi, rx_ring->q_index,
+					   NAPI_RX_CONTAINER);
+		if (ret)
+			return ret;
+	}
+	ice_for_each_tx_ring(tx_ring, q_vector->tx) {
+		ret = netif_napi_add_queue(&q_vector->napi, tx_ring->q_index,
+					   NAPI_TX_CONTAINER);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
+/**
+ * ice_vsi_add_napi_queues
+ * @vsi: VSI pointer
+ *
+ * Associate queue[s] with napi for all vectors
+ * Returns 0 on success or < 0 on error
+ */
+int ice_vsi_add_napi_queues(struct ice_vsi *vsi)
+{
+	int i, ret = 0;
+
+	if (!vsi->netdev)
+		return ret;
+
+	ice_for_each_q_vector(vsi, i) {
+		ret = ice_q_vector_add_napi_queues(vsi->q_vectors[i]);
+		if (ret)
+			return ret;
+	}
+	return ret;
+}
+
 /**
  * ice_napi_del - Remove NAPI handler for the VSI
  * @vsi: VSI for which NAPI handler is to be removed
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
index e985766e6bb5..623b5f738a5c 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_lib.h
@@ -93,6 +93,10 @@ void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc);
 struct ice_vsi *
 ice_vsi_setup(struct ice_pf *pf, struct ice_vsi_cfg_params *params);
 
+int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector);
+
+int ice_vsi_add_napi_queues(struct ice_vsi *vsi);
+
 void ice_napi_del(struct ice_vsi *vsi);
 
 int ice_vsi_release(struct ice_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 62e91512aeab..c66ff1473aeb 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3348,9 +3348,11 @@ static void ice_napi_add(struct ice_vsi *vsi)
 	if (!vsi->netdev)
 		return;
 
-	ice_for_each_q_vector(vsi, v_idx)
+	ice_for_each_q_vector(vsi, v_idx) {
 		netif_napi_add(vsi->netdev, &vsi->q_vectors[v_idx]->napi,
 			       ice_napi_poll);
+		ice_q_vector_add_napi_queues(vsi->q_vectors[v_idx]);
+	}
 }
 
 /**
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 49f64401af7c..a562db712c6e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -342,6 +342,14 @@ struct gro_list {
  */
 #define GRO_HASH_BUCKETS	8
 
+/*
+ * napi queue container type
+ */
+enum napi_container_type {
+	NAPI_RX_CONTAINER,
+	NAPI_TX_CONTAINER,
+};
+
 struct napi_queue {
 	struct list_head	q_list;
 	u16			queue_index;
@@ -2622,6 +2630,9 @@ static inline void *netdev_priv(const struct net_device *dev)
  */
 #define SET_NETDEV_DEVTYPE(net, devtype)	((net)->dev.type = (devtype))
 
+int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
+			 enum napi_container_type);
+
 /* Default NAPI poll() weight
  * Device drivers are strongly advised to not use bigger value
  */
diff --git a/net/core/dev.c b/net/core/dev.c
index 9ee8eb3ef223..ba712119ec85 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6366,6 +6366,40 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
 }
 EXPORT_SYMBOL(dev_set_threaded);
 
+/**
+ * netif_napi_add_queue - Associate queue with the napi
+ * @napi: NAPI context
+ * @queue_index: Index of queue
+ * @napi_container_type: queue type as RX or TX
+ *
+ * Add queue with its corresponding napi context
+ */
+int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
+			 enum napi_container_type type)
+{
+	struct napi_queue *napi_queue;
+
+	napi_queue = kzalloc(sizeof(*napi_queue), GFP_KERNEL);
+	if (!napi_queue)
+		return -ENOMEM;
+
+	napi_queue->queue_index = queue_index;
+
+	switch (type) {
+	case NAPI_RX_CONTAINER:
+		list_add_rcu(&napi_queue->q_list, &napi->napi_rxq_list);
+		break;
+	case NAPI_TX_CONTAINER:
+		list_add_rcu(&napi_queue->q_list, &napi->napi_txq_list);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(netif_napi_add_queue);
+
 void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
 			   int (*poll)(struct napi_struct *, int), int weight)
 {


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [net-next/RFC PATCH v1 3/4] netdev-genl: Introduce netdev dump ctx
  2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues Amritha Nambiar
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s] Amritha Nambiar
@ 2023-06-01 17:42 ` Amritha Nambiar
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev Amritha Nambiar
  2023-06-03  6:00 ` [net-next/RFC PATCH v1 0/4] Introduce napi queues support Jakub Kicinski
  4 siblings, 0 replies; 29+ messages in thread
From: Amritha Nambiar @ 2023-06-01 17:42 UTC (permalink / raw)
  To: netdev, kuba, davem; +Cc: sridhar.samudrala, amritha.nambiar

A comment in the definition of struct netlink_callback states
"args is deprecated. Cast a struct over ctx instead for proper
type safety." Introduce netdev_nl_dump_ctx structure and replace
'args' with netdev_nl_dump_ctx fields.


Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 net/core/netdev-genl.c |   22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index a4270fafdf11..8d6a840821c7 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -8,6 +8,19 @@
 
 #include "netdev-genl-gen.h"
 
+struct netdev_nl_dump_ctx {
+	int dev_entry_hash;
+	int dev_entry_idx;
+};
+
+static inline struct netdev_nl_dump_ctx *
+netdev_dump_ctx(struct netlink_callback *cb)
+{
+	NL_ASSERT_DUMP_CTX_FITS(struct netdev_nl_dump_ctx);
+
+	return (struct netdev_nl_dump_ctx *)cb->ctx;
+}
+
 static int
 netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp,
 		   u32 portid, u32 seq, int flags, u32 cmd)
@@ -91,14 +104,15 @@ int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
 
 int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 {
+	struct netdev_nl_dump_ctx *ctx = netdev_dump_ctx(cb);
 	struct net *net = sock_net(skb->sk);
 	struct net_device *netdev;
 	int idx = 0, s_idx;
 	int h, s_h;
 	int err;
 
-	s_h = cb->args[0];
-	s_idx = cb->args[1];
+	s_h = ctx->dev_entry_hash;
+	s_idx = ctx->dev_entry_idx;
 
 	rtnl_lock();
 
@@ -126,8 +140,8 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	if (err != -EMSGSIZE)
 		return err;
 
-	cb->args[1] = idx;
-	cb->args[0] = h;
+	ctx->dev_entry_idx = idx;
+	ctx->dev_entry_hash = h;
 	cb->seq = net->dev_base_seq;
 
 	return skb->len;


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
                   ` (2 preceding siblings ...)
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 3/4] netdev-genl: Introduce netdev dump ctx Amritha Nambiar
@ 2023-06-01 17:42 ` Amritha Nambiar
  2023-06-02 15:47   ` Simon Horman
  2023-06-03  6:17   ` Jakub Kicinski
  2023-06-03  6:00 ` [net-next/RFC PATCH v1 0/4] Introduce napi queues support Jakub Kicinski
  4 siblings, 2 replies; 29+ messages in thread
From: Amritha Nambiar @ 2023-06-01 17:42 UTC (permalink / raw)
  To: netdev, kuba, davem; +Cc: sridhar.samudrala, amritha.nambiar

Add support in ynl/netdev.yaml for napi related information. The
netdev structure tracks all the napi instances and napi fields.
The napi instances and associated queue[s] can be retrieved this way.

Refactored netdev-genl to support exposing napi<->queue[s] mapping
that is retained in a netdev.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 Documentation/netlink/specs/netdev.yaml |   39 +++++
 include/uapi/linux/netdev.h             |    4 +
 net/core/netdev-genl.c                  |  239 ++++++++++++++++++++++++++-----
 tools/include/uapi/linux/netdev.h       |    4 +
 4 files changed, 247 insertions(+), 39 deletions(-)

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index b99e7ffef7a1..8d0edb529563 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -62,6 +62,44 @@ attribute-sets:
         type: u64
         enum: xdp-act
         enum-as-flags: true
+      -
+        name: napi-info
+        doc: napi information such as napi-id, napi queues etc.
+        type: nest
+        multi-attr: true
+        nested-attributes: dev-napi-info
+      -
+        name: napi-id
+        doc: napi id
+        type: u32
+      -
+        name: rx-queues
+        doc: list of rx queues associated with a napi
+        type: u16
+        multi-attr: true
+      -
+        name: tx-queues
+        doc: list of tx queues associated with a napi
+        type: u16
+        multi-attr: true
+  -
+    name: dev-napi-info
+    subset-of: dev
+    attributes:
+      -
+        name: napi-id
+        doc: napi id
+        type: u32
+      -
+        name: rx-queues
+        doc: list rx of queues associated with a napi
+        type: u16
+        multi-attr: true
+      -
+        name: tx-queues
+        doc: list tx of queues associated with a napi
+        type: u16
+        multi-attr: true
 
 operations:
   list:
@@ -77,6 +115,7 @@ operations:
           attributes:
             - ifindex
             - xdp-features
+            - napi-info
       dump:
         reply: *dev-all
     -
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 639524b59930..16538fb1406a 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -41,6 +41,10 @@ enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
 	NETDEV_A_DEV_XDP_FEATURES,
+	NETDEV_A_DEV_NAPI_INFO,
+	NETDEV_A_DEV_NAPI_ID,
+	NETDEV_A_DEV_RX_QUEUES,
+	NETDEV_A_DEV_TX_QUEUES,
 
 	__NETDEV_A_DEV_MAX,
 	NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 8d6a840821c7..fdaa67f53b22 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -11,6 +11,7 @@
 struct netdev_nl_dump_ctx {
 	int dev_entry_hash;
 	int dev_entry_idx;
+	int napi_idx;
 };
 
 static inline struct netdev_nl_dump_ctx *
@@ -21,54 +22,187 @@ netdev_dump_ctx(struct netlink_callback *cb)
 	return (struct netdev_nl_dump_ctx *)cb->ctx;
 }
 
+enum netdev_nl_type {
+	NETDEV_NL_DO,
+	NETDEV_NL_NOTIFY,
+};
+
+static int netdev_nl_send_func(struct net_device *netdev, struct sk_buff *skb,
+			       u32 portid, enum netdev_nl_type type)
+{
+	switch (type) {
+	case NETDEV_NL_DO:
+		return genlmsg_unicast(dev_net(netdev), skb, portid);
+	case NETDEV_NL_NOTIFY:
+		return genlmsg_multicast_netns(&netdev_nl_family,
+					       dev_net(netdev), skb, 0,
+					       NETDEV_NLGRP_MGMT, GFP_KERNEL);
+	default:
+		return -EINVAL;
+	}
+}
+
 static int
-netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp,
-		   u32 portid, u32 seq, int flags, u32 cmd)
+netdev_nl_dev_napi_fill_one(struct sk_buff *msg, struct napi_struct *napi)
 {
-	void *hdr;
+	struct nlattr *napi_info;
+	struct napi_queue *q, *n;
 
-	hdr = genlmsg_put(rsp, portid, seq, &netdev_nl_family, flags, cmd);
-	if (!hdr)
+	napi_info = nla_nest_start(msg, NETDEV_A_DEV_NAPI_INFO);
+	if (!napi_info)
 		return -EMSGSIZE;
 
+	if (nla_put_u32(msg, NETDEV_A_DEV_NAPI_ID, napi->napi_id))
+		goto nla_put_failure;
+
+	list_for_each_entry_safe(q, n, &napi->napi_rxq_list, q_list) {
+		if (nla_put_u16(msg, NETDEV_A_DEV_RX_QUEUES, q->queue_index))
+			goto nla_put_failure;
+	}
+
+	list_for_each_entry_safe(q, n, &napi->napi_txq_list, q_list) {
+		if (nla_put_u16(msg, NETDEV_A_DEV_TX_QUEUES, q->queue_index))
+			goto nla_put_failure;
+	}
+	nla_nest_end(msg, napi_info);
+	return 0;
+nla_put_failure:
+	nla_nest_cancel(msg, napi_info);
+	return -EMSGSIZE;
+}
+
+static int
+netdev_nl_dev_napi_fill(struct net_device *netdev, struct sk_buff *msg, int *start)
+{
+	struct napi_struct *napi, *n;
+	int i = 0;
+
+	list_for_each_entry_safe(napi, n, &netdev->napi_list, dev_list) {
+		if (i < *start) {
+			i++;
+			continue;
+		}
+		if (netdev_nl_dev_napi_fill_one(msg, napi))
+			return -EMSGSIZE;
+		*start = ++i;
+	}
+	return 0;
+}
+
+static int
+netdev_nl_dev_napi_prepare_fill(struct net_device *netdev,
+				struct sk_buff **pskb, u32 portid, u32 seq,
+				int flags, u32 cmd, enum netdev_nl_type type)
+{
+	struct nlmsghdr *nlh;
+	struct sk_buff *skb = *pskb;
+	bool last = false;
+	int index = 0;
+	void *hdr;
+	int err;
+
+	while (!last) {
+		int tmp_index = index;
+
+		skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+		if (!skb)
+			return -ENOMEM;
+
+		hdr = genlmsg_put(skb, portid, seq, &netdev_nl_family,
+				  flags | NLM_F_MULTI, cmd);
+		if (!hdr) {
+			err = -EMSGSIZE;
+			goto nla_put_failure;
+		}
+		err = netdev_nl_dev_napi_fill(netdev, skb, &index);
+		if (!err)
+			last = true;
+		else if (err != -EMSGSIZE || tmp_index == index)
+			goto nla_put_failure;
+
+		genlmsg_end(skb, hdr);
+		err = netdev_nl_send_func(netdev, skb, portid, type);
+		if (err)
+			return err;
+	}
+
+	skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+	nlh = nlmsg_put(skb, portid, seq, NLMSG_DONE, 0, flags | NLM_F_MULTI);
+	if (!nlh) {
+		err = -EMSGSIZE;
+		goto nla_put_failure;
+	}
+
+	return netdev_nl_send_func(netdev, skb, portid, type);
+
+nla_put_failure:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int
+netdev_nl_dev_info_fill(struct net_device *netdev, struct sk_buff *rsp)
+{
 	if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) ||
 	    nla_put_u64_64bit(rsp, NETDEV_A_DEV_XDP_FEATURES,
-			      netdev->xdp_features, NETDEV_A_DEV_PAD)) {
-		genlmsg_cancel(rsp, hdr);
-		return -EINVAL;
+			      netdev->xdp_features, NETDEV_A_DEV_PAD))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int
+netdev_nl_dev_fill(struct net_device *netdev, u32 portid, u32 seq, int flags,
+		   u32 cmd, enum netdev_nl_type type)
+{
+	struct sk_buff *skb;
+	void *hdr;
+	int err;
+
+	skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(skb, portid, seq, &netdev_nl_family, flags, cmd);
+	if (!hdr) {
+		err = -EMSGSIZE;
+		goto err_free_msg;
+	}
+	err = netdev_nl_dev_info_fill(netdev, skb);
+	if (err) {
+		genlmsg_cancel(skb, hdr);
+		goto err_free_msg;
 	}
 
-	genlmsg_end(rsp, hdr);
+	genlmsg_end(skb, hdr);
 
-	return 0;
+	err = netdev_nl_send_func(netdev, skb, portid, type);
+	if (err)
+		return err;
+
+	return netdev_nl_dev_napi_prepare_fill(netdev, &skb, portid, seq, flags,
+					       cmd, type);
+
+err_free_msg:
+	nlmsg_free(skb);
+	return err;
 }
 
 static void
 netdev_genl_dev_notify(struct net_device *netdev, int cmd)
 {
-	struct sk_buff *ntf;
-
 	if (!genl_has_listeners(&netdev_nl_family, dev_net(netdev),
 				NETDEV_NLGRP_MGMT))
 		return;
 
-	ntf = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
-	if (!ntf)
-		return;
-
-	if (netdev_nl_dev_fill(netdev, ntf, 0, 0, 0, cmd)) {
-		nlmsg_free(ntf);
-		return;
-	}
+	netdev_nl_dev_fill(netdev, 0, 0, 0, cmd, NETDEV_NL_NOTIFY);
 
-	genlmsg_multicast_netns(&netdev_nl_family, dev_net(netdev), ntf,
-				0, NETDEV_NLGRP_MGMT, GFP_KERNEL);
 }
 
 int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
 {
 	struct net_device *netdev;
-	struct sk_buff *rsp;
 	u32 ifindex;
 	int err;
 
@@ -77,29 +211,53 @@ int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
 
 	ifindex = nla_get_u32(info->attrs[NETDEV_A_DEV_IFINDEX]);
 
-	rsp = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
-	if (!rsp)
-		return -ENOMEM;
-
 	rtnl_lock();
 
 	netdev = __dev_get_by_index(genl_info_net(info), ifindex);
 	if (netdev)
-		err = netdev_nl_dev_fill(netdev, rsp, info->snd_portid,
-					 info->snd_seq, 0, info->genlhdr->cmd);
+		err = netdev_nl_dev_fill(netdev, info->snd_portid,
+					 info->snd_seq, 0, info->genlhdr->cmd,
+					 NETDEV_NL_DO);
 	else
 		err = -ENODEV;
 
 	rtnl_unlock();
-
 	if (err)
-		goto err_free_msg;
+		return err;
 
-	return genlmsg_reply(rsp, info);
+	return 0;
+}
+
+static int
+netdev_nl_dev_dump_entry(struct net_device *netdev, struct sk_buff *rsp,
+			 struct netlink_callback *cb, int *start)
+{
+	int index = *start;
+	int tmp_index = index;
+	void *hdr;
+	int err;
+
+	hdr = genlmsg_put(rsp, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+			  &netdev_nl_family, NLM_F_MULTI, NETDEV_CMD_DEV_GET);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (netdev_nl_dev_info_fill(netdev, rsp))
+		goto nla_put_failure;
+
+	err =  netdev_nl_dev_napi_fill(netdev, rsp, &index);
+	if (err) {
+		if (err != -EMSGSIZE || tmp_index == index)
+			goto nla_put_failure;
+	}
+	*start = index;
+	genlmsg_end(rsp, hdr);
 
-err_free_msg:
-	nlmsg_free(rsp);
 	return err;
+
+nla_put_failure:
+	genlmsg_cancel(rsp, hdr);
+	return -EINVAL;
 }
 
 int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
@@ -107,12 +265,13 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	struct netdev_nl_dump_ctx *ctx = netdev_dump_ctx(cb);
 	struct net *net = sock_net(skb->sk);
 	struct net_device *netdev;
-	int idx = 0, s_idx;
+	int idx = 0, s_idx, n_idx;
 	int h, s_h;
 	int err;
 
 	s_h = ctx->dev_entry_hash;
 	s_idx = ctx->dev_entry_idx;
+	n_idx = ctx->napi_idx;
 
 	rtnl_lock();
 
@@ -124,10 +283,10 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 		hlist_for_each_entry(netdev, head, index_hlist) {
 			if (idx < s_idx)
 				goto cont;
-			err = netdev_nl_dev_fill(netdev, skb,
-						 NETLINK_CB(cb->skb).portid,
-						 cb->nlh->nlmsg_seq, 0,
-						 NETDEV_CMD_DEV_GET);
+			err = netdev_nl_dev_dump_entry(netdev, skb, cb, &n_idx);
+			if (err == -EMSGSIZE)
+				goto out;
+			n_idx = 0;
 			if (err < 0)
 				break;
 cont:
@@ -135,6 +294,7 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 		}
 	}
 
+out:
 	rtnl_unlock();
 
 	if (err != -EMSGSIZE)
@@ -142,6 +302,7 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 
 	ctx->dev_entry_idx = idx;
 	ctx->dev_entry_hash = h;
+	ctx->napi_idx = n_idx;
 	cb->seq = net->dev_base_seq;
 
 	return skb->len;
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 639524b59930..16538fb1406a 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -41,6 +41,10 @@ enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
 	NETDEV_A_DEV_XDP_FEATURES,
+	NETDEV_A_DEV_NAPI_INFO,
+	NETDEV_A_DEV_NAPI_ID,
+	NETDEV_A_DEV_RX_QUEUES,
+	NETDEV_A_DEV_TX_QUEUES,
 
 	__NETDEV_A_DEV_MAX,
 	NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s]
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s] Amritha Nambiar
@ 2023-06-02 15:42   ` Simon Horman
  2023-07-12 19:53     ` Nambiar, Amritha
  2023-06-03  6:31   ` Paolo Abeni
  1 sibling, 1 reply; 29+ messages in thread
From: Simon Horman @ 2023-06-02 15:42 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: netdev, kuba, davem, sridhar.samudrala

On Thu, Jun 01, 2023 at 10:42:30AM -0700, Amritha Nambiar wrote:
> After the napi context is initialized, map the napi instance
> with the queue/queue-set on the corresponding irq line.
> 
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>

Hi Amritha,

some minor feedback from my side.

...

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9ee8eb3ef223..ba712119ec85 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6366,6 +6366,40 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
>  }
>  EXPORT_SYMBOL(dev_set_threaded);
>  
> +/**
> + * netif_napi_add_queue - Associate queue with the napi
> + * @napi: NAPI context
> + * @queue_index: Index of queue
> + * @napi_container_type: queue type as RX or TX

s/@napi_container_type:/@type:/

> + *
> + * Add queue with its corresponding napi context
> + */
> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
> +			 enum napi_container_type type)
> +{
> +	struct napi_queue *napi_queue;
> +
> +	napi_queue = kzalloc(sizeof(*napi_queue), GFP_KERNEL);
> +	if (!napi_queue)
> +		return -ENOMEM;
> +
> +	napi_queue->queue_index = queue_index;
> +
> +	switch (type) {
> +	case NAPI_RX_CONTAINER:
> +		list_add_rcu(&napi_queue->q_list, &napi->napi_rxq_list);
> +		break;
> +	case NAPI_TX_CONTAINER:
> +		list_add_rcu(&napi_queue->q_list, &napi->napi_txq_list);
> +		break;
> +	default:

Perhaps napi_queue is leaked here.

> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(netif_napi_add_queue);
> +
>  void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
>  			   int (*poll)(struct napi_struct *, int), int weight)
>  {
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev Amritha Nambiar
@ 2023-06-02 15:47   ` Simon Horman
  2023-06-03  6:08     ` Jakub Kicinski
  2023-07-12 19:54     ` Nambiar, Amritha
  2023-06-03  6:17   ` Jakub Kicinski
  1 sibling, 2 replies; 29+ messages in thread
From: Simon Horman @ 2023-06-02 15:47 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: netdev, kuba, davem, sridhar.samudrala

On Thu, Jun 01, 2023 at 10:42:41AM -0700, Amritha Nambiar wrote:
> Add support in ynl/netdev.yaml for napi related information. The
> netdev structure tracks all the napi instances and napi fields.
> The napi instances and associated queue[s] can be retrieved this way.
> 
> Refactored netdev-genl to support exposing napi<->queue[s] mapping
> that is retained in a netdev.

Hi Amritha,

This feels like it should be two patches to me.
Though it is not something I feel strongly about.

> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>

...

> +static int
> +netdev_nl_dev_napi_prepare_fill(struct net_device *netdev,
> +				struct sk_buff **pskb, u32 portid, u32 seq,
> +				int flags, u32 cmd, enum netdev_nl_type type)
> +{
> +	struct nlmsghdr *nlh;
> +	struct sk_buff *skb = *pskb;
> +	bool last = false;
> +	int index = 0;
> +	void *hdr;
> +	int err;
> +

nit: please use reverse xmas tree - longest line to shortest - for
     local variable declarations in (new) Networking code.

...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 0/4] Introduce napi queues support
  2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
                   ` (3 preceding siblings ...)
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev Amritha Nambiar
@ 2023-06-03  6:00 ` Jakub Kicinski
  2023-07-12 19:52   ` Nambiar, Amritha
  4 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-06-03  6:00 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: netdev, davem, sridhar.samudrala

On Thu, 01 Jun 2023 10:42:20 -0700 Amritha Nambiar wrote:
> Introduce support for associating napi instances with
> corresponding RX and TX queue set. Add the capability
> to export napi information supported by the device.
> Extend the netdev_genl generic netlink family for netdev
> with napi data. The napi fields exposed are:
> - napi id
> - queue/queue-set (both RX and TX) associated with each
>   napi instance

Would you mind throwing in the IRQ vector number already?
Should be pretty easy to find the IRQ from NAPI, and it'd
make this code immediately very useful for IRQ pinning.

> Additional napi fields such as PID association for napi
> thread etc. can be supported in a follow-on patch set.
> 
> This series only supports 'get' ability for retrieving
> napi fields (specifically, napi ids and queue[s]). The 'set'
> ability for setting queue[s] associated with a napi instance
> via netdev-genl will be submitted as a separate patch series.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues Amritha Nambiar
@ 2023-06-03  6:06   ` Jakub Kicinski
  2023-07-12 20:09     ` Nambiar, Amritha
  0 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-06-03  6:06 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: netdev, davem, sridhar.samudrala

On Thu, 01 Jun 2023 10:42:25 -0700 Amritha Nambiar wrote:
> Introduce new napi fields 'napi_rxq_list' and 'napi_txq_list'
> for rx and tx queue set associated with the napi and
> initialize them. Handle their removal as well.
> 
> This enables a mapping of each napi instance with the
> queue/queue-set on the corresponding irq line.

Wouldn't it be easier to store the NAPI instance pointer in the queue?
That way we don't have to allocate memory.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-02 15:47   ` Simon Horman
@ 2023-06-03  6:08     ` Jakub Kicinski
  2023-07-12 20:05       ` Nambiar, Amritha
  2023-07-12 19:54     ` Nambiar, Amritha
  1 sibling, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-06-03  6:08 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: Simon Horman, netdev, davem, sridhar.samudrala

On Fri, 2 Jun 2023 17:47:17 +0200 Simon Horman wrote:
> This feels like it should be two patches to me.
> Though it is not something I feel strongly about.

+1 I'd put the YAML and what's generated from it in one patch, 
and the hand-written code in another.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev Amritha Nambiar
  2023-06-02 15:47   ` Simon Horman
@ 2023-06-03  6:17   ` Jakub Kicinski
  2023-07-12 20:10     ` Nambiar, Amritha
  1 sibling, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-06-03  6:17 UTC (permalink / raw)
  To: Amritha Nambiar; +Cc: netdev, davem, sridhar.samudrala

On Thu, 01 Jun 2023 10:42:41 -0700 Amritha Nambiar wrote:
> Add support in ynl/netdev.yaml for napi related information. The
> netdev structure tracks all the napi instances and napi fields.
> The napi instances and associated queue[s] can be retrieved this way.
> 
> Refactored netdev-genl to support exposing napi<->queue[s] mapping
> that is retained in a netdev.
> 
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> ---
>  Documentation/netlink/specs/netdev.yaml |   39 +++++
>  include/uapi/linux/netdev.h             |    4 +
>  net/core/netdev-genl.c                  |  239 ++++++++++++++++++++++++++-----
>  tools/include/uapi/linux/netdev.h       |    4 +
>  4 files changed, 247 insertions(+), 39 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
> index b99e7ffef7a1..8d0edb529563 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -62,6 +62,44 @@ attribute-sets:
>          type: u64
>          enum: xdp-act
>          enum-as-flags: true
> +      -
> +        name: napi-info
> +        doc: napi information such as napi-id, napi queues etc.
> +        type: nest
> +        multi-attr: true

Let's make a new attr space for the napi info command.
We don't reuse much of the attributes, and as the commands 
grow stuffing all attrs into one space makes finding stuff 
harder.

> +        nested-attributes: dev-napi-info

And what's inside this nest should also be a separate attr space.


> +      -
> +        name: napi-id
> +        doc: napi id
> +        type: u32
> +      -
> +        name: rx-queues
> +        doc: list of rx queues associated with a napi
> +        type: u16

Make it u32, at the uAPI level we're tried to the width of fields, and
u16 ends up being the same size as u32 "on the wire" due to padding.

> +        multi-attr: true
> +      -
> +        name: tx-queues
> +        doc: list of tx queues associated with a napi
> +        type: u16
> +        multi-attr: true


> +  -
> +    name: dev-napi-info
> +    subset-of: dev

Yeah, this shouldn't be a subset just a full-on separate attr space.
The handshake family may be a good example to look at, it's the biggest
so far written with the new rules in mind. 

> +    attributes:
> +      -
> +        name: napi-id
> +        doc: napi id
> +        type: u32
> +      -
> +        name: rx-queues
> +        doc: list rx of queues associated with a napi
> +        type: u16
> +        multi-attr: true
> +      -
> +        name: tx-queues
> +        doc: list tx of queues associated with a napi
> +        type: u16
> +        multi-attr: true
>  
>  operations:
>    list:
> @@ -77,6 +115,7 @@ operations:
>            attributes:
>              - ifindex
>              - xdp-features
> +            - napi-info

Aaah, separate command, please. Let's not stuff all the information
into a single command like we did for rtnl.

>        dump:
>          reply: *dev-all
>      -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s]
  2023-06-01 17:42 ` [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s] Amritha Nambiar
  2023-06-02 15:42   ` Simon Horman
@ 2023-06-03  6:31   ` Paolo Abeni
  2023-07-12 19:56     ` Nambiar, Amritha
  1 sibling, 1 reply; 29+ messages in thread
From: Paolo Abeni @ 2023-06-03  6:31 UTC (permalink / raw)
  To: Amritha Nambiar, netdev, kuba, davem; +Cc: sridhar.samudrala

On Thu, 2023-06-01 at 10:42 -0700, Amritha Nambiar wrote:
> After the napi context is initialized, map the napi instance
> with the queue/queue-set on the corresponding irq line.
> 
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_lib.c  |   57 +++++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_lib.h  |    4 ++
>  drivers/net/ethernet/intel/ice/ice_main.c |    4 ++
>  include/linux/netdevice.h                 |   11 ++++++
>  net/core/dev.c                            |   34 +++++++++++++++++
>  5 files changed, 109 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> index 5ddb95d1073a..58f68363119f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> @@ -2478,6 +2478,12 @@ ice_vsi_cfg_def(struct ice_vsi *vsi, struct ice_vsi_cfg_params *params)
>  			goto unroll_vector_base;
>  
>  		ice_vsi_map_rings_to_vectors(vsi);
> +
> +		/* Associate q_vector rings to napi */
> +		ret = ice_vsi_add_napi_queues(vsi);
> +		if (ret)
> +			goto unroll_vector_base;
> +
>  		vsi->stat_offsets_loaded = false;
>  
>  		if (ice_is_xdp_ena_vsi(vsi)) {
> @@ -2957,6 +2963,57 @@ void ice_vsi_dis_irq(struct ice_vsi *vsi)
>  		synchronize_irq(vsi->q_vectors[i]->irq.virq);
>  }
>  
> +/**
> + * ice_q_vector_add_napi_queues - Add queue[s] associated with the napi
> + * @q_vector: q_vector pointer
> + *
> + * Associate the q_vector napi with all the queue[s] on the vector
> + * Returns 0 on success or < 0 on error
> + */
> +int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector)
> +{
> +	struct ice_rx_ring *rx_ring;
> +	struct ice_tx_ring *tx_ring;
> +	int ret;
> +
> +	ice_for_each_rx_ring(rx_ring, q_vector->rx) {
> +		ret = netif_napi_add_queue(&q_vector->napi, rx_ring->q_index,
> +					   NAPI_RX_CONTAINER);
> +		if (ret)
> +			return ret;
> +	}
> +	ice_for_each_tx_ring(tx_ring, q_vector->tx) {
> +		ret = netif_napi_add_queue(&q_vector->napi, tx_ring->q_index,
> +					   NAPI_TX_CONTAINER);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * ice_vsi_add_napi_queues
> + * @vsi: VSI pointer
> + *
> + * Associate queue[s] with napi for all vectors
> + * Returns 0 on success or < 0 on error
> + */
> +int ice_vsi_add_napi_queues(struct ice_vsi *vsi)
> +{
> +	int i, ret = 0;
> +
> +	if (!vsi->netdev)
> +		return ret;
> +
> +	ice_for_each_q_vector(vsi, i) {
> +		ret = ice_q_vector_add_napi_queues(vsi->q_vectors[i]);
> +		if (ret)
> +			return ret;
> +	}
> +	return ret;
> +}
> +
>  /**
>   * ice_napi_del - Remove NAPI handler for the VSI
>   * @vsi: VSI for which NAPI handler is to be removed
> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
> index e985766e6bb5..623b5f738a5c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.h
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.h
> @@ -93,6 +93,10 @@ void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc);
>  struct ice_vsi *
>  ice_vsi_setup(struct ice_pf *pf, struct ice_vsi_cfg_params *params);
>  
> +int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector);
> +
> +int ice_vsi_add_napi_queues(struct ice_vsi *vsi);
> +
>  void ice_napi_del(struct ice_vsi *vsi);
>  
>  int ice_vsi_release(struct ice_vsi *vsi);
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 62e91512aeab..c66ff1473aeb 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -3348,9 +3348,11 @@ static void ice_napi_add(struct ice_vsi *vsi)
>  	if (!vsi->netdev)
>  		return;
>  
> -	ice_for_each_q_vector(vsi, v_idx)
> +	ice_for_each_q_vector(vsi, v_idx) {
>  		netif_napi_add(vsi->netdev, &vsi->q_vectors[v_idx]->napi,
>  			       ice_napi_poll);
> +		ice_q_vector_add_napi_queues(vsi->q_vectors[v_idx]);
> +	}
>  }
>  
>  /**
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 49f64401af7c..a562db712c6e 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -342,6 +342,14 @@ struct gro_list {
>   */
>  #define GRO_HASH_BUCKETS	8
>  
> +/*
> + * napi queue container type
> + */
> +enum napi_container_type {
> +	NAPI_RX_CONTAINER,
> +	NAPI_TX_CONTAINER,
> +};
> +
>  struct napi_queue {
>  	struct list_head	q_list;
>  	u16			queue_index;
> @@ -2622,6 +2630,9 @@ static inline void *netdev_priv(const struct net_device *dev)
>   */
>  #define SET_NETDEV_DEVTYPE(net, devtype)	((net)->dev.type = (devtype))
>  
> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
> +			 enum napi_container_type);
> +
>  /* Default NAPI poll() weight
>   * Device drivers are strongly advised to not use bigger value
>   */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9ee8eb3ef223..ba712119ec85 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6366,6 +6366,40 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
>  }
>  EXPORT_SYMBOL(dev_set_threaded);
>  
> +/**
> + * netif_napi_add_queue - Associate queue with the napi
> + * @napi: NAPI context
> + * @queue_index: Index of queue
> + * @napi_container_type: queue type as RX or TX
> + *
> + * Add queue with its corresponding napi context
> + */
> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
> +			 enum napi_container_type type)
> +{
> +	struct napi_queue *napi_queue;
> +
> +	napi_queue = kzalloc(sizeof(*napi_queue), GFP_KERNEL);
> +	if (!napi_queue)
> +		return -ENOMEM;
> +
> +	napi_queue->queue_index = queue_index;
> +
> +	switch (type) {
> +	case NAPI_RX_CONTAINER:
> +		list_add_rcu(&napi_queue->q_list, &napi->napi_rxq_list);
> +		break;
> +	case NAPI_TX_CONTAINER:
> +		list_add_rcu(&napi_queue->q_list, &napi->napi_txq_list);
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(netif_napi_add_queue);
> +
>  void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
>  			   int (*poll)(struct napi_struct *, int), int weight)
>  {
> 

I think this later 2 chunks are a better fit for the previous patch, so
that here there will be only driver-related changes.

Also it looks like the napi-queue APIs are going to grow a bit. Perhaps
it would be useful move all that new code in a separated file? dev.c is
already pretty big.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 0/4] Introduce napi queues support
  2023-06-03  6:00 ` [net-next/RFC PATCH v1 0/4] Introduce napi queues support Jakub Kicinski
@ 2023-07-12 19:52   ` Nambiar, Amritha
  0 siblings, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 19:52 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 6/2/2023 11:00 PM, Jakub Kicinski wrote:
> On Thu, 01 Jun 2023 10:42:20 -0700 Amritha Nambiar wrote:
>> Introduce support for associating napi instances with
>> corresponding RX and TX queue set. Add the capability
>> to export napi information supported by the device.
>> Extend the netdev_genl generic netlink family for netdev
>> with napi data. The napi fields exposed are:
>> - napi id
>> - queue/queue-set (both RX and TX) associated with each
>>    napi instance
> 
> Would you mind throwing in the IRQ vector number already?
> Should be pretty easy to find the IRQ from NAPI, and it'd
> make this code immediately very useful for IRQ pinning.
> 

Sorry for the delayed response as I was on vacation.
Sure, I can add in the IRQ vector number as well in the next version of 
the series.

>> Additional napi fields such as PID association for napi
>> thread etc. can be supported in a follow-on patch set.
>>
>> This series only supports 'get' ability for retrieving
>> napi fields (specifically, napi ids and queue[s]). The 'set'
>> ability for setting queue[s] associated with a napi instance
>> via netdev-genl will be submitted as a separate patch series.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s]
  2023-06-02 15:42   ` Simon Horman
@ 2023-07-12 19:53     ` Nambiar, Amritha
  0 siblings, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 19:53 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, kuba, davem, sridhar.samudrala

On 6/2/2023 8:42 AM, Simon Horman wrote:
> On Thu, Jun 01, 2023 at 10:42:30AM -0700, Amritha Nambiar wrote:
>> After the napi context is initialized, map the napi instance
>> with the queue/queue-set on the corresponding irq line.
>>
>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> 
> Hi Amritha,
> 
> some minor feedback from my side.
> 
> ...
> 
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9ee8eb3ef223..ba712119ec85 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -6366,6 +6366,40 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
>>   }
>>   EXPORT_SYMBOL(dev_set_threaded);
>>   
>> +/**
>> + * netif_napi_add_queue - Associate queue with the napi
>> + * @napi: NAPI context
>> + * @queue_index: Index of queue
>> + * @napi_container_type: queue type as RX or TX
> 
> s/@napi_container_type:/@type:/
> 

Will fix.

>> + *
>> + * Add queue with its corresponding napi context
>> + */
>> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
>> +			 enum napi_container_type type)
>> +{
>> +	struct napi_queue *napi_queue;
>> +
>> +	napi_queue = kzalloc(sizeof(*napi_queue), GFP_KERNEL);
>> +	if (!napi_queue)
>> +		return -ENOMEM;
>> +
>> +	napi_queue->queue_index = queue_index;
>> +
>> +	switch (type) {
>> +	case NAPI_RX_CONTAINER:
>> +		list_add_rcu(&napi_queue->q_list, &napi->napi_rxq_list);
>> +		break;
>> +	case NAPI_TX_CONTAINER:
>> +		list_add_rcu(&napi_queue->q_list, &napi->napi_txq_list);
>> +		break;
>> +	default:
> 
> Perhaps napi_queue is leaked here.
> 

My bad. Will fix in the next version.

>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL(netif_napi_add_queue);
>> +
>>   void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
>>   			   int (*poll)(struct napi_struct *, int), int weight)
>>   {
>>
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-02 15:47   ` Simon Horman
  2023-06-03  6:08     ` Jakub Kicinski
@ 2023-07-12 19:54     ` Nambiar, Amritha
  1 sibling, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 19:54 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, kuba, davem, sridhar.samudrala

On 6/2/2023 8:47 AM, Simon Horman wrote:
> On Thu, Jun 01, 2023 at 10:42:41AM -0700, Amritha Nambiar wrote:
>> Add support in ynl/netdev.yaml for napi related information. The
>> netdev structure tracks all the napi instances and napi fields.
>> The napi instances and associated queue[s] can be retrieved this way.
>>
>> Refactored netdev-genl to support exposing napi<->queue[s] mapping
>> that is retained in a netdev.
> 
> Hi Amritha,
> 
> This feels like it should be two patches to me.
> Though it is not something I feel strongly about.
> 

Thanks for pointing that out, I'll split this patch into two.

>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> 
> ...
> 
>> +static int
>> +netdev_nl_dev_napi_prepare_fill(struct net_device *netdev,
>> +				struct sk_buff **pskb, u32 portid, u32 seq,
>> +				int flags, u32 cmd, enum netdev_nl_type type)
>> +{
>> +	struct nlmsghdr *nlh;
>> +	struct sk_buff *skb = *pskb;
>> +	bool last = false;
>> +	int index = 0;
>> +	void *hdr;
>> +	int err;
>> +
> 
> nit: please use reverse xmas tree - longest line to shortest - for
>       local variable declarations in (new) Networking code.
> 
> ...
Will fix in next version.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s]
  2023-06-03  6:31   ` Paolo Abeni
@ 2023-07-12 19:56     ` Nambiar, Amritha
  0 siblings, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 19:56 UTC (permalink / raw)
  To: Paolo Abeni, netdev, kuba, davem; +Cc: sridhar.samudrala

On 6/2/2023 11:31 PM, Paolo Abeni wrote:
> On Thu, 2023-06-01 at 10:42 -0700, Amritha Nambiar wrote:
>> After the napi context is initialized, map the napi instance
>> with the queue/queue-set on the corresponding irq line.
>>
>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>> ---
>>   drivers/net/ethernet/intel/ice/ice_lib.c  |   57 +++++++++++++++++++++++++++++
>>   drivers/net/ethernet/intel/ice/ice_lib.h  |    4 ++
>>   drivers/net/ethernet/intel/ice/ice_main.c |    4 ++
>>   include/linux/netdevice.h                 |   11 ++++++
>>   net/core/dev.c                            |   34 +++++++++++++++++
>>   5 files changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
>> index 5ddb95d1073a..58f68363119f 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
>> @@ -2478,6 +2478,12 @@ ice_vsi_cfg_def(struct ice_vsi *vsi, struct ice_vsi_cfg_params *params)
>>   			goto unroll_vector_base;
>>   
>>   		ice_vsi_map_rings_to_vectors(vsi);
>> +
>> +		/* Associate q_vector rings to napi */
>> +		ret = ice_vsi_add_napi_queues(vsi);
>> +		if (ret)
>> +			goto unroll_vector_base;
>> +
>>   		vsi->stat_offsets_loaded = false;
>>   
>>   		if (ice_is_xdp_ena_vsi(vsi)) {
>> @@ -2957,6 +2963,57 @@ void ice_vsi_dis_irq(struct ice_vsi *vsi)
>>   		synchronize_irq(vsi->q_vectors[i]->irq.virq);
>>   }
>>   
>> +/**
>> + * ice_q_vector_add_napi_queues - Add queue[s] associated with the napi
>> + * @q_vector: q_vector pointer
>> + *
>> + * Associate the q_vector napi with all the queue[s] on the vector
>> + * Returns 0 on success or < 0 on error
>> + */
>> +int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector)
>> +{
>> +	struct ice_rx_ring *rx_ring;
>> +	struct ice_tx_ring *tx_ring;
>> +	int ret;
>> +
>> +	ice_for_each_rx_ring(rx_ring, q_vector->rx) {
>> +		ret = netif_napi_add_queue(&q_vector->napi, rx_ring->q_index,
>> +					   NAPI_RX_CONTAINER);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +	ice_for_each_tx_ring(tx_ring, q_vector->tx) {
>> +		ret = netif_napi_add_queue(&q_vector->napi, tx_ring->q_index,
>> +					   NAPI_TX_CONTAINER);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * ice_vsi_add_napi_queues
>> + * @vsi: VSI pointer
>> + *
>> + * Associate queue[s] with napi for all vectors
>> + * Returns 0 on success or < 0 on error
>> + */
>> +int ice_vsi_add_napi_queues(struct ice_vsi *vsi)
>> +{
>> +	int i, ret = 0;
>> +
>> +	if (!vsi->netdev)
>> +		return ret;
>> +
>> +	ice_for_each_q_vector(vsi, i) {
>> +		ret = ice_q_vector_add_napi_queues(vsi->q_vectors[i]);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +	return ret;
>> +}
>> +
>>   /**
>>    * ice_napi_del - Remove NAPI handler for the VSI
>>    * @vsi: VSI for which NAPI handler is to be removed
>> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
>> index e985766e6bb5..623b5f738a5c 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_lib.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_lib.h
>> @@ -93,6 +93,10 @@ void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc);
>>   struct ice_vsi *
>>   ice_vsi_setup(struct ice_pf *pf, struct ice_vsi_cfg_params *params);
>>   
>> +int ice_q_vector_add_napi_queues(struct ice_q_vector *q_vector);
>> +
>> +int ice_vsi_add_napi_queues(struct ice_vsi *vsi);
>> +
>>   void ice_napi_del(struct ice_vsi *vsi);
>>   
>>   int ice_vsi_release(struct ice_vsi *vsi);
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> index 62e91512aeab..c66ff1473aeb 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -3348,9 +3348,11 @@ static void ice_napi_add(struct ice_vsi *vsi)
>>   	if (!vsi->netdev)
>>   		return;
>>   
>> -	ice_for_each_q_vector(vsi, v_idx)
>> +	ice_for_each_q_vector(vsi, v_idx) {
>>   		netif_napi_add(vsi->netdev, &vsi->q_vectors[v_idx]->napi,
>>   			       ice_napi_poll);
>> +		ice_q_vector_add_napi_queues(vsi->q_vectors[v_idx]);
>> +	}
>>   }
>>   
>>   /**
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 49f64401af7c..a562db712c6e 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -342,6 +342,14 @@ struct gro_list {
>>    */
>>   #define GRO_HASH_BUCKETS	8
>>   
>> +/*
>> + * napi queue container type
>> + */
>> +enum napi_container_type {
>> +	NAPI_RX_CONTAINER,
>> +	NAPI_TX_CONTAINER,
>> +};
>> +
>>   struct napi_queue {
>>   	struct list_head	q_list;
>>   	u16			queue_index;
>> @@ -2622,6 +2630,9 @@ static inline void *netdev_priv(const struct net_device *dev)
>>    */
>>   #define SET_NETDEV_DEVTYPE(net, devtype)	((net)->dev.type = (devtype))
>>   
>> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
>> +			 enum napi_container_type);
>> +
>>   /* Default NAPI poll() weight
>>    * Device drivers are strongly advised to not use bigger value
>>    */
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9ee8eb3ef223..ba712119ec85 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -6366,6 +6366,40 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
>>   }
>>   EXPORT_SYMBOL(dev_set_threaded);
>>   
>> +/**
>> + * netif_napi_add_queue - Associate queue with the napi
>> + * @napi: NAPI context
>> + * @queue_index: Index of queue
>> + * @napi_container_type: queue type as RX or TX
>> + *
>> + * Add queue with its corresponding napi context
>> + */
>> +int netif_napi_add_queue(struct napi_struct *napi, u16 queue_index,
>> +			 enum napi_container_type type)
>> +{
>> +	struct napi_queue *napi_queue;
>> +
>> +	napi_queue = kzalloc(sizeof(*napi_queue), GFP_KERNEL);
>> +	if (!napi_queue)
>> +		return -ENOMEM;
>> +
>> +	napi_queue->queue_index = queue_index;
>> +
>> +	switch (type) {
>> +	case NAPI_RX_CONTAINER:
>> +		list_add_rcu(&napi_queue->q_list, &napi->napi_rxq_list);
>> +		break;
>> +	case NAPI_TX_CONTAINER:
>> +		list_add_rcu(&napi_queue->q_list, &napi->napi_txq_list);
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL(netif_napi_add_queue);
>> +
>>   void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
>>   			   int (*poll)(struct napi_struct *, int), int weight)
>>   {
>>
> 
> I think this later 2 chunks are a better fit for the previous patch, so
> that here there will be only driver-related changes.
> 

So if the later chunks are moved to the previous patch, wouldn't git 
bisect and build throw warnings as the kernel API (netif_napi_add_queue) 
would only be defined but not used. The function netif_napi_add_queue is 
being invoked only by the driver code. Hence, I had to move in the 
kernel function definition to this patch that has all the driver code.

> Also it looks like the napi-queue APIs are going to grow a bit. Perhaps
> it would be useful move all that new code in a separated file? dev.c is
> already pretty big.
> 

Are you suggesting to move just the napi-queue related new code into a 
separate file, but keep all other napi stuff in dev.c ? In that case, 
currently, the new file would contain only two function definitions for 
netif_napi_add_queue and delete (if the napi-queue specific APIs do not 
grow). I agree there may be more generic napi APIs in future (not just 
queue info related), but wouldn't it look better if all the napi code 
were in a new file.

> Thanks!
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-03  6:08     ` Jakub Kicinski
@ 2023-07-12 20:05       ` Nambiar, Amritha
  0 siblings, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 20:05 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Simon Horman, netdev, davem, sridhar.samudrala

On 6/2/2023 11:08 PM, Jakub Kicinski wrote:
> On Fri, 2 Jun 2023 17:47:17 +0200 Simon Horman wrote:
>> This feels like it should be two patches to me.
>> Though it is not something I feel strongly about.
> 
> +1 I'd put the YAML and what's generated from it in one patch,
> and the hand-written code in another.

Will fix in the next version.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-06-03  6:06   ` Jakub Kicinski
@ 2023-07-12 20:09     ` Nambiar, Amritha
  2023-07-12 21:14       ` Jakub Kicinski
  0 siblings, 1 reply; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 20:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 6/2/2023 11:06 PM, Jakub Kicinski wrote:
> On Thu, 01 Jun 2023 10:42:25 -0700 Amritha Nambiar wrote:
>> Introduce new napi fields 'napi_rxq_list' and 'napi_txq_list'
>> for rx and tx queue set associated with the napi and
>> initialize them. Handle their removal as well.
>>
>> This enables a mapping of each napi instance with the
>> queue/queue-set on the corresponding irq line.
> 
> Wouldn't it be easier to store the NAPI instance pointer in the queue?
> That way we don't have to allocate memory.
> 

Could you please elaborate on this so I have more clarity ? Are you 
suggesting that there's a way to avoid maintaining the list of queues in 
the napi struct?

The idea was for netdev-genl to extract information out of 
netdev->napi_list->napi. For tracking queues, we build a linked list of 
queues for the napi and store it in the napi_struct. This would also 
enable updating the napi<->queue[s] association (later with the 'set' 
command), i.e. remove the queue[s] from the existing napi instance that 
it is currently associated with and map with the new napi instance, by 
simply deleting from one list and adding to the new list.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-06-03  6:17   ` Jakub Kicinski
@ 2023-07-12 20:10     ` Nambiar, Amritha
  2023-07-12 21:19       ` Jakub Kicinski
  0 siblings, 1 reply; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 20:10 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 6/2/2023 11:17 PM, Jakub Kicinski wrote:
> On Thu, 01 Jun 2023 10:42:41 -0700 Amritha Nambiar wrote:
>> Add support in ynl/netdev.yaml for napi related information. The
>> netdev structure tracks all the napi instances and napi fields.
>> The napi instances and associated queue[s] can be retrieved this way.
>>
>> Refactored netdev-genl to support exposing napi<->queue[s] mapping
>> that is retained in a netdev.
>>
>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>> ---
>>   Documentation/netlink/specs/netdev.yaml |   39 +++++
>>   include/uapi/linux/netdev.h             |    4 +
>>   net/core/netdev-genl.c                  |  239 ++++++++++++++++++++++++++-----
>>   tools/include/uapi/linux/netdev.h       |    4 +
>>   4 files changed, 247 insertions(+), 39 deletions(-)
>>
>> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
>> index b99e7ffef7a1..8d0edb529563 100644
>> --- a/Documentation/netlink/specs/netdev.yaml
>> +++ b/Documentation/netlink/specs/netdev.yaml
>> @@ -62,6 +62,44 @@ attribute-sets:
>>           type: u64
>>           enum: xdp-act
>>           enum-as-flags: true
>> +      -
>> +        name: napi-info
>> +        doc: napi information such as napi-id, napi queues etc.
>> +        type: nest
>> +        multi-attr: true
> 
> Let's make a new attr space for the napi info command.
> We don't reuse much of the attributes, and as the commands
> grow stuffing all attrs into one space makes finding stuff
> harder.
> 

Agree. I should not have overloaded the 'dev' command to begin with.

>> +        nested-attributes: dev-napi-info
> 
> And what's inside this nest should also be a separate attr space.
> 

So, I think we could have two new commands for napi data. Would this be 
acceptable, a 'napi-queue-get' command for napi-queue specific 
information (set of TX and RX queues, IRQ number etc.), and another 
'napi-info-get' for other information,  such as PID for the napi thread, 
CPU etc.

Example:
  $ ./cli.py --spec netdev.yaml  --do napi-queue-get --json='{"ifindex": 
12}'

[{'napi-info': [{'napi-id': 600, 'rx-queues': [7], 'tx-queues': [7], 
'irq': 298},
                 {'napi-id': 599, 'rx-queues': [6], 'tx-queues': [6], 
'irq': 297},
                 {'napi-id': 598, 'rx-queues': [5], 'tx-queues': [5], 
'irq': 296},
                 {'napi-id': 597, 'rx-queues': [4], 'tx-queues': [4], 
'irq': 295},
                 {'napi-id': 596, 'rx-queues': [3], 'tx-queues': [3], 
'irq': 294},
                 {'napi-id': 595, 'rx-queues': [2], 'tx-queues': [2], 
'irq': 293},
                 {'napi-id': 594, 'rx-queues': [1], 'tx-queues': [1], 
'irq': 292},
                 {'napi-id': 593, 'rx-queues': [0], 'tx-queues': [0], 
'irq': 291}]}]
				

$ ./cli.py --spec netdev.yaml  --do napi-info-get --json='{"ifindex": 12}'

[{'napi-info': [{'napi-id': 600, 'pid': 68114},
                 {'napi-id': 599, 'pid': 68113},
                 {'napi-id': 598, 'pid': 68112},
                 {'napi-id': 597, 'pid': 68111},
                 {'napi-id': 596, 'pid': 68110},
                 {'napi-id': 595, 'pid': 68109},
                 {'napi-id': 594, 'pid': 68108},
                 {'napi-id': 593, 'pid': 68107}]}]
> 
>> +      -
>> +        name: napi-id
>> +        doc: napi id
>> +        type: u32
>> +      -
>> +        name: rx-queues
>> +        doc: list of rx queues associated with a napi
>> +        type: u16
> 
> Make it u32, at the uAPI level we're tried to the width of fields, and
> u16 ends up being the same size as u32 "on the wire" due to padding.
> 

Makes sense. Will fix.

>> +        multi-attr: true
>> +      -
>> +        name: tx-queues
>> +        doc: list of tx queues associated with a napi
>> +        type: u16
>> +        multi-attr: true
> 
> 
>> +  -
>> +    name: dev-napi-info
>> +    subset-of: dev
> 
> Yeah, this shouldn't be a subset just a full-on separate attr space.
> The handshake family may be a good example to look at, it's the biggest
> so far written with the new rules in mind.
> 

Okay.

>> +    attributes:
>> +      -
>> +        name: napi-id
>> +        doc: napi id
>> +        type: u32
>> +      -
>> +        name: rx-queues
>> +        doc: list rx of queues associated with a napi
>> +        type: u16
>> +        multi-attr: true
>> +      -
>> +        name: tx-queues
>> +        doc: list tx of queues associated with a napi
>> +        type: u16
>> +        multi-attr: true
>>   
>>   operations:
>>     list:
>> @@ -77,6 +115,7 @@ operations:
>>             attributes:
>>               - ifindex
>>               - xdp-features
>> +            - napi-info
> 
> Aaah, separate command, please. Let's not stuff all the information
> into a single command like we did for rtnl.
> 

Okay.

>>         dump:
>>           reply: *dev-all
>>       -

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-12 20:09     ` Nambiar, Amritha
@ 2023-07-12 21:14       ` Jakub Kicinski
  2023-07-12 23:11         ` Nambiar, Amritha
  0 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-07-12 21:14 UTC (permalink / raw)
  To: Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On Wed, 12 Jul 2023 13:09:35 -0700 Nambiar, Amritha wrote:
> On 6/2/2023 11:06 PM, Jakub Kicinski wrote:
> > On Thu, 01 Jun 2023 10:42:25 -0700 Amritha Nambiar wrote:  
> >> Introduce new napi fields 'napi_rxq_list' and 'napi_txq_list'
> >> for rx and tx queue set associated with the napi and
> >> initialize them. Handle their removal as well.
> >>
> >> This enables a mapping of each napi instance with the
> >> queue/queue-set on the corresponding irq line.  
> > 
> > Wouldn't it be easier to store the NAPI instance pointer in the queue?
> > That way we don't have to allocate memory.
> >   
> 
> Could you please elaborate on this so I have more clarity ? 

First off, let's acknowledge the fact you're asking me for
clarifications ~40 days after I sent the feedback.

Pause for self-reflection.

Okay.

> Are you suggesting that there's a way to avoid maintaining the list
> of queues in the napi struct?

Yes, why not add the napi pointer to struct netdev_queue and
netdev_rx_queue, specifically?

> The idea was for netdev-genl to extract information out of 
> netdev->napi_list->napi. For tracking queues, we build a linked list
> of queues for the napi and store it in the napi_struct. This would
> also enable updating the napi<->queue[s] association (later with the
> 'set' command), i.e. remove the queue[s] from the existing napi
> instance that it is currently associated with and map with the new
> napi instance, by simply deleting from one list and adding to the new
> list.

Right, my point is that each queue can only be serviced by a single
NAPI at a time, so we have a 1:N relationship. It's easier to store
the state on the side that's the N, rather than 1.

You can add list_head to the queue structs, if you prefer to be able 
to walk queues of a NAPI more efficiently (that said the head for
the list is in "control path only section" of napi_struct so...
I think you don't?)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev
  2023-07-12 20:10     ` Nambiar, Amritha
@ 2023-07-12 21:19       ` Jakub Kicinski
  0 siblings, 0 replies; 29+ messages in thread
From: Jakub Kicinski @ 2023-07-12 21:19 UTC (permalink / raw)
  To: Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On Wed, 12 Jul 2023 13:10:45 -0700 Nambiar, Amritha wrote:
> So, I think we could have two new commands for napi data. Would this be 
> acceptable, a 'napi-queue-get' command for napi-queue specific 
> information (set of TX and RX queues, IRQ number etc.), and another 
> 'napi-info-get' for other information,  such as PID for the napi thread, 
> CPU etc.
> 
> Example:
>   $ ./cli.py --spec netdev.yaml  --do napi-queue-get --json='{"ifindex": 
> 12}'
> 
> [{'napi-info': [{'napi-id': 600, 'rx-queues': [7], 'tx-queues': [7], 
> 'irq': 298},

I think the commands make sense. You should echo back the ifindex, tho,
it threw me off initially that there's only on attribute ('napi-info')
in the reply, in which case the nest would have been pointless..

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-12 21:14       ` Jakub Kicinski
@ 2023-07-12 23:11         ` Nambiar, Amritha
  2023-07-12 23:53           ` Jakub Kicinski
  0 siblings, 1 reply; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-12 23:11 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 7/12/2023 2:14 PM, Jakub Kicinski wrote:
> On Wed, 12 Jul 2023 13:09:35 -0700 Nambiar, Amritha wrote:
>> On 6/2/2023 11:06 PM, Jakub Kicinski wrote:
>>> On Thu, 01 Jun 2023 10:42:25 -0700 Amritha Nambiar wrote:
>>>> Introduce new napi fields 'napi_rxq_list' and 'napi_txq_list'
>>>> for rx and tx queue set associated with the napi and
>>>> initialize them. Handle their removal as well.
>>>>
>>>> This enables a mapping of each napi instance with the
>>>> queue/queue-set on the corresponding irq line.
>>>
>>> Wouldn't it be easier to store the NAPI instance pointer in the queue?
>>> That way we don't have to allocate memory.
>>>    
>>
>> Could you please elaborate on this so I have more clarity ?
> 
> First off, let's acknowledge the fact you're asking me for
> clarifications ~40 days after I sent the feedback.
> 
Sorry about that, my vacation to be blamed.

> Pause for self-reflection.
> 
> Okay.
> 
>> Are you suggesting that there's a way to avoid maintaining the list
>> of queues in the napi struct?
> 
> Yes, why not add the napi pointer to struct netdev_queue and
> netdev_rx_queue, specifically?
> 

Yes, this would achieve the queue<->napi binding for each queue. But 
when there are multiple queues for a NAPI, I would need to walk a list 
of queues for the NAPI.

>> The idea was for netdev-genl to extract information out of
>> netdev->napi_list->napi. For tracking queues, we build a linked list
>> of queues for the napi and store it in the napi_struct. This would
>> also enable updating the napi<->queue[s] association (later with the
>> 'set' command), i.e. remove the queue[s] from the existing napi
>> instance that it is currently associated with and map with the new
>> napi instance, by simply deleting from one list and adding to the new
>> list.
> 
> Right, my point is that each queue can only be serviced by a single
> NAPI at a time, so we have a 1:N relationship. It's easier to store
> the state on the side that's the N, rather than 1.
> 
> You can add list_head to the queue structs, if you prefer to be able
> to walk queues of a NAPI more efficiently (that said the head for
> the list is in "control path only section" of napi_struct so...
> I think you don't?)

The napi pointer in the queue structs would give the napi<->queue 
mapping, I still need to walk the queues of a NAPI (when there are 
multiple queues for the NAPI), example:
'napi-id': 600, 'rx-queues': [7,6,5], 'tx-queues': [7,6,5]

in which case I would have a list of netdev queue structs within the 
napi_struct (instead of the list of queue indices that I currently have) 
to avoid memory allocation.

Does this sound right?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-12 23:11         ` Nambiar, Amritha
@ 2023-07-12 23:53           ` Jakub Kicinski
  2023-07-28 21:59             ` Jakub Kicinski
  0 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-07-12 23:53 UTC (permalink / raw)
  To: Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On Wed, 12 Jul 2023 16:11:55 -0700 Nambiar, Amritha wrote:
> >> The idea was for netdev-genl to extract information out of
> >> netdev->napi_list->napi. For tracking queues, we build a linked list
> >> of queues for the napi and store it in the napi_struct. This would
> >> also enable updating the napi<->queue[s] association (later with the
> >> 'set' command), i.e. remove the queue[s] from the existing napi
> >> instance that it is currently associated with and map with the new
> >> napi instance, by simply deleting from one list and adding to the new
> >> list.  
> > 
> > Right, my point is that each queue can only be serviced by a single
> > NAPI at a time, so we have a 1:N relationship. It's easier to store
> > the state on the side that's the N, rather than 1.
> > 
> > You can add list_head to the queue structs, if you prefer to be able
> > to walk queues of a NAPI more efficiently (that said the head for
> > the list is in "control path only section" of napi_struct so...
> > I think you don't?)  
> 
> The napi pointer in the queue structs would give the napi<->queue 
> mapping, I still need to walk the queues of a NAPI (when there are 
> multiple queues for the NAPI), example:
> 'napi-id': 600, 'rx-queues': [7,6,5], 'tx-queues': [7,6,5]
> 
> in which case I would have a list of netdev queue structs within the 
> napi_struct (instead of the list of queue indices that I currently have) 
> to avoid memory allocation.
> 
> Does this sound right?

yes, I think that's fine.

If we store the NAPI pointer in the queue struct, we can still generate
the same dump with the time complexity of #napis * (#max_rx + #max_tx).
Which I don't think is too bad. Up to you.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-12 23:53           ` Jakub Kicinski
@ 2023-07-28 21:59             ` Jakub Kicinski
  2023-07-28 22:37               ` Nambiar, Amritha
  0 siblings, 1 reply; 29+ messages in thread
From: Jakub Kicinski @ 2023-07-28 21:59 UTC (permalink / raw)
  To: Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On Wed, 12 Jul 2023 16:53:26 -0700 Jakub Kicinski wrote:
> > The napi pointer in the queue structs would give the napi<->queue 
> > mapping, I still need to walk the queues of a NAPI (when there are 
> > multiple queues for the NAPI), example:
> > 'napi-id': 600, 'rx-queues': [7,6,5], 'tx-queues': [7,6,5]
> > 
> > in which case I would have a list of netdev queue structs within the 
> > napi_struct (instead of the list of queue indices that I currently have) 
> > to avoid memory allocation.
> > 
> > Does this sound right?  
> 
> yes, I think that's fine.
> 
> If we store the NAPI pointer in the queue struct, we can still generate
> the same dump with the time complexity of #napis * (#max_rx + #max_tx).
> Which I don't think is too bad. Up to you.

The more I think about it the more I feel like we should dump queues
and NAPIs separately. And the queue can list the NAPI id of the NAPI
instance which services it.

Are you actively working on this or should I take a stab?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-28 21:59             ` Jakub Kicinski
@ 2023-07-28 22:37               ` Nambiar, Amritha
  2023-07-28 23:09                 ` Jakub Kicinski
  0 siblings, 1 reply; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-28 22:37 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 7/28/2023 2:59 PM, Jakub Kicinski wrote:
> On Wed, 12 Jul 2023 16:53:26 -0700 Jakub Kicinski wrote:
>>> The napi pointer in the queue structs would give the napi<->queue
>>> mapping, I still need to walk the queues of a NAPI (when there are
>>> multiple queues for the NAPI), example:
>>> 'napi-id': 600, 'rx-queues': [7,6,5], 'tx-queues': [7,6,5]
>>>
>>> in which case I would have a list of netdev queue structs within the
>>> napi_struct (instead of the list of queue indices that I currently have)
>>> to avoid memory allocation.
>>>
>>> Does this sound right?
>>
>> yes, I think that's fine.
>>
>> If we store the NAPI pointer in the queue struct, we can still generate
>> the same dump with the time complexity of #napis * (#max_rx + #max_tx).
>> Which I don't think is too bad. Up to you.
> 
> The more I think about it the more I feel like we should dump queues
> and NAPIs separately. And the queue can list the NAPI id of the NAPI
> instance which services it.
> 
> Are you actively working on this or should I take a stab?

Hi Jakub, I have the next version of patches ready (I'll send that in a 
bit). I suggest if you could take a look at it and let me know your 
thoughts and then we can proceed from there.

About dumping queues and NAPIs separately, are you thinking about having 
both per-NAPI and per-queue instances, or do you think only one will 
suffice. The plan was to follow this work with a 'set-napi' series, 
something like,
set-napi <napi_id> queues <q_id1, q_id2, ...>
to configure the queue[s] that are to be serviced by the napi instance.

In this case, dumping the NAPIs would be beneficial especially when 
there are multiple queues on the NAPI.

WRT per-queue, are there a set of parameters that needs to exposed 
besides what's already handled by ethtool... Also, to configure a queue 
on a NAPI, set-queue <qid> <napi_id>, the existing NAPIs would have to 
be looked up from the queue parameters dumped.

-Amritha

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-28 22:37               ` Nambiar, Amritha
@ 2023-07-28 23:09                 ` Jakub Kicinski
  2023-07-31 23:48                   ` Nambiar, Amritha
  2023-08-02  0:26                   ` David Ahern
  0 siblings, 2 replies; 29+ messages in thread
From: Jakub Kicinski @ 2023-07-28 23:09 UTC (permalink / raw)
  To: Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On Fri, 28 Jul 2023 15:37:14 -0700 Nambiar, Amritha wrote:
> Hi Jakub, I have the next version of patches ready (I'll send that in a 
> bit). I suggest if you could take a look at it and let me know your 
> thoughts and then we can proceed from there.

Great, looking forward.

> About dumping queues and NAPIs separately, are you thinking about having 
> both per-NAPI and per-queue instances, or do you think only one will 
> suffice. The plan was to follow this work with a 'set-napi' series, 
> something like,
> set-napi <napi_id> queues <q_id1, q_id2, ...>
> to configure the queue[s] that are to be serviced by the napi instance.
> 
> In this case, dumping the NAPIs would be beneficial especially when 
> there are multiple queues on the NAPI.
> 
> WRT per-queue, are there a set of parameters that needs to exposed 
> besides what's already handled by ethtool...

Not much at this point, maybe memory model. Maybe stats if we want to
put stats in the same command. But the fact that sysfs has a bunch of
per queue attributes makes me think that sooner or later we'll want
queue as a full object in netlink. And starting out that way makes 
the whole API cleaner, at least in my opinion.

If we have another object which wants to refer to queues (e.g. page
pool) it's easier to express the topology when it's clear what is an
object and what's just an attribute.

> Also, to configure a queue 
> on a NAPI, set-queue <qid> <napi_id>, the existing NAPIs would have to 
> be looked up from the queue parameters dumped.

The look up should not be much of a problem.

And don't you think that:

  set-queue queue 1 napi-id 101
  set-queue queue 2 napi-id 101

is more natural than:

  set-napi napi-id 101 queues [1, 2]

Especially in presence of conflicts. If user tries:

  set-napi napi-id 101 queues [1, 2]
  set-napi napi-id 102 queues [1, 2]

Do both napis now serve those queues? May seem obvious to us, but
"philosophically" why does setting an attribute of object 102 change
attributes of object 101?

If we ever gain the ability to create queues it will be:

  create-queue napi-id xyz

which also matches set-queue more nicely than napi base API.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-28 23:09                 ` Jakub Kicinski
@ 2023-07-31 23:48                   ` Nambiar, Amritha
  2023-08-02  0:26                   ` David Ahern
  1 sibling, 0 replies; 29+ messages in thread
From: Nambiar, Amritha @ 2023-07-31 23:48 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, sridhar.samudrala

On 7/28/2023 4:09 PM, Jakub Kicinski wrote:
> On Fri, 28 Jul 2023 15:37:14 -0700 Nambiar, Amritha wrote:
>> Hi Jakub, I have the next version of patches ready (I'll send that in a
>> bit). I suggest if you could take a look at it and let me know your
>> thoughts and then we can proceed from there.
> 
> Great, looking forward.
> 
>> About dumping queues and NAPIs separately, are you thinking about having
>> both per-NAPI and per-queue instances, or do you think only one will
>> suffice. The plan was to follow this work with a 'set-napi' series,
>> something like,
>> set-napi <napi_id> queues <q_id1, q_id2, ...>
>> to configure the queue[s] that are to be serviced by the napi instance.
>>
>> In this case, dumping the NAPIs would be beneficial especially when
>> there are multiple queues on the NAPI.
>>
>> WRT per-queue, are there a set of parameters that needs to exposed
>> besides what's already handled by ethtool...
> 
> Not much at this point, maybe memory model. Maybe stats if we want to
> put stats in the same command. But the fact that sysfs has a bunch of
> per queue attributes makes me think that sooner or later we'll want
> queue as a full object in netlink. And starting out that way makes
> the whole API cleaner, at least in my opinion.
> 
> If we have another object which wants to refer to queues (e.g. page
> pool) it's easier to express the topology when it's clear what is an
> object and what's just an attribute.
> 
>> Also, to configure a queue
>> on a NAPI, set-queue <qid> <napi_id>, the existing NAPIs would have to
>> be looked up from the queue parameters dumped.
> 
> The look up should not be much of a problem.
> 
> And don't you think that:
> 
>    set-queue queue 1 napi-id 101
>    set-queue queue 2 napi-id 101
> 
> is more natural than:
> 
>    set-napi napi-id 101 queues [1, 2]
> 
> Especially in presence of conflicts. If user tries:
> 
>    set-napi napi-id 101 queues [1, 2]
>    set-napi napi-id 102 queues [1, 2]
> 
> Do both napis now serve those queues? May seem obvious to us, but
> "philosophically" why does setting an attribute of object 102 change
> attributes of object 101?
> 

Right, I see the point. In presence of conflicts when the 
napi<->queue[s] mappings are updated, set-napi will impact other 
NAPI-IDs, while set-queue would limit the change to just the queue that 
is requested.

In both the cases, the underlying work remains the same:
1. Remove the queue from the existing napi instance it is associated with.
2. Driver updates queue[s]<->vector mapping and associates with new napi
instance.
3. Report the impacted napi/queue back to the stack.

The 'napi-get' command would list all the napis and the updated
queue[s] list.

Now, in usecases where a single poller is set to service multiple queues 
(say 8), with set-napi this can be done with a single command, while 
with set-queue this will result in 8 different requests to the driver. 
This is the trade-off I see if we go with set-queue.

> If we ever gain the ability to create queues it will be:
> 
>    create-queue napi-id xyz
> 
> which also matches set-queue more nicely than napi base API.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-07-28 23:09                 ` Jakub Kicinski
  2023-07-31 23:48                   ` Nambiar, Amritha
@ 2023-08-02  0:26                   ` David Ahern
  2023-08-02  0:50                     ` Jakub Kicinski
  1 sibling, 1 reply; 29+ messages in thread
From: David Ahern @ 2023-08-02  0:26 UTC (permalink / raw)
  To: Jakub Kicinski, Nambiar, Amritha; +Cc: netdev, davem, sridhar.samudrala

On 7/28/23 5:09 PM, Jakub Kicinski wrote:
> On Fri, 28 Jul 2023 15:37:14 -0700 Nambiar, Amritha wrote:
>> Hi Jakub, I have the next version of patches ready (I'll send that in a 
>> bit). I suggest if you could take a look at it and let me know your 
>> thoughts and then we can proceed from there.
> 
> Great, looking forward.
> 
>> About dumping queues and NAPIs separately, are you thinking about having 
>> both per-NAPI and per-queue instances, or do you think only one will 
>> suffice. The plan was to follow this work with a 'set-napi' series, 
>> something like,
>> set-napi <napi_id> queues <q_id1, q_id2, ...>
>> to configure the queue[s] that are to be serviced by the napi instance.
>>
>> In this case, dumping the NAPIs would be beneficial especially when 
>> there are multiple queues on the NAPI.
>>
>> WRT per-queue, are there a set of parameters that needs to exposed 
>> besides what's already handled by ethtool...
> 
> Not much at this point, maybe memory model. Maybe stats if we want to
> put stats in the same command. But the fact that sysfs has a bunch of
> per queue attributes makes me think that sooner or later we'll want
> queue as a full object in netlink. And starting out that way makes 
> the whole API cleaner, at least in my opinion.
> 
> If we have another object which wants to refer to queues (e.g. page
> pool) it's easier to express the topology when it's clear what is an
> object and what's just an attribute.
> 
>> Also, to configure a queue 
>> on a NAPI, set-queue <qid> <napi_id>, the existing NAPIs would have to 
>> be looked up from the queue parameters dumped.
> 
> The look up should not be much of a problem.
> 
> And don't you think that:
> 
>   set-queue queue 1 napi-id 101
>   set-queue queue 2 napi-id 101
> 
> is more natural than:
> 
>   set-napi napi-id 101 queues [1, 2]
> 
> Especially in presence of conflicts. If user tries:
> 
>   set-napi napi-id 101 queues [1, 2]
>   set-napi napi-id 102 queues [1, 2]
> 
> Do both napis now serve those queues? May seem obvious to us, but
> "philosophically" why does setting an attribute of object 102 change
> attributes of object 101?
> 
> If we ever gain the ability to create queues it will be:
> 
>   create-queue napi-id xyz
> 
> which also matches set-queue more nicely than napi base API.
> 

I take it you have this path in mind as a means of creating
"specialized" queues (e.g., io_uring and Rx ZC). Any slides or notes on
the bigger picture?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues
  2023-08-02  0:26                   ` David Ahern
@ 2023-08-02  0:50                     ` Jakub Kicinski
  0 siblings, 0 replies; 29+ messages in thread
From: Jakub Kicinski @ 2023-08-02  0:50 UTC (permalink / raw)
  To: David Ahern; +Cc: Nambiar, Amritha, netdev, davem, sridhar.samudrala

On Tue, 1 Aug 2023 18:26:26 -0600 David Ahern wrote:
> I take it you have this path in mind as a means of creating
> "specialized" queues (e.g., io_uring and Rx ZC).

TBH I've been thinking more about the huge page stuff and RSS context
handling. Rx ZC should be a subset of what's needed for those cases.

> Any slides or notes on the bigger picture?

I don't have it well figured out, yet. The user space API is pretty
easy, but shaping it in a way that makes driver's life manageable is
more challenging.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-08-02  0:50 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-01 17:42 [net-next/RFC PATCH v1 0/4] Introduce napi queues support Amritha Nambiar
2023-06-01 17:42 ` [net-next/RFC PATCH v1 1/4] net: Introduce new napi fields for rx/tx queues Amritha Nambiar
2023-06-03  6:06   ` Jakub Kicinski
2023-07-12 20:09     ` Nambiar, Amritha
2023-07-12 21:14       ` Jakub Kicinski
2023-07-12 23:11         ` Nambiar, Amritha
2023-07-12 23:53           ` Jakub Kicinski
2023-07-28 21:59             ` Jakub Kicinski
2023-07-28 22:37               ` Nambiar, Amritha
2023-07-28 23:09                 ` Jakub Kicinski
2023-07-31 23:48                   ` Nambiar, Amritha
2023-08-02  0:26                   ` David Ahern
2023-08-02  0:50                     ` Jakub Kicinski
2023-06-01 17:42 ` [net-next/RFC PATCH v1 2/4] net: Add support for associating napi with queue[s] Amritha Nambiar
2023-06-02 15:42   ` Simon Horman
2023-07-12 19:53     ` Nambiar, Amritha
2023-06-03  6:31   ` Paolo Abeni
2023-07-12 19:56     ` Nambiar, Amritha
2023-06-01 17:42 ` [net-next/RFC PATCH v1 3/4] netdev-genl: Introduce netdev dump ctx Amritha Nambiar
2023-06-01 17:42 ` [net-next/RFC PATCH v1 4/4] netdev-genl: Add support for exposing napi info from netdev Amritha Nambiar
2023-06-02 15:47   ` Simon Horman
2023-06-03  6:08     ` Jakub Kicinski
2023-07-12 20:05       ` Nambiar, Amritha
2023-07-12 19:54     ` Nambiar, Amritha
2023-06-03  6:17   ` Jakub Kicinski
2023-07-12 20:10     ` Nambiar, Amritha
2023-07-12 21:19       ` Jakub Kicinski
2023-06-03  6:00 ` [net-next/RFC PATCH v1 0/4] Introduce napi queues support Jakub Kicinski
2023-07-12 19:52   ` Nambiar, Amritha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).