From: Maryam Tahhan <mtahhan@redhat.com>
To: ferruh.yigit@amd.com, stephen@networkplumber.org,
lihuisong@huawei.com, fengchengwen@huawei.com,
liuyonglong@huawei.com, david.marchand@redhat.com,
shibin.koikkara.reeny@intel.com, ciara.loftus@intel.com
Cc: dev@dpdk.org, Maryam Tahhan <mtahhan@redhat.com>
Subject: [v10 3/3] net/af_xdp: support AF_XDP DP pinned maps
Date: Thu, 29 Feb 2024 08:01:24 -0500 [thread overview]
Message-ID: <20240229130212.343036-4-mtahhan@redhat.com> (raw)
In-Reply-To: <20240229130212.343036-1-mtahhan@redhat.com>
Enable the AF_XDP PMD to retrieve the xskmap
from a pinned eBPF map. This map is expected
to be pinned by an external entity like the
AF_XDP Device Plugin. This enabled unprivileged
pods to create and use AF_XDP sockets.
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
---
doc/guides/howto/af_xdp_dp.rst | 35 ++++++++--
doc/guides/nics/af_xdp.rst | 34 ++++++++--
doc/guides/rel_notes/release_24_03.rst | 10 +++
drivers/net/af_xdp/rte_eth_af_xdp.c | 93 ++++++++++++++++++++------
4 files changed, 141 insertions(+), 31 deletions(-)
diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst
index ec348c3b82..9aa9f7d8d4 100644
--- a/doc/guides/howto/af_xdp_dp.rst
+++ b/doc/guides/howto/af_xdp_dp.rst
@@ -52,10 +52,21 @@ should be used when creating the socket
to instruct libbpf not to load the default libbpf program on the netdev.
Instead the loading is handled by the AF_XDP Device Plugin.
-The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument
-to explicitly tell the AF_XDP PMD where to find the UDS to interact with the
-AF_XDP Device Plugin. If this argument is not passed alongside the ``use_cni``
-argument then the AF_XDP PMD configures it internally.
+The EAL vdev argument ``use_pinned_map`` is used indicate to the AF_XDP PMD to
+retrieve the XSKMAP fd from a pinned eBPF map. This map is expected to be pinned
+by an external entity like the AF_XDP Device Plugin. This enabled unprivileged pods
+to create and use AF_XDP sockets. When this flag is set, the
+``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used by the AF_XDP PMD when
+creating the AF_XDP socket.
+
+The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map``
+arguments to explicitly tell the AF_XDP PMD where to find either:
+
+1. The UDS to interact with the AF_XDP Device Plugin. OR
+2. The pinned xskmap to use when creating AF_XDP sockets.
+
+If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then
+the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_.
.. note::
@@ -312,8 +323,18 @@ Run dpdk-testpmd with the AF_XDP Device Plugin + CNI
--no-mlockall --in-memory \
-- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
+ Or
+
+ .. code-block:: console
+
+ kubectl exec -i <Pod name> --container <containers name> -- \
+ /<Path>/dpdk-testpmd -l 0,1 --no-pci \
+ --vdev=net_af_xdp0,use_pinned_map=1,iface=<interface name>,dp_path="/tmp/afxdp_dp/<interface name>/xsks_map" \
+ --no-mlockall --in-memory \
+ -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap;
+
.. note::
- If the ``dp_path`` parameter isn't explicitly set (like the example above)
- the AF_XDP PMD will set the parameter value to
- ``/tmp/afxdp_dp/<<interface name>>/afxdp.sock``.
+ If the ``dp_path`` parameter isn't explicitly set with ``use_cni`` or ``use_pinned_map``
+ the AF_XDP PMD will set the parameter values to the `AF_XDP Device Plugin for Kubernetes`_
+ defaults.
diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
index 7f8651beda..940bbf60f2 100644
--- a/doc/guides/nics/af_xdp.rst
+++ b/doc/guides/nics/af_xdp.rst
@@ -171,13 +171,35 @@ enable the `AF_XDP Device Plugin for Kubernetes`_ with a DPDK application/pod.
so enabling and disabling of the promiscuous mode through the DPDK application
is also not supported.
+use_pinned_map
+~~~~~~~~~~~~~~
+
+The EAL vdev argument ``use_pinned_map`` is used to indicate that the user wishes to
+load a pinned xskmap mounted by `AF_XDP Device Plugin for Kubernetes`_ in the DPDK
+application/pod.
+
+.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
+
+.. code-block:: console
+
+ --vdev=net_af_xdp0,use_pinned_map=1
+
+.. note::
+
+ This feature can also be used with any external entity that can pin an eBPF map, not just
+ the `AF_XDP Device Plugin for Kubernetes`_.
+
dp_path
~~~~~~~
-The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` argument
-to explicitly tell the AF_XDP PMD where to find the UDS to interact with the
-`AF_XDP Device Plugin for Kubernetes`_. If this argument is not passed
-alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally.
+The EAL vdev argument ``dp_path`` is used alongside the ``use_cni`` or ``use_pinned_map``
+arguments to explicitly tell the AF_XDP PMD where to find either:
+
+1. The UDS to interact with the AF_XDP Device Plugin. OR
+2. The pinned xskmap to use when creating AF_XDP sockets.
+
+If this argument is not passed alongside the ``use_cni`` or ``use_pinned_map`` arguments then
+the AF_XDP PMD configures it internally to the `AF_XDP Device Plugin for Kubernetes`_.
.. _AF_XDP Device Plugin for Kubernetes: https://github.com/intel/afxdp-plugins-for-kubernetes
@@ -185,6 +207,10 @@ alongside the ``use_cni`` argument then the AF_XDP PMD configures it internally.
--vdev=net_af_xdp0,use_cni=1,dp_path="/tmp/afxdp_dp/<<interface name>>/afxdp.sock"
+.. code-block:: console
+
+ --vdev=net_af_xdp0,use_pinned_map=1,dp_path="/tmp/afxdp_dp/<<interface name>>/xsks_map"
+
Limitations
-----------
diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst
index b2b1f2566f..95d9a0f842 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -146,6 +146,16 @@ New Features
compatibility for any applications already using the ``use_cni`` vdev
argument with the AF_XDP Device Plugin.
+* **Integrated AF_XDP PMD with AF_XDP Device Plugin eBPF map pinning support**.
+
+ The EAL vdev argument for the AF_XDP PMD ``use_map_pinning`` was added
+ to allow Kubernetes Pods to use AF_XDP with DPDK, and run with limited
+ privileges, without having to do a full handshake over a Unix Domain
+ Socket with the Device Plugin. This flag indicates that the AF_XDP PMD
+ will be used in unprivileged mode and will obtain the XSKMAP FD by calling
+ ``bpf_obj_get()`` for an xskmap pinned (by the AF_XDP DP) inside the
+ container.
+
Removed Items
-------------
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 3fb0c6a3b9..f13bdb9017 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -85,6 +85,7 @@ RTE_LOG_REGISTER_DEFAULT(af_xdp_logtype, NOTICE);
#define DP_BASE_PATH "/tmp/afxdp_dp"
#define DP_UDS_SOCK "afxdp.sock"
+#define DP_XSK_MAP "xsks_map"
#define MAX_LONG_OPT_SZ 64
#define UDS_MAX_FD_NUM 2
#define UDS_MAX_CMD_LEN 64
@@ -172,6 +173,7 @@ struct pmd_internals {
bool custom_prog_configured;
bool force_copy;
bool use_cni;
+ bool use_pinned_map;
char dp_path[PATH_MAX];
struct bpf_map *map;
@@ -193,6 +195,7 @@ struct pmd_process_private {
#define ETH_AF_XDP_BUDGET_ARG "busy_budget"
#define ETH_AF_XDP_FORCE_COPY_ARG "force_copy"
#define ETH_AF_XDP_USE_CNI_ARG "use_cni"
+#define ETH_AF_XDP_USE_PINNED_MAP_ARG "use_pinned_map"
#define ETH_AF_XDP_DP_PATH_ARG "dp_path"
static const char * const valid_arguments[] = {
@@ -204,6 +207,7 @@ static const char * const valid_arguments[] = {
ETH_AF_XDP_BUDGET_ARG,
ETH_AF_XDP_FORCE_COPY_ARG,
ETH_AF_XDP_USE_CNI_ARG,
+ ETH_AF_XDP_USE_PINNED_MAP_ARG,
ETH_AF_XDP_DP_PATH_ARG,
NULL
};
@@ -1258,6 +1262,21 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals,
}
#endif
+static int
+get_pinned_map(const char *dp_path, int *map_fd)
+{
+ *map_fd = bpf_obj_get(dp_path);
+ if (!*map_fd) {
+ AF_XDP_LOG(ERR, "Failed to find xsks_map in %s\n", dp_path);
+ return -1;
+ }
+
+ AF_XDP_LOG(INFO, "Successfully retrieved map %s with fd %d\n",
+ dp_path, *map_fd);
+
+ return 0;
+}
+
static int
load_custom_xdp_prog(const char *prog_path, int if_index, struct bpf_map **map)
{
@@ -1644,7 +1663,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
#endif
/* Disable libbpf from loading XDP program */
- if (internals->use_cni)
+ if (internals->use_cni || internals->use_pinned_map)
cfg.libbpf_flags |= XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
if (strnlen(internals->prog_path, PATH_MAX)) {
@@ -1698,14 +1717,23 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
}
}
- if (internals->use_cni) {
+ if (internals->use_cni || internals->use_pinned_map) {
int err, map_fd;
- /* get socket fd from AF_XDP Device Plugin */
- map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path);
- if (map_fd < 0) {
- AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n");
- goto out_xsk;
+ if (internals->use_cni) {
+ /* get socket fd from AF_XDP Device Plugin */
+ map_fd = uds_get_xskmap_fd(internals->if_name, internals->dp_path);
+ if (map_fd < 0) {
+ AF_XDP_LOG(ERR, "Failed to receive xskmap fd from AF_XDP Device Plugin\n");
+ goto out_xsk;
+ }
+ } else {
+ /* get socket fd from AF_XDP plugin */
+ err = get_pinned_map(internals->dp_path, &map_fd);
+ if (err < 0 || map_fd < 0) {
+ AF_XDP_LOG(ERR, "Failed to retrieve pinned map fd\n");
+ goto out_xsk;
+ }
}
err = xsk_socket__update_xskmap(rxq->xsk, map_fd);
@@ -2027,7 +2055,7 @@ static int
parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
int *queue_cnt, int *shared_umem, char *prog_path,
int *busy_budget, int *force_copy, int *use_cni,
- char *dp_path)
+ int *use_pinned_map, char *dp_path)
{
int ret;
@@ -2073,6 +2101,11 @@ parse_parameters(struct rte_kvargs *kvlist, char *if_name, int *start_queue,
if (ret < 0)
goto free_kvlist;
+ ret = rte_kvargs_process(kvlist, ETH_AF_XDP_USE_PINNED_MAP_ARG,
+ &parse_integer_arg, use_pinned_map);
+ if (ret < 0)
+ goto free_kvlist;
+
ret = rte_kvargs_process(kvlist, ETH_AF_XDP_DP_PATH_ARG,
&parse_prog_arg, dp_path);
if (ret < 0)
@@ -2117,7 +2150,7 @@ static struct rte_eth_dev *
init_internals(struct rte_vdev_device *dev, const char *if_name,
int start_queue_idx, int queue_cnt, int shared_umem,
const char *prog_path, int busy_budget, int force_copy,
- int use_cni, const char *dp_path)
+ int use_cni, int use_pinned_map, const char *dp_path)
{
const char *name = rte_vdev_device_name(dev);
const unsigned int numa_node = dev->device.numa_node;
@@ -2147,6 +2180,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
internals->shared_umem = shared_umem;
internals->force_copy = force_copy;
internals->use_cni = use_cni;
+ internals->use_pinned_map = use_pinned_map;
strlcpy(internals->dp_path, dp_path, PATH_MAX);
if (xdp_get_channels_info(if_name, &internals->max_queue_cnt,
@@ -2206,7 +2240,7 @@ init_internals(struct rte_vdev_device *dev, const char *if_name,
eth_dev->data->dev_link = pmd_link;
eth_dev->data->mac_addrs = &internals->eth_addr;
eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
- if (!internals->use_cni)
+ if (!internals->use_cni && !internals->use_pinned_map)
eth_dev->dev_ops = &ops;
else
eth_dev->dev_ops = &ops_afxdp_dp;
@@ -2338,6 +2372,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
int busy_budget = -1, ret;
int force_copy = 0;
int use_cni = 0;
+ int use_pinned_map = 0;
char dp_path[PATH_MAX] = {'\0'};
struct rte_eth_dev *eth_dev = NULL;
const char *name = rte_vdev_device_name(dev);
@@ -2381,20 +2416,29 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
if (parse_parameters(kvlist, if_name, &xsk_start_queue_idx,
&xsk_queue_cnt, &shared_umem, prog_path,
- &busy_budget, &force_copy, &use_cni, dp_path) < 0) {
+ &busy_budget, &force_copy, &use_cni, &use_pinned_map,
+ dp_path) < 0) {
AF_XDP_LOG(ERR, "Invalid kvargs value\n");
return -EINVAL;
}
- if (use_cni && busy_budget > 0) {
+ if (use_cni && use_pinned_map) {
AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n",
- ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_BUDGET_ARG);
+ ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG);
return -EINVAL;
}
- if (use_cni && strnlen(prog_path, PATH_MAX)) {
- AF_XDP_LOG(ERR, "When '%s' parameter is used, '%s' parameter is not valid\n",
- ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_PROG_ARG);
+ if ((use_cni || use_pinned_map) && busy_budget > 0) {
+ AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n",
+ ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG,
+ ETH_AF_XDP_BUDGET_ARG);
+ return -EINVAL;
+ }
+
+ if ((use_cni || use_pinned_map) && strnlen(prog_path, PATH_MAX)) {
+ AF_XDP_LOG(ERR, "When '%s' or '%s' parameter is used, '%s' parameter is not valid\n",
+ ETH_AF_XDP_USE_CNI_ARG, ETH_AF_XDP_USE_PINNED_MAP_ARG,
+ ETH_AF_XDP_PROG_ARG);
return -EINVAL;
}
@@ -2404,9 +2448,16 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
ETH_AF_XDP_DP_PATH_ARG, dp_path);
}
- if (!use_cni && strnlen(dp_path, PATH_MAX)) {
- AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' was not enabled\n",
- ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG);
+ if (use_pinned_map && !strnlen(dp_path, PATH_MAX)) {
+ snprintf(dp_path, sizeof(dp_path), "%s/%s/%s", DP_BASE_PATH, if_name, DP_XSK_MAP);
+ AF_XDP_LOG(INFO, "'%s' parameter not provided, setting value to '%s'\n",
+ ETH_AF_XDP_DP_PATH_ARG, dp_path);
+ }
+
+ if ((!use_cni && !use_pinned_map) && strnlen(dp_path, PATH_MAX)) {
+ AF_XDP_LOG(ERR, "'%s' parameter is set, but '%s' or '%s' were not enabled\n",
+ ETH_AF_XDP_DP_PATH_ARG, ETH_AF_XDP_USE_CNI_ARG,
+ ETH_AF_XDP_USE_PINNED_MAP_ARG);
return -EINVAL;
}
@@ -2433,7 +2484,8 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
eth_dev = init_internals(dev, if_name, xsk_start_queue_idx,
xsk_queue_cnt, shared_umem, prog_path,
- busy_budget, force_copy, use_cni, dp_path);
+ busy_budget, force_copy, use_cni, use_pinned_map,
+ dp_path);
if (eth_dev == NULL) {
AF_XDP_LOG(ERR, "Failed to init internals\n");
return -1;
@@ -2495,4 +2547,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
"busy_budget=<int> "
"force_copy=<int> "
"use_cni=<int> "
+ "use_pinned_map=<int> "
"dp_path=<string> ");
--
2.41.0
prev parent reply other threads:[~2024-02-29 13:02 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-29 13:01 [v10 0/3] net/af_xdp: fix multi interface support for K8s Maryam Tahhan
2024-02-29 13:01 ` [v10 1/3] docs: AF_XDP Device Plugin Maryam Tahhan
2024-02-29 13:01 ` [v10 2/3] net/af_xdp: fix multi interface support for K8s Maryam Tahhan
2024-02-29 13:06 ` Maryam Tahhan
2024-02-29 13:01 ` Maryam Tahhan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240229130212.343036-4-mtahhan@redhat.com \
--to=mtahhan@redhat.com \
--cc=ciara.loftus@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=ferruh.yigit@amd.com \
--cc=lihuisong@huawei.com \
--cc=liuyonglong@huawei.com \
--cc=shibin.koikkara.reeny@intel.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.