From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Björn Töpel" <bjorn.topel@gmail.com>, bpf <bpf@vger.kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
Netdev <netdev@vger.kernel.org>,
"Karlsson, Magnus" <magnus.karlsson@intel.com>
Subject: Re: [PATCH bpf] xsk: mark napi_id on sendmsg()
Date: Thu, 30 Jun 2022 13:53:11 +0200 [thread overview]
Message-ID: <Yr2Op9m1xt5gW7Pw@boxer> (raw)
In-Reply-To: <20220629091707.20d66524@kernel.org>
On Wed, Jun 29, 2022 at 05:17:07PM +0100, Jakub Kicinski wrote:
> On Wed, 29 Jun 2022 09:16:29 -0700 Jakub Kicinski wrote:
> > > Would it make sense to introduce napi_id to xsk_buff_pool then?
> > > xp_set_rxq_info() could be setting it. We are sure that napi_id is the
> > > same for whole pool (each xdp_buff_xsk's rxq info).
> >
> > Would it be possible to move the marking to when the queue is getting
> > bound instead of the recv/send paths?
>
> I mean when socket is getting bound.
So Bjorn said that it was the design choice to follow the standard
sockets' approach. I'm including a dirty diff for a discussion which
allows me to get napi_id at bind() time. But, this works for ice as this
driver during the XDP prog/XSK pool load only disables the NAPI, so we are
sure that napi_id stays the same. That might not be the case for other
AF_XDP ZC enabled drivers though, they might delete the NAPI and this
approach wouldn't work...or am I missing something?
I'd prefer the diff below though as it simplifies the data path, but I
can't say if it's safe to do so. We would have to be sure about drivers
keeping their NAPI struct. This would also allow us to drop napi_id from
xdp_rxq_info.
Thoughts?
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 49ba8bfdbf04..3d084558628e 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -312,6 +312,7 @@ ice_xsk_pool_enable(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
return err;
set_bit(qid, vsi->af_xdp_zc_qps);
+ xsk_pool_set_napi_id(pool, vsi->rx_rings[qid]->q_vector->napi.napi_id);
return 0;
}
@@ -348,7 +349,6 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
pool_failure = pool_present ? ice_xsk_pool_enable(vsi, pool, qid) :
ice_xsk_pool_disable(vsi, qid);
-
xsk_pool_if_up:
if (if_running) {
ret = ice_qp_ena(vsi, qid);
@@ -358,6 +358,7 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
netdev_err(vsi->netdev, "ice_qp_ena error = %d\n", ret);
}
+
failure:
if (pool_failure) {
netdev_err(vsi->netdev, "Could not %sable buffer pool, error = %d\n",
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 4aa031849668..1a20320cd556 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -44,6 +44,14 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool,
xp_set_rxq_info(pool, rxq);
}
+static inline void xsk_pool_set_napi_id(struct xsk_buff_pool *pool,
+ unsigned int napi_id)
+{
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ xp_set_napi_id(pool, napi_id);
+#endif
+}
+
static inline void xsk_pool_dma_unmap(struct xsk_buff_pool *pool,
unsigned long attrs)
{
@@ -198,6 +206,11 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool,
{
}
+static inline void xsk_pool_set_napi_id(struct xsk_buff_pool *pool,
+ unsigned int napi_id)
+{
+}
+
static inline void xsk_pool_dma_unmap(struct xsk_buff_pool *pool,
unsigned long attrs)
{
diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index 647722e847b4..60775a8d1bcb 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -70,6 +70,7 @@ struct xsk_buff_pool {
u32 chunk_size;
u32 chunk_shift;
u32 frame_len;
+ unsigned int napi_id;
u8 cached_need_wakeup;
bool uses_need_wakeup;
bool dma_need_sync;
@@ -125,6 +126,7 @@ static inline void xp_init_xskb_dma(struct xdp_buff_xsk *xskb, struct xsk_buff_p
/* AF_XDP ZC drivers, via xdp_sock_buff.h */
void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq);
+void xp_set_napi_id(struct xsk_buff_pool *pool, unsigned int napi_id);
int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev,
unsigned long attrs, struct page **pages, u32 nr_pages);
void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs);
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index eafd512d38b1..18ac3f32a48d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -222,7 +222,6 @@ static int xsk_rcv_check(struct xdp_sock *xs, struct xdp_buff *xdp)
if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
return -EINVAL;
- sk_mark_napi_id_once_xdp(&xs->sk, xdp);
return 0;
}
@@ -637,11 +636,8 @@ static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len
if (unlikely(need_wait))
return -EOPNOTSUPP;
- if (sk_can_busy_loop(sk)) {
- if (xs->zc)
- __sk_mark_napi_id_once(sk, xs->pool->heads[0].xdp.rxq->napi_id);
+ if (sk_can_busy_loop(sk))
sk_busy_loop(sk, 1); /* only support non-blocking sockets */
- }
if (xs->zc && xsk_no_wakeup(sk))
return 0;
@@ -1015,6 +1011,8 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
xs->dev = dev;
xs->zc = xs->umem->zc;
xs->queue_id = qid;
+ if (xs->zc)
+ xs->sk.sk_napi_id = xs->pool->napi_id;
xp_add_xsk(xs->pool, xs);
out_unlock:
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 87bdd71c7bb6..5ec6443be3fc 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -121,6 +121,14 @@ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq)
}
EXPORT_SYMBOL(xp_set_rxq_info);
+#ifdef CONFIG_NET_RX_BUSY_POLL
+void xp_set_napi_id(struct xsk_buff_pool *pool, unsigned int napi_id)
+{
+ pool->napi_id = napi_id;
+}
+EXPORT_SYMBOL(xp_set_napi_id);
+#endif
+
static void xp_disable_drv_zc(struct xsk_buff_pool *pool)
{
struct netdev_bpf bpf;
next prev parent reply other threads:[~2022-06-30 11:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-29 10:57 [PATCH bpf] xsk: mark napi_id on sendmsg() Maciej Fijalkowski
2022-06-29 12:45 ` Björn Töpel
2022-06-29 12:53 ` Maciej Fijalkowski
2022-06-29 13:18 ` Magnus Karlsson
2022-06-29 16:16 ` Jakub Kicinski
2022-06-29 16:17 ` Jakub Kicinski
2022-06-30 11:53 ` Maciej Fijalkowski [this message]
2022-06-30 15:52 ` Jakub Kicinski
2022-06-29 13:39 ` Björn Töpel
2022-06-29 14:28 ` Maciej Fijalkowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yr2Op9m1xt5gW7Pw@boxer \
--to=maciej.fijalkowski@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bjorn.topel@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.