* [PATCH net-next v4] xsk: support use vaddr as ring
@ 2023-02-16 8:30 Xuan Zhuo
2023-02-16 13:04 ` Alexander Lobakin
2023-02-20 8:30 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Xuan Zhuo @ 2023-02-16 8:30 UTC (permalink / raw)
To: netdev
Cc: Björn Töpel, Magnus Karlsson, Maciej Fijalkowski,
Jonathan Lemon, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, bpf
When we try to start AF_XDP on some machines with long running time, due
to the machine's memory fragmentation problem, there is no sufficient
contiguous physical memory that will cause the start failure.
If the size of the queue is 8 * 1024, then the size of the desc[] is
8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
16page+. This is necessary to apply for a 4-order memory. If there are a
lot of queues, it is difficult to these machine with long running time.
Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
we only use 17 pages.
This patch replaces __get_free_pages() by vmalloc() to allocate memory
to solve these problems.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
net/xdp/xsk.c | 9 ++-------
net/xdp/xsk_queue.c | 11 +++++------
net/xdp/xsk_queue.h | 1 +
3 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 9f0561b67c12..0a047a09a10f 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1295,8 +1295,6 @@ static int xsk_mmap(struct file *file, struct socket *sock,
unsigned long size = vma->vm_end - vma->vm_start;
struct xdp_sock *xs = xdp_sk(sock->sk);
struct xsk_queue *q = NULL;
- unsigned long pfn;
- struct page *qpg;
if (READ_ONCE(xs->state) != XSK_READY)
return -EBUSY;
@@ -1319,13 +1317,10 @@ static int xsk_mmap(struct file *file, struct socket *sock,
/* Matches the smp_wmb() in xsk_init_queue */
smp_rmb();
- qpg = virt_to_head_page(q->ring);
- if (size > page_size(qpg))
+ if (size > q->ring_vmalloc_size)
return -EINVAL;
- pfn = virt_to_phys(q->ring) >> PAGE_SHIFT;
- return remap_pfn_range(vma, vma->vm_start, pfn,
- size, vma->vm_page_prot);
+ return remap_vmalloc_range(vma, q->ring, 0);
}
static int xsk_notifier(struct notifier_block *this,
diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c
index 6cf9586e5027..f8905400ee07 100644
--- a/net/xdp/xsk_queue.c
+++ b/net/xdp/xsk_queue.c
@@ -6,6 +6,7 @@
#include <linux/log2.h>
#include <linux/slab.h>
#include <linux/overflow.h>
+#include <linux/vmalloc.h>
#include <net/xdp_sock_drv.h>
#include "xsk_queue.h"
@@ -23,7 +24,6 @@ static size_t xskq_get_ring_size(struct xsk_queue *q, bool umem_queue)
struct xsk_queue *xskq_create(u32 nentries, bool umem_queue)
{
struct xsk_queue *q;
- gfp_t gfp_flags;
size_t size;
q = kzalloc(sizeof(*q), GFP_KERNEL);
@@ -33,17 +33,16 @@ struct xsk_queue *xskq_create(u32 nentries, bool umem_queue)
q->nentries = nentries;
q->ring_mask = nentries - 1;
- gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN |
- __GFP_COMP | __GFP_NORETRY;
size = xskq_get_ring_size(q, umem_queue);
+ size = PAGE_ALIGN(size);
- q->ring = (struct xdp_ring *)__get_free_pages(gfp_flags,
- get_order(size));
+ q->ring = vmalloc_user(size);
if (!q->ring) {
kfree(q);
return NULL;
}
+ q->ring_vmalloc_size = size;
return q;
}
@@ -52,6 +51,6 @@ void xskq_destroy(struct xsk_queue *q)
if (!q)
return;
- page_frag_free(q->ring);
+ vfree(q->ring);
kfree(q);
}
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index c6fb6b763658..bfb2a7e50c26 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -45,6 +45,7 @@ struct xsk_queue {
struct xdp_ring *ring;
u64 invalid_descs;
u64 queue_empty_descs;
+ size_t ring_vmalloc_size;
};
/* The structure of the shared state of the rings are a simple
--
2.32.0.3.g01195cf9f
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4] xsk: support use vaddr as ring
2023-02-16 8:30 [PATCH net-next v4] xsk: support use vaddr as ring Xuan Zhuo
@ 2023-02-16 13:04 ` Alexander Lobakin
2023-02-17 8:59 ` Xuan Zhuo
2023-02-20 8:30 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Alexander Lobakin @ 2023-02-16 13:04 UTC (permalink / raw)
To: Xuan Zhuo
Cc: netdev, Björn Töpel, Magnus Karlsson,
Maciej Fijalkowski, Jonathan Lemon, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, bpf
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date: Thu, 16 Feb 2023 16:30:47 +0800
> When we try to start AF_XDP on some machines with long running time, due
> to the machine's memory fragmentation problem, there is no sufficient
> contiguous physical memory that will cause the start failure.
>
> If the size of the queue is 8 * 1024, then the size of the desc[] is
> 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
> 16page+. This is necessary to apply for a 4-order memory. If there are a
> lot of queues, it is difficult to these machine with long running time.
>
> Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
> we only use 17 pages.
>
> This patch replaces __get_free_pages() by vmalloc() to allocate memory
> to solve these problems.
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
[...]
> diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
> index c6fb6b763658..bfb2a7e50c26 100644
> --- a/net/xdp/xsk_queue.h
> +++ b/net/xdp/xsk_queue.h
> @@ -45,6 +45,7 @@ struct xsk_queue {
> struct xdp_ring *ring;
> u64 invalid_descs;
> u64 queue_empty_descs;
> + size_t ring_vmalloc_size;
The name looks a bit long to me, but that might be just personal
preference. The code itself now looks good to me.
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> };
>
> /* The structure of the shared state of the rings are a simple
Next time pls make sure you added all of the reviewers to the Cc list
when sending a new revision. I noticed you posted v4 only by monitoring
the ML.
Thanks,
Olek
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4] xsk: support use vaddr as ring
2023-02-16 13:04 ` Alexander Lobakin
@ 2023-02-17 8:59 ` Xuan Zhuo
0 siblings, 0 replies; 4+ messages in thread
From: Xuan Zhuo @ 2023-02-17 8:59 UTC (permalink / raw)
To: Alexander Lobakin
Cc: netdev, Björn Töpel, Magnus Karlsson,
Maciej Fijalkowski, Jonathan Lemon, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, bpf
On Thu, 16 Feb 2023 14:04:47 +0100, Alexander Lobakin <aleksander.lobakin@intel.com> wrote:
> From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Date: Thu, 16 Feb 2023 16:30:47 +0800
>
> > When we try to start AF_XDP on some machines with long running time, due
> > to the machine's memory fragmentation problem, there is no sufficient
> > contiguous physical memory that will cause the start failure.
> >
> > If the size of the queue is 8 * 1024, then the size of the desc[] is
> > 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
> > 16page+. This is necessary to apply for a 4-order memory. If there are a
> > lot of queues, it is difficult to these machine with long running time.
> >
> > Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
> > we only use 17 pages.
> >
> > This patch replaces __get_free_pages() by vmalloc() to allocate memory
> > to solve these problems.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
>
> [...]
>
> > diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
> > index c6fb6b763658..bfb2a7e50c26 100644
> > --- a/net/xdp/xsk_queue.h
> > +++ b/net/xdp/xsk_queue.h
> > @@ -45,6 +45,7 @@ struct xsk_queue {
> > struct xdp_ring *ring;
> > u64 invalid_descs;
> > u64 queue_empty_descs;
> > + size_t ring_vmalloc_size;
>
> The name looks a bit long to me, but that might be just personal
> preference. The code itself now looks good to me.
>
> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> > };
> >
> > /* The structure of the shared state of the rings are a simple
>
> Next time pls make sure you added all of the reviewers to the Cc list
> when sending a new revision. I noticed you posted v4 only by monitoring
> the ML.
Oh, sorry. I always thought you were in the list. I did not notice this
situation.
I will pay attention next time. Thank you for your reply.
Thanks.
>
> Thanks,
> Olek
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v4] xsk: support use vaddr as ring
2023-02-16 8:30 [PATCH net-next v4] xsk: support use vaddr as ring Xuan Zhuo
2023-02-16 13:04 ` Alexander Lobakin
@ 2023-02-20 8:30 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-02-20 8:30 UTC (permalink / raw)
To: Xuan Zhuo
Cc: netdev, bjorn, magnus.karlsson, maciej.fijalkowski,
jonathan.lemon, davem, edumazet, kuba, pabeni, ast, daniel, hawk,
john.fastabend, bpf
Hello:
This patch was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:
On Thu, 16 Feb 2023 16:30:47 +0800 you wrote:
> When we try to start AF_XDP on some machines with long running time, due
> to the machine's memory fragmentation problem, there is no sufficient
> contiguous physical memory that will cause the start failure.
>
> If the size of the queue is 8 * 1024, then the size of the desc[] is
> 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
> 16page+. This is necessary to apply for a 4-order memory. If there are a
> lot of queues, it is difficult to these machine with long running time.
>
> [...]
Here is the summary with links:
- [net-next,v4] xsk: support use vaddr as ring
https://git.kernel.org/netdev/net-next/c/9f78bf330a66
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-20 8:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-16 8:30 [PATCH net-next v4] xsk: support use vaddr as ring Xuan Zhuo
2023-02-16 13:04 ` Alexander Lobakin
2023-02-17 8:59 ` Xuan Zhuo
2023-02-20 8:30 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).