All of lore.kernel.org
 help / color / mirror / Atom feed
* Shared Umem between processes
@ 2020-03-11 15:58 Gaul, Maximilian
  2020-03-12  7:55 ` Björn Töpel
  0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-11 15:58 UTC (permalink / raw)
  To: bpf@vger.kernel.org

Hello everyone,


I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.


Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf


I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).


My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process.


But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.



As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?

I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct.



So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly.



After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:




static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {

struct xsk_socket_config xsk_cfg;
struct xsk_socket_info *xsk_info;
uint32_t idx;
uint32_t prog_id = 0;
int i;
int ret;

xsk_info = calloc(1, sizeof(*xsk_info));
if (!xsk_info)
return NULL;

xsk_info->umem = umem;
xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
xsk_cfg.libbpf_flags = 0;
xsk_cfg.xdp_flags = cfg->xdp_flags;
xsk_cfg.bind_flags = cfg->xsk_bind_flags;
ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);

if (ret) {
fprintf(stderr, "FAIL 1\n");
goto error_exit;
}

ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
if (ret) {
fprintf(stderr, "FAIL 2\n");
goto error_exit;
}

/* Initialize umem frame allocation */
for (i = 0; i < NUM_FRAMES; i++)
xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;

xsk_info->umem_frame_free = NUM_FRAMES;

if(cfg->use_shrd_umem) {
return xsk_info;
}
        ...
}

Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:

However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.

from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag

I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets.

As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.

Can you please help?

Best regards

Max

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian
@ 2020-03-12  7:55 ` Björn Töpel
  2020-03-12  8:20   ` AW: " Gaul, Maximilian
  0 siblings, 1 reply; 12+ messages in thread
From: Björn Töpel @ 2020-03-12  7:55 UTC (permalink / raw)
  To: Gaul, Maximilian, Xdp; +Cc: bpf@vger.kernel.org

On Wed, 11 Mar 2020 at 16:59, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>
> Hello everyone,
>

Hi! I'm moving this to the XDP newbies list, which is a more proper
place for these kind of discussions!

>
> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>
>
> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>
>
> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>
>
> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process.
>
>
> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>

Let's start with defining what shared-umem is: The idea is to share
the same umem, fill ring, and completion ring for multiple
sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
hardware ring. It's a mechanism to load-balance a HW queue over
multiple sockets.

If I'm reading you correctly, you'd like a solution:

           hw_q0,
xsk_q0_0, xsk_q0_1, xsk_q0_2,

instead of:

hw_q0,    hw_q1,    hw_q2,
xsk_q0_0, xsk_q1_0, xsk_q2_0,

In the first case you'll need to mux the flows in the XDP program
using an XSKMAP.

Is this what you're trying to do?

>
>
> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>

Yes, that is correct, and for a reason! :-) Note that if you'd like to
do a multi-*process* setup with shared umem, you: need to have a
control process that manages the fill/completion rings, and
synchronize between the processes, OR re-mmap the fill/completetion
ring from the socket owning the umem in multiple processes *and*
synchronize the access to them. Neither is pleasant.

Honestly, not a setup I'd recommend.

> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>

Just for completeness; To setup shared umem:

1. create socket 0 and register the umem to this.
2. mmap the fr/cr using socket 0
3. create socket 1, 2, n and refer to socket 0 for the umem.

So, in a multiprocess solution step 3 would be done in separate
processes, and step 2 depending on your application. You'd need to
pass socket 0 to the other processes *and* share the umem memory from
the process where socket 0 was created. This is pretty much a threaded
solution, given all the shared state.

I advice not taking this path.

>
>
> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly.
>
>
>
> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>
>
>
>
> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>
> struct xsk_socket_config xsk_cfg;
> struct xsk_socket_info *xsk_info;
> uint32_t idx;
> uint32_t prog_id = 0;
> int i;
> int ret;
>
> xsk_info = calloc(1, sizeof(*xsk_info));
> if (!xsk_info)
> return NULL;
>
> xsk_info->umem = umem;
> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> xsk_cfg.libbpf_flags = 0;
> xsk_cfg.xdp_flags = cfg->xdp_flags;
> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>
> if (ret) {
> fprintf(stderr, "FAIL 1\n");
> goto error_exit;
> }
>
> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
> if (ret) {
> fprintf(stderr, "FAIL 2\n");
> goto error_exit;
> }
>
> /* Initialize umem frame allocation */
> for (i = 0; i < NUM_FRAMES; i++)
> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>
> xsk_info->umem_frame_free = NUM_FRAMES;
>
> if(cfg->use_shrd_umem) {
> return xsk_info;
> }
>         ...
> }
>
> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>
> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>
> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>
> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>
> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>
> Can you please help?
>

XDP sockets always use an XDP program, it just that a default one is
provided if the use doesn't explicitly add one. Have a look at
tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
explicitly have a program that muxes over the sockets. A naïve variant
can be found in samples/bpf/xdpsock_kern.c


Cheers,
Björn

> Best regards
>
> Max

^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Shared Umem between processes
  2020-03-12  7:55 ` Björn Töpel
@ 2020-03-12  8:20   ` Gaul, Maximilian
  2020-03-12  8:38     ` Björn Töpel
  0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-12  8:20 UTC (permalink / raw)
  To: Björn Töpel, Xdp; +Cc: bpf@vger.kernel.org

I don't know if this reply works but I will try.

On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
>>
>> Hello everyone,
>>
>
> Hi! I'm moving this to the XDP newbies list, which is a more proper
> place for these kind of discussions!
>
Sure, no problem. Thank you.
>>
>> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>>
>>
>> Just a few information at the start of this e-mail: My program is largely based on:  https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>>
>>
>> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>>
>>
>> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams  to process.
>>
>>
>> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>>

> Let's start with defining what shared-umem is: The idea is to share
> the same umem, fill ring, and completion ring for multiple
> sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
> hardware ring. It's a mechanism to load-balance a HW queue over
> multiple sockets.
> 
> If I'm reading you correctly, you'd like a solution:
> 
>            hw_q0,
> xsk_q0_0, xsk_q0_1, xsk_q0_2,
> 
> instead of:
> 
> hw_q0,    hw_q1,    hw_q2,
> xsk_q0_0, xsk_q1_0, xsk_q2_0,
>
> In the first case you'll need to mux the flows in the XDP program
> using an XSKMAP.
> 
> Is this what you're trying to do?
>
Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?
>>
>>
>> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>>
>
> Yes, that is correct, and for a reason! :-) Note that if you'd like to
> do a multi-*process* setup with shared umem, you: need to have a
> control process that manages the fill/completion rings, and
> synchronize between the processes, OR re-mmap the fill/completetion
> ring from the socket owning the umem in multiple processes *and*
> synchronize the access to them. Neither is pleasant.
> 
> Honestly, not a setup I'd recommend.
>
This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue.
>> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another  process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>>
>
> Just for completeness; To setup shared umem:
> 
> 1. create socket 0 and register the umem to this.
> 2. mmap the fr/cr using socket 0
> 3. create socket 1, 2, n and refer to socket 0 for the umem.
>
> So, in a multiprocess solution step 3 would be done in separate
> processes, and step 2 depending on your application. You'd need to
> pass socket 0 to the other processes *and* share the umem memory from
> the process where socket 0 was created. This is pretty much a threaded
> solution, given all the shared state.
>
> I advice not taking this path.
>
I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
>>
>>
>> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared  umem accordingly.
>>
>>
>>
>> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because  I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>>
>>
>>
>>
>> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>>
>> struct xsk_socket_config xsk_cfg;
>> struct xsk_socket_info *xsk_info;
>> uint32_t idx;
>> uint32_t prog_id = 0;
>> int i;
>> int ret;
>>
>> xsk_info = calloc(1, sizeof(*xsk_info));
>> if (!xsk_info)
>> return NULL;
>>
>> xsk_info->umem = umem;
>> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
>> xsk_cfg.libbpf_flags = 0;
>> xsk_cfg.xdp_flags = cfg->xdp_flags;
>> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
>> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>>
>> if (ret) {
>> fprintf(stderr, "FAIL 1\n");
>> goto error_exit;
>> }
>>
>> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
>> if (ret) {
>> fprintf(stderr, "FAIL 2\n");
>> goto error_exit;
>> }
>>
>> /* Initialize umem frame allocation */
>> for (i = 0; i < NUM_FRAMES; i++)
>> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>>
>> xsk_info->umem_frame_free = NUM_FRAMES;
>>
>> if(cfg->use_shrd_umem) {
>> return xsk_info;
>> }
>>         ...
>> }
>>
>> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>>
>> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>>
>> from  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>>
>> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>>
>> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>>
>> Can you please help?
>>
>
> XDP sockets always use an XDP program, it just that a default one is
> provided if the use doesn't explicitly add one. Have a look at
> tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
> explicitly have a program that muxes over the sockets. A naïve variant
> can be found in samples/bpf/xdpsock_kern.c
> 
> 
> Cheers,
> Björn
> 
>> Best regards
>>
>> Max
    

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-12  8:20   ` AW: " Gaul, Maximilian
@ 2020-03-12  8:38     ` Björn Töpel
  2020-03-12  8:49       ` AW: " Gaul, Maximilian
  0 siblings, 1 reply; 12+ messages in thread
From: Björn Töpel @ 2020-03-12  8:38 UTC (permalink / raw)
  To: Gaul, Maximilian; +Cc: Xdp

On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>
> I don't know if this reply works but I will try.
>

It worked! :-)

> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
> >>
> >> Hello everyone,
> >>
> >
> > Hi! I'm moving this to the XDP newbies list, which is a more proper
> > place for these kind of discussions!
> >
> Sure, no problem. Thank you.
> >>
> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
> >>
> >>
> >> Just a few information at the start of this e-mail: My program is largely based on:  https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
> >>
> >>
> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
> >>
> >>
> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams  to process.
> >>
> >>
> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
> >>
>
> > Let's start with defining what shared-umem is: The idea is to share
> > the same umem, fill ring, and completion ring for multiple
> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
> > hardware ring. It's a mechanism to load-balance a HW queue over
> > multiple sockets.
> >
> > If I'm reading you correctly, you'd like a solution:
> >
> >            hw_q0,
> > xsk_q0_0, xsk_q0_1, xsk_q0_2,
> >
> > instead of:
> >
> > hw_q0,    hw_q1,    hw_q2,
> > xsk_q0_0, xsk_q1_0, xsk_q2_0,
> >
> > In the first case you'll need to mux the flows in the XDP program
> > using an XSKMAP.
> >
> > Is this what you're trying to do?
> >
> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?

No; one socket, one umem, one queue. Unless you're using shared umem,
then multiple sockets, one umem, one queue.

> >>
> >>
> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
> >>
> >
> > Yes, that is correct, and for a reason! :-) Note that if you'd like to
> > do a multi-*process* setup with shared umem, you: need to have a
> > control process that manages the fill/completion rings, and
> > synchronize between the processes, OR re-mmap the fill/completetion
> > ring from the socket owning the umem in multiple processes *and*
> > synchronize the access to them. Neither is pleasant.
> >
> > Honestly, not a setup I'd recommend.
> >
> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue.

I would probably go for the first option, without shared umem, but
that's really up to you! If you're going for the shared umem, I'd do
it single process.

> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another  process) needs so I figured the simplest solution would be to just copy the whole umem struct.
> >>
> >
> > Just for completeness; To setup shared umem:
> >
> > 1. create socket 0 and register the umem to this.
> > 2. mmap the fr/cr using socket 0
> > 3. create socket 1, 2, n and refer to socket 0 for the umem.
> >
> > So, in a multiprocess solution step 3 would be done in separate
> > processes, and step 2 depending on your application. You'd need to
> > pass socket 0 to the other processes *and* share the umem memory from
> > the process where socket 0 was created. This is pretty much a threaded
> > solution, given all the shared state.
> >
> > I advice not taking this path.
> >
> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
> >>
> >>
> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared  umem accordingly.
> >>
> >>
> >>
> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because  I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
> >>
> >>
> >>
> >>
> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
> >>
> >> struct xsk_socket_config xsk_cfg;
> >> struct xsk_socket_info *xsk_info;
> >> uint32_t idx;
> >> uint32_t prog_id = 0;
> >> int i;
> >> int ret;
> >>
> >> xsk_info = calloc(1, sizeof(*xsk_info));
> >> if (!xsk_info)
> >> return NULL;
> >>
> >> xsk_info->umem = umem;
> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> >> xsk_cfg.libbpf_flags = 0;
> >> xsk_cfg.xdp_flags = cfg->xdp_flags;
> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
> >>
> >> if (ret) {
> >> fprintf(stderr, "FAIL 1\n");
> >> goto error_exit;
> >> }
> >>
> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
> >> if (ret) {
> >> fprintf(stderr, "FAIL 2\n");
> >> goto error_exit;
> >> }
> >>
> >> /* Initialize umem frame allocation */
> >> for (i = 0; i < NUM_FRAMES; i++)
> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
> >>
> >> xsk_info->umem_frame_free = NUM_FRAMES;
> >>
> >> if(cfg->use_shrd_umem) {
> >> return xsk_info;
> >> }
> >>         ...
> >> }
> >>
> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
> >>
> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
> >>
> >> from  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
> >>
> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
> >>
> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in  https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
> >>
> >> Can you please help?
> >>
> >
> > XDP sockets always use an XDP program, it just that a default one is
> > provided if the use doesn't explicitly add one. Have a look at
> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
> > explicitly have a program that muxes over the sockets. A naïve variant
> > can be found in samples/bpf/xdpsock_kern.c
> >
> >
> > Cheers,
> > Björn
> >
> >> Best regards
> >>
> >> Max
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Shared Umem between processes
  2020-03-12  8:38     ` Björn Töpel
@ 2020-03-12  8:49       ` Gaul, Maximilian
  2020-03-12  9:10         ` Björn Töpel
  0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-12  8:49 UTC (permalink / raw)
  To: Björn Töpel; +Cc: Xdp

Björn Töpel <bjorn.topel@gmail.com> wrote:

>On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>>
>> I don't know if this reply works but I will try.
>>
>
>It worked! :-)
>
>> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
>> >>
>> >> Hello everyone,
>> >>
>> >
>> > Hi! I'm moving this to the XDP newbies list, which is a more proper
>> > place for these kind of discussions!
>> >
>> Sure, no problem. Thank you.
>> >>
>> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>> >>
>> >>
>> >> Just a few information at the start of this e-mail: My program is largely based on:   https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>> >>
>> >>
>> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>> >>
>> >>
>> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are  streams  to process.
>> >>
>> >>
>> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>> >>
>>
>> > Let's start with defining what shared-umem is: The idea is to share
>> > the same umem, fill ring, and completion ring for multiple
>> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
>> > hardware ring. It's a mechanism to load-balance a HW queue over
>> > multiple sockets.
>> >
>> > If I'm reading you correctly, you'd like a solution:
>> >
>> >            hw_q0,
>> > xsk_q0_0, xsk_q0_1, xsk_q0_2,
>> >
>> > instead of:
>> >
>> > hw_q0,    hw_q1,    hw_q2,
>> > xsk_q0_0, xsk_q1_0, xsk_q2_0,
>> >
>> > In the first case you'll need to mux the flows in the XDP program
>> > using an XSKMAP.
>> >
>> > Is this what you're trying to do?
>> >
>> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?
>
>No; one socket, one umem, one queue. Unless you're using shared umem,
>then multiple sockets, one umem, one queue.
>
>> >>
>> >>
>> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>> >>
>> >
>> > Yes, that is correct, and for a reason! :-) Note that if you'd like to
>> > do a multi-*process* setup with shared umem, you: need to have a
>> > control process that manages the fill/completion rings, and
>> > synchronize between the processes, OR re-mmap the fill/completetion
>> > ring from the socket owning the umem in multiple processes *and*
>> > synchronize the access to them. Neither is pleasant.
>> >
>> > Honestly, not a setup I'd recommend.
>> >
>> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to  create multiple sockets for the same RX-Queue.
>
>I would probably go for the first option, without shared umem, but
>that's really up to you! If you're going for the shared umem, I'd do
>it single process.
>

I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of

                    hw_q0
xsk_q0_0, xsk_q0_1, xsk_q0_2

I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry.

>> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another   process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>> >>
>> >
>> > Just for completeness; To setup shared umem:
>> >
>> > 1. create socket 0 and register the umem to this.
>> > 2. mmap the fr/cr using socket 0
>> > 3. create socket 1, 2, n and refer to socket 0 for the umem.
>> >
>> > So, in a multiprocess solution step 3 would be done in separate
>> > processes, and step 2 depending on your application. You'd need to
>> > pass socket 0 to the other processes *and* share the umem memory from
>> > the process where socket 0 was created. This is pretty much a threaded
>> > solution, given all the shared state.
>> >
>> > I advice not taking this path.
>> >
>> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
>> >>
>> >>
>> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process  then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags  for shared  umem accordingly.
>> >>
>> >>
>> >>
>> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because   I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>> >>
>> >>
>> >>
>> >>
>> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>> >>
>> >> struct xsk_socket_config xsk_cfg;
>> >> struct xsk_socket_info *xsk_info;
>> >> uint32_t idx;
>> >> uint32_t prog_id = 0;
>> >> int i;
>> >> int ret;
>> >>
>> >> xsk_info = calloc(1, sizeof(*xsk_info));
>> >> if (!xsk_info)
>> >> return NULL;
>> >>
>> >> xsk_info->umem = umem;
>> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
>> >> xsk_cfg.libbpf_flags = 0;
>> >> xsk_cfg.xdp_flags = cfg->xdp_flags;
>> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
>> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>> >>
>> >> if (ret) {
>> >> fprintf(stderr, "FAIL 1\n");
>> >> goto error_exit;
>> >> }
>> >>
>> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
>> >> if (ret) {
>> >> fprintf(stderr, "FAIL 2\n");
>> >> goto error_exit;
>> >> }
>> >>
>> >> /* Initialize umem frame allocation */
>> >> for (i = 0; i < NUM_FRAMES; i++)
>> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>> >>
>> >> xsk_info->umem_frame_free = NUM_FRAMES;
>> >>
>> >> if(cfg->use_shrd_umem) {
>> >> return xsk_info;
>> >> }
>> >>         ...
>> >> }
>> >>
>> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>> >>
>> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>> >>
>> >> from   https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>> >>
>> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd  is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>> >>
>> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in   https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>> >>
>> >> Can you please help?
>> >>
>> >
>> > XDP sockets always use an XDP program, it just that a default one is
>> > provided if the use doesn't explicitly add one. Have a look at
>> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
>> > explicitly have a program that muxes over the sockets. A naïve variant
>> > can be found in samples/bpf/xdpsock_kern.c
>> >
>> >
>> > Cheers,
>> > Björn
>> >
>> >> Best regards
>> >>
>> >> Max
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-12  8:49       ` AW: " Gaul, Maximilian
@ 2020-03-12  9:10         ` Björn Töpel
  2020-03-12  9:17           ` AW: " Gaul, Maximilian
  0 siblings, 1 reply; 12+ messages in thread
From: Björn Töpel @ 2020-03-12  9:10 UTC (permalink / raw)
  To: Gaul, Maximilian; +Cc: Xdp

On Thu, 12 Mar 2020 at 09:49, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> >On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
> >>
> >> I don't know if this reply works but I will try.
> >>
> >
> >It worked! :-)
> >
> >> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
> >> >>
> >> >> Hello everyone,
> >> >>
> >> >
> >> > Hi! I'm moving this to the XDP newbies list, which is a more proper
> >> > place for these kind of discussions!
> >> >
> >> Sure, no problem. Thank you.
> >> >>
> >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
> >> >>
> >> >>
> >> >> Just a few information at the start of this e-mail: My program is largely based on:   https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
> >> >>
> >> >>
> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
> >> >>
> >> >>
> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are  streams  to process.
> >> >>
> >> >>
> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
> >> >>
> >>
> >> > Let's start with defining what shared-umem is: The idea is to share
> >> > the same umem, fill ring, and completion ring for multiple
> >> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
> >> > hardware ring. It's a mechanism to load-balance a HW queue over
> >> > multiple sockets.
> >> >
> >> > If I'm reading you correctly, you'd like a solution:
> >> >
> >> >            hw_q0,
> >> > xsk_q0_0, xsk_q0_1, xsk_q0_2,
> >> >
> >> > instead of:
> >> >
> >> > hw_q0,    hw_q1,    hw_q2,
> >> > xsk_q0_0, xsk_q1_0, xsk_q2_0,
> >> >
> >> > In the first case you'll need to mux the flows in the XDP program
> >> > using an XSKMAP.
> >> >
> >> > Is this what you're trying to do?
> >> >
> >> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?
> >
> >No; one socket, one umem, one queue. Unless you're using shared umem,
> >then multiple sockets, one umem, one queue.
> >
> >> >>
> >> >>
> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
> >> >>
> >> >
> >> > Yes, that is correct, and for a reason! :-) Note that if you'd like to
> >> > do a multi-*process* setup with shared umem, you: need to have a
> >> > control process that manages the fill/completion rings, and
> >> > synchronize between the processes, OR re-mmap the fill/completetion
> >> > ring from the socket owning the umem in multiple processes *and*
> >> > synchronize the access to them. Neither is pleasant.
> >> >
> >> > Honestly, not a setup I'd recommend.
> >> >
> >> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to  create multiple sockets for the same RX-Queue.
> >
> >I would probably go for the first option, without shared umem, but
> >that's really up to you! If you're going for the shared umem, I'd do
> >it single process.
> >
>
> I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of
>
>                     hw_q0
> xsk_q0_0, xsk_q0_1, xsk_q0_2
>
> I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry.
>

No worries! :-)

Above you wrote "I couldn't create multiple sockets (no sharing,
everyone with its own umem and rx/tx queues) tied to the same
RX-Queue. Maybe I did something wrong."  You can *only* tie multiple
sockets to one queue by using shared umem. You said that "everyone
with its own umem and rx/tx queues) tied to the same RX-Queue".

If you'd like to go for the setup above, you can do this with libbpf
today (have a look at the sample, where opt_num_xsks > 1). That will
however be a single process solution.

Clearer?


Björn


> >> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another   process) needs so I figured the simplest solution would be to just copy the whole umem struct.
> >> >>
> >> >
> >> > Just for completeness; To setup shared umem:
> >> >
> >> > 1. create socket 0 and register the umem to this.
> >> > 2. mmap the fr/cr using socket 0
> >> > 3. create socket 1, 2, n and refer to socket 0 for the umem.
> >> >
> >> > So, in a multiprocess solution step 3 would be done in separate
> >> > processes, and step 2 depending on your application. You'd need to
> >> > pass socket 0 to the other processes *and* share the umem memory from
> >> > the process where socket 0 was created. This is pretty much a threaded
> >> > solution, given all the shared state.
> >> >
> >> > I advice not taking this path.
> >> >
> >> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
> >> >>
> >> >>
> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process  then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags  for shared  umem accordingly.
> >> >>
> >> >>
> >> >>
> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because   I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
> >> >>
> >> >> struct xsk_socket_config xsk_cfg;
> >> >> struct xsk_socket_info *xsk_info;
> >> >> uint32_t idx;
> >> >> uint32_t prog_id = 0;
> >> >> int i;
> >> >> int ret;
> >> >>
> >> >> xsk_info = calloc(1, sizeof(*xsk_info));
> >> >> if (!xsk_info)
> >> >> return NULL;
> >> >>
> >> >> xsk_info->umem = umem;
> >> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
> >> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
> >> >> xsk_cfg.libbpf_flags = 0;
> >> >> xsk_cfg.xdp_flags = cfg->xdp_flags;
> >> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
> >> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
> >> >>
> >> >> if (ret) {
> >> >> fprintf(stderr, "FAIL 1\n");
> >> >> goto error_exit;
> >> >> }
> >> >>
> >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
> >> >> if (ret) {
> >> >> fprintf(stderr, "FAIL 2\n");
> >> >> goto error_exit;
> >> >> }
> >> >>
> >> >> /* Initialize umem frame allocation */
> >> >> for (i = 0; i < NUM_FRAMES; i++)
> >> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
> >> >>
> >> >> xsk_info->umem_frame_free = NUM_FRAMES;
> >> >>
> >> >> if(cfg->use_shrd_umem) {
> >> >> return xsk_info;
> >> >> }
> >> >>         ...
> >> >> }
> >> >>
> >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
> >> >>
> >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
> >> >>
> >> >> from   https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
> >> >>
> >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd  is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
> >> >>
> >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in   https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
> >> >>
> >> >> Can you please help?
> >> >>
> >> >
> >> > XDP sockets always use an XDP program, it just that a default one is
> >> > provided if the use doesn't explicitly add one. Have a look at
> >> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
> >> > explicitly have a program that muxes over the sockets. A naïve variant
> >> > can be found in samples/bpf/xdpsock_kern.c
> >> >
> >> >
> >> > Cheers,
> >> > Björn
> >> >
> >> >> Best regards
> >> >>
> >> >> Max
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Shared Umem between processes
  2020-03-12  9:10         ` Björn Töpel
@ 2020-03-12  9:17           ` Gaul, Maximilian
  2020-03-12  9:40             ` Björn Töpel
  0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-12  9:17 UTC (permalink / raw)
  To: Björn Töpel; +Cc: Xdp


On Thu, 12 Mar 2020 at 10:10, Björn Töpel <bjorn.topel@gmail.com> wrote:
    
>On Thu, 12 Mar 2020 at 09:49, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>>
>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>
>> >On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>> >>
>> >> I don't know if this reply works but I will try.
>> >>
>> >
>> >It worked! :-)
>> >
>> >> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote:
>> >> >>
>> >> >> Hello everyone,
>> >> >>
>> >> >
>> >> > Hi! I'm moving this to the XDP newbies list, which is a more proper
>> >> > place for these kind of discussions!
>> >> >
>> >> Sure, no problem. Thank you.
>> >> >>
>> >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
>> >> >>
>> >> >>
>> >> >> Just a few information at the start of this e-mail: My program is largely based on:    https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
>> >> >>
>> >> >>
>> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
>> >> >>
>> >> >>
>> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there  are  streams  to process.
>> >> >>
>> >> >>
>> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
>> >> >>
>> >>
>> >> > Let's start with defining what shared-umem is: The idea is to share
>> >> > the same umem, fill ring, and completion ring for multiple
>> >> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one
>> >> > hardware ring. It's a mechanism to load-balance a HW queue over
>> >> > multiple sockets.
>> >> >
>> >> > If I'm reading you correctly, you'd like a solution:
>> >> >
>> >> >            hw_q0,
>> >> > xsk_q0_0, xsk_q0_1, xsk_q0_2,
>> >> >
>> >> > instead of:
>> >> >
>> >> > hw_q0,    hw_q1,    hw_q2,
>> >> > xsk_q0_0, xsk_q1_0, xsk_q2_0,
>> >> >
>> >> > In the first case you'll need to mux the flows in the XDP program
>> >> > using an XSKMAP.
>> >> >
>> >> > Is this what you're trying to do?
>> >> >
>> >> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible?
>> >
>> >No; one socket, one umem, one queue. Unless you're using shared umem,
>> >then multiple sockets, one umem, one queue.
>> >
>> >> >>
>> >> >>
>> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
>> >> >>
>> >> >
>> >> > Yes, that is correct, and for a reason! :-) Note that if you'd like to
>> >> > do a multi-*process* setup with shared umem, you: need to have a
>> >> > control process that manages the fill/completion rings, and
>> >> > synchronize between the processes, OR re-mmap the fill/completetion
>> >> > ring from the socket owning the umem in multiple processes *and*
>> >> > synchronize the access to them. Neither is pleasant.
>> >> >
>> >> > Honestly, not a setup I'd recommend.
>> >> >
>> >> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able  to  create multiple sockets for the same RX-Queue.
>> >
>> >I would probably go for the first option, without shared umem, but
>> >that's really up to you! If you're going for the shared umem, I'd do
>> >it single process.
>> >
>>
>> I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of
>>
>>                     hw_q0
>> xsk_q0_0, xsk_q0_1, xsk_q0_2
>>
>> I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry.
>>
>
>No worries! :-)
>
>Above you wrote "I couldn't create multiple sockets (no sharing,
>everyone with its own umem and rx/tx queues) tied to the same
>RX-Queue. Maybe I did something wrong."  You can *only* tie multiple
>sockets to one queue by using shared umem. You said that "everyone
>with its own umem and rx/tx queues) tied to the same RX-Queue".
>
>If you'd like to go for the setup above, you can do this with libbpf
>today (have a look at the sample, where opt_num_xsks > 1). That will
>however be a single process solution.
>
>Clearer?
>
>
>Björn
>
>

Thank you so much Björn!

just to wrap things up:

- if I want to distribute packet processing from a single RX-Queue to multiple sockets I have to use shared umem because it is not possible to bind multiple af-xdp sockets onto the same RX-Queue
- furthermore, you would recommend to go with a single process / multiple threads solution in case of shared umem

is this correct?

Max

>> >> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from  another   process) needs so I figured the simplest solution would be to just copy the whole umem struct.
>> >> >>
>> >> >
>> >> > Just for completeness; To setup shared umem:
>> >> >
>> >> > 1. create socket 0 and register the umem to this.
>> >> > 2. mmap the fr/cr using socket 0
>> >> > 3. create socket 1, 2, n and refer to socket 0 for the umem.
>> >> >
>> >> > So, in a multiprocess solution step 3 would be done in separate
>> >> > processes, and step 2 depending on your application. You'd need to
>> >> > pass socket 0 to the other processes *and* share the umem memory from
>> >> > the process where socket 0 was created. This is pretty much a threaded
>> >> > solution, given all the shared state.
>> >> >
>> >> > I advice not taking this path.
>> >> >
>> >> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`.
>> >> >>
>> >> >>
>> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process   then  reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags   for shared  umem accordingly.
>> >> >>
>> >> >>
>> >> >>
>> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process  because   I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
>> >> >>
>> >> >> struct xsk_socket_config xsk_cfg;
>> >> >> struct xsk_socket_info *xsk_info;
>> >> >> uint32_t idx;
>> >> >> uint32_t prog_id = 0;
>> >> >> int i;
>> >> >> int ret;
>> >> >>
>> >> >> xsk_info = calloc(1, sizeof(*xsk_info));
>> >> >> if (!xsk_info)
>> >> >> return NULL;
>> >> >>
>> >> >> xsk_info->umem = umem;
>> >> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>> >> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
>> >> >> xsk_cfg.libbpf_flags = 0;
>> >> >> xsk_cfg.xdp_flags = cfg->xdp_flags;
>> >> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags;
>> >> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
>> >> >>
>> >> >> if (ret) {
>> >> >> fprintf(stderr, "FAIL 1\n");
>> >> >> goto error_exit;
>> >> >> }
>> >> >>
>> >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
>> >> >> if (ret) {
>> >> >> fprintf(stderr, "FAIL 2\n");
>> >> >> goto error_exit;
>> >> >> }
>> >> >>
>> >> >> /* Initialize umem frame allocation */
>> >> >> for (i = 0; i < NUM_FRAMES; i++)
>> >> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
>> >> >>
>> >> >> xsk_info->umem_frame_free = NUM_FRAMES;
>> >> >>
>> >> >> if(cfg->use_shrd_umem) {
>> >> >> return xsk_info;
>> >> >> }
>> >> >>         ...
>> >> >> }
>> >> >>
>> >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
>> >> >>
>> >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
>> >> >>
>> >> >> from    https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
>> >> >>
>> >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket  fd  is  not put into the kernel `xsks-map` which basically means that I don't receive any packets.
>> >> >>
>> >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in    https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
>> >> >>
>> >> >> Can you please help?
>> >> >>
>> >> >
>> >> > XDP sockets always use an XDP program, it just that a default one is
>> >> > provided if the use doesn't explicitly add one. Have a look at
>> >> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to
>> >> > explicitly have a program that muxes over the sockets. A naïve variant
>> >> > can be found in samples/bpf/xdpsock_kern.c
>> >> >
>> >> >
>> >> > Cheers,
>> >> > Björn
>> >> >
>> >> >> Best regards
>> >> >>
>> >> >> Max
>> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-12  9:17           ` AW: " Gaul, Maximilian
@ 2020-03-12  9:40             ` Björn Töpel
  2020-03-12 11:00               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Björn Töpel @ 2020-03-12  9:40 UTC (permalink / raw)
  To: Gaul, Maximilian; +Cc: Xdp

On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>
[...]
>
> Thank you so much Björn!
>
> just to wrap things up:
>
> - if I want to distribute packet processing from a single RX-Queue to multiple sockets I have to use shared umem because it is not possible to bind multiple af-xdp sockets onto the same RX-Queue

Correct! And you need a tailored XDP program that spreads over the
shared umem sockets!

> - furthermore, you would recommend to go with a single process / multiple threads solution in case of shared umem
>

Yes, and while possible to do multi-process, the solution would be
complex with little gain (IMO).

And please reach out if you have any more questions!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-12  9:40             ` Björn Töpel
@ 2020-03-12 11:00               ` Toke Høiland-Jørgensen
  2020-03-12 11:29                 ` Björn Töpel
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-03-12 11:00 UTC (permalink / raw)
  To: Björn Töpel, Gaul, Maximilian; +Cc: Xdp

Björn Töpel <bjorn.topel@gmail.com> writes:

> On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
>>
> [...]
>>
>> Thank you so much Björn!
>>
>> just to wrap things up:
>>
>> - if I want to distribute packet processing from a single RX-Queue to
>> multiple sockets I have to use shared umem because it is not possible
>> to bind multiple af-xdp sockets onto the same RX-Queue
>
> Correct! And you need a tailored XDP program that spreads over the
> shared umem sockets!

Could we lift this restriction? Not with zero-copy, obviously, but if
there's a copy involved it seems it should be possible to support
several sockets on the same RXQ? That would make it possible to use XDP
as a per-CPU load balancer for a single RXQ, like we can do with cpumap
for packets hitting the stack today?

-Toke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Shared Umem between processes
  2020-03-12 11:00               ` Toke Høiland-Jørgensen
@ 2020-03-12 11:29                 ` Björn Töpel
  2020-03-12 15:48                   ` AW: " Gaul, Maximilian
  0 siblings, 1 reply; 12+ messages in thread
From: Björn Töpel @ 2020-03-12 11:29 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: Gaul, Maximilian, Xdp

On Thu, 12 Mar 2020 at 12:01, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Björn Töpel <bjorn.topel@gmail.com> writes:
>
> > On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote:
> >>
> > [...]
> >>
> >> Thank you so much Björn!
> >>
> >> just to wrap things up:
> >>
> >> - if I want to distribute packet processing from a single RX-Queue to
> >> multiple sockets I have to use shared umem because it is not possible
> >> to bind multiple af-xdp sockets onto the same RX-Queue
> >
> > Correct! And you need a tailored XDP program that spreads over the
> > shared umem sockets!
>
> Could we lift this restriction? Not with zero-copy, obviously, but if
> there's a copy involved it seems it should be possible to support
> several sockets on the same RXQ? That would make it possible to use XDP
> as a per-CPU load balancer for a single RXQ, like we can do with cpumap
> for packets hitting the stack today?
>

Yes! It's on the (never ending) TODO list. Never heard that before, right? :-(


Cheers,
Björn

> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Shared Umem between processes
  2020-03-12 11:29                 ` Björn Töpel
@ 2020-03-12 15:48                   ` Gaul, Maximilian
  2020-03-15 15:29                     ` Gaul, Maximilian
  0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-12 15:48 UTC (permalink / raw)
  To: Björn Töpel, Toke Høiland-Jørgensen; +Cc: Xdp

I don't know if it is appropriate to ask a new question on the same topic in the same thread? In case it is not I will start a new one but I thought it would fit "my story on how to get shared umem working".

I followed https://github.com/xdp-project/bpf-next/blob/master/samples/bpf/xdpsock_user.c on how to implement shared umem and I think I got it working to some extend.

There are some problems though:

I am processing two multicast streams arriving on the same RX-Queue on two sockets (using shared umem) - each socket is running in its own thread (same process).
What I noticed: Everything seems to work fine for about 1min or so (even though I am noticing some packet-loss for 530.000 pps) but after that, datarate drops to only half and after one more minute to a fourth.

My first thought on this was that the reserved umem frames by calling `xsk_ring_prod__reserve` are not freed properly (similiar to a memory leak) because of that I decreased the size of the umem to a tenth hoping to see a decrease in packet rate even sooner - and indeed I did!

Basically what I do is to create a new thread for each socket and pass the `xsk_socket_info`-struct accordingly. I then call `nanosleep` for 2.5ms in a while-loop and process every frame that arrived:

static void* rx_and_process(void *a) {

	struct pthread_arg *arg = (struct pthread_arg*)a;
	struct config *cfg = arg->cfg;
	struct pckt_idntfy_stats *pckt = arg->pckt_idntfy;
	struct xsk_socket_info *xsk_socket = arg->xsk_socket;

	struct timespec spec;
	spec.tv_sec = 0;
	spec.tv_nsec = 2500 * 1000;

	struct timespec remaining;

	while(!global_exit) {
		if(nanosleep(&spec, &remaining) < 0) {
			nanosleep(&spec, &remaining);
		}
		handle_receive_packets(xsk_socket, fds);
	}

	return NULL;
}

`pckt_idntfy_stats` contains information about where the statistics about this multicast-stream should be placed in shared memory.

Processing then happens like this:

static void handle_receive_packets(struct xsk_socket_info *xsk_socket, struct pollfd *fds) {
	unsigned int rcvd, i;
	uint32_t idx_rx = 0, idx_fq = 0;
	int ret;

	rcvd = xsk_ring_cons__peek(&xsk_socket->rx, INT32_MAX, &idx_rx);
	if (!rcvd) {
        /* no packets received, go to sleep */
		return;
	}

	ret = xsk_ring_prod__reserve(&xsk_socket->umem->fq, rcvd, &idx_fq);
	if (ret < 0) {
		fprintf(stderr, "Error: %s\n", strerror(-ret));
		return;
	} else if(ret == 0) {
		printf("NO SPACE LEFT!\n");
		return;
	} else if(ret != rcvd) {
		printf("RET != RCVD\n");
		return;
	}

	for (i = 0; i < rcvd; i++) {
		uint64_t addr = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx)->addr;
		uint32_t len = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx++)->len;
		uint64_t orig = xsk_umem__extract_addr(addr);

		addr = xsk_umem__add_offset_to_addr(addr);

		process_packet(xsk_socket, addr, len);
		*xsk_ring_prod__fill_addr(&xsk_socket->umem->fq, idx_fq++) = orig;
		
		xsk_socket->stats.rx_bytes += len;
	}

	xsk_ring_prod__submit(&xsk_socket->umem->fq, rcvd);
	xsk_ring_cons__release(&xsk_socket->rx, rcvd);
	xsk_socket->stats.rx_packets += rcvd;
}

I am sorry to post all this code here but maybe it helps?

This is how I configured the umem (basically a 1:1 copy from `xdpsock_user.c`:

static struct xsk_umem_info *configure_xsk_umem(void *buffer, uint64_t size) {

	struct xsk_umem_info *umem;
	struct xsk_umem_config cfg = {
		.fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS,
		.comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS,
		.frame_size = FRAME_SIZE,
		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
		.flags = 0
	};
	int ret;

	umem = calloc(1, sizeof(*umem));
	if (!umem) {
		fprintf(stderr, "Error while allocating umem: %s\n", strerror(errno));
		exit(1);
	}

	ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg);
	if (ret) {
		fprintf(stderr, "`xsk_umem__create` returned: %s\n", strerror(-ret));
		exit(1);
	}

	umem->buffer = buffer;
	return umem;
}

and after that I call:

static void xsk_populate_fill_ring(struct xsk_umem_info *umem) {
	int ret, i;
	uint32_t idx;

	ret = xsk_ring_prod__reserve(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
	if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) {
		fprintf(stderr, "Failed to reserve prod ring: %s\n", strerror(errno));
		exit(1);
	}
	for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) {
		*xsk_ring_prod__fill_addr(&umem->fq, idx++) = i * FRAME_SIZE;
	}
	xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
}

And sockets are created this way:

static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, struct config *cfg, bool rx, bool tx) {
	struct xsk_socket_config xsk_socket_cfg;
 	struct xsk_socket_info *xsk;
	struct xsk_ring_cons *rxr;
	struct xsk_ring_prod *txr;
	int ret;

	xsk = calloc(1, sizeof(*xsk));
	if (!xsk) {
		fprintf(stderr, "xsk `calloc` failed: %s\n", strerror(errno));
		exit(1);
	}

	xsk->umem = umem;
	xsk_socket_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
	xsk_socket_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
	if (cfg->ip_addrs_len > 1) {
		xsk_socket_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
	} else {
		xsk_socket_cfg.libbpf_flags = 0;
	}
	xsk_socket_cfg.xdp_flags = cfg->xdp_flags;
	xsk_socket_cfg.bind_flags = cfg->xsk_bind_flags;

	rxr = rx ? &xsk->rx : NULL;
	txr = tx ? &xsk->tx : NULL;
	ret = xsk_socket__create(&xsk->xsk, cfg->ifname_buf, cfg->xsk_if_queue, umem->umem, rxr, txr, &xsk_socket_cfg);
	if (ret) {
		fprintf(stderr, "`xsk_socket__create` returned error: %s\n", strerror(errno));
		exit(-ret);
	}

	return xsk;
}

As far as I've seen from `xdpsock_user.c` there is no special handling required by the sockets who are using shared umem? What am I missing?

Best regards

Max

^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Shared Umem between processes
  2020-03-12 15:48                   ` AW: " Gaul, Maximilian
@ 2020-03-15 15:29                     ` Gaul, Maximilian
  0 siblings, 0 replies; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-15 15:29 UTC (permalink / raw)
  To: Xdp

I figured out that this was because I didn't synchronize the access to the umem between the AF-XDP sockets.
I am doing that now in a Round-Robin like fashion as described in `xdpsock_user.c`.

>On Thu, 12 Mar 2020 at 16:48, Gaul, Maximilian <maximilian.gaul@hm.edu>  wrote:

>I don't know if it is appropriate to ask a new question on the same topic in the same thread? In case it is not I will start a new one but I thought it would fit "my story on how to get shared umem working".
>
>I followed  https://github.com/xdp-project/bpf-next/blob/master/samples/bpf/xdpsock_user.c on how to implement shared umem and I think I got it working to some extend.
>
>There are some problems though:
>
>I am processing two multicast streams arriving on the same RX-Queue on two sockets (using shared umem) - each socket is running in its own thread (same process).
>What I noticed: Everything seems to work fine for about 1min or so (even though I am noticing some packet-loss for 530.000 pps) but after that, datarate drops to only half and after one more minute to a fourth.
>
>My first thought on this was that the reserved umem frames by calling `xsk_ring_prod__reserve` are not freed properly (similiar to a memory leak) because of that I decreased the size of the umem to a tenth hoping to see a decrease in packet rate even sooner  - and indeed I did!
>
>Basically what I do is to create a new thread for each socket and pass the `xsk_socket_info`-struct accordingly. I then call `nanosleep` for 2.5ms in a while-loop and process every frame that arrived:
>
>static void* rx_and_process(void *a) {
>
>        struct pthread_arg *arg = (struct pthread_arg*)a;
>        struct config *cfg = arg->cfg;
>        struct pckt_idntfy_stats *pckt = arg->pckt_idntfy;
>        struct xsk_socket_info *xsk_socket = arg->xsk_socket;
>
>        struct timespec spec;
>        spec.tv_sec = 0;
>        spec.tv_nsec = 2500 * 1000;
>
>        struct timespec remaining;
>
>        while(!global_exit) {
>                if(nanosleep(&spec, &remaining) < 0) {
>                        nanosleep(&spec, &remaining);
>                }
>                handle_receive_packets(xsk_socket, fds);
>        }
>
>        return NULL;
>}
>
>`pckt_idntfy_stats` contains information about where the statistics about this multicast-stream should be placed in shared memory.
>
>Processing then happens like this:
>
>static void handle_receive_packets(struct xsk_socket_info *xsk_socket, struct pollfd *fds) {
>        unsigned int rcvd, i;
>        uint32_t idx_rx = 0, idx_fq = 0;
>        int ret;
>
>        rcvd = xsk_ring_cons__peek(&xsk_socket->rx, INT32_MAX, &idx_rx);
>        if (!rcvd) {
>        /* no packets received, go to sleep */
>                return;
>        }
>
>        ret = xsk_ring_prod__reserve(&xsk_socket->umem->fq, rcvd, &idx_fq);
>        if (ret < 0) {
>                fprintf(stderr, "Error: %s\n", strerror(-ret));
>                return;
>        } else if(ret == 0) {
>                printf("NO SPACE LEFT!\n");
>                return;
>        } else if(ret != rcvd) {
>                printf("RET != RCVD\n");
>                return;
>        }
>
>        for (i = 0; i < rcvd; i++) {
>                uint64_t addr = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx)->addr;
>                uint32_t len = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx++)->len;
>                uint64_t orig = xsk_umem__extract_addr(addr);
>
>                addr = xsk_umem__add_offset_to_addr(addr);
>
>                process_packet(xsk_socket, addr, len);
>                *xsk_ring_prod__fill_addr(&xsk_socket->umem->fq, idx_fq++) = orig;
>                
>                xsk_socket->stats.rx_bytes += len;
>        }
>
>        xsk_ring_prod__submit(&xsk_socket->umem->fq, rcvd);
>        xsk_ring_cons__release(&xsk_socket->rx, rcvd);
>        xsk_socket->stats.rx_packets += rcvd;
>}
>
>I am sorry to post all this code here but maybe it helps?
>
>This is how I configured the umem (basically a 1:1 copy from `xdpsock_user.c`:
>
>static struct xsk_umem_info *configure_xsk_umem(void *buffer, uint64_t size) {
>
>        struct xsk_umem_info *umem;
>        struct xsk_umem_config cfg = {
>                .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS,
>                .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS,
>                .frame_size = FRAME_SIZE,
>                .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
>                .flags = 0
>        };
>        int ret;
>
>        umem = calloc(1, sizeof(*umem));
>        if (!umem) {
>                fprintf(stderr, "Error while allocating umem: %s\n", strerror(errno));
>                exit(1);
>        }
>
>        ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg);
>        if (ret) {
>                fprintf(stderr, "`xsk_umem__create` returned: %s\n", strerror(-ret));
>                exit(1);
>        }
>
>        umem->buffer = buffer;
>        return umem;
>}
>
>and after that I call:
>
>static void xsk_populate_fill_ring(struct xsk_umem_info *umem) {
>        int ret, i;
>        uint32_t idx;
>
>        ret = xsk_ring_prod__reserve(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx);
>        if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) {
>                fprintf(stderr, "Failed to reserve prod ring: %s\n", strerror(errno));
>                exit(1);
>        }
>        for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) {
>                *xsk_ring_prod__fill_addr(&umem->fq, idx++) = i * FRAME_SIZE;
>        }
>        xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS);
>}
>
>And sockets are created this way:
>
>static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, struct config *cfg, bool rx, bool tx) {
>        struct xsk_socket_config xsk_socket_cfg;
>         struct xsk_socket_info *xsk;
>        struct xsk_ring_cons *rxr;
>        struct xsk_ring_prod *txr;
>        int ret;
>
>        xsk = calloc(1, sizeof(*xsk));
>        if (!xsk) {
>                fprintf(stderr, "xsk `calloc` failed: %s\n", strerror(errno));
>                exit(1);
>        }
>
>        xsk->umem = umem;
>        xsk_socket_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
>        xsk_socket_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
>        if (cfg->ip_addrs_len > 1) {
>                xsk_socket_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
>        } else {
>                xsk_socket_cfg.libbpf_flags = 0;
>        }
>        xsk_socket_cfg.xdp_flags = cfg->xdp_flags;
>        xsk_socket_cfg.bind_flags = cfg->xsk_bind_flags;
>
>        rxr = rx ? &xsk->rx : NULL;
>        txr = tx ? &xsk->tx : NULL;
>        ret = xsk_socket__create(&xsk->xsk, cfg->ifname_buf, cfg->xsk_if_queue, umem->umem, rxr, txr, &xsk_socket_cfg);
>        if (ret) {
>                fprintf(stderr, "`xsk_socket__create` returned error: %s\n", strerror(errno));
>                exit(-ret);
>        }
>
>        return xsk;
>}
>
>As far as I've seen from `xdpsock_user.c` there is no special handling required by the sockets who are using shared umem? What am I missing?
>
>Best regards
>
>Max    

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-03-15 15:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian
2020-03-12  7:55 ` Björn Töpel
2020-03-12  8:20   ` AW: " Gaul, Maximilian
2020-03-12  8:38     ` Björn Töpel
2020-03-12  8:49       ` AW: " Gaul, Maximilian
2020-03-12  9:10         ` Björn Töpel
2020-03-12  9:17           ` AW: " Gaul, Maximilian
2020-03-12  9:40             ` Björn Töpel
2020-03-12 11:00               ` Toke Høiland-Jørgensen
2020-03-12 11:29                 ` Björn Töpel
2020-03-12 15:48                   ` AW: " Gaul, Maximilian
2020-03-15 15:29                     ` Gaul, Maximilian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.