* Shared Umem between processes
@ 2020-03-11 15:58 Gaul, Maximilian
2020-03-12 7:55 ` Björn Töpel
0 siblings, 1 reply; 12+ messages in thread
From: Gaul, Maximilian @ 2020-03-11 15:58 UTC (permalink / raw)
To: bpf@vger.kernel.org
Hello everyone,
I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse.
Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf
I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second).
My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process.
But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`.
As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right?
I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct.
So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly.
After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that:
static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) {
struct xsk_socket_config xsk_cfg;
struct xsk_socket_info *xsk_info;
uint32_t idx;
uint32_t prog_id = 0;
int i;
int ret;
xsk_info = calloc(1, sizeof(*xsk_info));
if (!xsk_info)
return NULL;
xsk_info->umem = umem;
xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
xsk_cfg.libbpf_flags = 0;
xsk_cfg.xdp_flags = cfg->xdp_flags;
xsk_cfg.bind_flags = cfg->xsk_bind_flags;
ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg);
if (ret) {
fprintf(stderr, "FAIL 1\n");
goto error_exit;
}
ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags);
if (ret) {
fprintf(stderr, "FAIL 2\n");
goto error_exit;
}
/* Initialize umem frame allocation */
for (i = 0; i < NUM_FRAMES; i++)
xsk_info->umem_frame_addr[i] = i * FRAME_SIZE;
xsk_info->umem_frame_free = NUM_FRAMES;
if(cfg->use_shrd_umem) {
return xsk_info;
}
...
}
Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement:
However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you.
from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag
I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets.
As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented.
Can you please help?
Best regards
Max
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Shared Umem between processes 2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian @ 2020-03-12 7:55 ` Björn Töpel 2020-03-12 8:20 ` AW: " Gaul, Maximilian 0 siblings, 1 reply; 12+ messages in thread From: Björn Töpel @ 2020-03-12 7:55 UTC (permalink / raw) To: Gaul, Maximilian, Xdp; +Cc: bpf@vger.kernel.org On Wed, 11 Mar 2020 at 16:59, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > > Hello everyone, > Hi! I'm moving this to the XDP newbies list, which is a more proper place for these kind of discussions! > > I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. > > > Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf > > > I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). > > > My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. > > > But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. > Let's start with defining what shared-umem is: The idea is to share the same umem, fill ring, and completion ring for multiple sockets. The sockets sharing that umem/fr/cr are tied (bound) to one hardware ring. It's a mechanism to load-balance a HW queue over multiple sockets. If I'm reading you correctly, you'd like a solution: hw_q0, xsk_q0_0, xsk_q0_1, xsk_q0_2, instead of: hw_q0, hw_q1, hw_q2, xsk_q0_0, xsk_q1_0, xsk_q2_0, In the first case you'll need to mux the flows in the XDP program using an XSKMAP. Is this what you're trying to do? > > > As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? > Yes, that is correct, and for a reason! :-) Note that if you'd like to do a multi-*process* setup with shared umem, you: need to have a control process that manages the fill/completion rings, and synchronize between the processes, OR re-mmap the fill/completetion ring from the socket owning the umem in multiple processes *and* synchronize the access to them. Neither is pleasant. Honestly, not a setup I'd recommend. > I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. > Just for completeness; To setup shared umem: 1. create socket 0 and register the umem to this. 2. mmap the fr/cr using socket 0 3. create socket 1, 2, n and refer to socket 0 for the umem. So, in a multiprocess solution step 3 would be done in separate processes, and step 2 depending on your application. You'd need to pass socket 0 to the other processes *and* share the umem memory from the process where socket 0 was created. This is pretty much a threaded solution, given all the shared state. I advice not taking this path. > > > So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. > > > > After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: > > > > > static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { > > struct xsk_socket_config xsk_cfg; > struct xsk_socket_info *xsk_info; > uint32_t idx; > uint32_t prog_id = 0; > int i; > int ret; > > xsk_info = calloc(1, sizeof(*xsk_info)); > if (!xsk_info) > return NULL; > > xsk_info->umem = umem; > xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; > xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; > xsk_cfg.libbpf_flags = 0; > xsk_cfg.xdp_flags = cfg->xdp_flags; > xsk_cfg.bind_flags = cfg->xsk_bind_flags; > ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); > > if (ret) { > fprintf(stderr, "FAIL 1\n"); > goto error_exit; > } > > ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); > if (ret) { > fprintf(stderr, "FAIL 2\n"); > goto error_exit; > } > > /* Initialize umem frame allocation */ > for (i = 0; i < NUM_FRAMES; i++) > xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; > > xsk_info->umem_frame_free = NUM_FRAMES; > > if(cfg->use_shrd_umem) { > return xsk_info; > } > ... > } > > Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: > > However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. > > from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag > > I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. > > As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. > > Can you please help? > XDP sockets always use an XDP program, it just that a default one is provided if the use doesn't explicitly add one. Have a look at tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to explicitly have a program that muxes over the sockets. A naïve variant can be found in samples/bpf/xdpsock_kern.c Cheers, Björn > Best regards > > Max ^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: Shared Umem between processes 2020-03-12 7:55 ` Björn Töpel @ 2020-03-12 8:20 ` Gaul, Maximilian 2020-03-12 8:38 ` Björn Töpel 0 siblings, 1 reply; 12+ messages in thread From: Gaul, Maximilian @ 2020-03-12 8:20 UTC (permalink / raw) To: Björn Töpel, Xdp; +Cc: bpf@vger.kernel.org I don't know if this reply works but I will try. On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: >> >> Hello everyone, >> > > Hi! I'm moving this to the XDP newbies list, which is a more proper > place for these kind of discussions! > Sure, no problem. Thank you. >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. >> >> >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf >> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). >> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. >> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. >> > Let's start with defining what shared-umem is: The idea is to share > the same umem, fill ring, and completion ring for multiple > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one > hardware ring. It's a mechanism to load-balance a HW queue over > multiple sockets. > > If I'm reading you correctly, you'd like a solution: > > hw_q0, > xsk_q0_0, xsk_q0_1, xsk_q0_2, > > instead of: > > hw_q0, hw_q1, hw_q2, > xsk_q0_0, xsk_q1_0, xsk_q2_0, > > In the first case you'll need to mux the flows in the XDP program > using an XSKMAP. > > Is this what you're trying to do? > Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? >> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? >> > > Yes, that is correct, and for a reason! :-) Note that if you'd like to > do a multi-*process* setup with shared umem, you: need to have a > control process that manages the fill/completion rings, and > synchronize between the processes, OR re-mmap the fill/completetion > ring from the socket owning the umem in multiple processes *and* > synchronize the access to them. Neither is pleasant. > > Honestly, not a setup I'd recommend. > This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. >> > > Just for completeness; To setup shared umem: > > 1. create socket 0 and register the umem to this. > 2. mmap the fr/cr using socket 0 > 3. create socket 1, 2, n and refer to socket 0 for the umem. > > So, in a multiprocess solution step 3 would be done in separate > processes, and step 2 depending on your application. You'd need to > pass socket 0 to the other processes *and* share the umem memory from > the process where socket 0 was created. This is pretty much a threaded > solution, given all the shared state. > > I advice not taking this path. > I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. >> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. >> >> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: >> >> >> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { >> >> struct xsk_socket_config xsk_cfg; >> struct xsk_socket_info *xsk_info; >> uint32_t idx; >> uint32_t prog_id = 0; >> int i; >> int ret; >> >> xsk_info = calloc(1, sizeof(*xsk_info)); >> if (!xsk_info) >> return NULL; >> >> xsk_info->umem = umem; >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; >> xsk_cfg.libbpf_flags = 0; >> xsk_cfg.xdp_flags = cfg->xdp_flags; >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); >> >> if (ret) { >> fprintf(stderr, "FAIL 1\n"); >> goto error_exit; >> } >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); >> if (ret) { >> fprintf(stderr, "FAIL 2\n"); >> goto error_exit; >> } >> >> /* Initialize umem frame allocation */ >> for (i = 0; i < NUM_FRAMES; i++) >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; >> >> xsk_info->umem_frame_free = NUM_FRAMES; >> >> if(cfg->use_shrd_umem) { >> return xsk_info; >> } >> ... >> } >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. >> >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. >> >> Can you please help? >> > > XDP sockets always use an XDP program, it just that a default one is > provided if the use doesn't explicitly add one. Have a look at > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to > explicitly have a program that muxes over the sockets. A naïve variant > can be found in samples/bpf/xdpsock_kern.c > > > Cheers, > Björn > >> Best regards >> >> Max ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Shared Umem between processes 2020-03-12 8:20 ` AW: " Gaul, Maximilian @ 2020-03-12 8:38 ` Björn Töpel 2020-03-12 8:49 ` AW: " Gaul, Maximilian 0 siblings, 1 reply; 12+ messages in thread From: Björn Töpel @ 2020-03-12 8:38 UTC (permalink / raw) To: Gaul, Maximilian; +Cc: Xdp On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > > I don't know if this reply works but I will try. > It worked! :-) > On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: > >> > >> Hello everyone, > >> > > > > Hi! I'm moving this to the XDP newbies list, which is a more proper > > place for these kind of discussions! > > > Sure, no problem. Thank you. > >> > >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. > >> > >> > >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf > >> > >> > >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). > >> > >> > >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. > >> > >> > >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. > >> > > > Let's start with defining what shared-umem is: The idea is to share > > the same umem, fill ring, and completion ring for multiple > > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one > > hardware ring. It's a mechanism to load-balance a HW queue over > > multiple sockets. > > > > If I'm reading you correctly, you'd like a solution: > > > > hw_q0, > > xsk_q0_0, xsk_q0_1, xsk_q0_2, > > > > instead of: > > > > hw_q0, hw_q1, hw_q2, > > xsk_q0_0, xsk_q1_0, xsk_q2_0, > > > > In the first case you'll need to mux the flows in the XDP program > > using an XSKMAP. > > > > Is this what you're trying to do? > > > Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? No; one socket, one umem, one queue. Unless you're using shared umem, then multiple sockets, one umem, one queue. > >> > >> > >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? > >> > > > > Yes, that is correct, and for a reason! :-) Note that if you'd like to > > do a multi-*process* setup with shared umem, you: need to have a > > control process that manages the fill/completion rings, and > > synchronize between the processes, OR re-mmap the fill/completetion > > ring from the socket owning the umem in multiple processes *and* > > synchronize the access to them. Neither is pleasant. > > > > Honestly, not a setup I'd recommend. > > > This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. I would probably go for the first option, without shared umem, but that's really up to you! If you're going for the shared umem, I'd do it single process. > >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. > >> > > > > Just for completeness; To setup shared umem: > > > > 1. create socket 0 and register the umem to this. > > 2. mmap the fr/cr using socket 0 > > 3. create socket 1, 2, n and refer to socket 0 for the umem. > > > > So, in a multiprocess solution step 3 would be done in separate > > processes, and step 2 depending on your application. You'd need to > > pass socket 0 to the other processes *and* share the umem memory from > > the process where socket 0 was created. This is pretty much a threaded > > solution, given all the shared state. > > > > I advice not taking this path. > > > I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. > >> > >> > >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. > >> > >> > >> > >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: > >> > >> > >> > >> > >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { > >> > >> struct xsk_socket_config xsk_cfg; > >> struct xsk_socket_info *xsk_info; > >> uint32_t idx; > >> uint32_t prog_id = 0; > >> int i; > >> int ret; > >> > >> xsk_info = calloc(1, sizeof(*xsk_info)); > >> if (!xsk_info) > >> return NULL; > >> > >> xsk_info->umem = umem; > >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; > >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; > >> xsk_cfg.libbpf_flags = 0; > >> xsk_cfg.xdp_flags = cfg->xdp_flags; > >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; > >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); > >> > >> if (ret) { > >> fprintf(stderr, "FAIL 1\n"); > >> goto error_exit; > >> } > >> > >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); > >> if (ret) { > >> fprintf(stderr, "FAIL 2\n"); > >> goto error_exit; > >> } > >> > >> /* Initialize umem frame allocation */ > >> for (i = 0; i < NUM_FRAMES; i++) > >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; > >> > >> xsk_info->umem_frame_free = NUM_FRAMES; > >> > >> if(cfg->use_shrd_umem) { > >> return xsk_info; > >> } > >> ... > >> } > >> > >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: > >> > >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. > >> > >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag > >> > >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. > >> > >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. > >> > >> Can you please help? > >> > > > > XDP sockets always use an XDP program, it just that a default one is > > provided if the use doesn't explicitly add one. Have a look at > > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to > > explicitly have a program that muxes over the sockets. A naïve variant > > can be found in samples/bpf/xdpsock_kern.c > > > > > > Cheers, > > Björn > > > >> Best regards > >> > >> Max > ^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: Shared Umem between processes 2020-03-12 8:38 ` Björn Töpel @ 2020-03-12 8:49 ` Gaul, Maximilian 2020-03-12 9:10 ` Björn Töpel 0 siblings, 1 reply; 12+ messages in thread From: Gaul, Maximilian @ 2020-03-12 8:49 UTC (permalink / raw) To: Björn Töpel; +Cc: Xdp Björn Töpel <bjorn.topel@gmail.com> wrote: >On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: >> >> I don't know if this reply works but I will try. >> > >It worked! :-) > >> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: >> >> >> >> Hello everyone, >> >> >> > >> > Hi! I'm moving this to the XDP newbies list, which is a more proper >> > place for these kind of discussions! >> > >> Sure, no problem. Thank you. >> >> >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. >> >> >> >> >> >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf >> >> >> >> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). >> >> >> >> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. >> >> >> >> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. >> >> >> >> > Let's start with defining what shared-umem is: The idea is to share >> > the same umem, fill ring, and completion ring for multiple >> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one >> > hardware ring. It's a mechanism to load-balance a HW queue over >> > multiple sockets. >> > >> > If I'm reading you correctly, you'd like a solution: >> > >> > hw_q0, >> > xsk_q0_0, xsk_q0_1, xsk_q0_2, >> > >> > instead of: >> > >> > hw_q0, hw_q1, hw_q2, >> > xsk_q0_0, xsk_q1_0, xsk_q2_0, >> > >> > In the first case you'll need to mux the flows in the XDP program >> > using an XSKMAP. >> > >> > Is this what you're trying to do? >> > >> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? > >No; one socket, one umem, one queue. Unless you're using shared umem, >then multiple sockets, one umem, one queue. > >> >> >> >> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? >> >> >> > >> > Yes, that is correct, and for a reason! :-) Note that if you'd like to >> > do a multi-*process* setup with shared umem, you: need to have a >> > control process that manages the fill/completion rings, and >> > synchronize between the processes, OR re-mmap the fill/completetion >> > ring from the socket owning the umem in multiple processes *and* >> > synchronize the access to them. Neither is pleasant. >> > >> > Honestly, not a setup I'd recommend. >> > >> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. > >I would probably go for the first option, without shared umem, but >that's really up to you! If you're going for the shared umem, I'd do >it single process. > I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of hw_q0 xsk_q0_0, xsk_q0_1, xsk_q0_2 I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry. >> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. >> >> >> > >> > Just for completeness; To setup shared umem: >> > >> > 1. create socket 0 and register the umem to this. >> > 2. mmap the fr/cr using socket 0 >> > 3. create socket 1, 2, n and refer to socket 0 for the umem. >> > >> > So, in a multiprocess solution step 3 would be done in separate >> > processes, and step 2 depending on your application. You'd need to >> > pass socket 0 to the other processes *and* share the umem memory from >> > the process where socket 0 was created. This is pretty much a threaded >> > solution, given all the shared state. >> > >> > I advice not taking this path. >> > >> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. >> >> >> >> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. >> >> >> >> >> >> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: >> >> >> >> >> >> >> >> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { >> >> >> >> struct xsk_socket_config xsk_cfg; >> >> struct xsk_socket_info *xsk_info; >> >> uint32_t idx; >> >> uint32_t prog_id = 0; >> >> int i; >> >> int ret; >> >> >> >> xsk_info = calloc(1, sizeof(*xsk_info)); >> >> if (!xsk_info) >> >> return NULL; >> >> >> >> xsk_info->umem = umem; >> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; >> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; >> >> xsk_cfg.libbpf_flags = 0; >> >> xsk_cfg.xdp_flags = cfg->xdp_flags; >> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; >> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); >> >> >> >> if (ret) { >> >> fprintf(stderr, "FAIL 1\n"); >> >> goto error_exit; >> >> } >> >> >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); >> >> if (ret) { >> >> fprintf(stderr, "FAIL 2\n"); >> >> goto error_exit; >> >> } >> >> >> >> /* Initialize umem frame allocation */ >> >> for (i = 0; i < NUM_FRAMES; i++) >> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; >> >> >> >> xsk_info->umem_frame_free = NUM_FRAMES; >> >> >> >> if(cfg->use_shrd_umem) { >> >> return xsk_info; >> >> } >> >> ... >> >> } >> >> >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: >> >> >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. >> >> >> >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag >> >> >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. >> >> >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. >> >> >> >> Can you please help? >> >> >> > >> > XDP sockets always use an XDP program, it just that a default one is >> > provided if the use doesn't explicitly add one. Have a look at >> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to >> > explicitly have a program that muxes over the sockets. A naïve variant >> > can be found in samples/bpf/xdpsock_kern.c >> > >> > >> > Cheers, >> > Björn >> > >> >> Best regards >> >> >> >> Max >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Shared Umem between processes 2020-03-12 8:49 ` AW: " Gaul, Maximilian @ 2020-03-12 9:10 ` Björn Töpel 2020-03-12 9:17 ` AW: " Gaul, Maximilian 0 siblings, 1 reply; 12+ messages in thread From: Björn Töpel @ 2020-03-12 9:10 UTC (permalink / raw) To: Gaul, Maximilian; +Cc: Xdp On Thu, 12 Mar 2020 at 09:49, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > > Björn Töpel <bjorn.topel@gmail.com> wrote: > > >On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > >> > >> I don't know if this reply works but I will try. > >> > > > >It worked! :-) > > > >> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: > >> >> > >> >> Hello everyone, > >> >> > >> > > >> > Hi! I'm moving this to the XDP newbies list, which is a more proper > >> > place for these kind of discussions! > >> > > >> Sure, no problem. Thank you. > >> >> > >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. > >> >> > >> >> > >> >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf > >> >> > >> >> > >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). > >> >> > >> >> > >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. > >> >> > >> >> > >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. > >> >> > >> > >> > Let's start with defining what shared-umem is: The idea is to share > >> > the same umem, fill ring, and completion ring for multiple > >> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one > >> > hardware ring. It's a mechanism to load-balance a HW queue over > >> > multiple sockets. > >> > > >> > If I'm reading you correctly, you'd like a solution: > >> > > >> > hw_q0, > >> > xsk_q0_0, xsk_q0_1, xsk_q0_2, > >> > > >> > instead of: > >> > > >> > hw_q0, hw_q1, hw_q2, > >> > xsk_q0_0, xsk_q1_0, xsk_q2_0, > >> > > >> > In the first case you'll need to mux the flows in the XDP program > >> > using an XSKMAP. > >> > > >> > Is this what you're trying to do? > >> > > >> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? > > > >No; one socket, one umem, one queue. Unless you're using shared umem, > >then multiple sockets, one umem, one queue. > > > >> >> > >> >> > >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? > >> >> > >> > > >> > Yes, that is correct, and for a reason! :-) Note that if you'd like to > >> > do a multi-*process* setup with shared umem, you: need to have a > >> > control process that manages the fill/completion rings, and > >> > synchronize between the processes, OR re-mmap the fill/completetion > >> > ring from the socket owning the umem in multiple processes *and* > >> > synchronize the access to them. Neither is pleasant. > >> > > >> > Honestly, not a setup I'd recommend. > >> > > >> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. > > > >I would probably go for the first option, without shared umem, but > >that's really up to you! If you're going for the shared umem, I'd do > >it single process. > > > > I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of > > hw_q0 > xsk_q0_0, xsk_q0_1, xsk_q0_2 > > I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry. > No worries! :-) Above you wrote "I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong." You can *only* tie multiple sockets to one queue by using shared umem. You said that "everyone with its own umem and rx/tx queues) tied to the same RX-Queue". If you'd like to go for the setup above, you can do this with libbpf today (have a look at the sample, where opt_num_xsks > 1). That will however be a single process solution. Clearer? Björn > >> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. > >> >> > >> > > >> > Just for completeness; To setup shared umem: > >> > > >> > 1. create socket 0 and register the umem to this. > >> > 2. mmap the fr/cr using socket 0 > >> > 3. create socket 1, 2, n and refer to socket 0 for the umem. > >> > > >> > So, in a multiprocess solution step 3 would be done in separate > >> > processes, and step 2 depending on your application. You'd need to > >> > pass socket 0 to the other processes *and* share the umem memory from > >> > the process where socket 0 was created. This is pretty much a threaded > >> > solution, given all the shared state. > >> > > >> > I advice not taking this path. > >> > > >> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. > >> >> > >> >> > >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. > >> >> > >> >> > >> >> > >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: > >> >> > >> >> > >> >> > >> >> > >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { > >> >> > >> >> struct xsk_socket_config xsk_cfg; > >> >> struct xsk_socket_info *xsk_info; > >> >> uint32_t idx; > >> >> uint32_t prog_id = 0; > >> >> int i; > >> >> int ret; > >> >> > >> >> xsk_info = calloc(1, sizeof(*xsk_info)); > >> >> if (!xsk_info) > >> >> return NULL; > >> >> > >> >> xsk_info->umem = umem; > >> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; > >> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; > >> >> xsk_cfg.libbpf_flags = 0; > >> >> xsk_cfg.xdp_flags = cfg->xdp_flags; > >> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; > >> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); > >> >> > >> >> if (ret) { > >> >> fprintf(stderr, "FAIL 1\n"); > >> >> goto error_exit; > >> >> } > >> >> > >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); > >> >> if (ret) { > >> >> fprintf(stderr, "FAIL 2\n"); > >> >> goto error_exit; > >> >> } > >> >> > >> >> /* Initialize umem frame allocation */ > >> >> for (i = 0; i < NUM_FRAMES; i++) > >> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; > >> >> > >> >> xsk_info->umem_frame_free = NUM_FRAMES; > >> >> > >> >> if(cfg->use_shrd_umem) { > >> >> return xsk_info; > >> >> } > >> >> ... > >> >> } > >> >> > >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: > >> >> > >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. > >> >> > >> >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag > >> >> > >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. > >> >> > >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. > >> >> > >> >> Can you please help? > >> >> > >> > > >> > XDP sockets always use an XDP program, it just that a default one is > >> > provided if the use doesn't explicitly add one. Have a look at > >> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to > >> > explicitly have a program that muxes over the sockets. A naïve variant > >> > can be found in samples/bpf/xdpsock_kern.c > >> > > >> > > >> > Cheers, > >> > Björn > >> > > >> >> Best regards > >> >> > >> >> Max > >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: Shared Umem between processes 2020-03-12 9:10 ` Björn Töpel @ 2020-03-12 9:17 ` Gaul, Maximilian 2020-03-12 9:40 ` Björn Töpel 0 siblings, 1 reply; 12+ messages in thread From: Gaul, Maximilian @ 2020-03-12 9:17 UTC (permalink / raw) To: Björn Töpel; +Cc: Xdp On Thu, 12 Mar 2020 at 10:10, Björn Töpel <bjorn.topel@gmail.com> wrote: >On Thu, 12 Mar 2020 at 09:49, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: >> >> Björn Töpel <bjorn.topel@gmail.com> wrote: >> >> >On Thu, 12 Mar 2020 at 09:20, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: >> >> >> >> I don't know if this reply works but I will try. >> >> >> > >> >It worked! :-) >> > >> >> On Thu, 12 Mar 2020 at 08:55, Björn Töpel <bjorn.topel@gmail.com> wrote: >> >> >> >> >> >> Hello everyone, >> >> >> >> >> > >> >> > Hi! I'm moving this to the XDP newbies list, which is a more proper >> >> > place for these kind of discussions! >> >> > >> >> Sure, no problem. Thank you. >> >> >> >> >> >> I am not sure if this is the correct address for my question / problem but I was forwarded to this e-mail from the libbpf github-issue section, so this is my excuse. >> >> >> >> >> >> >> >> >> Just a few information at the start of this e-mail: My program is largely based on: https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP and I am using libbpf: https://github.com/libbpf/libbpf >> >> >> >> >> >> >> >> >> I am currently trying to build an application that enables me to process multiple udp-multicast streams at once in parallel (each with up to several ten-thousands of packets per second). >> >> >> >> >> >> >> >> >> My first solution was to steer each multicast-stream on a separate RX-Queue on my NIC via `ethtool -N <if> flow-type udp4 ...` and to spawn as much user-space processes (each with a separate AF-XDP socket connected to one of the RX-Queues) as there are streams to process. >> >> >> >> >> >> >> >> >> But because this solution is limited to the amount of RX-Queues the NIC has and I wanted to build something hardware-independent, I looked around a bit and found a feature called `XDP_SHARED_UMEM`. >> >> >> >> >> >> >> > Let's start with defining what shared-umem is: The idea is to share >> >> > the same umem, fill ring, and completion ring for multiple >> >> > sockets. The sockets sharing that umem/fr/cr are tied (bound) to one >> >> > hardware ring. It's a mechanism to load-balance a HW queue over >> >> > multiple sockets. >> >> > >> >> > If I'm reading you correctly, you'd like a solution: >> >> > >> >> > hw_q0, >> >> > xsk_q0_0, xsk_q0_1, xsk_q0_2, >> >> > >> >> > instead of: >> >> > >> >> > hw_q0, hw_q1, hw_q2, >> >> > xsk_q0_0, xsk_q1_0, xsk_q2_0, >> >> > >> >> > In the first case you'll need to mux the flows in the XDP program >> >> > using an XSKMAP. >> >> > >> >> > Is this what you're trying to do? >> >> > >> >> Yes it is. But I had the problem that I couldn't create multiple sockets (no sharing, everyone with its own umem and rx/tx queues) tied to the same RX-Queue. Maybe I did something wrong. But is this possible? >> > >> >No; one socket, one umem, one queue. Unless you're using shared umem, >> >then multiple sockets, one umem, one queue. >> > >> >> >> >> >> >> >> >> >> As far as I understand (please correct me if I am wrong), at the moment libbpf only supports shared umem between threads of a process but not between processes - right? >> >> >> >> >> > >> >> > Yes, that is correct, and for a reason! :-) Note that if you'd like to >> >> > do a multi-*process* setup with shared umem, you: need to have a >> >> > control process that manages the fill/completion rings, and >> >> > synchronize between the processes, OR re-mmap the fill/completetion >> >> > ring from the socket owning the umem in multiple processes *and* >> >> > synchronize the access to them. Neither is pleasant. >> >> > >> >> > Honestly, not a setup I'd recommend. >> >> > >> >> This indeed sounds very unpleasent. So instead, if I understand correctly, you would go with the version above (the XDP program distributing the packets on the sockets via a XSKMAP). Is there something I have to watch out for? As I said, I wasn't able to create multiple sockets for the same RX-Queue. >> > >> >I would probably go for the first option, without shared umem, but >> >that's really up to you! If you're going for the shared umem, I'd do >> >it single process. >> > >> >> I am sorry but I am confused, you just said *No; one socket, one umem, one queue.*. How would I be able to follow your rough sketch of >> >> hw_q0 >> xsk_q0_0, xsk_q0_1, xsk_q0_2 >> >> I don't have deep knowledge about XDP and the pipeline, maybe there is something I am missing. I am sorry. >> > >No worries! :-) > >Above you wrote "I couldn't create multiple sockets (no sharing, >everyone with its own umem and rx/tx queues) tied to the same >RX-Queue. Maybe I did something wrong." You can *only* tie multiple >sockets to one queue by using shared umem. You said that "everyone >with its own umem and rx/tx queues) tied to the same RX-Queue". > >If you'd like to go for the setup above, you can do this with libbpf >today (have a look at the sample, where opt_num_xsks > 1). That will >however be a single process solution. > >Clearer? > > >Björn > > Thank you so much Björn! just to wrap things up: - if I want to distribute packet processing from a single RX-Queue to multiple sockets I have to use shared umem because it is not possible to bind multiple af-xdp sockets onto the same RX-Queue - furthermore, you would recommend to go with a single process / multiple threads solution in case of shared umem is this correct? Max >> >> >> I ran unto the problem, that `struct xsk_umem` is hidden in `xsk.c`. This prevents me from copying the content from the original socket / umem into shared memory. I am not sure, what information the sub-process (the one which is using the umem from another process) needs so I figured the simplest solution would be to just copy the whole umem struct. >> >> >> >> >> > >> >> > Just for completeness; To setup shared umem: >> >> > >> >> > 1. create socket 0 and register the umem to this. >> >> > 2. mmap the fr/cr using socket 0 >> >> > 3. create socket 1, 2, n and refer to socket 0 for the umem. >> >> > >> >> > So, in a multiprocess solution step 3 would be done in separate >> >> > processes, and step 2 depending on your application. You'd need to >> >> > pass socket 0 to the other processes *and* share the umem memory from >> >> > the process where socket 0 was created. This is pretty much a threaded >> >> > solution, given all the shared state. >> >> > >> >> > I advice not taking this path. >> >> > >> >> I am not entirely sure what you mean with *passing socket 0* is this just the fd of the socket? What's about the `struct xsk_umem`? Do I need that? I guess so because `xsk_socket__create()` has a parameter `struct xsk_umem`. >> >> >> >> >> >> >> >> >> So I went with the "quick-fix" to just move the definition of `struct xsk_umem` into `xsk.h` and to copy the umem-information from the original process into a shared memory. This process then calls `fork()` thus spawning a sub-process. This sub-process then reads the previously written umem-information from shared memory and passes it into `xsk_configure_socket` (af_xdp_user.c) which then eventually calls `xsk_socket__create` in `xsk.c`. This function then checks for `umem->refcount` and sets the flags for shared umem accordingly. >> >> >> >> >> >> >> >> >> >> >> >> After returning from `xsk_socket__create` (we are still in `xsk_configure_socket` in af_xdp_user.c), `bpf_get_link_xdp_id` is called (I don't know if that's necessary). But after that call I exit the function `xsk_socket__create` in the sub-process because I figured it is probably bad to configure the umem a second time by calling `xsk_ring_prod__reserve` after that: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> static struct xsk_socket_info *xsk_configure_socket(struct config *cfg, struct xsk_umem_info *umem) { >> >> >> >> >> >> struct xsk_socket_config xsk_cfg; >> >> >> struct xsk_socket_info *xsk_info; >> >> >> uint32_t idx; >> >> >> uint32_t prog_id = 0; >> >> >> int i; >> >> >> int ret; >> >> >> >> >> >> xsk_info = calloc(1, sizeof(*xsk_info)); >> >> >> if (!xsk_info) >> >> >> return NULL; >> >> >> >> >> >> xsk_info->umem = umem; >> >> >> xsk_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; >> >> >> xsk_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; >> >> >> xsk_cfg.libbpf_flags = 0; >> >> >> xsk_cfg.xdp_flags = cfg->xdp_flags; >> >> >> xsk_cfg.bind_flags = cfg->xsk_bind_flags; >> >> >> ret = xsk_socket__create(&xsk_info->xsk, cfg->ifname, cfg->xsk_if_queue, umem->umem, &xsk_info->rx, &xsk_info->tx, &xsk_cfg); >> >> >> >> >> >> if (ret) { >> >> >> fprintf(stderr, "FAIL 1\n"); >> >> >> goto error_exit; >> >> >> } >> >> >> >> >> >> ret = bpf_get_link_xdp_id(cfg->ifindex, &prog_id, cfg->xdp_flags); >> >> >> if (ret) { >> >> >> fprintf(stderr, "FAIL 2\n"); >> >> >> goto error_exit; >> >> >> } >> >> >> >> >> >> /* Initialize umem frame allocation */ >> >> >> for (i = 0; i < NUM_FRAMES; i++) >> >> >> xsk_info->umem_frame_addr[i] = i * FRAME_SIZE; >> >> >> >> >> >> xsk_info->umem_frame_free = NUM_FRAMES; >> >> >> >> >> >> if(cfg->use_shrd_umem) { >> >> >> return xsk_info; >> >> >> } >> >> >> ... >> >> >> } >> >> >> >> >> >> Somehow what I am doing doesn't work because my sub-process dies in `xsk_configure_socket`. I am not able to debug it properly with GDB though. Another point I don't understand is the statement: >> >> >> >> >> >> However, note that you need to supply the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the xsk_socket__create calls and load your own XDP program as there is no built in one in libbpf that will route the traffic for you. >> >> >> >> >> >> from https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag >> >> >> >> >> >> I didn't know that libbpf loads a XDP-program? Why would it do that? I am using my own af-xdp program which filters for udp-packets. If I set `xsk_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;` in `xsk_configure_socket`, the af-xdp-socket fd is not put into the kernel `xsks-map` which basically means that I don't receive any packets. >> >> >> >> >> >> As you probably already noticed, I am overstrained with the concept of Shared Umem and I have to say, there is no documentation about it besides the two sentences in https://www.kernel.org/doc/html/latest/networking/af_xdp.html#xdp-shared-umem-bind-flag and a mail in a linux mailbox from Nov. 2019 stating that this feature is now implemented. >> >> >> >> >> >> Can you please help? >> >> >> >> >> > >> >> > XDP sockets always use an XDP program, it just that a default one is >> >> > provided if the use doesn't explicitly add one. Have a look at >> >> > tools/lib/bpf/xsk.c:xsk_load_xdp_prog. So, for shared umem you need to >> >> > explicitly have a program that muxes over the sockets. A naïve variant >> >> > can be found in samples/bpf/xdpsock_kern.c >> >> > >> >> > >> >> > Cheers, >> >> > Björn >> >> > >> >> >> Best regards >> >> >> >> >> >> Max >> >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Shared Umem between processes 2020-03-12 9:17 ` AW: " Gaul, Maximilian @ 2020-03-12 9:40 ` Björn Töpel 2020-03-12 11:00 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 12+ messages in thread From: Björn Töpel @ 2020-03-12 9:40 UTC (permalink / raw) To: Gaul, Maximilian; +Cc: Xdp On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > [...] > > Thank you so much Björn! > > just to wrap things up: > > - if I want to distribute packet processing from a single RX-Queue to multiple sockets I have to use shared umem because it is not possible to bind multiple af-xdp sockets onto the same RX-Queue Correct! And you need a tailored XDP program that spreads over the shared umem sockets! > - furthermore, you would recommend to go with a single process / multiple threads solution in case of shared umem > Yes, and while possible to do multi-process, the solution would be complex with little gain (IMO). And please reach out if you have any more questions! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Shared Umem between processes 2020-03-12 9:40 ` Björn Töpel @ 2020-03-12 11:00 ` Toke Høiland-Jørgensen 2020-03-12 11:29 ` Björn Töpel 0 siblings, 1 reply; 12+ messages in thread From: Toke Høiland-Jørgensen @ 2020-03-12 11:00 UTC (permalink / raw) To: Björn Töpel, Gaul, Maximilian; +Cc: Xdp Björn Töpel <bjorn.topel@gmail.com> writes: > On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: >> > [...] >> >> Thank you so much Björn! >> >> just to wrap things up: >> >> - if I want to distribute packet processing from a single RX-Queue to >> multiple sockets I have to use shared umem because it is not possible >> to bind multiple af-xdp sockets onto the same RX-Queue > > Correct! And you need a tailored XDP program that spreads over the > shared umem sockets! Could we lift this restriction? Not with zero-copy, obviously, but if there's a copy involved it seems it should be possible to support several sockets on the same RXQ? That would make it possible to use XDP as a per-CPU load balancer for a single RXQ, like we can do with cpumap for packets hitting the stack today? -Toke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Shared Umem between processes 2020-03-12 11:00 ` Toke Høiland-Jørgensen @ 2020-03-12 11:29 ` Björn Töpel 2020-03-12 15:48 ` AW: " Gaul, Maximilian 0 siblings, 1 reply; 12+ messages in thread From: Björn Töpel @ 2020-03-12 11:29 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Gaul, Maximilian, Xdp On Thu, 12 Mar 2020 at 12:01, Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Björn Töpel <bjorn.topel@gmail.com> writes: > > > On Thu, 12 Mar 2020 at 10:17, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: > >> > > [...] > >> > >> Thank you so much Björn! > >> > >> just to wrap things up: > >> > >> - if I want to distribute packet processing from a single RX-Queue to > >> multiple sockets I have to use shared umem because it is not possible > >> to bind multiple af-xdp sockets onto the same RX-Queue > > > > Correct! And you need a tailored XDP program that spreads over the > > shared umem sockets! > > Could we lift this restriction? Not with zero-copy, obviously, but if > there's a copy involved it seems it should be possible to support > several sockets on the same RXQ? That would make it possible to use XDP > as a per-CPU load balancer for a single RXQ, like we can do with cpumap > for packets hitting the stack today? > Yes! It's on the (never ending) TODO list. Never heard that before, right? :-( Cheers, Björn > -Toke > ^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: Shared Umem between processes 2020-03-12 11:29 ` Björn Töpel @ 2020-03-12 15:48 ` Gaul, Maximilian 2020-03-15 15:29 ` Gaul, Maximilian 0 siblings, 1 reply; 12+ messages in thread From: Gaul, Maximilian @ 2020-03-12 15:48 UTC (permalink / raw) To: Björn Töpel, Toke Høiland-Jørgensen; +Cc: Xdp I don't know if it is appropriate to ask a new question on the same topic in the same thread? In case it is not I will start a new one but I thought it would fit "my story on how to get shared umem working". I followed https://github.com/xdp-project/bpf-next/blob/master/samples/bpf/xdpsock_user.c on how to implement shared umem and I think I got it working to some extend. There are some problems though: I am processing two multicast streams arriving on the same RX-Queue on two sockets (using shared umem) - each socket is running in its own thread (same process). What I noticed: Everything seems to work fine for about 1min or so (even though I am noticing some packet-loss for 530.000 pps) but after that, datarate drops to only half and after one more minute to a fourth. My first thought on this was that the reserved umem frames by calling `xsk_ring_prod__reserve` are not freed properly (similiar to a memory leak) because of that I decreased the size of the umem to a tenth hoping to see a decrease in packet rate even sooner - and indeed I did! Basically what I do is to create a new thread for each socket and pass the `xsk_socket_info`-struct accordingly. I then call `nanosleep` for 2.5ms in a while-loop and process every frame that arrived: static void* rx_and_process(void *a) { struct pthread_arg *arg = (struct pthread_arg*)a; struct config *cfg = arg->cfg; struct pckt_idntfy_stats *pckt = arg->pckt_idntfy; struct xsk_socket_info *xsk_socket = arg->xsk_socket; struct timespec spec; spec.tv_sec = 0; spec.tv_nsec = 2500 * 1000; struct timespec remaining; while(!global_exit) { if(nanosleep(&spec, &remaining) < 0) { nanosleep(&spec, &remaining); } handle_receive_packets(xsk_socket, fds); } return NULL; } `pckt_idntfy_stats` contains information about where the statistics about this multicast-stream should be placed in shared memory. Processing then happens like this: static void handle_receive_packets(struct xsk_socket_info *xsk_socket, struct pollfd *fds) { unsigned int rcvd, i; uint32_t idx_rx = 0, idx_fq = 0; int ret; rcvd = xsk_ring_cons__peek(&xsk_socket->rx, INT32_MAX, &idx_rx); if (!rcvd) { /* no packets received, go to sleep */ return; } ret = xsk_ring_prod__reserve(&xsk_socket->umem->fq, rcvd, &idx_fq); if (ret < 0) { fprintf(stderr, "Error: %s\n", strerror(-ret)); return; } else if(ret == 0) { printf("NO SPACE LEFT!\n"); return; } else if(ret != rcvd) { printf("RET != RCVD\n"); return; } for (i = 0; i < rcvd; i++) { uint64_t addr = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx)->addr; uint32_t len = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx++)->len; uint64_t orig = xsk_umem__extract_addr(addr); addr = xsk_umem__add_offset_to_addr(addr); process_packet(xsk_socket, addr, len); *xsk_ring_prod__fill_addr(&xsk_socket->umem->fq, idx_fq++) = orig; xsk_socket->stats.rx_bytes += len; } xsk_ring_prod__submit(&xsk_socket->umem->fq, rcvd); xsk_ring_cons__release(&xsk_socket->rx, rcvd); xsk_socket->stats.rx_packets += rcvd; } I am sorry to post all this code here but maybe it helps? This is how I configured the umem (basically a 1:1 copy from `xdpsock_user.c`: static struct xsk_umem_info *configure_xsk_umem(void *buffer, uint64_t size) { struct xsk_umem_info *umem; struct xsk_umem_config cfg = { .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, .frame_size = FRAME_SIZE, .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, .flags = 0 }; int ret; umem = calloc(1, sizeof(*umem)); if (!umem) { fprintf(stderr, "Error while allocating umem: %s\n", strerror(errno)); exit(1); } ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg); if (ret) { fprintf(stderr, "`xsk_umem__create` returned: %s\n", strerror(-ret)); exit(1); } umem->buffer = buffer; return umem; } and after that I call: static void xsk_populate_fill_ring(struct xsk_umem_info *umem) { int ret, i; uint32_t idx; ret = xsk_ring_prod__reserve(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx); if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) { fprintf(stderr, "Failed to reserve prod ring: %s\n", strerror(errno)); exit(1); } for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) { *xsk_ring_prod__fill_addr(&umem->fq, idx++) = i * FRAME_SIZE; } xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS); } And sockets are created this way: static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, struct config *cfg, bool rx, bool tx) { struct xsk_socket_config xsk_socket_cfg; struct xsk_socket_info *xsk; struct xsk_ring_cons *rxr; struct xsk_ring_prod *txr; int ret; xsk = calloc(1, sizeof(*xsk)); if (!xsk) { fprintf(stderr, "xsk `calloc` failed: %s\n", strerror(errno)); exit(1); } xsk->umem = umem; xsk_socket_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; xsk_socket_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; if (cfg->ip_addrs_len > 1) { xsk_socket_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; } else { xsk_socket_cfg.libbpf_flags = 0; } xsk_socket_cfg.xdp_flags = cfg->xdp_flags; xsk_socket_cfg.bind_flags = cfg->xsk_bind_flags; rxr = rx ? &xsk->rx : NULL; txr = tx ? &xsk->tx : NULL; ret = xsk_socket__create(&xsk->xsk, cfg->ifname_buf, cfg->xsk_if_queue, umem->umem, rxr, txr, &xsk_socket_cfg); if (ret) { fprintf(stderr, "`xsk_socket__create` returned error: %s\n", strerror(errno)); exit(-ret); } return xsk; } As far as I've seen from `xdpsock_user.c` there is no special handling required by the sockets who are using shared umem? What am I missing? Best regards Max ^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: Shared Umem between processes 2020-03-12 15:48 ` AW: " Gaul, Maximilian @ 2020-03-15 15:29 ` Gaul, Maximilian 0 siblings, 0 replies; 12+ messages in thread From: Gaul, Maximilian @ 2020-03-15 15:29 UTC (permalink / raw) To: Xdp I figured out that this was because I didn't synchronize the access to the umem between the AF-XDP sockets. I am doing that now in a Round-Robin like fashion as described in `xdpsock_user.c`. >On Thu, 12 Mar 2020 at 16:48, Gaul, Maximilian <maximilian.gaul@hm.edu> wrote: >I don't know if it is appropriate to ask a new question on the same topic in the same thread? In case it is not I will start a new one but I thought it would fit "my story on how to get shared umem working". > >I followed https://github.com/xdp-project/bpf-next/blob/master/samples/bpf/xdpsock_user.c on how to implement shared umem and I think I got it working to some extend. > >There are some problems though: > >I am processing two multicast streams arriving on the same RX-Queue on two sockets (using shared umem) - each socket is running in its own thread (same process). >What I noticed: Everything seems to work fine for about 1min or so (even though I am noticing some packet-loss for 530.000 pps) but after that, datarate drops to only half and after one more minute to a fourth. > >My first thought on this was that the reserved umem frames by calling `xsk_ring_prod__reserve` are not freed properly (similiar to a memory leak) because of that I decreased the size of the umem to a tenth hoping to see a decrease in packet rate even sooner - and indeed I did! > >Basically what I do is to create a new thread for each socket and pass the `xsk_socket_info`-struct accordingly. I then call `nanosleep` for 2.5ms in a while-loop and process every frame that arrived: > >static void* rx_and_process(void *a) { > > struct pthread_arg *arg = (struct pthread_arg*)a; > struct config *cfg = arg->cfg; > struct pckt_idntfy_stats *pckt = arg->pckt_idntfy; > struct xsk_socket_info *xsk_socket = arg->xsk_socket; > > struct timespec spec; > spec.tv_sec = 0; > spec.tv_nsec = 2500 * 1000; > > struct timespec remaining; > > while(!global_exit) { > if(nanosleep(&spec, &remaining) < 0) { > nanosleep(&spec, &remaining); > } > handle_receive_packets(xsk_socket, fds); > } > > return NULL; >} > >`pckt_idntfy_stats` contains information about where the statistics about this multicast-stream should be placed in shared memory. > >Processing then happens like this: > >static void handle_receive_packets(struct xsk_socket_info *xsk_socket, struct pollfd *fds) { > unsigned int rcvd, i; > uint32_t idx_rx = 0, idx_fq = 0; > int ret; > > rcvd = xsk_ring_cons__peek(&xsk_socket->rx, INT32_MAX, &idx_rx); > if (!rcvd) { > /* no packets received, go to sleep */ > return; > } > > ret = xsk_ring_prod__reserve(&xsk_socket->umem->fq, rcvd, &idx_fq); > if (ret < 0) { > fprintf(stderr, "Error: %s\n", strerror(-ret)); > return; > } else if(ret == 0) { > printf("NO SPACE LEFT!\n"); > return; > } else if(ret != rcvd) { > printf("RET != RCVD\n"); > return; > } > > for (i = 0; i < rcvd; i++) { > uint64_t addr = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx)->addr; > uint32_t len = xsk_ring_cons__rx_desc(&xsk_socket->rx, idx_rx++)->len; > uint64_t orig = xsk_umem__extract_addr(addr); > > addr = xsk_umem__add_offset_to_addr(addr); > > process_packet(xsk_socket, addr, len); > *xsk_ring_prod__fill_addr(&xsk_socket->umem->fq, idx_fq++) = orig; > > xsk_socket->stats.rx_bytes += len; > } > > xsk_ring_prod__submit(&xsk_socket->umem->fq, rcvd); > xsk_ring_cons__release(&xsk_socket->rx, rcvd); > xsk_socket->stats.rx_packets += rcvd; >} > >I am sorry to post all this code here but maybe it helps? > >This is how I configured the umem (basically a 1:1 copy from `xdpsock_user.c`: > >static struct xsk_umem_info *configure_xsk_umem(void *buffer, uint64_t size) { > > struct xsk_umem_info *umem; > struct xsk_umem_config cfg = { > .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, > .frame_size = FRAME_SIZE, > .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, > .flags = 0 > }; > int ret; > > umem = calloc(1, sizeof(*umem)); > if (!umem) { > fprintf(stderr, "Error while allocating umem: %s\n", strerror(errno)); > exit(1); > } > > ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg); > if (ret) { > fprintf(stderr, "`xsk_umem__create` returned: %s\n", strerror(-ret)); > exit(1); > } > > umem->buffer = buffer; > return umem; >} > >and after that I call: > >static void xsk_populate_fill_ring(struct xsk_umem_info *umem) { > int ret, i; > uint32_t idx; > > ret = xsk_ring_prod__reserve(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx); > if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) { > fprintf(stderr, "Failed to reserve prod ring: %s\n", strerror(errno)); > exit(1); > } > for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) { > *xsk_ring_prod__fill_addr(&umem->fq, idx++) = i * FRAME_SIZE; > } > xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS); >} > >And sockets are created this way: > >static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, struct config *cfg, bool rx, bool tx) { > struct xsk_socket_config xsk_socket_cfg; > struct xsk_socket_info *xsk; > struct xsk_ring_cons *rxr; > struct xsk_ring_prod *txr; > int ret; > > xsk = calloc(1, sizeof(*xsk)); > if (!xsk) { > fprintf(stderr, "xsk `calloc` failed: %s\n", strerror(errno)); > exit(1); > } > > xsk->umem = umem; > xsk_socket_cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; > xsk_socket_cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; > if (cfg->ip_addrs_len > 1) { > xsk_socket_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; > } else { > xsk_socket_cfg.libbpf_flags = 0; > } > xsk_socket_cfg.xdp_flags = cfg->xdp_flags; > xsk_socket_cfg.bind_flags = cfg->xsk_bind_flags; > > rxr = rx ? &xsk->rx : NULL; > txr = tx ? &xsk->tx : NULL; > ret = xsk_socket__create(&xsk->xsk, cfg->ifname_buf, cfg->xsk_if_queue, umem->umem, rxr, txr, &xsk_socket_cfg); > if (ret) { > fprintf(stderr, "`xsk_socket__create` returned error: %s\n", strerror(errno)); > exit(-ret); > } > > return xsk; >} > >As far as I've seen from `xdpsock_user.c` there is no special handling required by the sockets who are using shared umem? What am I missing? > >Best regards > >Max ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-03-15 15:29 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-03-11 15:58 Shared Umem between processes Gaul, Maximilian 2020-03-12 7:55 ` Björn Töpel 2020-03-12 8:20 ` AW: " Gaul, Maximilian 2020-03-12 8:38 ` Björn Töpel 2020-03-12 8:49 ` AW: " Gaul, Maximilian 2020-03-12 9:10 ` Björn Töpel 2020-03-12 9:17 ` AW: " Gaul, Maximilian 2020-03-12 9:40 ` Björn Töpel 2020-03-12 11:00 ` Toke Høiland-Jørgensen 2020-03-12 11:29 ` Björn Töpel 2020-03-12 15:48 ` AW: " Gaul, Maximilian 2020-03-15 15:29 ` Gaul, Maximilian
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.