* xdp/xsk.c: missing read memory barrier in xsk_poll()
@ 2023-11-24 7:00 Yewon Choi
2023-11-24 13:50 ` Magnus Karlsson
0 siblings, 1 reply; 3+ messages in thread
From: Yewon Choi @ 2023-11-24 7:00 UTC (permalink / raw)
To: Björn Töpel, Magnus Karlsson, Maciej Fijalkowski,
Jonathan Lemon, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, netdev, bpf, linux-kernel
Cc: threeearcat
Hello,
We found some possibility of missing read memory barrier in xsk_poll(),
so we would like to ask to check it.
commit e6762c8b adds two smp_rmb() in xsk_mmap(), which are paired with
smp_wmb() in XDP_UMEM_REG and xsk_init_queue each. The later one is
added in order to prevent reordering between reading of q and reading
of q->ring.
One example in simplied code is:
xsk_mmap():
if (offset == XDP_PGOFF_RX_RING) {
q = READ_ONCE(xs->rx);
}
...
if (!q)
return -EINVAL;
/* Matches the smp_wmb() in xsk_init_queue */
smp_rmb();
...
return remap_vmalloc_range(vma, q->ring, 0);
Also, the similar logic exists in xsk_poll() without smp_rmb().
xsk_poll():
...
if (xs->rx && !xskq_prod_is_empty(xs->rx))
mask |= EPOLLIN | EPOLLRDNORM;
if (xs->tx && xsk_tx_writeable(xs))
mask |= EPOLLOUT | EPOLLWRNORM;
xskq_prod_is_empty():
return READ_ONCE(q->ring->consumer) && ...
To be consistent, I think that smp_rmb() is needed between
xs->rx and !xsq_prod_is_empty() and the same applies for xs->tx.
Could you check this please?
If a patch is needed, we will send them.
Best Regards,
Yewon Choi
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: xdp/xsk.c: missing read memory barrier in xsk_poll() 2023-11-24 7:00 xdp/xsk.c: missing read memory barrier in xsk_poll() Yewon Choi @ 2023-11-24 13:50 ` Magnus Karlsson 2023-11-24 15:16 ` Yewon Choi 0 siblings, 1 reply; 3+ messages in thread From: Magnus Karlsson @ 2023-11-24 13:50 UTC (permalink / raw) To: Yewon Choi Cc: Björn Töpel, Magnus Karlsson, Maciej Fijalkowski, Jonathan Lemon, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev, bpf, linux-kernel, threeearcat On Fri, 24 Nov 2023 at 08:00, Yewon Choi <woni9911@gmail.com> wrote: > > Hello, > > We found some possibility of missing read memory barrier in xsk_poll(), > so we would like to ask to check it. > > commit e6762c8b adds two smp_rmb() in xsk_mmap(), which are paired with > smp_wmb() in XDP_UMEM_REG and xsk_init_queue each. The later one is > added in order to prevent reordering between reading of q and reading > of q->ring. > One example in simplied code is: > > xsk_mmap(): > if (offset == XDP_PGOFF_RX_RING) { > q = READ_ONCE(xs->rx); > } > ... > if (!q) > return -EINVAL; > > /* Matches the smp_wmb() in xsk_init_queue */ > smp_rmb(); > ... > return remap_vmalloc_range(vma, q->ring, 0); > > Also, the similar logic exists in xsk_poll() without smp_rmb(). > > xsk_poll(): > ... > if (xs->rx && !xskq_prod_is_empty(xs->rx)) > mask |= EPOLLIN | EPOLLRDNORM; > if (xs->tx && xsk_tx_writeable(xs)) > mask |= EPOLLOUT | EPOLLWRNORM; > > xskq_prod_is_empty(): > return READ_ONCE(q->ring->consumer) && ... > > To be consistent, I think that smp_rmb() is needed between > xs->rx and !xsq_prod_is_empty() and the same applies for xs->tx. > > Could you check this please? > If a patch is needed, we will send them. Yes, you are correct that the current code would need an smp_rmb(). However, an unbound socket should never be allowed to enter the xsk_poll() code in the first place since it is pointless to poll a socket that has not been bound. This error was introduced in the commit below: commit 1596dae2f17ec5c6e8c8f0e3fec78c5ae55c1e0b Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Date: Wed Feb 15 15:33:09 2023 +0100 xsk: check IFF_UP earlier in Tx path When an AF_XDP socket has been bound, it is guaranteed to have been set up in the correct way and a memory barrier has already been executed in the xsk_bind call. It would be great if you could submit a patch, but I suggest that you do something like this instead of introducing an smp_rmb(): if (xsk_check_common(xs)) goto out; : : if (xs->rx && !xskq_prod_is_empty(xs->rx)) mask |= EPOLLIN | EPOLLRDNORM; if (xs->tx && xsk_tx_writeable(xs)) mask |= EPOLLOUT | EPOLLWRNORM; out: rcu_read_unlock(); return mask; Thank you for spotting this! /Magnus > > Best Regards, > Yewon Choi > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xdp/xsk.c: missing read memory barrier in xsk_poll() 2023-11-24 13:50 ` Magnus Karlsson @ 2023-11-24 15:16 ` Yewon Choi 0 siblings, 0 replies; 3+ messages in thread From: Yewon Choi @ 2023-11-24 15:16 UTC (permalink / raw) To: Magnus Karlsson Cc: Björn Töpel, Maciej Fijalkowski, Jonathan Lemon, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev, bpf, linux-kernel, threeearcat On Fri, Nov 24, 2023 at 02:50:04PM +0100, Magnus Karlsson wrote: > On Fri, 24 Nov 2023 at 08:00, Yewon Choi <woni9911@gmail.com> wrote: > > > > Hello, > > > > We found some possibility of missing read memory barrier in xsk_poll(), > > so we would like to ask to check it. > > > > commit e6762c8b adds two smp_rmb() in xsk_mmap(), which are paired with > > smp_wmb() in XDP_UMEM_REG and xsk_init_queue each. The later one is > > added in order to prevent reordering between reading of q and reading > > of q->ring. > > One example in simplied code is: > > > > xsk_mmap(): > > if (offset == XDP_PGOFF_RX_RING) { > > q = READ_ONCE(xs->rx); > > } > > ... > > if (!q) > > return -EINVAL; > > > > /* Matches the smp_wmb() in xsk_init_queue */ > > smp_rmb(); > > ... > > return remap_vmalloc_range(vma, q->ring, 0); > > > > Also, the similar logic exists in xsk_poll() without smp_rmb(). > > > > xsk_poll(): > > ... > > if (xs->rx && !xskq_prod_is_empty(xs->rx)) > > mask |= EPOLLIN | EPOLLRDNORM; > > if (xs->tx && xsk_tx_writeable(xs)) > > mask |= EPOLLOUT | EPOLLWRNORM; > > > > xskq_prod_is_empty(): > > return READ_ONCE(q->ring->consumer) && ... > > > > To be consistent, I think that smp_rmb() is needed between > > xs->rx and !xsq_prod_is_empty() and the same applies for xs->tx. > > > > Could you check this please? > > If a patch is needed, we will send them. > > Yes, you are correct that the current code would need an smp_rmb(). > However, an unbound socket should never be allowed to enter the > xsk_poll() code in the first place since it is pointless to poll a > socket that has not been bound. This error was introduced in the > commit below: > > commit 1596dae2f17ec5c6e8c8f0e3fec78c5ae55c1e0b > Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Date: Wed Feb 15 15:33:09 2023 +0100 > > xsk: check IFF_UP earlier in Tx path > > When an AF_XDP socket has been bound, it is guaranteed to have been > set up in the correct way and a memory barrier has already been > executed in the xsk_bind call. It would be great if you could submit a > patch, but I suggest that you do something like this instead of > introducing an smp_rmb(): > > if (xsk_check_common(xs)) > goto out; > : > : > > if (xs->rx && !xskq_prod_is_empty(xs->rx)) > mask |= EPOLLIN | EPOLLRDNORM; > if (xs->tx && xsk_tx_writeable(xs)) > mask |= EPOLLOUT | EPOLLWRNORM; > > out: > rcu_read_unlock(); > return mask; > I didn't grab that semantic fully, thank you for pointing it out. As you suggested, it seems that the part right below skip_tx also should be skipped. Additionally, I think read ordering will be guaranteed by smp_rmb() in xsk_check_common(). I'll write a patch after making sure it, just in case of my mistake. Thank you for your reply. > Thank you for spotting this! > > /Magnus > Best Regards, Yewon Choi ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-11-24 15:17 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-24 7:00 xdp/xsk.c: missing read memory barrier in xsk_poll() Yewon Choi 2023-11-24 13:50 ` Magnus Karlsson 2023-11-24 15:16 ` Yewon Choi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox