From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>,
Cong Wang <cong.wang@bytedance.com>,
Jiang Wang <jiang.wang@bytedance.com>,
Krasnov Arseniy <oxffffaa@gmail.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Stefano Garzarella <sgarzare@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5] virtio/vsock: replace virtio_vsock_pkt with sk_buff
Date: Mon, 21 Nov 2022 12:01:12 +0000 [thread overview]
Message-ID: <Y3toiPtBgOcrb8TL@bullseye> (raw)
In-Reply-To: <863a58452b4a4c0d63a41b0f78b59d32919067fa.camel@redhat.com>
On Tue, Dec 06, 2022 at 11:20:21AM +0100, Paolo Abeni wrote:
> Hello,
>
> On Fri, 2022-12-02 at 09:35 -0800, Bobby Eshleman wrote:
> [...]
> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> > index 35d7eedb5e8e..6c0b2d4da3fe 100644
> > --- a/include/linux/virtio_vsock.h
> > +++ b/include/linux/virtio_vsock.h
> > @@ -3,10 +3,129 @@
> > #define _LINUX_VIRTIO_VSOCK_H
> >
> > #include <uapi/linux/virtio_vsock.h>
> > +#include <linux/bits.h>
> > #include <linux/socket.h>
> > #include <net/sock.h>
> > #include <net/af_vsock.h>
> >
> > +#define VIRTIO_VSOCK_SKB_HEADROOM (sizeof(struct virtio_vsock_hdr))
> > +
> > +enum virtio_vsock_skb_flags {
> > + VIRTIO_VSOCK_SKB_FLAGS_REPLY = BIT(0),
> > + VIRTIO_VSOCK_SKB_FLAGS_TAP_DELIVERED = BIT(1),
> > +};
> > +
> > +static inline struct virtio_vsock_hdr *virtio_vsock_hdr(struct sk_buff *skb)
> > +{
> > + return (struct virtio_vsock_hdr *)skb->head;
> > +}
> > +
> > +static inline bool virtio_vsock_skb_reply(struct sk_buff *skb)
> > +{
> > + return skb->_skb_refdst & VIRTIO_VSOCK_SKB_FLAGS_REPLY;
> > +}
>
> I'm sorry for the late feedback. The above is extremelly risky: if the
> skb will land later into the networking stack, we could experience the
> most difficult to track bugs.
>
> You should use the skb control buffer instead (skb->cb), with the
> additional benefit you could use e.g. bool - the compiler could emit
> better code to manipulate such fields - and you will not need to clear
> the field before release nor enqueue.
>
> [...]
>
Hey Paolo, thank you for the review. For my own learning, this would
happen presumably when the skb is dropped? And I assume we don't see
this in sockmap because it is always cleared before leaving sockmap's
hands? I sanity checked this patch with an out-of-tree patch I have that
uses the networking stack, but I suspect I didn't see issues because my
test harness didn't induce dropping...
I originally avoided skb->cb because the reply flag is set at allocation
and would potentially be clobbered by a pass through the networking
stack. The reply flag would be used after a pass through the networking
stack (e.g., during transmission at the device level and when sockets
close while skbs are still queued for xmit).
I suppose using skb->cb would look like something like this:
- use skb_clone() for reply skbs
- set reply on the cloned sk_buff skb->cb
- keep a hashmap mapping original skb to cloned skb (is there a better
way?)
- when choosing to apply reply logic, if skb_cloned() refer to the
hashmap
Is there a better/simpler way to maintain skb->cb?
> > @@ -352,37 +360,38 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> > size_t len)
> > {
> > struct virtio_vsock_sock *vvs = vsk->trans;
> > - struct virtio_vsock_pkt *pkt;
> > size_t bytes, total = 0;
> > - u32 free_space;
> > + struct sk_buff *skb;
> > int err = -EFAULT;
> > + u32 free_space;
> >
> > spin_lock_bh(&vvs->rx_lock);
> > - while (total < len && !list_empty(&vvs->rx_queue)) {
> > - pkt = list_first_entry(&vvs->rx_queue,
> > - struct virtio_vsock_pkt, list);
> > + while (total < len && !skb_queue_empty_lockless(&vvs->rx_queue)) {
> > + skb = __skb_dequeue(&vvs->rx_queue);
>
> Here the locking schema is confusing. It looks like vvs->rx_queue is
> under vvs->rx_lock protection, so the above should be skb_queue_empty()
> instead of the lockless variant.
>
> [...]
>
> > @@ -858,16 +873,11 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t,
> > static void virtio_transport_remove_sock(struct vsock_sock *vsk)
> > {
> > struct virtio_vsock_sock *vvs = vsk->trans;
> > - struct virtio_vsock_pkt *pkt, *tmp;
> >
> > /* We don't need to take rx_lock, as the socket is closing and we are
> > * removing it.
> > */
> > - list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
> > - list_del(&pkt->list);
> > - virtio_transport_free_pkt(pkt);
> > - }
> > -
> > + virtio_vsock_skb_queue_purge(&vvs->rx_queue);
>
> Still assuming rx_queue is under the rx_lock, given you don't need the
> locking here as per the above comment, you should use the lockless
> purge variant.
>
Good catch, thanks!
Thanks again,
Bobby
next prev parent reply other threads:[~2022-12-06 18:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-02 17:35 [PATCH v5] virtio/vsock: replace virtio_vsock_pkt with sk_buff Bobby Eshleman
2022-12-05 12:22 ` Stefano Garzarella
2022-12-06 1:37 ` Jakub Kicinski
2022-11-21 11:02 ` Bobby Eshleman
2022-12-06 10:20 ` Paolo Abeni
2022-11-21 12:01 ` Bobby Eshleman [this message]
2022-12-07 9:33 ` Paolo Abeni
2022-11-21 20:19 ` Bobby Eshleman
2022-12-06 22:15 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3toiPtBgOcrb8TL@bullseye \
--to=bobbyeshleman@gmail.com \
--cc=bobby.eshleman@bytedance.com \
--cc=cong.wang@bytedance.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jasowang@redhat.com \
--cc=jiang.wang@bytedance.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=oxffffaa@gmail.com \
--cc=pabeni@redhat.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox