From: Stefan Hajnoczi <stefanha@redhat.com>
To: Bobby Eshleman <bobbyeshleman@gmail.com>
Cc: Cong Wang <cong.wang@bytedance.com>,
Bobby Eshleman <bobby.eshleman@bytedance.com>,
kvm@vger.kernel.org, netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Cong Wang <xiyou.wangcong@gmail.com>
Subject: Re: [Patch net] vsock: improve tap delivery accuracy
Date: Wed, 3 May 2023 09:39:13 -0400 [thread overview]
Message-ID: <20230503133913.GF757667@fedora> (raw)
In-Reply-To: <ZDt+PDtKlxrwUPnc@bullseye>
[-- Attachment #1.1: Type: text/plain, Size: 3193 bytes --]
On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > we should not deliver the copy to tap device in this case. So we
> > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > possible failures.
> > >
> > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > > net/vmw_vsock/virtio_transport.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > index e95df847176b..055678628c07 100644
> > > --- a/net/vmw_vsock/virtio_transport.c
> > > +++ b/net/vmw_vsock/virtio_transport.c
> > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > if (!skb)
> > > break;
> > >
> > > - virtio_transport_deliver_tap_pkt(skb);
> > > - reply = virtio_vsock_skb_reply(skb);
> > > -
> > > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > sgs[out_sg++] = &hdr;
> > > if (skb->len > 0) {
> > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > break;
> > > }
> > >
> > > + virtio_transport_deliver_tap_pkt(skb);
> > > + reply = virtio_vsock_skb_reply(skb);
> >
> > I don't remember the reason for the ordering, but I'm pretty sure it was
> > deliberate. Probably because the payload buffers could be freed as soon
> > as virtqueue_add_sgs() is called.
> >
> > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > to monitor packets after they have been sent.
> >
> > Stefan
>
> Hey Stefan,
>
> Unfortunately, skbuff doesn't change that behavior.
>
> If I understand correctly, the problem flow you are describing
> would be something like this:
>
> Thread 0 Thread 1
> guest:virtqueue_add_sgs()[@send_pkt_work]
>
> host:vhost_vq_get_desc()[@handle_tx_kick]
> host:vhost_add_used()
> host:vhost_signal()
> guest:virtqueue_get_buf()[@tx_work]
> guest:consume_skb()
>
> guest:deliver_tap_pkt()[@send_pkt_work]
> ^ use-after-free
>
> Which I guess is possible because the receiver can consume the new
> scatterlist during the processing kicked off for a previous batch?
> (doesn't have to wait for the subsequent kick)
Yes, drivers must assume that the device completes request before
virtqueue_add_sgs() returns. For example, the device is allowed to poll
the virtqueue memory and may see the new descriptors immediately.
I haven't audited the current vsock code path to determine whether it's
possible to reach consume_skb() before deliver_tap_pkt() returns, so I
can't say whether it's safe or not.
Stefan
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Bobby Eshleman <bobbyeshleman@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
Cong Wang <cong.wang@bytedance.com>,
Bobby Eshleman <bobby.eshleman@bytedance.com>,
kvm@vger.kernel.org, netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org
Subject: Re: [Patch net] vsock: improve tap delivery accuracy
Date: Wed, 3 May 2023 09:39:13 -0400 [thread overview]
Message-ID: <20230503133913.GF757667@fedora> (raw)
In-Reply-To: <ZDt+PDtKlxrwUPnc@bullseye>
[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]
On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > we should not deliver the copy to tap device in this case. So we
> > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > possible failures.
> > >
> > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > > net/vmw_vsock/virtio_transport.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > index e95df847176b..055678628c07 100644
> > > --- a/net/vmw_vsock/virtio_transport.c
> > > +++ b/net/vmw_vsock/virtio_transport.c
> > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > if (!skb)
> > > break;
> > >
> > > - virtio_transport_deliver_tap_pkt(skb);
> > > - reply = virtio_vsock_skb_reply(skb);
> > > -
> > > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > sgs[out_sg++] = &hdr;
> > > if (skb->len > 0) {
> > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > break;
> > > }
> > >
> > > + virtio_transport_deliver_tap_pkt(skb);
> > > + reply = virtio_vsock_skb_reply(skb);
> >
> > I don't remember the reason for the ordering, but I'm pretty sure it was
> > deliberate. Probably because the payload buffers could be freed as soon
> > as virtqueue_add_sgs() is called.
> >
> > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > to monitor packets after they have been sent.
> >
> > Stefan
>
> Hey Stefan,
>
> Unfortunately, skbuff doesn't change that behavior.
>
> If I understand correctly, the problem flow you are describing
> would be something like this:
>
> Thread 0 Thread 1
> guest:virtqueue_add_sgs()[@send_pkt_work]
>
> host:vhost_vq_get_desc()[@handle_tx_kick]
> host:vhost_add_used()
> host:vhost_signal()
> guest:virtqueue_get_buf()[@tx_work]
> guest:consume_skb()
>
> guest:deliver_tap_pkt()[@send_pkt_work]
> ^ use-after-free
>
> Which I guess is possible because the receiver can consume the new
> scatterlist during the processing kicked off for a previous batch?
> (doesn't have to wait for the subsequent kick)
Yes, drivers must assume that the device completes request before
virtqueue_add_sgs() returns. For example, the device is allowed to poll
the virtqueue memory and may see the new descriptors immediately.
I haven't audited the current vsock code path to determine whether it's
possible to reach consume_skb() before deliver_tap_pkt() returns, so I
can't say whether it's safe or not.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2023-05-03 15:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-02 17:44 [Patch net] vsock: improve tap delivery accuracy Cong Wang
2023-05-02 17:44 ` Cong Wang
2023-05-02 20:02 ` Simon Horman
2023-05-02 20:14 ` Stefan Hajnoczi
2023-05-02 20:14 ` Stefan Hajnoczi
2023-04-16 4:49 ` Bobby Eshleman
2023-05-03 7:38 ` Stefano Garzarella
2023-05-03 7:38 ` Stefano Garzarella
2023-04-16 6:57 ` Bobby Eshleman
2023-05-03 13:39 ` Stefan Hajnoczi [this message]
2023-05-03 13:39 ` Stefan Hajnoczi
2023-04-16 6:40 ` Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230503133913.GF757667@fedora \
--to=stefanha@redhat.com \
--cc=bobby.eshleman@bytedance.com \
--cc=bobbyeshleman@gmail.com \
--cc=cong.wang@bytedance.com \
--cc=kvm@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.