* [PATCH] vhost/net: Set num_buffers for virtio 1.0
@ 2024-09-15 1:35 Akihiko Odaki
2024-11-06 8:54 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Akihiko Odaki @ 2024-09-15 1:35 UTC (permalink / raw)
To: Michael S. Tsirkin, Jason Wang, Eugenio Pérez
Cc: kvm, virtualization, netdev, linux-kernel, Akihiko Odaki
The specification says the device MUST set num_buffers to 1 if
VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
drivers/vhost/net.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index f16279351db5..d4d97fa9cc8f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
size_t vhost_hlen, sock_hlen;
size_t vhost_len, sock_len;
bool busyloop_intr = false;
+ bool set_num_buffers;
struct socket *sock;
struct iov_iter fixup;
__virtio16 num_buffers;
@@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
vq->log : NULL;
mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
+ set_num_buffers = mergeable ||
+ vhost_has_feature(vq, VIRTIO_F_VERSION_1);
do {
sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
@@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
/* TODO: Should check and handle checksum. */
num_buffers = cpu_to_vhost16(vq, headcount);
- if (likely(mergeable) &&
+ if (likely(set_num_buffers) &&
copy_to_iter(&num_buffers, sizeof num_buffers,
&fixup) != sizeof num_buffers) {
vq_err(vq, "Failed num_buffers write");
---
base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
change-id: 20240908-v1-90fc83ff8b09
Best regards,
--
Akihiko Odaki <akihiko.odaki@daynix.com>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] vhost/net: Set num_buffers for virtio 1.0
2024-09-15 1:35 [PATCH] vhost/net: Set num_buffers for virtio 1.0 Akihiko Odaki
@ 2024-11-06 8:54 ` Michael S. Tsirkin
2024-11-11 1:27 ` Jason Wang
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2024-11-06 8:54 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Jason Wang, Eugenio Pérez, kvm, virtualization, netdev,
linux-kernel
On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> The specification says the device MUST set num_buffers to 1 if
> VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
>
> Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
True, this is out of spec. But, qemu is also out of spec :(
Given how many years this was out there, I wonder whether
we should just fix the spec, instead of changing now.
Jason, what's your take?
> ---
> drivers/vhost/net.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index f16279351db5..d4d97fa9cc8f 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
> size_t vhost_hlen, sock_hlen;
> size_t vhost_len, sock_len;
> bool busyloop_intr = false;
> + bool set_num_buffers;
> struct socket *sock;
> struct iov_iter fixup;
> __virtio16 num_buffers;
> @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
> vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
> vq->log : NULL;
> mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> + set_num_buffers = mergeable ||
> + vhost_has_feature(vq, VIRTIO_F_VERSION_1);
>
> do {
> sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
> /* TODO: Should check and handle checksum. */
>
> num_buffers = cpu_to_vhost16(vq, headcount);
> - if (likely(mergeable) &&
> + if (likely(set_num_buffers) &&
> copy_to_iter(&num_buffers, sizeof num_buffers,
> &fixup) != sizeof num_buffers) {
> vq_err(vq, "Failed num_buffers write");
>
> ---
> base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> change-id: 20240908-v1-90fc83ff8b09
>
> Best regards,
> --
> Akihiko Odaki <akihiko.odaki@daynix.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vhost/net: Set num_buffers for virtio 1.0
2024-11-06 8:54 ` Michael S. Tsirkin
@ 2024-11-11 1:27 ` Jason Wang
2024-12-26 11:54 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Jason Wang @ 2024-11-11 1:27 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Akihiko Odaki, Eugenio Pérez, kvm, virtualization, netdev,
linux-kernel
On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > The specification says the device MUST set num_buffers to 1 if
> > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> >
> > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>
> True, this is out of spec. But, qemu is also out of spec :(
>
> Given how many years this was out there, I wonder whether
> we should just fix the spec, instead of changing now.
>
> Jason, what's your take?
Fixing the spec (if you mean release the requirement) seems to be less risky.
Thanks
>
>
> > ---
> > drivers/vhost/net.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index f16279351db5..d4d97fa9cc8f 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
> > size_t vhost_hlen, sock_hlen;
> > size_t vhost_len, sock_len;
> > bool busyloop_intr = false;
> > + bool set_num_buffers;
> > struct socket *sock;
> > struct iov_iter fixup;
> > __virtio16 num_buffers;
> > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
> > vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
> > vq->log : NULL;
> > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> > + set_num_buffers = mergeable ||
> > + vhost_has_feature(vq, VIRTIO_F_VERSION_1);
> >
> > do {
> > sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
> > /* TODO: Should check and handle checksum. */
> >
> > num_buffers = cpu_to_vhost16(vq, headcount);
> > - if (likely(mergeable) &&
> > + if (likely(set_num_buffers) &&
> > copy_to_iter(&num_buffers, sizeof num_buffers,
> > &fixup) != sizeof num_buffers) {
> > vq_err(vq, "Failed num_buffers write");
> >
> > ---
> > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> > change-id: 20240908-v1-90fc83ff8b09
> >
> > Best regards,
> > --
> > Akihiko Odaki <akihiko.odaki@daynix.com>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vhost/net: Set num_buffers for virtio 1.0
2024-11-11 1:27 ` Jason Wang
@ 2024-12-26 11:54 ` Michael S. Tsirkin
[not found] ` <CACGkMEug-83KTBQjJBEKuYsVY86-mCSMpuGgj-BfcL=m2VFfvA@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2024-12-26 11:54 UTC (permalink / raw)
To: Jason Wang
Cc: Akihiko Odaki, Eugenio Pérez, kvm, virtualization, netdev,
linux-kernel
On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
> On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > > The specification says the device MUST set num_buffers to 1 if
> > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> > >
> > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> >
> > True, this is out of spec. But, qemu is also out of spec :(
> >
> > Given how many years this was out there, I wonder whether
> > we should just fix the spec, instead of changing now.
> >
> > Jason, what's your take?
>
> Fixing the spec (if you mean release the requirement) seems to be less risky.
>
> Thanks
I looked at the latest spec patch.
Issue is, if we relax the requirement in the spec,
it just might break some drivers.
Something I did not realize at the time.
Also, vhost just leaves it uninitialized so there really is no chance
some driver using vhost looks at it and assumes 0.
There is another thing out of spec with vhost at the moment:
it is actually leaving this field in the buffer
uninitialized. Which is out of spec, length supplied by device
must be initialized by device.
We generally just ask everyone to follow spec. So now I'm inclined to fix
it, and make a corresponding qemu change.
Now, about how to fix it - besides a risk to non-VM workloads, I dislike
doing an extra copy to user into buffer. So maybe we should add an ioctl
to teach tun to set num bufs to 1.
This way userspace has control.
Hmm?
> >
> >
> > > ---
> > > drivers/vhost/net.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > index f16279351db5..d4d97fa9cc8f 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
> > > size_t vhost_hlen, sock_hlen;
> > > size_t vhost_len, sock_len;
> > > bool busyloop_intr = false;
> > > + bool set_num_buffers;
> > > struct socket *sock;
> > > struct iov_iter fixup;
> > > __virtio16 num_buffers;
> > > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
> > > vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
> > > vq->log : NULL;
> > > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> > > + set_num_buffers = mergeable ||
> > > + vhost_has_feature(vq, VIRTIO_F_VERSION_1);
> > >
> > > do {
> > > sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> > > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
> > > /* TODO: Should check and handle checksum. */
> > >
> > > num_buffers = cpu_to_vhost16(vq, headcount);
> > > - if (likely(mergeable) &&
> > > + if (likely(set_num_buffers) &&
> > > copy_to_iter(&num_buffers, sizeof num_buffers,
> > > &fixup) != sizeof num_buffers) {
> > > vq_err(vq, "Failed num_buffers write");
> > >
> > > ---
> > > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> > > change-id: 20240908-v1-90fc83ff8b09
> > >
> > > Best regards,
> > > --
> > > Akihiko Odaki <akihiko.odaki@daynix.com>
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vhost/net: Set num_buffers for virtio 1.0
[not found] ` <CACGkMEug-83KTBQjJBEKuYsVY86-mCSMpuGgj-BfcL=m2VFfvA@mail.gmail.com>
@ 2024-12-27 4:34 ` Akihiko Odaki
2024-12-27 13:44 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Akihiko Odaki @ 2024-12-27 4:34 UTC (permalink / raw)
To: Jason Wang, Michael S. Tsirkin
Cc: Eugenio Pérez, kvm, virtualization, netdev, linux-kernel
On 2024/12/27 10:29, Jason Wang wrote:
>
>
> On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com
> <mailto:mst@redhat.com>> wrote:
>
> On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
> > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com
> <mailto:mst@redhat.com>> wrote:
> > >
> > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > > > The specification says the device MUST set num_buffers to 1 if
> > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> > > >
> > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com
> <mailto:akihiko.odaki@daynix.com>>
> > >
> > > True, this is out of spec. But, qemu is also out of spec :(
> > >
> > > Given how many years this was out there, I wonder whether
> > > we should just fix the spec, instead of changing now.
> > >
> > > Jason, what's your take?
> >
> > Fixing the spec (if you mean release the requirement) seems to be
> less risky.
> >
> > Thanks
>
> I looked at the latest spec patch.
> Issue is, if we relax the requirement in the spec,
> it just might break some drivers.
>
> Something I did not realize at the time.
>
> Also, vhost just leaves it uninitialized so there really is no chance
> some driver using vhost looks at it and assumes 0.
> >
> So it also has no chance to assume it for anything specific value.
Theoretically, there could be a driver written according to the
specification and tested with other device implementations that set
num_buffers to one.
Practically, I will be surprised if there is such a driver in reality.
But I also see few reasons to relax the device requirement now; if we
used to say it should be set to one and there is no better alternative
value, why don't stick to one?
I sent v2 for the virtio-spec change that retains the device requirement
so please tell me what you think about it:
https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u
>
>
> There is another thing out of spec with vhost at the moment:
> it is actually leaving this field in the buffer
> uninitialized. Which is out of spec, length supplied by device
> must be initialized by device.
>
>
> What do you mean by "length" here?
>
>
>
> We generally just ask everyone to follow spec.
>
>
> Spec can't cover all the behaviour, so there would be some leftovers.
>
> So now I'm inclined to fix
> it, and make a corresponding qemu change.
>
>
> Now, about how to fix it - besides a risk to non-VM workloads, I dislike
> doing an extra copy to user into buffer. So maybe we should add an ioctl
> to teach tun to set num bufs to 1.
> This way userspace has control.
>
>
> I'm not sure I will get here. TUN has no knowledge of the mergeable
> buffers if I understand it correctly.
I rather want QEMU and other vhost_net users automatically fixed instead
of opting-in the fix.
The extra copy overhead can be almost eliminated if we initialize the
field in TUN/TAP; they already writes other part of the header so we can
simply add two bytes there. But I wonder if it's worthwhile.
Regards,
Akihiko Odaki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vhost/net: Set num_buffers for virtio 1.0
2024-12-27 4:34 ` Akihiko Odaki
@ 2024-12-27 13:44 ` Michael S. Tsirkin
0 siblings, 0 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2024-12-27 13:44 UTC (permalink / raw)
To: Akihiko Odaki
Cc: Jason Wang, Eugenio Pérez, kvm, virtualization, netdev,
linux-kernel
On Fri, Dec 27, 2024 at 01:34:10PM +0900, Akihiko Odaki wrote:
> On 2024/12/27 10:29, Jason Wang wrote:
> >
> >
> > On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com
> > <mailto:mst@redhat.com>> wrote:
> >
> > On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
> > > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com
> > <mailto:mst@redhat.com>> wrote:
> > > >
> > > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > > > > The specification says the device MUST set num_buffers to 1 if
> > > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> > > > >
> > > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com
> > <mailto:akihiko.odaki@daynix.com>>
> > > >
> > > > True, this is out of spec. But, qemu is also out of spec :(
> > > >
> > > > Given how many years this was out there, I wonder whether
> > > > we should just fix the spec, instead of changing now.
> > > >
> > > > Jason, what's your take?
> > >
> > > Fixing the spec (if you mean release the requirement) seems to be
> > less risky.
> > >
> > > Thanks
> >
> > I looked at the latest spec patch.
> > Issue is, if we relax the requirement in the spec,
> > it just might break some drivers.
> >
> > Something I did not realize at the time.
> >
> > Also, vhost just leaves it uninitialized so there really is no chance
> > some driver using vhost looks at it and assumes 0.
> > >
> > So it also has no chance to assume it for anything specific value.
>
> Theoretically, there could be a driver written according to the
> specification and tested with other device implementations that set
> num_buffers to one.
>
> Practically, I will be surprised if there is such a driver in reality.
>
> But I also see few reasons to relax the device requirement now; if we used
> to say it should be set to one and there is no better alternative value, why
> don't stick to one?
>
> I sent v2 for the virtio-spec change that retains the device requirement so
> please tell me what you think about it:
> https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u
>
> >
> >
> > There is another thing out of spec with vhost at the moment:
> > it is actually leaving this field in the buffer
> > uninitialized. Which is out of spec, length supplied by device
> > must be initialized by device.
> >
> >
> > What do you mean by "length" here?
> >
> >
> >
> > We generally just ask everyone to follow spec.
> >
> >
> > Spec can't cover all the behaviour, so there would be some leftovers.
> >
> > So now I'm inclined to fix
> > it, and make a corresponding qemu change.
> >
> >
> > Now, about how to fix it - besides a risk to non-VM workloads, I dislike
> > doing an extra copy to user into buffer. So maybe we should add an ioctl
> > to teach tun to set num bufs to 1.
> > This way userspace has control.
> >
> >
> > I'm not sure I will get here. TUN has no knowledge of the mergeable
> > buffers if I understand it correctly.
>
> I rather want QEMU and other vhost_net users automatically fixed instead of
> opting-in the fix.
qemu can be automatic. kernel I am not sure.
> The extra copy overhead can be almost eliminated if we initialize the field
> in TUN/TAP; they already writes other part of the header so we can simply
> add two bytes there. But I wonder if it's worthwhile.
Try?
> Regards,
> Akihiko Odaki
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-12-27 13:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-15 1:35 [PATCH] vhost/net: Set num_buffers for virtio 1.0 Akihiko Odaki
2024-11-06 8:54 ` Michael S. Tsirkin
2024-11-11 1:27 ` Jason Wang
2024-12-26 11:54 ` Michael S. Tsirkin
[not found] ` <CACGkMEug-83KTBQjJBEKuYsVY86-mCSMpuGgj-BfcL=m2VFfvA@mail.gmail.com>
2024-12-27 4:34 ` Akihiko Odaki
2024-12-27 13:44 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).