* Re: [virtio-dev] [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets [not found] ` <20220901021038.84751-3-gavinl@nvidia.com> @ 2022-09-07 2:17 ` Jason Wang 2022-09-07 5:31 ` Michael S. Tsirkin 2022-09-22 9:35 ` Michael S. Tsirkin 2 siblings, 0 replies; 29+ messages in thread From: Jason Wang @ 2022-09-07 2:17 UTC (permalink / raw) To: Gavin Li, stephen, davem, jesse.brandeburg, alexander.h.duyck, kuba, sridhar.samudrala, loseweigh, netdev, virtualization, virtio-dev, mst Cc: gavi 在 2022/9/1 10:10, Gavin Li 写道: > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > packets even when GUEST_* offloads are not present on the device. > However, if guest GSO is not supported, it would be sufficient to > allocate segments to cover just up the MTU size and no further. > Allocating the maximum amount of segments results in a large waste of > buffer space in the queue, which limits the number of packets that can > be buffered and can result in reduced performance. > > Therefore, if guest GSO is not supported, use the MTU to calculate the > optimal amount of segments required. > > When guest offload is enabled at runtime, RQ already has packets of bytes > less than 64K. So when packet of 64KB arrives, all the packets of such > size will be dropped. and RQ is now not usable. > > So this means that during set_guest_offloads() phase, RQs have to be > destroyed and recreated, which requires almost driver reload. > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it should > always treat them as GSO enabled. > > Accordingly, for now the assumption is that if guest GSO has been > negotiated then it has been enabled, even if it's actually been disabled > at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. Nit: Actually, it's not the assumption but the behavior of the codes itself. Since we don't try to change guest offloading in probe so it's ok to check GSO via negotiated features? Thanks > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA for > 1 VQ, queue size 1024, before and after the change, with the iperf > server running over the virtio-net interface. > > MTU(Bytes)/Bandwidth (Gbit/s) > Before After > 1500 22.5 22.4 > 9000 12.8 25.9 > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > Reviewed-by: Parav Pandit <parav@nvidia.com> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > --- > changelog: > v4->v5 > - Addressed comments from Michael S. Tsirkin > - Improve commit message > v3->v4 > - Addressed comments from Si-Wei > - Rename big_packets_sg_num with big_packets_num_skbfrags > v2->v3 > - Addressed comments from Si-Wei > - Simplify the condition check to enable the optimization > v1->v2 > - Addressed comments from Jason, Michael, Si-Wei. > - Remove the flag of guest GSO support, set sg_num for big packets and > use it directly > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > - Replace the round up algorithm with DIV_ROUND_UP > --- > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > 1 file changed, 24 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f831a0290998..dbffd5f56fb8 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -225,6 +225,9 @@ struct virtnet_info { > /* I like... big packets and I cannot lie! */ > bool big_packets; > > + /* number of sg entries allocated for big packets */ > + unsigned int big_packets_num_skbfrags; > + > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > bool mergeable_rx_bufs; > > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > char *p; > int i, err, offset; > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > first = get_a_page(rq, gfp); > if (!first) { > if (list) > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > > /* chain first in list head */ > first->private = (unsigned long)list; > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > + err = virtqueue_add_inbuf(rq->vq, rq->sg, vi->big_packets_num_skbfrags + 2, > first, gfp); > if (err < 0) > give_pages(rq, first); > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const struct virtnet_info *vi) > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); > } > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, const int mtu) > +{ > + bool guest_gso = virtnet_check_guest_gso(vi); > + > + /* If device can receive ANY guest GSO packets, regardless of mtu, > + * allocate packets of maximum size, otherwise limit it to only > + * mtu size worth only. > + */ > + if (mtu > ETH_DATA_LEN || guest_gso) { > + vi->big_packets = true; > + vi->big_packets_num_skbfrags = guest_gso ? MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > + } > +} > + > static int virtnet_probe(struct virtio_device *vdev) > { > int i, err = -ENOMEM; > struct net_device *dev; > struct virtnet_info *vi; > u16 max_queue_pairs; > - int mtu; > + int mtu = 0; > > /* Find if host supports multiqueue/rss virtio_net device */ > max_queue_pairs = 1; > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device *vdev) > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > spin_lock_init(&vi->refill_lock); > > - /* If we can receive ANY GSO packets, we must allocate large ones. */ > - if (virtnet_check_guest_gso(vi)) > - vi->big_packets = true; > - > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > vi->mergeable_rx_bufs = true; > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device *vdev) > > dev->mtu = mtu; > dev->max_mtu = mtu; > - > - /* TODO: size buffers correctly in this case. */ > - if (dev->mtu > ETH_DATA_LEN) > - vi->big_packets = true; > } > > + virtnet_set_big_packets_fields(vi, mtu); > + > if (vi->any_header_sg) > dev->needed_headroom = vi->hdr_len; > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets [not found] ` <20220901021038.84751-3-gavinl@nvidia.com> 2022-09-07 2:17 ` [virtio-dev] [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets Jason Wang @ 2022-09-07 5:31 ` Michael S. Tsirkin [not found] ` <0355d1e4-a3cf-5b16-8c7f-b39b1ec14ade@nvidia.com> 2022-09-22 9:35 ` Michael S. Tsirkin 2 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 5:31 UTC (permalink / raw) To: Gavin Li Cc: alexander.h.duyck, virtio-dev, sridhar.samudrala, jesse.brandeburg, gavi, virtualization, stephen, loseweigh, netdev, kuba, davem On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > packets even when GUEST_* offloads are not present on the device. > However, if guest GSO is not supported, it would be sufficient to > allocate segments to cover just up the MTU size and no further. > Allocating the maximum amount of segments results in a large waste of > buffer space in the queue, which limits the number of packets that can > be buffered and can result in reduced performance. > > Therefore, if guest GSO is not supported, use the MTU to calculate the > optimal amount of segments required. > > When guest offload is enabled at runtime, RQ already has packets of bytes > less than 64K. So when packet of 64KB arrives, all the packets of such > size will be dropped. and RQ is now not usable. > > So this means that during set_guest_offloads() phase, RQs have to be > destroyed and recreated, which requires almost driver reload. > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it should > always treat them as GSO enabled. > > Accordingly, for now the assumption is that if guest GSO has been > negotiated then it has been enabled, even if it's actually been disabled > at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA for > 1 VQ, queue size 1024, before and after the change, with the iperf > server running over the virtio-net interface. > > MTU(Bytes)/Bandwidth (Gbit/s) > Before After > 1500 22.5 22.4 > 9000 12.8 25.9 > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > Reviewed-by: Parav Pandit <parav@nvidia.com> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Which configurations were tested? Did you test devices without VIRTIO_NET_F_MTU ? > --- > changelog: > v4->v5 > - Addressed comments from Michael S. Tsirkin > - Improve commit message > v3->v4 > - Addressed comments from Si-Wei > - Rename big_packets_sg_num with big_packets_num_skbfrags > v2->v3 > - Addressed comments from Si-Wei > - Simplify the condition check to enable the optimization > v1->v2 > - Addressed comments from Jason, Michael, Si-Wei. > - Remove the flag of guest GSO support, set sg_num for big packets and > use it directly > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > - Replace the round up algorithm with DIV_ROUND_UP > --- > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > 1 file changed, 24 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f831a0290998..dbffd5f56fb8 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -225,6 +225,9 @@ struct virtnet_info { > /* I like... big packets and I cannot lie! */ > bool big_packets; > > + /* number of sg entries allocated for big packets */ > + unsigned int big_packets_num_skbfrags; > + > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > bool mergeable_rx_bufs; > > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > char *p; > int i, err, offset; > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > first = get_a_page(rq, gfp); > if (!first) { > if (list) > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > > /* chain first in list head */ > first->private = (unsigned long)list; > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > + err = virtqueue_add_inbuf(rq->vq, rq->sg, vi->big_packets_num_skbfrags + 2, > first, gfp); > if (err < 0) > give_pages(rq, first); > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const struct virtnet_info *vi) > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); > } > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, const int mtu) I'd rename this to just virtnet_set_big_packets. > +{ > + bool guest_gso = virtnet_check_guest_gso(vi); > + > + /* If device can receive ANY guest GSO packets, regardless of mtu, > + * allocate packets of maximum size, otherwise limit it to only > + * mtu size worth only. > + */ > + if (mtu > ETH_DATA_LEN || guest_gso) { > + vi->big_packets = true; > + vi->big_packets_num_skbfrags = guest_gso ? MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > + } > +} > + > static int virtnet_probe(struct virtio_device *vdev) > { > int i, err = -ENOMEM; > struct net_device *dev; > struct virtnet_info *vi; > u16 max_queue_pairs; > - int mtu; > + int mtu = 0; > > /* Find if host supports multiqueue/rss virtio_net device */ > max_queue_pairs = 1; > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device *vdev) > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > spin_lock_init(&vi->refill_lock); > > - /* If we can receive ANY GSO packets, we must allocate large ones. */ > - if (virtnet_check_guest_gso(vi)) > - vi->big_packets = true; > - > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > vi->mergeable_rx_bufs = true; > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device *vdev) > > dev->mtu = mtu; > dev->max_mtu = mtu; > - > - /* TODO: size buffers correctly in this case. */ > - if (dev->mtu > ETH_DATA_LEN) > - vi->big_packets = true; > } > > + virtnet_set_big_packets_fields(vi, mtu); > + If VIRTIO_NET_F_MTU is off, then mtu is uninitialized. You should move it to within if () above to fix. > if (vi->any_header_sg) > dev->needed_headroom = vi->hdr_len; > > -- > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <0355d1e4-a3cf-5b16-8c7f-b39b1ec14ade@nvidia.com>]
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets [not found] ` <0355d1e4-a3cf-5b16-8c7f-b39b1ec14ade@nvidia.com> @ 2022-09-07 9:26 ` Michael S. Tsirkin 2022-09-07 14:08 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 9:26 UTC (permalink / raw) To: Gavin Li Cc: alexander.h.duyck, virtio-dev, sridhar.samudrala, jesse.brandeburg, gavi, virtualization, stephen, loseweigh, netdev, kuba, davem On Wed, Sep 07, 2022 at 04:08:54PM +0800, Gavin Li wrote: > > On 9/7/2022 1:31 PM, Michael S. Tsirkin wrote: > > External email: Use caution opening links or attachments > > > > > > On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > > > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > > > packets even when GUEST_* offloads are not present on the device. > > > However, if guest GSO is not supported, it would be sufficient to > > > allocate segments to cover just up the MTU size and no further. > > > Allocating the maximum amount of segments results in a large waste of > > > buffer space in the queue, which limits the number of packets that can > > > be buffered and can result in reduced performance. actually how does this waste space? Is this because your device does not have INDIRECT? > > > > > > Therefore, if guest GSO is not supported, use the MTU to calculate the > > > optimal amount of segments required. > > > > > > When guest offload is enabled at runtime, RQ already has packets of bytes > > > less than 64K. So when packet of 64KB arrives, all the packets of such > > > size will be dropped. and RQ is now not usable. > > > > > > So this means that during set_guest_offloads() phase, RQs have to be > > > destroyed and recreated, which requires almost driver reload. > > > > > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it should > > > always treat them as GSO enabled. > > > > > > Accordingly, for now the assumption is that if guest GSO has been > > > negotiated then it has been enabled, even if it's actually been disabled > > > at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > > > > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA for > > > 1 VQ, queue size 1024, before and after the change, with the iperf > > > server running over the virtio-net interface. > > > > > > MTU(Bytes)/Bandwidth (Gbit/s) > > > Before After > > > 1500 22.5 22.4 > > > 9000 12.8 25.9 is this buffer space? just the overhead of allocating/freeing the buffers? of using INDIRECT? > > > > > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > > > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > > > Reviewed-by: Parav Pandit <parav@nvidia.com> > > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > > > Which configurations were tested? > I tested it with DPDK vDPA + qemu vhost. Do you mean the feature set of the > VM? yes > > Did you test devices without VIRTIO_NET_F_MTU ? > No. It will need code changes. Testing to make sure nothing broke should not require changes. > > > > > --- > > > changelog: > > > v4->v5 > > > - Addressed comments from Michael S. Tsirkin > > > - Improve commit message > > > v3->v4 > > > - Addressed comments from Si-Wei > > > - Rename big_packets_sg_num with big_packets_num_skbfrags > > > v2->v3 > > > - Addressed comments from Si-Wei > > > - Simplify the condition check to enable the optimization > > > v1->v2 > > > - Addressed comments from Jason, Michael, Si-Wei. > > > - Remove the flag of guest GSO support, set sg_num for big packets and > > > use it directly > > > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > > > - Replace the round up algorithm with DIV_ROUND_UP > > > --- > > > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > > > 1 file changed, 24 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > index f831a0290998..dbffd5f56fb8 100644 > > > --- a/drivers/net/virtio_net.c > > > +++ b/drivers/net/virtio_net.c > > > @@ -225,6 +225,9 @@ struct virtnet_info { > > > /* I like... big packets and I cannot lie! */ > > > bool big_packets; > > > > > > + /* number of sg entries allocated for big packets */ > > > + unsigned int big_packets_num_skbfrags; > > > + > > > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > > > bool mergeable_rx_bufs; > > > > > > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > > > char *p; > > > int i, err, offset; > > > > > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > > > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > > > > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > > > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > > > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > > > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > > > first = get_a_page(rq, gfp); > > > if (!first) { > > > if (list) > > > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > > > > > > /* chain first in list head */ > > > first->private = (unsigned long)list; > > > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > > > + err = virtqueue_add_inbuf(rq->vq, rq->sg, vi->big_packets_num_skbfrags + 2, > > > first, gfp); > > > if (err < 0) > > > give_pages(rq, first); > > > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const struct virtnet_info *vi) > > > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); > > > } > > > > > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, const int mtu) > > > > I'd rename this to just virtnet_set_big_packets. > ACK > > > > > > > +{ > > > + bool guest_gso = virtnet_check_guest_gso(vi); > > > + > > > + /* If device can receive ANY guest GSO packets, regardless of mtu, > > > + * allocate packets of maximum size, otherwise limit it to only > > > + * mtu size worth only. > > > + */ > > > + if (mtu > ETH_DATA_LEN || guest_gso) { > > > + vi->big_packets = true; > > > + vi->big_packets_num_skbfrags = guest_gso ? MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > > > + } > > > +} > > > + > > > static int virtnet_probe(struct virtio_device *vdev) > > > { > > > int i, err = -ENOMEM; > > > struct net_device *dev; > > > struct virtnet_info *vi; > > > u16 max_queue_pairs; > > > - int mtu; > > > + int mtu = 0; > > > > > > /* Find if host supports multiqueue/rss virtio_net device */ > > > max_queue_pairs = 1; > > > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device *vdev) > > > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > > > spin_lock_init(&vi->refill_lock); > > > > > > - /* If we can receive ANY GSO packets, we must allocate large ones. */ > > > - if (virtnet_check_guest_gso(vi)) > > > - vi->big_packets = true; > > > - > > > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > > > vi->mergeable_rx_bufs = true; > > > > > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device *vdev) > > > > > > dev->mtu = mtu; > > > dev->max_mtu = mtu; > > > - > > > - /* TODO: size buffers correctly in this case. */ > > > - if (dev->mtu > ETH_DATA_LEN) > > > - vi->big_packets = true; > > > } > > > > > > + virtnet_set_big_packets_fields(vi, mtu); > > > + > > If VIRTIO_NET_F_MTU is off, then mtu is uninitialized. > > You should move it to within if () above to fix. > mtu was initialized to 0 at the beginning of probe if VIRTIO_NET_F_MTU is > off. > > In this case, big_packets_num_skbfrags will be set according to guest gso. > > If guest gso is supported, it will be set to MAX_SKB_FRAGS else zero---- do > you > > think this is a bug to be fixed? yes I think with no mtu this should behave as it did historically. > > > > > if (vi->any_header_sg) > > > dev->needed_headroom = vi->hdr_len; > > > > > > -- > > > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 9:26 ` Michael S. Tsirkin @ 2022-09-07 14:08 ` Parav Pandit via Virtualization 2022-09-07 14:29 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 14:08 UTC (permalink / raw) To: Michael S. Tsirkin, Gavin Li Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 5:27 AM > > On Wed, Sep 07, 2022 at 04:08:54PM +0800, Gavin Li wrote: > > > > On 9/7/2022 1:31 PM, Michael S. Tsirkin wrote: > > > External email: Use caution opening links or attachments > > > > > > > > > On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > > > > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for > > > > big packets even when GUEST_* offloads are not present on the > device. > > > > However, if guest GSO is not supported, it would be sufficient to > > > > allocate segments to cover just up the MTU size and no further. > > > > Allocating the maximum amount of segments results in a large waste > > > > of buffer space in the queue, which limits the number of packets > > > > that can be buffered and can result in reduced performance. > > actually how does this waste space? Is this because your device does not > have INDIRECT? VQ is 256 entries deep. Driver posted total of 256 descriptors. Each descriptor points to a page of 4K. These descriptors are chained as 4K * 16. So total packets that can be serviced are 256/16 = 16. So effective queue depth = 16. So, when GSO is off, for 9K mtu, packet buffer needed = 3 pages. (12k). So, 13 descriptors (= 13 x 4K =52K) per packet buffer is wasted. After this improvement, these 13 descriptors are available, increasing the effective queue depth = 256/3 = 85. [..] > > > > > > > > MTU(Bytes)/Bandwidth (Gbit/s) > > > > Before After > > > > 1500 22.5 22.4 > > > > 9000 12.8 25.9 > > > is this buffer space? Above performance numbers are showing improvement in bandwidth. In Gbps/sec. > just the overhead of allocating/freeing the buffers? > of using INDIRECT? The effective queue depth is so small, device cannot receive all the packets at given bw-delay product. > > > > > > Which configurations were tested? > > I tested it with DPDK vDPA + qemu vhost. Do you mean the feature set > > of the VM? > The configuration of interest is mtu, not the backend. Which is different mtu as shown in above perf numbers. > > > Did you test devices without VIRTIO_NET_F_MTU ? > > No. It will need code changes. No. It doesn't need any code changes. This is misleading/vague. This patch doesn't have any relation to a device which doesn't offer VIRTIO_NET_F_MTU. Just the code restructuring is touching this area, that may require some existing tests. I assume virtio tree will have some automation tests for such a device? > > > > > > > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct > > > > virtio_device *vdev) > > > > > > > > dev->mtu = mtu; > > > > dev->max_mtu = mtu; > > > > - > > > > - /* TODO: size buffers correctly in this case. */ > > > > - if (dev->mtu > ETH_DATA_LEN) > > > > - vi->big_packets = true; > > > > } > > > > > > > > + virtnet_set_big_packets_fields(vi, mtu); > > > > + > > > If VIRTIO_NET_F_MTU is off, then mtu is uninitialized. > > > You should move it to within if () above to fix. > > mtu was initialized to 0 at the beginning of probe if VIRTIO_NET_F_MTU > > is off. > > > > In this case, big_packets_num_skbfrags will be set according to guest gso. > > > > If guest gso is supported, it will be set to MAX_SKB_FRAGS else > > zero---- do you > > > > think this is a bug to be fixed? > > > yes I think with no mtu this should behave as it did historically. > Michael is right. It should behave as today. There is no new bug introduced by this patch. dev->mtu and dev->max_mtu is set only when VIRTIO_NET_F_MTU is offered with/without this patch. Please have mtu related fix/change in different patch. > > > > > > > if (vi->any_header_sg) > > > > dev->needed_headroom = vi->hdr_len; > > > > > > > > -- > > > > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 14:08 ` Parav Pandit via Virtualization @ 2022-09-07 14:29 ` Michael S. Tsirkin 2022-09-07 14:33 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 14:29 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 02:08:18PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 5:27 AM > > > > On Wed, Sep 07, 2022 at 04:08:54PM +0800, Gavin Li wrote: > > > > > > On 9/7/2022 1:31 PM, Michael S. Tsirkin wrote: > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > > > > > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for > > > > > big packets even when GUEST_* offloads are not present on the > > device. > > > > > However, if guest GSO is not supported, it would be sufficient to > > > > > allocate segments to cover just up the MTU size and no further. > > > > > Allocating the maximum amount of segments results in a large waste > > > > > of buffer space in the queue, which limits the number of packets > > > > > that can be buffered and can result in reduced performance. > > > > actually how does this waste space? Is this because your device does not > > have INDIRECT? > VQ is 256 entries deep. > Driver posted total of 256 descriptors. > Each descriptor points to a page of 4K. > These descriptors are chained as 4K * 16. So without indirect then? with indirect each descriptor can point to 16 entries. > So total packets that can be serviced are 256/16 = 16. > So effective queue depth = 16. > > So, when GSO is off, for 9K mtu, packet buffer needed = 3 pages. (12k). > So, 13 descriptors (= 13 x 4K =52K) per packet buffer is wasted. > > After this improvement, these 13 descriptors are available, increasing the effective queue depth = 256/3 = 85. > > [..] > > > > > > > > > > MTU(Bytes)/Bandwidth (Gbit/s) > > > > > Before After > > > > > 1500 22.5 22.4 > > > > > 9000 12.8 25.9 > > > > > > is this buffer space? > Above performance numbers are showing improvement in bandwidth. In Gbps/sec. > > > just the overhead of allocating/freeing the buffers? > > of using INDIRECT? > The effective queue depth is so small, device cannot receive all the packets at given bw-delay product. > > > > > > > > > Which configurations were tested? > > > I tested it with DPDK vDPA + qemu vhost. Do you mean the feature set > > > of the VM? > > > The configuration of interest is mtu, not the backend. > Which is different mtu as shown in above perf numbers. > > > > > Did you test devices without VIRTIO_NET_F_MTU ? > > > No. It will need code changes. > No. It doesn't need any code changes. This is misleading/vague. > > This patch doesn't have any relation to a device which doesn't offer VIRTIO_NET_F_MTU. > Just the code restructuring is touching this area, that may require some existing tests. > I assume virtio tree will have some automation tests for such a device? I have some automated tests but I also expect developer to do testing. > > > > > > > > > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct > > > > > virtio_device *vdev) > > > > > > > > > > dev->mtu = mtu; > > > > > dev->max_mtu = mtu; > > > > > - > > > > > - /* TODO: size buffers correctly in this case. */ > > > > > - if (dev->mtu > ETH_DATA_LEN) > > > > > - vi->big_packets = true; > > > > > } > > > > > > > > > > + virtnet_set_big_packets_fields(vi, mtu); > > > > > + > > > > If VIRTIO_NET_F_MTU is off, then mtu is uninitialized. > > > > You should move it to within if () above to fix. > > > mtu was initialized to 0 at the beginning of probe if VIRTIO_NET_F_MTU > > > is off. > > > > > > In this case, big_packets_num_skbfrags will be set according to guest gso. > > > > > > If guest gso is supported, it will be set to MAX_SKB_FRAGS else > > > zero---- do you > > > > > > think this is a bug to be fixed? > > > > > > yes I think with no mtu this should behave as it did historically. > > > Michael is right. > It should behave as today. There is no new bug introduced by this patch. > dev->mtu and dev->max_mtu is set only when VIRTIO_NET_F_MTU is offered with/without this patch. > > Please have mtu related fix/change in different patch. > > > > > > > > > > if (vi->any_header_sg) > > > > > dev->needed_headroom = vi->hdr_len; > > > > > > > > > > -- > > > > > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 14:29 ` Michael S. Tsirkin @ 2022-09-07 14:33 ` Parav Pandit via Virtualization 2022-09-07 14:40 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 14:33 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 10:30 AM [..] > > > actually how does this waste space? Is this because your device does > > > not have INDIRECT? > > VQ is 256 entries deep. > > Driver posted total of 256 descriptors. > > Each descriptor points to a page of 4K. > > These descriptors are chained as 4K * 16. > > So without indirect then? with indirect each descriptor can point to 16 > entries. > With indirect, can it post 256 * 16 size buffers even though vq depth is 256 entries? I recall that total number of descriptors with direct/indirect descriptors is limited to vq depth. Was there some recent clarification occurred in the spec to clarify this? _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 14:33 ` Parav Pandit via Virtualization @ 2022-09-07 14:40 ` Michael S. Tsirkin 2022-09-07 16:12 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 14:40 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 10:30 AM > > [..] > > > > actually how does this waste space? Is this because your device does > > > > not have INDIRECT? > > > VQ is 256 entries deep. > > > Driver posted total of 256 descriptors. > > > Each descriptor points to a page of 4K. > > > These descriptors are chained as 4K * 16. > > > > So without indirect then? with indirect each descriptor can point to 16 > > entries. > > > With indirect, can it post 256 * 16 size buffers even though vq depth is 256 entries? > I recall that total number of descriptors with direct/indirect descriptors is limited to vq depth. > Was there some recent clarification occurred in the spec to clarify this? This would make INDIRECT completely pointless. So I don't think we ever had such a limitation. The only thing that comes to mind is this: A driver MUST NOT create a descriptor chain longer than the Queue Size of the device. but this limits individual chain length not the total length of all chains. We have an open bug that we forgot to include this requirement in the packed ring documentation. -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 14:40 ` Michael S. Tsirkin @ 2022-09-07 16:12 ` Parav Pandit via Virtualization 2022-09-07 18:15 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 16:12 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 10:40 AM > > On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, September 7, 2022 10:30 AM > > > > [..] > > > > > actually how does this waste space? Is this because your device > > > > > does not have INDIRECT? > > > > VQ is 256 entries deep. > > > > Driver posted total of 256 descriptors. > > > > Each descriptor points to a page of 4K. > > > > These descriptors are chained as 4K * 16. > > > > > > So without indirect then? with indirect each descriptor can point to > > > 16 entries. > > > > > With indirect, can it post 256 * 16 size buffers even though vq depth is 256 > entries? > > I recall that total number of descriptors with direct/indirect descriptors is > limited to vq depth. > > > > Was there some recent clarification occurred in the spec to clarify this? > > > This would make INDIRECT completely pointless. So I don't think we ever > had such a limitation. > The only thing that comes to mind is this: > > A driver MUST NOT create a descriptor chain longer than the Queue > Size of > the device. > > but this limits individual chain length not the total length of all chains. > Right. I double checked in virtqueue_add_split() which doesn't count table entries towards desc count of VQ for indirect case. With indirect descriptors without this patch the situation is even worse with memory usage. Driver will allocate 64K * 256 = 16MB buffer per VQ, while needed (and used) buffer is only 2.3 Mbytes. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 16:12 ` Parav Pandit via Virtualization @ 2022-09-07 18:15 ` Michael S. Tsirkin 2022-09-07 19:06 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 18:15 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 04:12:47PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 10:40 AM > > > > On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Wednesday, September 7, 2022 10:30 AM > > > > > > [..] > > > > > > actually how does this waste space? Is this because your device > > > > > > does not have INDIRECT? > > > > > VQ is 256 entries deep. > > > > > Driver posted total of 256 descriptors. > > > > > Each descriptor points to a page of 4K. > > > > > These descriptors are chained as 4K * 16. > > > > > > > > So without indirect then? with indirect each descriptor can point to > > > > 16 entries. > > > > > > > With indirect, can it post 256 * 16 size buffers even though vq depth is 256 > > entries? > > > I recall that total number of descriptors with direct/indirect descriptors is > > limited to vq depth. > > > > > > > Was there some recent clarification occurred in the spec to clarify this? > > > > > > This would make INDIRECT completely pointless. So I don't think we ever > > had such a limitation. > > The only thing that comes to mind is this: > > > > A driver MUST NOT create a descriptor chain longer than the Queue > > Size of > > the device. > > > > but this limits individual chain length not the total length of all chains. > > > Right. > I double checked in virtqueue_add_split() which doesn't count table entries towards desc count of VQ for indirect case. > > With indirect descriptors without this patch the situation is even worse with memory usage. > Driver will allocate 64K * 256 = 16MB buffer per VQ, while needed (and used) buffer is only 2.3 Mbytes. Yes. So just so we understand the reason for the performance improvement is this because of memory usage? Or is this because device does not have INDIRECT? Thanks, -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 18:15 ` Michael S. Tsirkin @ 2022-09-07 19:06 ` Parav Pandit via Virtualization 2022-09-07 19:11 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 19:06 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 2:16 PM > > On Wed, Sep 07, 2022 at 04:12:47PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, September 7, 2022 10:40 AM > > > > > > On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote: > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > Sent: Wednesday, September 7, 2022 10:30 AM > > > > > > > > [..] > > > > > > > actually how does this waste space? Is this because your > > > > > > > device does not have INDIRECT? > > > > > > VQ is 256 entries deep. > > > > > > Driver posted total of 256 descriptors. > > > > > > Each descriptor points to a page of 4K. > > > > > > These descriptors are chained as 4K * 16. > > > > > > > > > > So without indirect then? with indirect each descriptor can > > > > > point to > > > > > 16 entries. > > > > > > > > > With indirect, can it post 256 * 16 size buffers even though vq > > > > depth is 256 > > > entries? > > > > I recall that total number of descriptors with direct/indirect > > > > descriptors is > > > limited to vq depth. > > > > > > > > > > Was there some recent clarification occurred in the spec to clarify this? > > > > > > > > > This would make INDIRECT completely pointless. So I don't think we > > > ever had such a limitation. > > > The only thing that comes to mind is this: > > > > > > A driver MUST NOT create a descriptor chain longer than the Queue > > > Size of > > > the device. > > > > > > but this limits individual chain length not the total length of all chains. > > > > > Right. > > I double checked in virtqueue_add_split() which doesn't count table > entries towards desc count of VQ for indirect case. > > > > With indirect descriptors without this patch the situation is even worse > with memory usage. > > Driver will allocate 64K * 256 = 16MB buffer per VQ, while needed (and > used) buffer is only 2.3 Mbytes. > > Yes. So just so we understand the reason for the performance improvement > is this because of memory usage? Or is this because device does not have > INDIRECT? Because of shallow queue of 16 entries deep. With driver turn around time to repost buffers, device is idle without any RQ buffers. With this improvement, device has 85 buffers instead of 16 to receive packets. Enabling indirect in device can help at cost of 7x higher memory per VQ in the guest VM. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:06 ` Parav Pandit via Virtualization @ 2022-09-07 19:11 ` Michael S. Tsirkin 2022-09-07 19:18 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 19:11 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 07:06:09PM +0000, Parav Pandit wrote: > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 2:16 PM > > > > On Wed, Sep 07, 2022 at 04:12:47PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Wednesday, September 7, 2022 10:40 AM > > > > > > > > On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote: > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > Sent: Wednesday, September 7, 2022 10:30 AM > > > > > > > > > > [..] > > > > > > > > actually how does this waste space? Is this because your > > > > > > > > device does not have INDIRECT? > > > > > > > VQ is 256 entries deep. > > > > > > > Driver posted total of 256 descriptors. > > > > > > > Each descriptor points to a page of 4K. > > > > > > > These descriptors are chained as 4K * 16. > > > > > > > > > > > > So without indirect then? with indirect each descriptor can > > > > > > point to > > > > > > 16 entries. > > > > > > > > > > > With indirect, can it post 256 * 16 size buffers even though vq > > > > > depth is 256 > > > > entries? > > > > > I recall that total number of descriptors with direct/indirect > > > > > descriptors is > > > > limited to vq depth. > > > > > > > > > > > > > Was there some recent clarification occurred in the spec to clarify this? > > > > > > > > > > > > This would make INDIRECT completely pointless. So I don't think we > > > > ever had such a limitation. > > > > The only thing that comes to mind is this: > > > > > > > > A driver MUST NOT create a descriptor chain longer than the Queue > > > > Size of > > > > the device. > > > > > > > > but this limits individual chain length not the total length of all chains. > > > > > > > Right. > > > I double checked in virtqueue_add_split() which doesn't count table > > entries towards desc count of VQ for indirect case. > > > > > > With indirect descriptors without this patch the situation is even worse > > with memory usage. > > > Driver will allocate 64K * 256 = 16MB buffer per VQ, while needed (and > > used) buffer is only 2.3 Mbytes. > > > > Yes. So just so we understand the reason for the performance improvement > > is this because of memory usage? Or is this because device does not have > > INDIRECT? > > Because of shallow queue of 16 entries deep. but why is the queue just 16 entries? does the device not support indirect? because with indirect you get 256 entries, with 16 s/g each. > With driver turn around time to repost buffers, device is idle without any RQ buffers. > With this improvement, device has 85 buffers instead of 16 to receive packets. > > Enabling indirect in device can help at cost of 7x higher memory per VQ in the guest VM. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:11 ` Michael S. Tsirkin @ 2022-09-07 19:18 ` Parav Pandit via Virtualization 2022-09-07 19:23 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 19:18 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 3:12 PM > > Because of shallow queue of 16 entries deep. > > but why is the queue just 16 entries? I explained the calculation in [1] about 16 entries. [1] https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC419@PH0PR12MB5481.namprd12.prod.outlook.com/ > does the device not support indirect? > Yes, indirect feature bit is disabled on the device. > because with indirect you get 256 entries, with 16 s/g each. > Sure. I explained below that indirect comes with 7x memory cost that is not desired. (Ignored the table memory allocation cost and extra latency). Hence don't want to enable indirect in this scenario. This optimization also works with indirect with smaller indirect table. > > > With driver turn around time to repost buffers, device is idle without any > RQ buffers. > > With this improvement, device has 85 buffers instead of 16 to receive > packets. > > > > Enabling indirect in device can help at cost of 7x higher memory per VQ in > the guest VM. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:18 ` Parav Pandit via Virtualization @ 2022-09-07 19:23 ` Michael S. Tsirkin 2022-09-07 19:27 ` Parav Pandit via Virtualization 0 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 19:23 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > Because of shallow queue of 16 entries deep. > > > > but why is the queue just 16 entries? > I explained the calculation in [1] about 16 entries. > > [1] https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC419@PH0PR12MB5481.namprd12.prod.outlook.com/ > > > does the device not support indirect? > > > Yes, indirect feature bit is disabled on the device. OK that explains it. > > because with indirect you get 256 entries, with 16 s/g each. > > > Sure. I explained below that indirect comes with 7x memory cost that is not desired. > (Ignored the table memory allocation cost and extra latency). Oh sure, it's a waste. I wonder what effect does the patch have on bandwidth with indirect enabled though. > Hence don't want to enable indirect in this scenario. > This optimization also works with indirect with smaller indirect table. > > > > > > With driver turn around time to repost buffers, device is idle without any > > RQ buffers. > > > With this improvement, device has 85 buffers instead of 16 to receive > > packets. > > > > > > Enabling indirect in device can help at cost of 7x higher memory per VQ in > > the guest VM. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:23 ` Michael S. Tsirkin @ 2022-09-07 19:27 ` Parav Pandit via Virtualization 2022-09-07 19:36 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 19:27 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 3:24 PM > > On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > > > Because of shallow queue of 16 entries deep. > > > > > > but why is the queue just 16 entries? > > I explained the calculation in [1] about 16 entries. > > > > [1] > > > https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC > 419@ > > PH0PR12MB5481.namprd12.prod.outlook.com/ > > > > > does the device not support indirect? > > > > > Yes, indirect feature bit is disabled on the device. > > OK that explains it. So can we proceed with v6 to contain (a) updated commit message and (b) function name change you suggested to drop _fields suffix? _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:27 ` Parav Pandit via Virtualization @ 2022-09-07 19:36 ` Michael S. Tsirkin 2022-09-07 19:37 ` Michael S. Tsirkin ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 19:36 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 07:27:16PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 3:24 PM > > > > On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > > > > > Because of shallow queue of 16 entries deep. > > > > > > > > but why is the queue just 16 entries? > > > I explained the calculation in [1] about 16 entries. > > > > > > [1] > > > > > https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC > > 419@ > > > PH0PR12MB5481.namprd12.prod.outlook.com/ > > > > > > > does the device not support indirect? > > > > > > > Yes, indirect feature bit is disabled on the device. > > > > OK that explains it. > > So can we proceed with v6 to contain > (a) updated commit message and > (b) function name change you suggested to drop _fields suffix? (c) replace mtu = 0 with sensibly not calling the function when mtu is unknown. And I'd like commit log to include results of perf testing - with indirect feature on - with mtu feature off just to make sure nothing breaks. -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:36 ` Michael S. Tsirkin @ 2022-09-07 19:37 ` Michael S. Tsirkin 2022-09-07 19:54 ` Parav Pandit via Virtualization 2022-09-07 19:51 ` Parav Pandit via Virtualization 2022-09-07 20:04 ` Parav Pandit via Virtualization 2 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-07 19:37 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 03:36:16PM -0400, Michael S. Tsirkin wrote: > On Wed, Sep 07, 2022 at 07:27:16PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, September 7, 2022 3:24 PM > > > > > > On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > > > > > > > Because of shallow queue of 16 entries deep. > > > > > > > > > > but why is the queue just 16 entries? > > > > I explained the calculation in [1] about 16 entries. > > > > > > > > [1] > > > > > > > https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC > > > 419@ > > > > PH0PR12MB5481.namprd12.prod.outlook.com/ > > > > > > > > > does the device not support indirect? > > > > > > > > > Yes, indirect feature bit is disabled on the device. > > > > > > OK that explains it. > > > > So can we proceed with v6 to contain > > (a) updated commit message and > > (b) function name change you suggested to drop _fields suffix? > > (c) replace mtu = 0 with sensibly not calling the function > when mtu is unknown. > > > And I'd like commit log to include results of perf testing > - with indirect feature on > - with mtu feature off > just to make sure nothing breaks. and if possible a larger ring. 1k? > -- > MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:37 ` Michael S. Tsirkin @ 2022-09-07 19:54 ` Parav Pandit via Virtualization 0 siblings, 0 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 19:54 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 3:38 PM > > and if possible a larger ring. 1k? What do you expect to see here for which test report should be added to commit log? What is special about 1k vs 512, 128 and 2k? is 1K default for some configuration? _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:36 ` Michael S. Tsirkin 2022-09-07 19:37 ` Michael S. Tsirkin @ 2022-09-07 19:51 ` Parav Pandit via Virtualization 2022-09-07 21:39 ` [virtio-dev] " Si-Wei Liu 2022-09-22 9:26 ` Michael S. Tsirkin 2022-09-07 20:04 ` Parav Pandit via Virtualization 2 siblings, 2 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 19:51 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 3:36 PM > > On Wed, Sep 07, 2022 at 07:27:16PM +0000, Parav Pandit wrote: > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > Sent: Wednesday, September 7, 2022 3:24 PM > > > > > > On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > > > > > > > Because of shallow queue of 16 entries deep. > > > > > > > > > > but why is the queue just 16 entries? > > > > I explained the calculation in [1] about 16 entries. > > > > > > > > [1] > > > > > > > > https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC > > > 419@ > > > > PH0PR12MB5481.namprd12.prod.outlook.com/ > > > > > > > > > does the device not support indirect? > > > > > > > > > Yes, indirect feature bit is disabled on the device. > > > > > > OK that explains it. > > > > So can we proceed with v6 to contain > > (a) updated commit message and > > (b) function name change you suggested to drop _fields suffix? > > (c) replace mtu = 0 with sensibly not calling the function when mtu is > unknown. > > > And I'd like commit log to include results of perf testing > - with indirect feature on Which device do you suggest using for this test? > - with mtu feature off Why is this needed when it is not touching the area of mtu being not offered? > just to make sure nothing breaks. Not sure why you demand this. Can you please share the link to other patches that ensured that nothing breaks, for example I didn't see a similar "test ask" in v14 series [1]? What is so special about current patch of interest vs [1] that requires this special testing details in commit log, and it is not required in [1] or past patches? Do you have link to the tests done with synchronization tests by commit [2]? This will help to define test matrix for developers and internal regression and similar report in all subsequent patches like [1]. [1] https://lore.kernel.org/bpf/20220801063902.129329-41-xuanzhuo@linux.alibaba.com/ [2] 6213f07cb54 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [virtio-dev] RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:51 ` Parav Pandit via Virtualization @ 2022-09-07 21:39 ` Si-Wei Liu 2022-09-07 22:11 ` Parav Pandit via Virtualization 2022-09-22 9:26 ` Michael S. Tsirkin 1 sibling, 1 reply; 29+ messages in thread From: Si-Wei Liu @ 2022-09-07 21:39 UTC (permalink / raw) To: Parav Pandit, Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On 9/7/2022 12:51 PM, Parav Pandit wrote: >> And I'd like commit log to include results of perf testing >> - with indirect feature on > Which device do you suggest using for this test? > You may use software vhost-net backend with and without fix to compare. Since this driver fix effectively lowers down the buffer size for the indirect=on case as well, it's a natural request to make sure no perf degradation is seen on devices with indirect descriptor enabled. I don't expect degradation though and think this patch should improve efficiency with less memory foot print. Cheers, -Siwei _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [virtio-dev] RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 21:39 ` [virtio-dev] " Si-Wei Liu @ 2022-09-07 22:11 ` Parav Pandit via Virtualization 2022-09-07 22:57 ` Si-Wei Liu 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 22:11 UTC (permalink / raw) To: Si-Wei Liu, Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Si-Wei Liu <si-wei.liu@oracle.com> > Sent: Wednesday, September 7, 2022 5:40 PM > > > On 9/7/2022 12:51 PM, Parav Pandit wrote: > >> And I'd like commit log to include results of perf testing > >> - with indirect feature on > > Which device do you suggest using for this test? > > > You may use software vhost-net backend with and without fix to compare. > Since this driver fix effectively lowers down the buffer size for the > indirect=on case as well, Do you have sample example for this? > it's a natural request to make sure no perf > degradation is seen on devices with indirect descriptor enabled. I don't > expect degradation though and think this patch should improve efficiency > with less memory foot print. > Any specific reason to discount test for the packed vq here because the change applies to packed vq too? It is counter intuitive to see degradation with smaller size buffers. But sure, code reviews can miss things that can bring regression for which it should be tested. I am not against the test itself. It is good thing to do more test coverage. What is puzzling me is, I fail to see test results not available in previous commits and cover letters, which is making this patch special for test coverage. Or a new trend begins with this specific patch onwards? For example, I don’t see a test results in commit [1], [2], [3] to indicate that no degradation is seen which heavily touches the lock in core data path. So want to know test reporting guidelines in the commit logs for this and other patches. [1] commit 5a2f966d0f [2] commit a7766ef18b33 [3] commit 22bc63c58e876 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [virtio-dev] RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 22:11 ` Parav Pandit via Virtualization @ 2022-09-07 22:57 ` Si-Wei Liu 0 siblings, 0 replies; 29+ messages in thread From: Si-Wei Liu @ 2022-09-07 22:57 UTC (permalink / raw) To: Parav Pandit, Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On 9/7/2022 3:11 PM, Parav Pandit wrote: >> From: Si-Wei Liu <si-wei.liu@oracle.com> >> Sent: Wednesday, September 7, 2022 5:40 PM >> >> >> On 9/7/2022 12:51 PM, Parav Pandit wrote: >>>> And I'd like commit log to include results of perf testing >>>> - with indirect feature on >>> Which device do you suggest using for this test? >>> >> You may use software vhost-net backend with and without fix to compare. >> Since this driver fix effectively lowers down the buffer size for the >> indirect=on case as well, > Do you have sample example for this? ip link add link ens300f1 name macvtap1 address $mac type macvtap mode bridge ifindex1=`cat /sys/class/net/macvtap1/ifindex` qemu-system-x86_64 \ ... -netdev type=tap,id=hnet1,vhost=on,fd=25,vhostfd=27 \ -device virtio-net-pci,netdev=hnet1,id=vnet1,mac=$mac,host_mtu=9000,mrg_rxbuf=off,guest_tso4=off,guest_tso6=off,guest_ecn=off,guest_ufo=off,indirect_desc=on \ 25<>/dev/tap${ifindex1} 27<>/dev/vhost-net indirect_desc=on is not actually required as it's on by default if not specified. -Siwei _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:51 ` Parav Pandit via Virtualization 2022-09-07 21:39 ` [virtio-dev] " Si-Wei Liu @ 2022-09-22 9:26 ` Michael S. Tsirkin 2022-09-22 10:07 ` Parav Pandit via Virtualization 1 sibling, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-22 9:26 UTC (permalink / raw) To: Parav Pandit Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Wed, Sep 07, 2022 at 07:51:38PM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Wednesday, September 7, 2022 3:36 PM > > > > On Wed, Sep 07, 2022 at 07:27:16PM +0000, Parav Pandit wrote: > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > Sent: Wednesday, September 7, 2022 3:24 PM > > > > > > > > On Wed, Sep 07, 2022 at 07:18:06PM +0000, Parav Pandit wrote: > > > > > > > > > > > From: Michael S. Tsirkin <mst@redhat.com> > > > > > > Sent: Wednesday, September 7, 2022 3:12 PM > > > > > > > > > > > > Because of shallow queue of 16 entries deep. > > > > > > > > > > > > but why is the queue just 16 entries? > > > > > I explained the calculation in [1] about 16 entries. > > > > > > > > > > [1] > > > > > > > > > > > https://lore.kernel.org/netdev/PH0PR12MB54812EC7F4711C1EA4CAA119DC > > > > 419@ > > > > > PH0PR12MB5481.namprd12.prod.outlook.com/ > > > > > > > > > > > does the device not support indirect? > > > > > > > > > > > Yes, indirect feature bit is disabled on the device. > > > > > > > > OK that explains it. > > > > > > So can we proceed with v6 to contain > > > (a) updated commit message and > > > (b) function name change you suggested to drop _fields suffix? > > > > (c) replace mtu = 0 with sensibly not calling the function when mtu is > > unknown. > > > > > > > And I'd like commit log to include results of perf testing > > - with indirect feature on > Which device do you suggest using for this test? AFAIK most devices support INDIRECT, e.g. don't nvidia cards do this? > > - with mtu feature off > Why is this needed when it is not touching the area of mtu being not offered? I don't really like it that instead of checking the MTU feature bit everywhere the patch sets mtu variable to 0. Because of this it wasn't all that obvious that the patch did not affect !MTU performance (the code does change). Rereading it afresh I think it's ok. But explicit check for !MTU would be better imho making it obvious we do not need to test !MTU. -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-22 9:26 ` Michael S. Tsirkin @ 2022-09-22 10:07 ` Parav Pandit via Virtualization 0 siblings, 0 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-22 10:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Thursday, September 22, 2022 5:27 AM > > > > > > And I'd like commit log to include results of perf testing > > > - with indirect feature on > > Which device do you suggest using for this test? > > AFAIK most devices support INDIRECT, e.g. don't nvidia cards do this? The device of interest didn't have INDIRECT support. But the question is no more relevant as Gavin supplied the numbers with INDIRECT in commit log of v6. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-07 19:36 ` Michael S. Tsirkin 2022-09-07 19:37 ` Michael S. Tsirkin 2022-09-07 19:51 ` Parav Pandit via Virtualization @ 2022-09-07 20:04 ` Parav Pandit via Virtualization 2 siblings, 0 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-07 20:04 UTC (permalink / raw) To: Michael S. Tsirkin Cc: virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Wednesday, September 7, 2022 3:36 PM > > (c) replace mtu = 0 with sensibly not calling the function when mtu is > unknown. Even when mtu is zero, virtnet_set_big_packets() must be called to act on the gso bits. Currently handling by virtnet_set_big_packets() seems more simpler taking care of mtu and gso both. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets [not found] ` <20220901021038.84751-3-gavinl@nvidia.com> 2022-09-07 2:17 ` [virtio-dev] [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets Jason Wang 2022-09-07 5:31 ` Michael S. Tsirkin @ 2022-09-22 9:35 ` Michael S. Tsirkin 2022-09-22 10:04 ` Parav Pandit via Virtualization 2 siblings, 1 reply; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-22 9:35 UTC (permalink / raw) To: Gavin Li Cc: alexander.h.duyck, virtio-dev, sridhar.samudrala, jesse.brandeburg, gavi, virtualization, stephen, loseweigh, netdev, kuba, davem On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > packets even when GUEST_* offloads are not present on the device. > However, if guest GSO is not supported, it would be sufficient to > allocate segments to cover just up the MTU size and no further. > Allocating the maximum amount of segments results in a large waste of > buffer space in the queue, which limits the number of packets that can > be buffered and can result in reduced performance. > > Therefore, if guest GSO is not supported, use the MTU to calculate the > optimal amount of segments required. > > When guest offload is enabled at runtime, RQ already has packets of bytes > less than 64K. So when packet of 64KB arrives, all the packets of such > size will be dropped. and RQ is now not usable. > > So this means that during set_guest_offloads() phase, RQs have to be > destroyed and recreated, which requires almost driver reload. > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it should > always treat them as GSO enabled. > > Accordingly, for now the assumption is that if guest GSO has been > negotiated then it has been enabled, even if it's actually been disabled > at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA for > 1 VQ, queue size 1024, before and after the change, with the iperf > server running over the virtio-net interface. > > MTU(Bytes)/Bandwidth (Gbit/s) > Before After > 1500 22.5 22.4 > 9000 12.8 25.9 > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > Reviewed-by: Parav Pandit <parav@nvidia.com> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> OK I think the logic is correct, it's just a bit harder to read than necessary. Small improvement suggestions: > --- > changelog: > v4->v5 > - Addressed comments from Michael S. Tsirkin > - Improve commit message > v3->v4 > - Addressed comments from Si-Wei > - Rename big_packets_sg_num with big_packets_num_skbfrags > v2->v3 > - Addressed comments from Si-Wei > - Simplify the condition check to enable the optimization > v1->v2 > - Addressed comments from Jason, Michael, Si-Wei. > - Remove the flag of guest GSO support, set sg_num for big packets and > use it directly > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > - Replace the round up algorithm with DIV_ROUND_UP > --- > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > 1 file changed, 24 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index f831a0290998..dbffd5f56fb8 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -225,6 +225,9 @@ struct virtnet_info { > /* I like... big packets and I cannot lie! */ > bool big_packets; > > + /* number of sg entries allocated for big packets */ > + unsigned int big_packets_num_skbfrags; > + > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > bool mergeable_rx_bufs; > big_packets_num_skbfrags -> big_packet_num_skbfrags > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > char *p; > int i, err, offset; > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > first = get_a_page(rq, gfp); > if (!first) { > if (list) > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq, > > /* chain first in list head */ > first->private = (unsigned long)list; > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > + err = virtqueue_add_inbuf(rq->vq, rq->sg, vi->big_packets_num_skbfrags + 2, > first, gfp); > if (err < 0) > give_pages(rq, first); > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const struct virtnet_info *vi) > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); > } > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, const int mtu) > +{ > + bool guest_gso = virtnet_check_guest_gso(vi); > + > + /* If device can receive ANY guest GSO packets, regardless of mtu, > + * allocate packets of maximum size, otherwise limit it to only > + * mtu size worth only. > + */ > + if (mtu > ETH_DATA_LEN || guest_gso) { > + vi->big_packets = true; > + vi->big_packets_num_skbfrags = guest_gso ? MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > + } > +} > + > static int virtnet_probe(struct virtio_device *vdev) > { > int i, err = -ENOMEM; > struct net_device *dev; > struct virtnet_info *vi; > u16 max_queue_pairs; > - int mtu; > + int mtu = 0; > I think it's better to drop this and instead just put the code where we already know the config. So: > /* Find if host supports multiqueue/rss virtio_net device */ > max_queue_pairs = 1; > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device *vdev) > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > spin_lock_init(&vi->refill_lock); > > - /* If we can receive ANY GSO packets, we must allocate large ones. */ > - if (virtnet_check_guest_gso(vi)) > - vi->big_packets = true; > - > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > vi->mergeable_rx_bufs = true; > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device *vdev) > > dev->mtu = mtu; > dev->max_mtu = mtu; > - > - /* TODO: size buffers correctly in this case. */ > - if (dev->mtu > ETH_DATA_LEN) > - vi->big_packets = true; /* Size buffers to fit mtu. */ if (mtu > ETH_DATA_LEN) { vi->big_packets = true; vi->big_packets_num_skbfrags = DIV_ROUND_UP(mtu, PAGE_SIZE); } > } > > + virtnet_set_big_packets_fields(vi, mtu); > + and here: /* If device can receive guest GSO packets, allocate buffers for * packets of maximum size, regardless of mtu. */ if (virtnet_check_guest_gso(vi)) { vi->big_packets = true; vi->big_packets_num_skbfrags = MAX_SKB_FRAGS; } > if (vi->any_header_sg) > dev->needed_headroom = vi->hdr_len; > > -- > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-22 9:35 ` Michael S. Tsirkin @ 2022-09-22 10:04 ` Parav Pandit via Virtualization 2022-09-22 10:14 ` Michael S. Tsirkin 0 siblings, 1 reply; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-22 10:04 UTC (permalink / raw) To: Michael S. Tsirkin, Gavin Li Cc: alexander.h.duyck@intel.com, virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Thursday, September 22, 2022 5:35 AM > > On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > > packets even when GUEST_* offloads are not present on the device. > > However, if guest GSO is not supported, it would be sufficient to > > allocate segments to cover just up the MTU size and no further. > > Allocating the maximum amount of segments results in a large waste of > > buffer space in the queue, which limits the number of packets that can > > be buffered and can result in reduced performance. > > > > Therefore, if guest GSO is not supported, use the MTU to calculate the > > optimal amount of segments required. > > > > When guest offload is enabled at runtime, RQ already has packets of > > bytes less than 64K. So when packet of 64KB arrives, all the packets > > of such size will be dropped. and RQ is now not usable. > > > > So this means that during set_guest_offloads() phase, RQs have to be > > destroyed and recreated, which requires almost driver reload. > > > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it > > should always treat them as GSO enabled. > > > > Accordingly, for now the assumption is that if guest GSO has been > > negotiated then it has been enabled, even if it's actually been > > disabled at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA > > for > > 1 VQ, queue size 1024, before and after the change, with the iperf > > server running over the virtio-net interface. > > > > MTU(Bytes)/Bandwidth (Gbit/s) > > Before After > > 1500 22.5 22.4 > > 9000 12.8 25.9 > > > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > > Reviewed-by: Parav Pandit <parav@nvidia.com> > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > OK I think the logic is correct, it's just a bit harder to read than necessary. > Small improvement suggestions: > > > > --- > > changelog: > > v4->v5 > > - Addressed comments from Michael S. Tsirkin > > - Improve commit message > > v3->v4 > > - Addressed comments from Si-Wei > > - Rename big_packets_sg_num with big_packets_num_skbfrags > > v2->v3 > > - Addressed comments from Si-Wei > > - Simplify the condition check to enable the optimization > > v1->v2 > > - Addressed comments from Jason, Michael, Si-Wei. > > - Remove the flag of guest GSO support, set sg_num for big packets and > > use it directly > > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > > - Replace the round up algorithm with DIV_ROUND_UP > > --- > > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > > 1 file changed, 24 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index > > f831a0290998..dbffd5f56fb8 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -225,6 +225,9 @@ struct virtnet_info { > > /* I like... big packets and I cannot lie! */ > > bool big_packets; > > > > + /* number of sg entries allocated for big packets */ > > + unsigned int big_packets_num_skbfrags; > > + > > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > > bool mergeable_rx_bufs; > > > > big_packets_num_skbfrags -> big_packet_num_skbfrags > > > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info > *vi, struct receive_queue *rq, > > char *p; > > int i, err, offset; > > > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > > first = get_a_page(rq, gfp); > > if (!first) { > > if (list) > > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info > > *vi, struct receive_queue *rq, > > > > /* chain first in list head */ > > first->private = (unsigned long)list; > > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > > + err = virtqueue_add_inbuf(rq->vq, rq->sg, > > +vi->big_packets_num_skbfrags + 2, > > first, gfp); > > if (err < 0) > > give_pages(rq, first); > > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const > struct virtnet_info *vi) > > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); } > > > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, > > +const int mtu) { > > + bool guest_gso = virtnet_check_guest_gso(vi); > > + > > + /* If device can receive ANY guest GSO packets, regardless of mtu, > > + * allocate packets of maximum size, otherwise limit it to only > > + * mtu size worth only. > > + */ > > + if (mtu > ETH_DATA_LEN || guest_gso) { > > + vi->big_packets = true; > > + vi->big_packets_num_skbfrags = guest_gso ? > MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > > + } > > +} > > + > > static int virtnet_probe(struct virtio_device *vdev) { > > int i, err = -ENOMEM; > > struct net_device *dev; > > struct virtnet_info *vi; > > u16 max_queue_pairs; > > - int mtu; > > + int mtu = 0; > > > > I think it's better to drop this and instead just put the code > where we already know the config. So: > > > /* Find if host supports multiqueue/rss virtio_net device */ > > max_queue_pairs = 1; > > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device > *vdev) > > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > > spin_lock_init(&vi->refill_lock); > > > > - /* If we can receive ANY GSO packets, we must allocate large ones. > */ > > - if (virtnet_check_guest_gso(vi)) > > - vi->big_packets = true; > > - > > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > > vi->mergeable_rx_bufs = true; > > > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device > *vdev) > > > > dev->mtu = mtu; > > dev->max_mtu = mtu; > > - > > - /* TODO: size buffers correctly in this case. */ > > - if (dev->mtu > ETH_DATA_LEN) > > - vi->big_packets = true; > > /* Size buffers to fit mtu. */ > if (mtu > ETH_DATA_LEN) { > vi->big_packets = true; > vi->big_packets_num_skbfrags = DIV_ROUND_UP(mtu, > PAGE_SIZE); > } > How doing things twice is better i.e. when mtu is > ETH_DATA_LEN and gso is offered? It calculates big_packets variable twice. It also easier to read the code at single place where big_packets decision is taken. > > } > > > > + virtnet_set_big_packets_fields(vi, mtu); > > + > > and here: > /* If device can receive guest GSO packets, allocate buffers for > * packets of maximum size, regardless of mtu. > */ > > if (virtnet_check_guest_gso(vi)) { > vi->big_packets = true; > vi->big_packets_num_skbfrags = MAX_SKB_FRAGS; > } > > > > if (vi->any_header_sg) > > dev->needed_headroom = vi->hdr_len; > > > > -- > > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-22 10:04 ` Parav Pandit via Virtualization @ 2022-09-22 10:14 ` Michael S. Tsirkin 2022-09-22 10:29 ` Parav Pandit via Virtualization [not found] ` <20220922053458.66f31136@kernel.org> 0 siblings, 2 replies; 29+ messages in thread From: Michael S. Tsirkin @ 2022-09-22 10:14 UTC (permalink / raw) To: Parav Pandit Cc: alexander.h.duyck@intel.com, virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li On Thu, Sep 22, 2022 at 10:04:53AM +0000, Parav Pandit wrote: > > > From: Michael S. Tsirkin <mst@redhat.com> > > Sent: Thursday, September 22, 2022 5:35 AM > > > > On Thu, Sep 01, 2022 at 05:10:38AM +0300, Gavin Li wrote: > > > Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big > > > packets even when GUEST_* offloads are not present on the device. > > > However, if guest GSO is not supported, it would be sufficient to > > > allocate segments to cover just up the MTU size and no further. > > > Allocating the maximum amount of segments results in a large waste of > > > buffer space in the queue, which limits the number of packets that can > > > be buffered and can result in reduced performance. > > > > > > Therefore, if guest GSO is not supported, use the MTU to calculate the > > > optimal amount of segments required. > > > > > > When guest offload is enabled at runtime, RQ already has packets of > > > bytes less than 64K. So when packet of 64KB arrives, all the packets > > > of such size will be dropped. and RQ is now not usable. > > > > > > So this means that during set_guest_offloads() phase, RQs have to be > > > destroyed and recreated, which requires almost driver reload. > > > > > > If VIRTIO_NET_F_CTRL_GUEST_OFFLOADS has been negotiated, then it > > > should always treat them as GSO enabled. > > > > > > Accordingly, for now the assumption is that if guest GSO has been > > > negotiated then it has been enabled, even if it's actually been > > > disabled at runtime through VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > > > > > Below is the iperf TCP test results over a Mellanox NIC, using vDPA > > > for > > > 1 VQ, queue size 1024, before and after the change, with the iperf > > > server running over the virtio-net interface. > > > > > > MTU(Bytes)/Bandwidth (Gbit/s) > > > Before After > > > 1500 22.5 22.4 > > > 9000 12.8 25.9 > > > > > > Signed-off-by: Gavin Li <gavinl@nvidia.com> > > > Reviewed-by: Gavi Teitz <gavi@nvidia.com> > > > Reviewed-by: Parav Pandit <parav@nvidia.com> > > > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > > > OK I think the logic is correct, it's just a bit harder to read than necessary. > > Small improvement suggestions: > > > > > > > --- > > > changelog: > > > v4->v5 > > > - Addressed comments from Michael S. Tsirkin > > > - Improve commit message > > > v3->v4 > > > - Addressed comments from Si-Wei > > > - Rename big_packets_sg_num with big_packets_num_skbfrags > > > v2->v3 > > > - Addressed comments from Si-Wei > > > - Simplify the condition check to enable the optimization > > > v1->v2 > > > - Addressed comments from Jason, Michael, Si-Wei. > > > - Remove the flag of guest GSO support, set sg_num for big packets and > > > use it directly > > > - Recalculate sg_num for big packets in virtnet_set_guest_offloads > > > - Replace the round up algorithm with DIV_ROUND_UP > > > --- > > > drivers/net/virtio_net.c | 37 ++++++++++++++++++++++++------------- > > > 1 file changed, 24 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index > > > f831a0290998..dbffd5f56fb8 100644 > > > --- a/drivers/net/virtio_net.c > > > +++ b/drivers/net/virtio_net.c > > > @@ -225,6 +225,9 @@ struct virtnet_info { > > > /* I like... big packets and I cannot lie! */ > > > bool big_packets; > > > > > > + /* number of sg entries allocated for big packets */ > > > + unsigned int big_packets_num_skbfrags; > > > + > > > /* Host will merge rx buffers for big packets (shake it! shake it!) */ > > > bool mergeable_rx_bufs; > > > > > > > big_packets_num_skbfrags -> big_packet_num_skbfrags > > > > > @@ -1331,10 +1334,10 @@ static int add_recvbuf_big(struct virtnet_info > > *vi, struct receive_queue *rq, > > > char *p; > > > int i, err, offset; > > > > > > - sg_init_table(rq->sg, MAX_SKB_FRAGS + 2); > > > + sg_init_table(rq->sg, vi->big_packets_num_skbfrags + 2); > > > > > > - /* page in rq->sg[MAX_SKB_FRAGS + 1] is list tail */ > > > - for (i = MAX_SKB_FRAGS + 1; i > 1; --i) { > > > + /* page in rq->sg[vi->big_packets_num_skbfrags + 1] is list tail */ > > > + for (i = vi->big_packets_num_skbfrags + 1; i > 1; --i) { > > > first = get_a_page(rq, gfp); > > > if (!first) { > > > if (list) > > > @@ -1365,7 +1368,7 @@ static int add_recvbuf_big(struct virtnet_info > > > *vi, struct receive_queue *rq, > > > > > > /* chain first in list head */ > > > first->private = (unsigned long)list; > > > - err = virtqueue_add_inbuf(rq->vq, rq->sg, MAX_SKB_FRAGS + 2, > > > + err = virtqueue_add_inbuf(rq->vq, rq->sg, > > > +vi->big_packets_num_skbfrags + 2, > > > first, gfp); > > > if (err < 0) > > > give_pages(rq, first); > > > @@ -3690,13 +3693,27 @@ static bool virtnet_check_guest_gso(const > > struct virtnet_info *vi) > > > virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO); } > > > > > > +static void virtnet_set_big_packets_fields(struct virtnet_info *vi, > > > +const int mtu) { > > > + bool guest_gso = virtnet_check_guest_gso(vi); > > > + > > > + /* If device can receive ANY guest GSO packets, regardless of mtu, > > > + * allocate packets of maximum size, otherwise limit it to only > > > + * mtu size worth only. > > > + */ > > > + if (mtu > ETH_DATA_LEN || guest_gso) { > > > + vi->big_packets = true; > > > + vi->big_packets_num_skbfrags = guest_gso ? > > MAX_SKB_FRAGS : DIV_ROUND_UP(mtu, PAGE_SIZE); > > > + } > > > +} > > > + > > > static int virtnet_probe(struct virtio_device *vdev) { > > > int i, err = -ENOMEM; > > > struct net_device *dev; > > > struct virtnet_info *vi; > > > u16 max_queue_pairs; > > > - int mtu; > > > + int mtu = 0; > > > > > > > I think it's better to drop this and instead just put the code > > where we already know the config. So: > > > > > /* Find if host supports multiqueue/rss virtio_net device */ > > > max_queue_pairs = 1; > > > @@ -3784,10 +3801,6 @@ static int virtnet_probe(struct virtio_device > > *vdev) > > > INIT_WORK(&vi->config_work, virtnet_config_changed_work); > > > spin_lock_init(&vi->refill_lock); > > > > > > - /* If we can receive ANY GSO packets, we must allocate large ones. > > */ > > > - if (virtnet_check_guest_gso(vi)) > > > - vi->big_packets = true; > > > - > > > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) > > > vi->mergeable_rx_bufs = true; > > > > > > @@ -3853,12 +3866,10 @@ static int virtnet_probe(struct virtio_device > > *vdev) > > > > > > dev->mtu = mtu; > > > dev->max_mtu = mtu; > > > - > > > - /* TODO: size buffers correctly in this case. */ > > > - if (dev->mtu > ETH_DATA_LEN) > > > - vi->big_packets = true; > > > > /* Size buffers to fit mtu. */ > > if (mtu > ETH_DATA_LEN) { > > vi->big_packets = true; > > vi->big_packets_num_skbfrags = DIV_ROUND_UP(mtu, > > PAGE_SIZE); > > } > > > How doing things twice is better i.e. when mtu is > ETH_DATA_LEN and gso is offered? > It calculates big_packets variable twice. > > It also easier to read the code at single place where big_packets decision is taken. I guess it depends on what you want to keep in one place. I just wanted to reduce the testing burden on the submitter. What I proposed makes the functional change minimal. It's nitpicking to be frank. v6 arrived while I was traveling and I didn't notice it. I see Jason acked that so I guess I will just apply as is. Do you ack v6 too? > > > } > > > > > > + virtnet_set_big_packets_fields(vi, mtu); > > > + > > > > and here: > > /* If device can receive guest GSO packets, allocate buffers for > > * packets of maximum size, regardless of mtu. > > */ > > > > if (virtnet_check_guest_gso(vi)) { > > vi->big_packets = true; > > vi->big_packets_num_skbfrags = MAX_SKB_FRAGS; > > } > > > > > > > if (vi->any_header_sg) > > > dev->needed_headroom = vi->hdr_len; > > > > > > -- > > > 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets 2022-09-22 10:14 ` Michael S. Tsirkin @ 2022-09-22 10:29 ` Parav Pandit via Virtualization [not found] ` <20220922053458.66f31136@kernel.org> 1 sibling, 0 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-09-22 10:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck@intel.com, virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, kuba@kernel.org, davem@davemloft.net, Gavin Li > From: Michael S. Tsirkin <mst@redhat.com> > Sent: Thursday, September 22, 2022 6:15 AM > > It's nitpicking to be frank. v6 arrived while I was traveling and I didn't notice it. > I see Jason acked that so I guess I will just apply as is. Do you ack v6 too? > Yes. I reviewed it. Gavin added reviewed-by. Thanks. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20220922053458.66f31136@kernel.org>]
* RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets [not found] ` <20220922053458.66f31136@kernel.org> @ 2022-10-05 10:29 ` Parav Pandit via Virtualization 0 siblings, 0 replies; 29+ messages in thread From: Parav Pandit via Virtualization @ 2022-10-05 10:29 UTC (permalink / raw) To: Jakub Kicinski, Michael S. Tsirkin Cc: alexander.h.duyck@intel.com, virtio-dev@lists.oasis-open.org, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, Gavi Teitz, virtualization@lists.linux-foundation.org, stephen@networkplumber.org, loseweigh@gmail.com, netdev@vger.kernel.org, davem@davemloft.net, Gavin Li > From: Jakub Kicinski <kuba@kernel.org> > Sent: Thursday, September 22, 2022 6:05 PM > > On Thu, 22 Sep 2022 06:14:59 -0400 Michael S. Tsirkin wrote: > > It's nitpicking to be frank. v6 arrived while I was traveling and I > > didn't notice it. I see Jason acked that so I guess I will just apply > > as is. > > Oh, you wanna take it? The split on who takes virtio_net is a touch unclear > but if it makes more sense here to go via virtio feel free to slap my ack on the > v6. Michael, Did you get a chance to take this? I don't see it at https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=linux-next _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2022-10-05 10:29 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220901021038.84751-1-gavinl@nvidia.com>
[not found] ` <20220901021038.84751-3-gavinl@nvidia.com>
2022-09-07 2:17 ` [virtio-dev] [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets Jason Wang
2022-09-07 5:31 ` Michael S. Tsirkin
[not found] ` <0355d1e4-a3cf-5b16-8c7f-b39b1ec14ade@nvidia.com>
2022-09-07 9:26 ` Michael S. Tsirkin
2022-09-07 14:08 ` Parav Pandit via Virtualization
2022-09-07 14:29 ` Michael S. Tsirkin
2022-09-07 14:33 ` Parav Pandit via Virtualization
2022-09-07 14:40 ` Michael S. Tsirkin
2022-09-07 16:12 ` Parav Pandit via Virtualization
2022-09-07 18:15 ` Michael S. Tsirkin
2022-09-07 19:06 ` Parav Pandit via Virtualization
2022-09-07 19:11 ` Michael S. Tsirkin
2022-09-07 19:18 ` Parav Pandit via Virtualization
2022-09-07 19:23 ` Michael S. Tsirkin
2022-09-07 19:27 ` Parav Pandit via Virtualization
2022-09-07 19:36 ` Michael S. Tsirkin
2022-09-07 19:37 ` Michael S. Tsirkin
2022-09-07 19:54 ` Parav Pandit via Virtualization
2022-09-07 19:51 ` Parav Pandit via Virtualization
2022-09-07 21:39 ` [virtio-dev] " Si-Wei Liu
2022-09-07 22:11 ` Parav Pandit via Virtualization
2022-09-07 22:57 ` Si-Wei Liu
2022-09-22 9:26 ` Michael S. Tsirkin
2022-09-22 10:07 ` Parav Pandit via Virtualization
2022-09-07 20:04 ` Parav Pandit via Virtualization
2022-09-22 9:35 ` Michael S. Tsirkin
2022-09-22 10:04 ` Parav Pandit via Virtualization
2022-09-22 10:14 ` Michael S. Tsirkin
2022-09-22 10:29 ` Parav Pandit via Virtualization
[not found] ` <20220922053458.66f31136@kernel.org>
2022-10-05 10:29 ` Parav Pandit via Virtualization
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).