* Re: [PATCH] vhost-net: add limitation of sent packets for tx polling
[not found] <88D661ADF6AFBF42B2AB88D8E7682B0901FC4627@EXMBX-SZMAIL011.tencent.com>
@ 2018-04-03 11:59 ` Michael S. Tsirkin
0 siblings, 0 replies; 2+ messages in thread
From: Michael S. Tsirkin @ 2018-04-03 11:59 UTC (permalink / raw)
To: haibinzhang(张海斌)
Cc: kvm@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
yunfangtai(台运方),
lidongchen(陈立东)
On Tue, Apr 03, 2018 at 08:08:26AM +0000, haibinzhang(张海斌) wrote:
> handle_tx will delay rx for a long time when tx busy polling udp packets
> with small length(e.g. 1byte udp payload), because setting VHOST_NET_WEIGHT
> takes into account only sent-bytes but no single packet length.
>
> Tests were done between two Virtual Machines using netperf(UDP_STREAM, len=1),
> then another machine pinged the client. Result shows as follow:
>
> Packet# Ping-Latency(ms)
> min avg max
> Origin 3.319 18.489 57.503
> 64 1.643 2.021 2.552
> 128 1.825 2.600 3.224
> 256 1.997 2.710 4.295
> 512* 1.860 3.171 4.631
> 1024 2.002 4.173 9.056
> 2048 2.257 5.650 9.688
> 4096 2.093 8.508 15.943
>
> 512 is selected, which is multi-VRING_SIZE
There's no guarantee vring size is 256.
Could you pls try with a different tx ring size?
I suspect we want:
#define VHOST_NET_PKT_WEIGHT(vq) ((vq)->num * 2)
> and close to VHOST_NET_WEIGHT/MTU.
Puzzled by this part. Does tweaking MTU change anything?
> To evaluate this change, another tests were done using netperf(RR, TX) between
> two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz. Result as follow
> does not show obvious changes:
>
> TCP_RR
>
> size/sessions/+thu%/+normalize%
> 1/ 1/ -7%/ -2%
> 1/ 4/ +1%/ 0%
> 1/ 8/ +1%/ -2%
> 64/ 1/ -6%/ 0%
> 64/ 4/ 0%/ +2%
> 64/ 8/ 0%/ 0%
> 256/ 1/ -3%/ -4%
> 256/ 4/ +3%/ +4%
> 256/ 8/ +2%/ 0%
>
> UDP_RR
>
> size/sessions/+thu%/+normalize%
> 1/ 1/ -5%/ +1%
> 1/ 4/ +4%/ +1%
> 1/ 8/ -1%/ -1%
> 64/ 1/ -2%/ -3%
> 64/ 4/ -5%/ -1%
> 64/ 8/ 0%/ -1%
> 256/ 1/ +7%/ +1%
> 256/ 4/ +1%/ +1%
> 256/ 8/ +2%/ +2%
>
> TCP_STREAM
>
> size/sessions/+thu%/+normalize%
> 64/ 1/ 0%/ -3%
> 64/ 4/ +3%/ -1%
> 64/ 8/ +9%/ -4%
> 256/ 1/ +1%/ -4%
> 256/ 4/ -1%/ -1%
> 256/ 8/ +7%/ +5%
> 512/ 1/ +1%/ 0%
> 512/ 4/ +1%/ -1%
> 512/ 8/ +7%/ -5%
> 1024/ 1/ 0%/ -1%
> 1024/ 4/ +3%/ 0%
> 1024/ 8/ +8%/ +5%
> 2048/ 1/ +2%/ +2%
> 2048/ 4/ +1%/ 0%
> 2048/ 8/ -2%/ 0%
> 4096/ 1/ -2%/ 0%
> 4096/ 4/ +2%/ 0%
> 4096/ 8/ +9%/ -2%
>
> Signed-off-by: Haibin Zhang <haibinzhang@tencent.com>
> Signed-off-by: Yunfang Tai <yunfangtai@tencent.com>
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> ---
> drivers/vhost/net.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 8139bc70ad7d..13a23f3f3ea4 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
> * Using this limit prevents one virtqueue from starving others. */
> #define VHOST_NET_WEIGHT 0x80000
>
> +/* Max number of packets transferred before requeueing the job.
> + * Using this limit prevents one virtqueue from starving rx. */
> +#define VHOST_NET_PKT_WEIGHT 512
> +
> /* MAX number of TX used buffers for outstanding zerocopy */
> #define VHOST_MAX_PEND 128
> #define VHOST_GOODCOPY_LEN 256
> @@ -473,6 +477,7 @@ static void handle_tx(struct vhost_net *net)
> struct socket *sock;
> struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
> bool zcopy, zcopy_used;
> + int sent_pkts = 0;
>
> mutex_lock(&vq->mutex);
> sock = vq->private_data;
> @@ -580,7 +585,8 @@ static void handle_tx(struct vhost_net *net)
> else
> vhost_zerocopy_signal_used(net, vq);
> vhost_net_tx_packet(net);
> - if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> + if (unlikely(total_len >= VHOST_NET_WEIGHT) ||
> + unlikely(++sent_pkts >= VHOST_NET_PKT_WEIGHT)) {
> vhost_poll_queue(&vq->poll);
> break;
> }
> --
> 2.12.3
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] vhost-net: add limitation of sent packets for tx polling
[not found] <88D661ADF6AFBF42B2AB88D8E7682B0901FC465B@EXMBX-SZMAIL011.tencent.com>
@ 2018-04-03 13:26 ` Michael S. Tsirkin
0 siblings, 0 replies; 2+ messages in thread
From: Michael S. Tsirkin @ 2018-04-03 13:26 UTC (permalink / raw)
To: haibinzhang(张海斌)
Cc: kvm@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
yunfangtai(台运方),
lidongchen(陈立东)
On Tue, Apr 03, 2018 at 12:29:47PM +0000, haibinzhang(张海斌) wrote:
>
> >On Tue, Apr 03, 2018 at 08:08:26AM +0000, haibinzhang wrote:
> >> handle_tx will delay rx for a long time when tx busy polling udp packets
> >> with small length(e.g. 1byte udp payload), because setting VHOST_NET_WEIGHT
> >> takes into account only sent-bytes but no single packet length.
> >>
> >> Tests were done between two Virtual Machines using netperf(UDP_STREAM, len=1),
> >> then another machine pinged the client. Result shows as follow:
> >>
> >> Packet# Ping-Latency(ms)
> >> min avg max
> >> Origin 3.319 18.489 57.503
> >> 64 1.643 2.021 2.552
> >> 128 1.825 2.600 3.224
> >> 256 1.997 2.710 4.295
> >> 512* 1.860 3.171 4.631
> >> 1024 2.002 4.173 9.056
> >> 2048 2.257 5.650 9.688
> >> 4096 2.093 8.508 15.943
> >>
> >> 512 is selected, which is multi-VRING_SIZE
> >
> >There's no guarantee vring size is 256.
> >
> >Could you pls try with a different tx ring size?
> >
> >I suspect we want:
> >
> >#define VHOST_NET_PKT_WEIGHT(vq) ((vq)->num * 2)
> >
> >
> >> and close to VHOST_NET_WEIGHT/MTU.
> >
> >Puzzled by this part. Does tweaking MTU change anything?
>
> The MTU of ethernet is 1500, so VHOST_NET_WEIGHT/MTU equals 0x80000/1500=350.
We should include the 12 byte header so it's a bit lower.
> Then sent-bytes cannot reach VHOST_NET_WEIGHT in one handle_tx even with 1500-bytes
> frame if packet# is less than 350. So packet# must be bigger than 350.
> 512 meets this condition
What you seem to say is this:
imagine MTU sized buffers. With these we stop after 350
packets. Thus adding another limit > 350 will not
slow us down.
Fair enough but won't apply with smaller packet
sizes, will it?
I still think a simpler argument carries more weight:
ring size is a hint from device about a burst size
it can tolerate. Based on benchmarks, we tweak
the limit to 2 * vq size as that seems to
perform a bit better, and is still safer
than no limit on # of packets as is done now.
but this needs testing with another ring size.
Could you try that please?
> and is also DEFAULT VRING_SIZE aligned.
Neither Linux nor virtio have a default vring size. It's a historical
construct that exists in qemu for qemu compatibility
reasons.
> >
> >> To evaluate this change, another tests were done using netperf(RR, TX) between
> >> two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz. Result as follow
> >> does not show obvious changes:
> >>
> >> TCP_RR
> >>
> >> size/sessions/+thu%/+normalize%
> >> 1/ 1/ -7%/ -2%
> >> 1/ 4/ +1%/ 0%
> >> 1/ 8/ +1%/ -2%
> >> 64/ 1/ -6%/ 0%
> >> 64/ 4/ 0%/ +2%
> >> 64/ 8/ 0%/ 0%
> >> 256/ 1/ -3%/ -4%
> >> 256/ 4/ +3%/ +4%
> >> 256/ 8/ +2%/ 0%
> >>
> >> UDP_RR
> >>
> >> size/sessions/+thu%/+normalize%
> >> 1/ 1/ -5%/ +1%
> >> 1/ 4/ +4%/ +1%
> >> 1/ 8/ -1%/ -1%
> >> 64/ 1/ -2%/ -3%
> >> 64/ 4/ -5%/ -1%
> >> 64/ 8/ 0%/ -1%
> >> 256/ 1/ +7%/ +1%
> >> 256/ 4/ +1%/ +1%
> >> 256/ 8/ +2%/ +2%
> >>
> >> TCP_STREAM
> >>
> >> size/sessions/+thu%/+normalize%
> >> 64/ 1/ 0%/ -3%
> >> 64/ 4/ +3%/ -1%
> >> 64/ 8/ +9%/ -4%
> >> 256/ 1/ +1%/ -4%
> >> 256/ 4/ -1%/ -1%
> >> 256/ 8/ +7%/ +5%
> >> 512/ 1/ +1%/ 0%
> >> 512/ 4/ +1%/ -1%
> >> 512/ 8/ +7%/ -5%
> >> 1024/ 1/ 0%/ -1%
> >> 1024/ 4/ +3%/ 0%
> >> 1024/ 8/ +8%/ +5%
> >> 2048/ 1/ +2%/ +2%
> >> 2048/ 4/ +1%/ 0%
> >> 2048/ 8/ -2%/ 0%
> >> 4096/ 1/ -2%/ 0%
> >> 4096/ 4/ +2%/ 0%
> >> 4096/ 8/ +9%/ -2%
> >>
> >> Signed-off-by: Haibin Zhang <haibinzhang@tencent.com>
> >> Signed-off-by: Yunfang Tai <yunfangtai@tencent.com>
> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> >> ---
> >> drivers/vhost/net.c | 8 +++++++-
> >> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> >> index 8139bc70ad7d..13a23f3f3ea4 100644
> >> --- a/drivers/vhost/net.c
> >> +++ b/drivers/vhost/net.c
> >> @@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
> >> * Using this limit prevents one virtqueue from starving others. */
> >> #define VHOST_NET_WEIGHT 0x80000
> >>
> >> +/* Max number of packets transferred before requeueing the job.
> >> + * Using this limit prevents one virtqueue from starving rx. */
> >> +#define VHOST_NET_PKT_WEIGHT 512
> >> +
> >> /* MAX number of TX used buffers for outstanding zerocopy */
> >> #define VHOST_MAX_PEND 128
> >> #define VHOST_GOODCOPY_LEN 256
> >> @@ -473,6 +477,7 @@ static void handle_tx(struct vhost_net *net)
> >> struct socket *sock;
> >> struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
> >> bool zcopy, zcopy_used;
> >> + int sent_pkts = 0;
> >>
> >> mutex_lock(&vq->mutex);
> >> sock = vq->private_data;
> >> @@ -580,7 +585,8 @@ static void handle_tx(struct vhost_net *net)
> >> else
> >> vhost_zerocopy_signal_used(net, vq);
> >> vhost_net_tx_packet(net);
> >> - if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
> >> + if (unlikely(total_len >= VHOST_NET_WEIGHT) ||
> >> + unlikely(++sent_pkts >= VHOST_NET_PKT_WEIGHT)) {
> >> vhost_poll_queue(&vq->poll);
> >> break;
> >> }
> >> --
> >> 2.12.3
> >>
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-04-03 13:26 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <88D661ADF6AFBF42B2AB88D8E7682B0901FC4627@EXMBX-SZMAIL011.tencent.com>
2018-04-03 11:59 ` [PATCH] vhost-net: add limitation of sent packets for tx polling Michael S. Tsirkin
[not found] <88D661ADF6AFBF42B2AB88D8E7682B0901FC465B@EXMBX-SZMAIL011.tencent.com>
2018-04-03 13:26 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).