* virtio-net + BQL
@ 2024-02-23 8:49 Xuan Zhuo
2024-02-23 12:58 ` Dave Taht
0 siblings, 1 reply; 9+ messages in thread
From: Xuan Zhuo @ 2024-02-23 8:49 UTC (permalink / raw)
To: Dave Taht; +Cc: Jason Wang, Michael S. Tsirkin, hengqi, netdev
Hi Dave,
We study the BQL recently.
For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
netdim, maybe it is time for a change. @Heng is working for the netdim.
But the performance number from https://lwn.net/Articles/469652/ has not appeal
to me.
The below number is good, but that just work when the nic is busy.
No BQL, tso on: 3000-3200K bytes in queue: 36 tps
BQL, tso on: 156-194K bytes in queue, 535 tps
Or I miss something.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-23 8:49 virtio-net + BQL Xuan Zhuo
@ 2024-02-23 12:58 ` Dave Taht
2024-02-25 18:36 ` Michael S. Tsirkin
2024-02-27 2:20 ` Xuan Zhuo
0 siblings, 2 replies; 9+ messages in thread
From: Dave Taht @ 2024-02-23 12:58 UTC (permalink / raw)
To: Xuan Zhuo; +Cc: Jason Wang, Michael S. Tsirkin, hengqi, netdev
On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> Hi Dave,
>
> We study the BQL recently.
>
> For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> netdim, maybe it is time for a change. @Heng is working for the netdim.
>
> But the performance number from https://lwn.net/Articles/469652/ has not appeal
> to me.
>
> The below number is good, but that just work when the nic is busy.
>
> No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> BQL, tso on: 156-194K bytes in queue, 535 tps
That is data from 2011 against a gbit interface. Each of those BQL
queues is additive.
> Or I miss something.
What I see nowadays is 16+Mbytes vanishing into ring buffers and
affecting packet pacing, and fair queue and QoS behaviors. Certainly
my own efforts with eBPF and LibreQos are helping observability here,
but it seems to me that the virtualized stack is not getting enough
pushback from the underlying cloudy driver - be it this one, or nitro.
Most of the time the packet shaping seems to take place in the cloud
network or driver on a per-vm basis.
I know that adding BQL to virtio has been tried before, and I keep
hoping it gets tried again,
measuring latency under load.
BQL has sprouted some new latency issues since 2011 given the enormous
number of hardware queues exposed which I talked about a bit in my
netdevconf talk here:
https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
I am also interested in how similar AI workloads are to the infamous
rrul test in a virtualized environment also.
There is also AFAP thinking mis-understood- with a really
mind-bogglingly-wrong application of it documented over here, where
15ms of delay in the stack is considered good.
https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
So my overall concern is a bit broader than "just add bql", but in
other drivers, it was only 6 lines of code....
> Thanks.
>
--
https://blog.cerowrt.org/post/2024_predictions/
Dave Täht CSO, LibreQos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-23 12:58 ` Dave Taht
@ 2024-02-25 18:36 ` Michael S. Tsirkin
2024-02-25 18:58 ` Dave Taht
2024-02-27 2:20 ` Xuan Zhuo
1 sibling, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2024-02-25 18:36 UTC (permalink / raw)
To: Dave Taht; +Cc: Xuan Zhuo, Jason Wang, hengqi, netdev
On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > Hi Dave,
> >
> > We study the BQL recently.
> >
> > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > netdim, maybe it is time for a change. @Heng is working for the netdim.
> >
> > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > to me.
> >
> > The below number is good, but that just work when the nic is busy.
> >
> > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > BQL, tso on: 156-194K bytes in queue, 535 tps
>
> That is data from 2011 against a gbit interface. Each of those BQL
> queues is additive.
>
> > Or I miss something.
>
> What I see nowadays is 16+Mbytes vanishing into ring buffers and
> affecting packet pacing, and fair queue and QoS behaviors. Certainly
> my own efforts with eBPF and LibreQos are helping observability here,
> but it seems to me that the virtualized stack is not getting enough
> pushback from the underlying cloudy driver - be it this one, or nitro.
> Most of the time the packet shaping seems to take place in the cloud
> network or driver on a per-vm basis.
>
> I know that adding BQL to virtio has been tried before, and I keep
> hoping it gets tried again,
> measuring latency under load.
>
> BQL has sprouted some new latency issues since 2011 given the enormous
> number of hardware queues exposed which I talked about a bit in my
> netdevconf talk here:
>
> https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
>
> I am also interested in how similar AI workloads are to the infamous
> rrul test in a virtualized environment also.
>
> There is also AFAP thinking mis-understood- with a really
> mind-bogglingly-wrong application of it documented over here, where
> 15ms of delay in the stack is considered good.
>
> https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
>
> So my overall concern is a bit broader than "just add bql", but in
> other drivers, it was only 6 lines of code....
>
> > Thanks.
> >
>
>
It is less BQL it is more TCP small queues which do not
seem to work well when your kernel isn't running part of the
time because hypervisor scheduled it out. wireless has some
of the same problem with huge variance in latency unrelated
to load and IIRC worked around that by
tuning socket queue size slightly differently.
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-25 18:36 ` Michael S. Tsirkin
@ 2024-02-25 18:58 ` Dave Taht
2024-02-25 20:26 ` Michael S. Tsirkin
0 siblings, 1 reply; 9+ messages in thread
From: Dave Taht @ 2024-02-25 18:58 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Xuan Zhuo, Jason Wang, hengqi, netdev
On Sun, Feb 25, 2024 at 1:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> > On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > Hi Dave,
> > >
> > > We study the BQL recently.
> > >
> > > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > > netdim, maybe it is time for a change. @Heng is working for the netdim.
> > >
> > > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > > to me.
> > >
> > > The below number is good, but that just work when the nic is busy.
> > >
> > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > BQL, tso on: 156-194K bytes in queue, 535 tps
> >
> > That is data from 2011 against a gbit interface. Each of those BQL
> > queues is additive.
> >
> > > Or I miss something.
> >
> > What I see nowadays is 16+Mbytes vanishing into ring buffers and
> > affecting packet pacing, and fair queue and QoS behaviors. Certainly
> > my own efforts with eBPF and LibreQos are helping observability here,
> > but it seems to me that the virtualized stack is not getting enough
> > pushback from the underlying cloudy driver - be it this one, or nitro.
> > Most of the time the packet shaping seems to take place in the cloud
> > network or driver on a per-vm basis.
> >
> > I know that adding BQL to virtio has been tried before, and I keep
> > hoping it gets tried again,
> > measuring latency under load.
> >
> > BQL has sprouted some new latency issues since 2011 given the enormous
> > number of hardware queues exposed which I talked about a bit in my
> > netdevconf talk here:
> >
> > https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
> >
> > I am also interested in how similar AI workloads are to the infamous
> > rrul test in a virtualized environment also.
> >
> > There is also AFAP thinking mis-understood- with a really
> > mind-bogglingly-wrong application of it documented over here, where
> > 15ms of delay in the stack is considered good.
> >
> > https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
> >
> > So my overall concern is a bit broader than "just add bql", but in
> > other drivers, it was only 6 lines of code....
> >
> > > Thanks.
> > >
> >
> >
>
> It is less BQL it is more TCP small queues which do not
> seem to work well when your kernel isn't running part of the
> time because hypervisor scheduled it out. wireless has some
> of the same problem with huge variance in latency unrelated
> to load and IIRC worked around that by
> tuning socket queue size slightly differently.
Add that to the problems-with-virtualization list, then. :/ I was
aghast at a fix jakub put in to kick things at 7ms that went by
recently.
Wireless is kind of an overly broad topic. I was (6 years ago) pretty
happy with all the fixes we put in there for WiFi softmac devices, the
mt76 and the new mt79 seem to be performing rather well. Ath9k is
still good, ath10k not horrible, I have no data about ath11k, and
let's not talk about the Broadcom nightmare.
This was still a pretty good day, in my memory:
https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002
Is something else in wif igoing to hell? There are still, oh, 200
drivers left to fix. ENOFUNDING.
And so far as I know the 3GPP (5g) work is entirely out of tree and
almost entirely dpdk or ebpf?
>
>
> --
> MST
>
--
https://blog.cerowrt.org/post/2024_predictions/
Dave Täht CSO, LibreQos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-25 18:58 ` Dave Taht
@ 2024-02-25 20:26 ` Michael S. Tsirkin
2024-02-26 5:03 ` Jason Wang
0 siblings, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2024-02-25 20:26 UTC (permalink / raw)
To: Dave Taht; +Cc: Xuan Zhuo, Jason Wang, hengqi, netdev
On Sun, Feb 25, 2024 at 01:58:53PM -0500, Dave Taht wrote:
> On Sun, Feb 25, 2024 at 1:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> > > On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > >
> > > > Hi Dave,
> > > >
> > > > We study the BQL recently.
> > > >
> > > > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > > > netdim, maybe it is time for a change. @Heng is working for the netdim.
> > > >
> > > > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > > > to me.
> > > >
> > > > The below number is good, but that just work when the nic is busy.
> > > >
> > > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > > BQL, tso on: 156-194K bytes in queue, 535 tps
> > >
> > > That is data from 2011 against a gbit interface. Each of those BQL
> > > queues is additive.
> > >
> > > > Or I miss something.
> > >
> > > What I see nowadays is 16+Mbytes vanishing into ring buffers and
> > > affecting packet pacing, and fair queue and QoS behaviors. Certainly
> > > my own efforts with eBPF and LibreQos are helping observability here,
> > > but it seems to me that the virtualized stack is not getting enough
> > > pushback from the underlying cloudy driver - be it this one, or nitro.
> > > Most of the time the packet shaping seems to take place in the cloud
> > > network or driver on a per-vm basis.
> > >
> > > I know that adding BQL to virtio has been tried before, and I keep
> > > hoping it gets tried again,
> > > measuring latency under load.
> > >
> > > BQL has sprouted some new latency issues since 2011 given the enormous
> > > number of hardware queues exposed which I talked about a bit in my
> > > netdevconf talk here:
> > >
> > > https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
> > >
> > > I am also interested in how similar AI workloads are to the infamous
> > > rrul test in a virtualized environment also.
> > >
> > > There is also AFAP thinking mis-understood- with a really
> > > mind-bogglingly-wrong application of it documented over here, where
> > > 15ms of delay in the stack is considered good.
> > >
> > > https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
> > >
> > > So my overall concern is a bit broader than "just add bql", but in
> > > other drivers, it was only 6 lines of code....
> > >
> > > > Thanks.
> > > >
> > >
> > >
> >
> > It is less BQL it is more TCP small queues which do not
> > seem to work well when your kernel isn't running part of the
> > time because hypervisor scheduled it out. wireless has some
> > of the same problem with huge variance in latency unrelated
> > to load and IIRC worked around that by
> > tuning socket queue size slightly differently.
>
> Add that to the problems-with-virtualization list, then. :/
yep
for example, attempts to drop packets to fight bufferbloat do
not work well because as you start dropping packets you have less
work to do on host and so VM starts going even faster
flooding you with even more packets.
virtualization has to be treated more like userspace than like
a physical machine.
> I was
> aghast at a fix jakub put in to kick things at 7ms that went by
> recently.
which one is it?
> Wireless is kind of an overly broad topic. I was (6 years ago) pretty
> happy with all the fixes we put in there for WiFi softmac devices, the
> mt76 and the new mt79 seem to be performing rather well. Ath9k is
> still good, ath10k not horrible, I have no data about ath11k, and
> let's not talk about the Broadcom nightmare.
>
> This was still a pretty good day, in my memory:
> https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002
>
> Is something else in wif igoing to hell? There are still, oh, 200
> drivers left to fix. ENOFUNDING.
>
> And so far as I know the 3GPP (5g) work is entirely out of tree and
> almost entirely dpdk or ebpf?
>
> >
> >
> > --
> > MST
> >
>
>
> --
> https://blog.cerowrt.org/post/2024_predictions/
> Dave Täht CSO, LibreQos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-25 20:26 ` Michael S. Tsirkin
@ 2024-02-26 5:03 ` Jason Wang
2024-02-26 11:42 ` Michael S. Tsirkin
2024-02-27 2:32 ` Xuan Zhuo
0 siblings, 2 replies; 9+ messages in thread
From: Jason Wang @ 2024-02-26 5:03 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Dave Taht, Xuan Zhuo, hengqi, netdev
On Mon, Feb 26, 2024 at 4:26 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sun, Feb 25, 2024 at 01:58:53PM -0500, Dave Taht wrote:
> > On Sun, Feb 25, 2024 at 1:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> > > > On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >
> > > > > Hi Dave,
> > > > >
> > > > > We study the BQL recently.
> > > > >
> > > > > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > > > > netdim, maybe it is time for a change. @Heng is working for the netdim.
> > > > >
> > > > > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > > > > to me.
> > > > >
> > > > > The below number is good, but that just work when the nic is busy.
> > > > >
> > > > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > > > BQL, tso on: 156-194K bytes in queue, 535 tps
> > > >
> > > > That is data from 2011 against a gbit interface. Each of those BQL
> > > > queues is additive.
> > > >
> > > > > Or I miss something.
> > > >
> > > > What I see nowadays is 16+Mbytes vanishing into ring buffers and
> > > > affecting packet pacing, and fair queue and QoS behaviors. Certainly
> > > > my own efforts with eBPF and LibreQos are helping observability here,
> > > > but it seems to me that the virtualized stack is not getting enough
> > > > pushback from the underlying cloudy driver - be it this one, or nitro.
> > > > Most of the time the packet shaping seems to take place in the cloud
> > > > network or driver on a per-vm basis.
> > > >
> > > > I know that adding BQL to virtio has been tried before, and I keep
> > > > hoping it gets tried again,
> > > > measuring latency under load.
> > > >
> > > > BQL has sprouted some new latency issues since 2011 given the enormous
> > > > number of hardware queues exposed which I talked about a bit in my
> > > > netdevconf talk here:
> > > >
> > > > https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
> > > >
> > > > I am also interested in how similar AI workloads are to the infamous
> > > > rrul test in a virtualized environment also.
> > > >
> > > > There is also AFAP thinking mis-understood- with a really
> > > > mind-bogglingly-wrong application of it documented over here, where
> > > > 15ms of delay in the stack is considered good.
> > > >
> > > > https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
> > > >
> > > > So my overall concern is a bit broader than "just add bql", but in
> > > > other drivers, it was only 6 lines of code....
> > > >
> > > > > Thanks.
> > > > >
> > > >
> > > >
> > >
> > > It is less BQL it is more TCP small queues which do not
> > > seem to work well when your kernel isn't running part of the
> > > time because hypervisor scheduled it out. wireless has some
> > > of the same problem with huge variance in latency unrelated
> > > to load and IIRC worked around that by
> > > tuning socket queue size slightly differently.
> >
> > Add that to the problems-with-virtualization list, then. :/
>
> yep
>
> for example, attempts to drop packets to fight bufferbloat do
> not work well because as you start dropping packets you have less
> work to do on host and so VM starts going even faster
> flooding you with even more packets.
>
> virtualization has to be treated more like userspace than like
> a physical machine.
Probaby, but I think we need a new rfc with a benchmark for more
information (there's no need to bother with the mode switching so it
should be a tiny patch).
One interesting thing is that gve implements bql.
Thanks
>
>
> > I was
> > aghast at a fix jakub put in to kick things at 7ms that went by
> > recently.
>
> which one is it?
>
> > Wireless is kind of an overly broad topic. I was (6 years ago) pretty
> > happy with all the fixes we put in there for WiFi softmac devices, the
> > mt76 and the new mt79 seem to be performing rather well. Ath9k is
> > still good, ath10k not horrible, I have no data about ath11k, and
> > let's not talk about the Broadcom nightmare.
> >
> > This was still a pretty good day, in my memory:
> > https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002
> >
> > Is something else in wif igoing to hell? There are still, oh, 200
> > drivers left to fix. ENOFUNDING.
> >
> > And so far as I know the 3GPP (5g) work is entirely out of tree and
> > almost entirely dpdk or ebpf?
> >
> > >
> > >
> > > --
> > > MST
> > >
> >
> >
> > --
> > https://blog.cerowrt.org/post/2024_predictions/
> > Dave Täht CSO, LibreQos
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-26 5:03 ` Jason Wang
@ 2024-02-26 11:42 ` Michael S. Tsirkin
2024-02-27 2:32 ` Xuan Zhuo
1 sibling, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2024-02-26 11:42 UTC (permalink / raw)
To: Jason Wang; +Cc: Dave Taht, Xuan Zhuo, hengqi, netdev
On Mon, Feb 26, 2024 at 01:03:12PM +0800, Jason Wang wrote:
> On Mon, Feb 26, 2024 at 4:26 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Feb 25, 2024 at 01:58:53PM -0500, Dave Taht wrote:
> > > On Sun, Feb 25, 2024 at 1:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> > > > > On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >
> > > > > > Hi Dave,
> > > > > >
> > > > > > We study the BQL recently.
> > > > > >
> > > > > > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > > > > > netdim, maybe it is time for a change. @Heng is working for the netdim.
> > > > > >
> > > > > > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > > > > > to me.
> > > > > >
> > > > > > The below number is good, but that just work when the nic is busy.
> > > > > >
> > > > > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > > > > BQL, tso on: 156-194K bytes in queue, 535 tps
> > > > >
> > > > > That is data from 2011 against a gbit interface. Each of those BQL
> > > > > queues is additive.
> > > > >
> > > > > > Or I miss something.
> > > > >
> > > > > What I see nowadays is 16+Mbytes vanishing into ring buffers and
> > > > > affecting packet pacing, and fair queue and QoS behaviors. Certainly
> > > > > my own efforts with eBPF and LibreQos are helping observability here,
> > > > > but it seems to me that the virtualized stack is not getting enough
> > > > > pushback from the underlying cloudy driver - be it this one, or nitro.
> > > > > Most of the time the packet shaping seems to take place in the cloud
> > > > > network or driver on a per-vm basis.
> > > > >
> > > > > I know that adding BQL to virtio has been tried before, and I keep
> > > > > hoping it gets tried again,
> > > > > measuring latency under load.
> > > > >
> > > > > BQL has sprouted some new latency issues since 2011 given the enormous
> > > > > number of hardware queues exposed which I talked about a bit in my
> > > > > netdevconf talk here:
> > > > >
> > > > > https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
> > > > >
> > > > > I am also interested in how similar AI workloads are to the infamous
> > > > > rrul test in a virtualized environment also.
> > > > >
> > > > > There is also AFAP thinking mis-understood- with a really
> > > > > mind-bogglingly-wrong application of it documented over here, where
> > > > > 15ms of delay in the stack is considered good.
> > > > >
> > > > > https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
> > > > >
> > > > > So my overall concern is a bit broader than "just add bql", but in
> > > > > other drivers, it was only 6 lines of code....
> > > > >
> > > > > > Thanks.
> > > > > >
> > > > >
> > > > >
> > > >
> > > > It is less BQL it is more TCP small queues which do not
> > > > seem to work well when your kernel isn't running part of the
> > > > time because hypervisor scheduled it out. wireless has some
> > > > of the same problem with huge variance in latency unrelated
> > > > to load and IIRC worked around that by
> > > > tuning socket queue size slightly differently.
> > >
> > > Add that to the problems-with-virtualization list, then. :/
> >
> > yep
> >
> > for example, attempts to drop packets to fight bufferbloat do
> > not work well because as you start dropping packets you have less
> > work to do on host and so VM starts going even faster
> > flooding you with even more packets.
> >
> > virtualization has to be treated more like userspace than like
> > a physical machine.
>
> Probaby, but I think we need a new rfc with a benchmark for more
> information (there's no need to bother with the mode switching so it
> should be a tiny patch).
>
> One interesting thing is that gve implements bql.
>
> Thanks
Yea all this talk is rather pointless. Someone interested has to try.
Trying to activate the zerocopy tx machinery in vhost even for when
packet is actually copied could be one way to create feedback into VM.
> >
> >
> > > I was
> > > aghast at a fix jakub put in to kick things at 7ms that went by
> > > recently.
> >
> > which one is it?
> >
> > > Wireless is kind of an overly broad topic. I was (6 years ago) pretty
> > > happy with all the fixes we put in there for WiFi softmac devices, the
> > > mt76 and the new mt79 seem to be performing rather well. Ath9k is
> > > still good, ath10k not horrible, I have no data about ath11k, and
> > > let's not talk about the Broadcom nightmare.
> > >
> > > This was still a pretty good day, in my memory:
> > > https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002
> > >
> > > Is something else in wif igoing to hell? There are still, oh, 200
> > > drivers left to fix. ENOFUNDING.
> > >
> > > And so far as I know the 3GPP (5g) work is entirely out of tree and
> > > almost entirely dpdk or ebpf?
> > >
> > > >
> > > >
> > > > --
> > > > MST
> > > >
> > >
> > >
> > > --
> > > https://blog.cerowrt.org/post/2024_predictions/
> > > Dave Täht CSO, LibreQos
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-23 12:58 ` Dave Taht
2024-02-25 18:36 ` Michael S. Tsirkin
@ 2024-02-27 2:20 ` Xuan Zhuo
1 sibling, 0 replies; 9+ messages in thread
From: Xuan Zhuo @ 2024-02-27 2:20 UTC (permalink / raw)
To: Dave Taht; +Cc: Jason Wang, Michael S. Tsirkin, hengqi, netdev
On Fri, 23 Feb 2024 07:58:34 -0500, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > Hi Dave,
> >
> > We study the BQL recently.
> >
> > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > netdim, maybe it is time for a change. @Heng is working for the netdim.
> >
> > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > to me.
> >
> > The below number is good, but that just work when the nic is busy.
> >
> > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > BQL, tso on: 156-194K bytes in queue, 535 tps
>
> That is data from 2011 against a gbit interface. Each of those BQL
> queues is additive.
>
> > Or I miss something.
>
> What I see nowadays is 16+Mbytes vanishing into ring buffers and
> affecting packet pacing, and fair queue and QoS behaviors. Certainly
> my own efforts with eBPF and LibreQos are helping observability here,
> but it seems to me that the virtualized stack is not getting enough
> pushback from the underlying cloudy driver - be it this one, or nitro.
> Most of the time the packet shaping seems to take place in the cloud
> network or driver on a per-vm basis.
So for the virtualized stack, do you mean the virtio-net + tap(host).
But now, on the cloud the virtio-net devices are DPUs in most cases.
The DPU is passthrought to the vm. So the virtio-net devices work
more like the hw devices.
On this case, I can do some benchmarks, but I want to do the test
when the nic is not full to simulate the normal user cases.
Can the BQL help to reduce the latency or increase throughput?
Or other benefit.
Thanks.
>
> I know that adding BQL to virtio has been tried before, and I keep
> hoping it gets tried again,
> measuring latency under load.
>
> BQL has sprouted some new latency issues since 2011 given the enormous
> number of hardware queues exposed which I talked about a bit in my
> netdevconf talk here:
>
> https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
>
> I am also interested in how similar AI workloads are to the infamous
> rrul test in a virtualized environment also.
>
> There is also AFAP thinking mis-understood- with a really
> mind-bogglingly-wrong application of it documented over here, where
> 15ms of delay in the stack is considered good.
>
> https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
>
> So my overall concern is a bit broader than "just add bql", but in
> other drivers, it was only 6 lines of code....
>
> > Thanks.
> >
>
>
> --
> https://blog.cerowrt.org/post/2024_predictions/
> Dave Täht CSO, LibreQos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: virtio-net + BQL
2024-02-26 5:03 ` Jason Wang
2024-02-26 11:42 ` Michael S. Tsirkin
@ 2024-02-27 2:32 ` Xuan Zhuo
1 sibling, 0 replies; 9+ messages in thread
From: Xuan Zhuo @ 2024-02-27 2:32 UTC (permalink / raw)
To: Jason Wang; +Cc: Dave Taht, hengqi, netdev, Michael S. Tsirkin
On Mon, 26 Feb 2024 13:03:12 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Mon, Feb 26, 2024 at 4:26 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Feb 25, 2024 at 01:58:53PM -0500, Dave Taht wrote:
> > > On Sun, Feb 25, 2024 at 1:36 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Feb 23, 2024 at 07:58:34AM -0500, Dave Taht wrote:
> > > > > On Fri, Feb 23, 2024 at 3:59 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > >
> > > > > > Hi Dave,
> > > > > >
> > > > > > We study the BQL recently.
> > > > > >
> > > > > > For virtio-net, the skb orphan mode is the problem for the BQL. But now, we have
> > > > > > netdim, maybe it is time for a change. @Heng is working for the netdim.
> > > > > >
> > > > > > But the performance number from https://lwn.net/Articles/469652/ has not appeal
> > > > > > to me.
> > > > > >
> > > > > > The below number is good, but that just work when the nic is busy.
> > > > > >
> > > > > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > > > > BQL, tso on: 156-194K bytes in queue, 535 tps
> > > > >
> > > > > That is data from 2011 against a gbit interface. Each of those BQL
> > > > > queues is additive.
> > > > >
> > > > > > Or I miss something.
> > > > >
> > > > > What I see nowadays is 16+Mbytes vanishing into ring buffers and
> > > > > affecting packet pacing, and fair queue and QoS behaviors. Certainly
> > > > > my own efforts with eBPF and LibreQos are helping observability here,
> > > > > but it seems to me that the virtualized stack is not getting enough
> > > > > pushback from the underlying cloudy driver - be it this one, or nitro.
> > > > > Most of the time the packet shaping seems to take place in the cloud
> > > > > network or driver on a per-vm basis.
> > > > >
> > > > > I know that adding BQL to virtio has been tried before, and I keep
> > > > > hoping it gets tried again,
> > > > > measuring latency under load.
> > > > >
> > > > > BQL has sprouted some new latency issues since 2011 given the enormous
> > > > > number of hardware queues exposed which I talked about a bit in my
> > > > > netdevconf talk here:
> > > > >
> > > > > https://www.youtube.com/watch?v=rWnb543Sdk8&t=2603s
> > > > >
> > > > > I am also interested in how similar AI workloads are to the infamous
> > > > > rrul test in a virtualized environment also.
> > > > >
> > > > > There is also AFAP thinking mis-understood- with a really
> > > > > mind-bogglingly-wrong application of it documented over here, where
> > > > > 15ms of delay in the stack is considered good.
> > > > >
> > > > > https://github.com/cilium/cilium/issues/29083#issuecomment-1824756141
> > > > >
> > > > > So my overall concern is a bit broader than "just add bql", but in
> > > > > other drivers, it was only 6 lines of code....
> > > > >
> > > > > > Thanks.
> > > > > >
> > > > >
> > > > >
> > > >
> > > > It is less BQL it is more TCP small queues which do not
> > > > seem to work well when your kernel isn't running part of the
> > > > time because hypervisor scheduled it out. wireless has some
> > > > of the same problem with huge variance in latency unrelated
> > > > to load and IIRC worked around that by
> > > > tuning socket queue size slightly differently.
> > >
> > > Add that to the problems-with-virtualization list, then. :/
> >
> > yep
> >
> > for example, attempts to drop packets to fight bufferbloat do
> > not work well because as you start dropping packets you have less
> > work to do on host and so VM starts going even faster
> > flooding you with even more packets.
> >
> > virtualization has to be treated more like userspace than like
> > a physical machine.
>
> Probaby, but I think we need a new rfc with a benchmark for more
> information (there's no need to bother with the mode switching so it
> should be a tiny patch).
YES.
We need to know the cases that BQL can improve. Then I can do some
benchmarks on it.
I don't think the orphan mode is a problem. We can clarify that
the no-orphan mode is the future, so we can skip the orphan mode.
Thanks.
>
> One interesting thing is that gve implements bql.
>
> Thanks
>
> >
> >
> > > I was
> > > aghast at a fix jakub put in to kick things at 7ms that went by
> > > recently.
> >
> > which one is it?
> >
> > > Wireless is kind of an overly broad topic. I was (6 years ago) pretty
> > > happy with all the fixes we put in there for WiFi softmac devices, the
> > > mt76 and the new mt79 seem to be performing rather well. Ath9k is
> > > still good, ath10k not horrible, I have no data about ath11k, and
> > > let's not talk about the Broadcom nightmare.
> > >
> > > This was still a pretty good day, in my memory:
> > > https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002
> > >
> > > Is something else in wif igoing to hell? There are still, oh, 200
> > > drivers left to fix. ENOFUNDING.
> > >
> > > And so far as I know the 3GPP (5g) work is entirely out of tree and
> > > almost entirely dpdk or ebpf?
> > >
> > > >
> > > >
> > > > --
> > > > MST
> > > >
> > >
> > >
> > > --
> > > https://blog.cerowrt.org/post/2024_predictions/
> > > Dave Täht CSO, LibreQos
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-02-27 2:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-23 8:49 virtio-net + BQL Xuan Zhuo
2024-02-23 12:58 ` Dave Taht
2024-02-25 18:36 ` Michael S. Tsirkin
2024-02-25 18:58 ` Dave Taht
2024-02-25 20:26 ` Michael S. Tsirkin
2024-02-26 5:03 ` Jason Wang
2024-02-26 11:42 ` Michael S. Tsirkin
2024-02-27 2:32 ` Xuan Zhuo
2024-02-27 2:20 ` Xuan Zhuo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).