* qlen check in tun.c
@ 2013-06-19 2:31 Jerry Chu
2013-06-19 3:29 ` Jason Wang
0 siblings, 1 reply; 12+ messages in thread
From: Jerry Chu @ 2013-06-19 2:31 UTC (permalink / raw)
To: netdev@vger.kernel.org, Jason Wang
In tun_net_xmit() the max qlen is computed as
dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
latter may be way too small, forcing one to adjust txqueuelen based
on number of queues created. (Well the default txqueuelen of
500/TUN_READQ_SIZE already seems too small even for single queue.)
Wouldn't it be better to simply use dev->tx_queue_len to cap the qlen of
each queue? This also seems to be more consistent with h/w multi-queues.
Also is there any objection to increase MAX_TAP_QUEUES from 8 to 16?
Yes it will take up more space in struct tun_struct. But we are
hitting the perf limit of 8 queues.
Thanks,
Jerry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 2:31 qlen check in tun.c Jerry Chu
@ 2013-06-19 3:29 ` Jason Wang
2013-06-19 19:39 ` Jerry Chu
0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2013-06-19 3:29 UTC (permalink / raw)
To: Jerry Chu; +Cc: netdev@vger.kernel.org, Michael S. Tsirkin
On 06/19/2013 10:31 AM, Jerry Chu wrote:
> In tun_net_xmit() the max qlen is computed as
> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
> latter may be way too small, forcing one to adjust txqueuelen based
> on number of queues created. (Well the default txqueuelen of
> 500/TUN_READQ_SIZE already seems too small even for single queue.)
Hi Jerry:
Do you have some test result of this? Anyway, tun allows userspace to
adjust this value based on its requirement.
>
> Wouldn't it be better to simply use dev->tx_queue_len to cap the qlen of
> each queue? This also seems to be more consistent with h/w multi-queues.
Make sense. Michael, any ideas on this?
>
> Also is there any objection to increase MAX_TAP_QUEUES from 8 to 16?
> Yes it will take up more space in struct tun_struct. But we are
> hitting the perf limit of 8 queues.
Not only the tun_struct, another issue of this is sizeof(netdev_queue)
which is 320 currently, if we use 16, it may be greater than 4096 which
lead high order page allocation. Need a solution such as flex array or
array of pointers.
Btw, I have draft patch on both, will post as rfc.
Thanks
> Thanks,
>
> Jerry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 3:29 ` Jason Wang
@ 2013-06-19 19:39 ` Jerry Chu
2013-06-19 19:49 ` Rick Jones
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Jerry Chu @ 2013-06-19 19:39 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev@vger.kernel.org, Michael S. Tsirkin
Hi Jason,
On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>> In tun_net_xmit() the max qlen is computed as
>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>> latter may be way too small, forcing one to adjust txqueuelen based
>> on number of queues created. (Well the default txqueuelen of
>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>
> Hi Jerry:
>
> Do you have some test result of this? Anyway, tun allows userspace to
> adjust this value based on its requirement.
Sure, but the default size of 500 is just way too small. queue overflows even
with a simple single-stream throughput test through Openvswitch due to CPU
scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
queue overflow. But then with 8192 we'll be deep into the "buffer
bloat" territory.
We haven't figured out an optimal strategy for thruput vs latency, but
suffice to
say 500 is too small.
Jerry
>>
>> Wouldn't it be better to simply use dev->tx_queue_len to cap the qlen of
>> each queue? This also seems to be more consistent with h/w multi-queues.
>
> Make sense. Michael, any ideas on this?
>>
>> Also is there any objection to increase MAX_TAP_QUEUES from 8 to 16?
>> Yes it will take up more space in struct tun_struct. But we are
>> hitting the perf limit of 8 queues.
>
> Not only the tun_struct, another issue of this is sizeof(netdev_queue)
> which is 320 currently, if we use 16, it may be greater than 4096 which
> lead high order page allocation. Need a solution such as flex array or
> array of pointers.
>
> Btw, I have draft patch on both, will post as rfc.
>
> Thanks
>> Thanks,
>>
>> Jerry
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 19:39 ` Jerry Chu
@ 2013-06-19 19:49 ` Rick Jones
2013-06-19 20:42 ` Jerry Chu
2013-06-20 8:07 ` Michael S. Tsirkin
2013-06-21 6:44 ` Jason Wang
2 siblings, 1 reply; 12+ messages in thread
From: Rick Jones @ 2013-06-19 19:49 UTC (permalink / raw)
To: Jerry Chu; +Cc: Jason Wang, netdev@vger.kernel.org, Michael S. Tsirkin
On 06/19/2013 12:39 PM, Jerry Chu wrote:
> Hi Jason,
>
> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>>> In tun_net_xmit() the max qlen is computed as
>>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>>> latter may be way too small, forcing one to adjust txqueuelen based
>>> on number of queues created. (Well the default txqueuelen of
>>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>>
>> Hi Jerry:
>>
>> Do you have some test result of this? Anyway, tun allows userspace to
>> adjust this value based on its requirement.
>
> Sure, but the default size of 500 is just way too small. queue overflows even
> with a simple single-stream throughput test through Openvswitch due to CPU
> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
> queue overflow. But then with 8192 we'll be deep into the "buffer
> bloat" territory.
Assuming this single-stream is a netperf test, what happens when you cap
the socket buffers to 724000 bytes? Put another way, is this simply a
situation where the autotuning of the socket buffers/window is taking a
connection somewhere it shouldn't go?
> We haven't figured out an optimal strategy for thruput vs latency, but
> suffice to say 500 is too small.
Just what is the bandwidthXdelay product through the openvswitch?
happy benchmarking,
rick
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 19:49 ` Rick Jones
@ 2013-06-19 20:42 ` Jerry Chu
2013-06-19 21:37 ` Rick Jones
0 siblings, 1 reply; 12+ messages in thread
From: Jerry Chu @ 2013-06-19 20:42 UTC (permalink / raw)
To: Rick Jones; +Cc: Jason Wang, netdev@vger.kernel.org, Michael S. Tsirkin
On Wed, Jun 19, 2013 at 12:49 PM, Rick Jones <rick.jones2@hp.com> wrote:
> On 06/19/2013 12:39 PM, Jerry Chu wrote:
>>
>> Hi Jason,
>>
>> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>
>>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>>>>
>>>> In tun_net_xmit() the max qlen is computed as
>>>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>>>> latter may be way too small, forcing one to adjust txqueuelen based
>>>> on number of queues created. (Well the default txqueuelen of
>>>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>>>
>>>
>>> Hi Jerry:
>>>
>>> Do you have some test result of this? Anyway, tun allows userspace to
>>> adjust this value based on its requirement.
>>
>>
>> Sure, but the default size of 500 is just way too small. queue overflows
>> even
>> with a simple single-stream throughput test through Openvswitch due to CPU
>> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
>> queue overflow. But then with 8192 we'll be deep into the "buffer
>> bloat" territory.
>
>
> Assuming this single-stream is a netperf test, what happens when you cap the
> socket buffers to 724000 bytes? Put another way, is this simply a situation
> where the autotuning of the socket buffers/window is taking a connection
> somewhere it shouldn't go?
You have a good point - for single netperf streaming the TCP window seems to
grow much larger than necessary. Manually capping socket buffer seems to make
the problem go away without hurting throughput - but only to some extent.
Unfortunately manual setting is undesirable, and the autotuning code
is difficult
to "tune".
>
>
>> We haven't figured out an optimal strategy for thruput vs latency, but
>> suffice to say 500 is too small.
>
>
> Just what is the bandwidthXdelay product through the openvswitch?
Unlike the traditional NIC, for tuntap it'd be CPU b/w times scheduling delay.
Both can have a large variance. I haven't figured out how to right size the
qlen in this scenario.
Jerry
>
> happy benchmarking,
>
> rick
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 20:42 ` Jerry Chu
@ 2013-06-19 21:37 ` Rick Jones
0 siblings, 0 replies; 12+ messages in thread
From: Rick Jones @ 2013-06-19 21:37 UTC (permalink / raw)
To: Jerry Chu; +Cc: Jason Wang, netdev@vger.kernel.org, Michael S. Tsirkin
On 06/19/2013 01:42 PM, Jerry Chu wrote:
> On Wed, Jun 19, 2013 at 12:49 PM, Rick Jones <rick.jones2@hp.com> wrote:
>> Assuming this single-stream is a netperf test, what happens when you cap the
>> socket buffers to 724000 bytes? Put another way, is this simply a situation
>> where the autotuning of the socket buffers/window is taking a connection
>> somewhere it shouldn't go?
>
> You have a good point - for single netperf streaming the TCP window seems to
> grow much larger than necessary. Manually capping socket buffer seems to make
> the problem go away without hurting throughput - but only to some extent.
> Unfortunately manual setting is undesirable, and the autotuning code
> is difficult to "tune".
>
...
>> Just what is the bandwidthXdelay product through the openvswitch?
>
> Unlike the traditional NIC, for tuntap it'd be CPU b/w times scheduling delay.
> Both can have a large variance. I haven't figured out how to right size the
> qlen in this scenario.
In perhaps overly broad, handwaving terms, doesn't wireless have a
similar problem with highly variable latency/delay?
In theory, if your max scheduling delay is 10 milliseconds, your still
not large enough 8192 entry queue should still get you 1 GB/s assuming
that it is >> 1 GB/s between scheduling delays. Is there really an
expectation/requirement to get better than that or have even larger
scheduling delays? The existence of 40 and 100 GbE (even bonded 10GbE)
notwithstanding, once one is talking about 1 GB/s one is looking more at
SR-IOV I would think, not going through an Openvswitch.
Do you actually still see single-stream drops at 8192? That should be
something like 11 MB of queuing - I don't think I've seen tcp_[wr]mem go
above 6 MB thusfar...
rick
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 19:39 ` Jerry Chu
2013-06-19 19:49 ` Rick Jones
@ 2013-06-20 8:07 ` Michael S. Tsirkin
2013-06-24 5:13 ` Jason Wang
2013-06-25 22:23 ` Jerry Chu
2013-06-21 6:44 ` Jason Wang
2 siblings, 2 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2013-06-20 8:07 UTC (permalink / raw)
To: Jerry Chu; +Cc: Jason Wang, netdev@vger.kernel.org
On Wed, Jun 19, 2013 at 12:39:34PM -0700, Jerry Chu wrote:
> Hi Jason,
>
> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
> > On 06/19/2013 10:31 AM, Jerry Chu wrote:
> >> In tun_net_xmit() the max qlen is computed as
> >> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
> >> latter may be way too small, forcing one to adjust txqueuelen based
> >> on number of queues created. (Well the default txqueuelen of
> >> 500/TUN_READQ_SIZE already seems too small even for single queue.)
> >
> > Hi Jerry:
> >
> > Do you have some test result of this? Anyway, tun allows userspace to
> > adjust this value based on its requirement.
>
> Sure, but the default size of 500 is just way too small. queue overflows even
> with a simple single-stream throughput test through Openvswitch due to CPU
> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
> queue overflow. But then with 8192 we'll be deep into the "buffer
> bloat" territory.
> We haven't figured out an optimal strategy for thruput vs latency, but
> suffice to
> say 500 is too small.
>
> Jerry
Maybe TSO is off for you?
With TSO you can get 64Kbyte packets, 500 of these is 30 Mbytes!
We really should consider setting byte limits, not packet limits.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-19 19:39 ` Jerry Chu
2013-06-19 19:49 ` Rick Jones
2013-06-20 8:07 ` Michael S. Tsirkin
@ 2013-06-21 6:44 ` Jason Wang
2 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2013-06-21 6:44 UTC (permalink / raw)
To: Jerry Chu; +Cc: netdev@vger.kernel.org, Michael S. Tsirkin
On 06/20/2013 03:39 AM, Jerry Chu wrote:
> Hi Jason,
>
> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>>> In tun_net_xmit() the max qlen is computed as
>>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>>> latter may be way too small, forcing one to adjust txqueuelen based
>>> on number of queues created. (Well the default txqueuelen of
>>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>> Hi Jerry:
>>
>> Do you have some test result of this? Anyway, tun allows userspace to
>> adjust this value based on its requirement.
> Sure, but the default size of 500 is just way too small. queue overflows even
> with a simple single-stream throughput test through Openvswitch due to CPU
> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
> queue overflow. But then with 8192 we'll be deep into the "buffer
> bloat" territory.
Do the overflow also happens on bridge? If not, maybe a bug of openvswitch?
Btw, it the scheduler brings unexpected latency, increasing tx queue
length may just make things worse.
> We haven't figured out an optimal strategy for thruput vs latency, but
> suffice to
> say 500 is too small.
>
> Jerry
>
>>> Wouldn't it be better to simply use dev->tx_queue_len to cap the qlen of
>>> each queue? This also seems to be more consistent with h/w multi-queues.
>> Make sense. Michael, any ideas on this?
>>> Also is there any objection to increase MAX_TAP_QUEUES from 8 to 16?
>>> Yes it will take up more space in struct tun_struct. But we are
>>> hitting the perf limit of 8 queues.
>> Not only the tun_struct, another issue of this is sizeof(netdev_queue)
>> which is 320 currently, if we use 16, it may be greater than 4096 which
>> lead high order page allocation. Need a solution such as flex array or
>> array of pointers.
>>
>> Btw, I have draft patch on both, will post as rfc.
>>
>> Thanks
>>> Thanks,
>>>
>>> Jerry
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-20 8:07 ` Michael S. Tsirkin
@ 2013-06-24 5:13 ` Jason Wang
2013-06-25 22:23 ` Jerry Chu
1 sibling, 0 replies; 12+ messages in thread
From: Jason Wang @ 2013-06-24 5:13 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Jerry Chu, netdev@vger.kernel.org
On 06/20/2013 04:07 PM, Michael S. Tsirkin wrote:
> On Wed, Jun 19, 2013 at 12:39:34PM -0700, Jerry Chu wrote:
>> Hi Jason,
>>
>> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>>>> In tun_net_xmit() the max qlen is computed as
>>>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>>>> latter may be way too small, forcing one to adjust txqueuelen based
>>>> on number of queues created. (Well the default txqueuelen of
>>>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>>> Hi Jerry:
>>>
>>> Do you have some test result of this? Anyway, tun allows userspace to
>>> adjust this value based on its requirement.
>> Sure, but the default size of 500 is just way too small. queue overflows even
>> with a simple single-stream throughput test through Openvswitch due to CPU
>> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
>> queue overflow. But then with 8192 we'll be deep into the "buffer
>> bloat" territory.
>> We haven't figured out an optimal strategy for thruput vs latency, but
>> suffice to
>> say 500 is too small.
>>
>> Jerry
> Maybe TSO is off for you?
> With TSO you can get 64Kbyte packets, 500 of these is 30 Mbytes!
> We really should consider setting byte limits, not packet limits.
Do you mean BQL for tuntap?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-20 8:07 ` Michael S. Tsirkin
2013-06-24 5:13 ` Jason Wang
@ 2013-06-25 22:23 ` Jerry Chu
2013-06-26 5:23 ` Jason Wang
1 sibling, 1 reply; 12+ messages in thread
From: Jerry Chu @ 2013-06-25 22:23 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Jason Wang, netdev@vger.kernel.org
On Thu, Jun 20, 2013 at 1:07 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Jun 19, 2013 at 12:39:34PM -0700, Jerry Chu wrote:
>> Hi Jason,
>>
>> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>> > On 06/19/2013 10:31 AM, Jerry Chu wrote:
>> >> In tun_net_xmit() the max qlen is computed as
>> >> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>> >> latter may be way too small, forcing one to adjust txqueuelen based
>> >> on number of queues created. (Well the default txqueuelen of
>> >> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>> >
>> > Hi Jerry:
>> >
>> > Do you have some test result of this? Anyway, tun allows userspace to
>> > adjust this value based on its requirement.
>>
>> Sure, but the default size of 500 is just way too small. queue overflows even
>> with a simple single-stream throughput test through Openvswitch due to CPU
>> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
>> queue overflow. But then with 8192 we'll be deep into the "buffer
>> bloat" territory.
>> We haven't figured out an optimal strategy for thruput vs latency, but
>> suffice to
>> say 500 is too small.
>>
>> Jerry
>
> Maybe TSO is off for you?
> With TSO you can get 64Kbyte packets, 500 of these is 30 Mbytes!
> We really should consider setting byte limits, not packet limits.
Sorry for the delay. TSO was on when I was seeing lots of pkts drops.
But I realized the catch was GRO was off, which caused lots of MTU
size pkts to pile up on the receive side overflowing the small tuntap
queue.
I just finished implementing GRE support in the GRO stack. When I
turned it on, there were much less pkt drops. I do notice now the many
acks triggered by the thruput tests will cause the tuntap queue to
overflow.
In any case, with a large tx queue there should probably have some
queue mgmt or BQL logic going with it.
Jerry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
2013-06-25 22:23 ` Jerry Chu
@ 2013-06-26 5:23 ` Jason Wang
[not found] ` <CAPshTCjbOnZJ6c2tyLbBqZ3Pz=xi+cBMvi=0BqPDyiXJ+jDDOA@mail.gmail.com>
0 siblings, 1 reply; 12+ messages in thread
From: Jason Wang @ 2013-06-26 5:23 UTC (permalink / raw)
To: Jerry Chu; +Cc: Michael S. Tsirkin, netdev@vger.kernel.org
On 06/26/2013 06:23 AM, Jerry Chu wrote:
> On Thu, Jun 20, 2013 at 1:07 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Wed, Jun 19, 2013 at 12:39:34PM -0700, Jerry Chu wrote:
>>> Hi Jason,
>>>
>>> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
>>>>> In tun_net_xmit() the max qlen is computed as
>>>>> dev->tx_queue_len / tun->numqueues. For multi-queue configuration the
>>>>> latter may be way too small, forcing one to adjust txqueuelen based
>>>>> on number of queues created. (Well the default txqueuelen of
>>>>> 500/TUN_READQ_SIZE already seems too small even for single queue.)
>>>> Hi Jerry:
>>>>
>>>> Do you have some test result of this? Anyway, tun allows userspace to
>>>> adjust this value based on its requirement.
>>> Sure, but the default size of 500 is just way too small. queue overflows even
>>> with a simple single-stream throughput test through Openvswitch due to CPU
>>> scheduler anomaly. On our loaded multi-stream test even 8192 can't prevent
>>> queue overflow. But then with 8192 we'll be deep into the "buffer
>>> bloat" territory.
>>> We haven't figured out an optimal strategy for thruput vs latency, but
>>> suffice to
>>> say 500 is too small.
>>>
>>> Jerry
>> Maybe TSO is off for you?
>> With TSO you can get 64Kbyte packets, 500 of these is 30 Mbytes!
>> We really should consider setting byte limits, not packet limits.
> Sorry for the delay. TSO was on when I was seeing lots of pkts drops.
> But I realized the catch was GRO was off, which caused lots of MTU
> size pkts to pile up on the receive side overflowing the small tuntap
> queue.
>
> I just finished implementing GRE support in the GRO stack. When I
> turned it on, there were much less pkt drops. I do notice now the many
> acks triggered by the thruput tests will cause the tuntap queue to
> overflow.
Looks like you've modified tuntap codes since currently transmit GRE gso
packet were forbidden.
>
> In any case, with a large tx queue there should probably have some
> queue mgmt or BQL logic going with it.
It's not hard to do BQL for tuntap, but since it may cause packets to be
queued in qdisc which seems conflict with
5d097109257c03a71845729f8db6b5770c4bbedc (tun: only queue packets on
device) who just does the queuing in device.
Btw, I suspect this may be another reason to cause the packets to be
dropped in your case.
> Jerry
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qlen check in tun.c
[not found] ` <CAPshTCjbOnZJ6c2tyLbBqZ3Pz=xi+cBMvi=0BqPDyiXJ+jDDOA@mail.gmail.com>
@ 2013-06-26 5:44 ` Jason Wang
0 siblings, 0 replies; 12+ messages in thread
From: Jason Wang @ 2013-06-26 5:44 UTC (permalink / raw)
To: Jerry Chu; +Cc: Michael S. Tsirkin, netdev@vger.kernel.org
On 06/26/2013 01:32 PM, Jerry Chu wrote:
> On Tue, Jun 25, 2013 at 10:23 PM, Jason Wang <jasowang@redhat.com
> <mailto:jasowang@redhat.com>> wrote:
>
> On 06/26/2013 06:23 AM, Jerry Chu wrote:
> > On Thu, Jun 20, 2013 at 1:07 AM, Michael S. Tsirkin
> <mst@redhat.com <mailto:mst@redhat.com>> wrote:
> >> On Wed, Jun 19, 2013 at 12:39:34PM -0700, Jerry Chu wrote:
> >>> Hi Jason,
> >>>
> >>> On Tue, Jun 18, 2013 at 8:29 PM, Jason Wang
> <jasowang@redhat.com <mailto:jasowang@redhat.com>> wrote:
> >>>> On 06/19/2013 10:31 AM, Jerry Chu wrote:
> >>>>> In tun_net_xmit() the max qlen is computed as
> >>>>> dev->tx_queue_len / tun->numqueues. For multi-queue
> configuration the
> >>>>> latter may be way too small, forcing one to adjust
> txqueuelen based
> >>>>> on number of queues created. (Well the default txqueuelen of
> >>>>> 500/TUN_READQ_SIZE already seems too small even for single
> queue.)
> >>>> Hi Jerry:
> >>>>
> >>>> Do you have some test result of this? Anyway, tun allows
> userspace to
> >>>> adjust this value based on its requirement.
> >>> Sure, but the default size of 500 is just way too small. queue
> overflows even
> >>> with a simple single-stream throughput test through
> Openvswitch due to CPU
> >>> scheduler anomaly. On our loaded multi-stream test even 8192
> can't prevent
> >>> queue overflow. But then with 8192 we'll be deep into the "buffer
> >>> bloat" territory.
> >>> We haven't figured out an optimal strategy for thruput vs
> latency, but
> >>> suffice to
> >>> say 500 is too small.
> >>>
> >>> Jerry
> >> Maybe TSO is off for you?
> >> With TSO you can get 64Kbyte packets, 500 of these is 30 Mbytes!
> >> We really should consider setting byte limits, not packet limits.
> > Sorry for the delay. TSO was on when I was seeing lots of pkts
> drops.
> > But I realized the catch was GRO was off, which caused lots of MTU
> > size pkts to pile up on the receive side overflowing the small
> tuntap
> > queue.
> >
> > I just finished implementing GRE support in the GRO stack. When I
> > turned it on, there were much less pkt drops. I do notice now
> the many
> > acks triggered by the thruput tests will cause the tuntap queue to
> > overflow.
>
> Looks like you've modified tuntap codes since currently transmit
> GRE gso
> packet were forbidden.
>
>
> Not sure what you meant above. The change I made was all in the GRO
> stack (to support GRE pkts). No change to tuntap code. (Hopefully I
> can find
> the time to submit the patch in the near future.)
I infer from the above since you say "I just finished implementing GRE
support in the GRO stack. When I turned it on, there were much less pkt
drops." Since virtio-net does not use GRO, so I thought the GRE GRO were
enabled by host, and you see less packet drops in tuntap? If yes, looks
strange since tuntap can't transmit GRO GSO packet.
>
>
> >
> > In any case, with a large tx queue there should probably have some
> > queue mgmt or BQL logic going with it.
>
> It's not hard to do BQL for tuntap, but since it may cause packets
> to be
> queued in qdisc which seems conflict with
> 5d097109257c03a71845729f8db6b5770c4bbedc (tun: only queue packets on
> device) who just does the queuing in device.
>
>
> I don't see how changing the max qlen in the device conflicts with the
> above
> change, which is simply done by never flow control the qdisc above it.
I mean BQL may conflict with the change. Just changing the max qlen is
ok but it may hard to find a value which is good for all kinds of workload.
>
>
> Btw, I suspect this may be another reason to cause the packets to be
> dropped in your case.
>
> > Jerry
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> <mailto:majordomo@vger.kernel.org>
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-06-26 5:45 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-19 2:31 qlen check in tun.c Jerry Chu
2013-06-19 3:29 ` Jason Wang
2013-06-19 19:39 ` Jerry Chu
2013-06-19 19:49 ` Rick Jones
2013-06-19 20:42 ` Jerry Chu
2013-06-19 21:37 ` Rick Jones
2013-06-20 8:07 ` Michael S. Tsirkin
2013-06-24 5:13 ` Jason Wang
2013-06-25 22:23 ` Jerry Chu
2013-06-26 5:23 ` Jason Wang
[not found] ` <CAPshTCjbOnZJ6c2tyLbBqZ3Pz=xi+cBMvi=0BqPDyiXJ+jDDOA@mail.gmail.com>
2013-06-26 5:44 ` Jason Wang
2013-06-21 6:44 ` Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).