netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* VLAN I/F's and TX queue.
@ 2010-05-03 11:34 Joakim Tjernlund
  0 siblings, 0 replies; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-03 11:34 UTC (permalink / raw)
  To: netdev


We noted dropped pkgs on our VLAN interfaces and i stated to look
for a cause. Here is a ifconfig example:

eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)

eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)

eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)

Here I note that txqueuelen is 0 for eth0.1/eth0.1.1 and 100 for eth0 and
that it is only eth0.1 and eth0.1.1 that drops pkgs. It feels as if eth0.1
bypasses eth0's tx queue and passes pkgs directly to the HW driver. Is that so?
If so, that feels a bit strange and I am not sure how to best
fix this. Any ides?

Using kernel 2.6.33

     Jocke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
       [not found] <OF5A42C874.3AF220FE-ONC1257718.003ABC6E-C1257718.003F94D2@LocalDomain>
@ 2010-05-07  8:04 ` Joakim Tjernlund
  2010-05-07  8:53   ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-07  8:04 UTC (permalink / raw)
  To: netdev

Joakim Tjernlund/Transmode wrote on 2010/05/03 13:34:28:
>
> We noted dropped pkgs on our VLAN interfaces and i stated to look
> for a cause. Here is a ifconfig example:
>
> eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:100
>           RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)
>
> eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)
>
> eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)
>
> Here I note that txqueuelen is 0 for eth0.1/eth0.1.1 and 100 for eth0 and
> that it is only eth0.1 and eth0.1.1 that drops pkgs. It feels as if eth0.1
> bypasses eth0's tx queue and passes pkgs directly to the HW driver. Is that so?
> If so, that feels a bit strange and I am not sure how to best
> fix this. Any ides?
>
> Using kernel 2.6.33

So I did some more testing
two nodes A and B connected over a slow link.
Create two VLAN's as above and start pinging from A to B
with pkg size 9600, start a few(4-10) parallel ping processes.

Now I see dropped packages on B, the receiver of pings, and no
pkg loss on A.

1) since the link is symmetrical, why do I only see pkg loss
   at B?

2) pkg loss in B only manifests on the VLAN's interfaces and
   always in pair as if one lost pkg is counted twice?

3) I would expect lost pkgs to be accounted on eth0 instead of
   the VLAN interface(s) since that is where the pkg is lost, why
   isn't it so?

    Jocke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-07  8:04 ` VLAN I/F's and TX queue Joakim Tjernlund
@ 2010-05-07  8:53   ` Eric Dumazet
  2010-05-07  9:29     ` Joakim Tjernlund
  2010-05-10 14:26     ` Patrick McHardy
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Dumazet @ 2010-05-07  8:53 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev, Patrick McHardy

Le vendredi 07 mai 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
> Joakim Tjernlund/Transmode wrote on 2010/05/03 13:34:28:
> >
> > We noted dropped pkgs on our VLAN interfaces and i stated to look
> > for a cause. Here is a ifconfig example:
> >
> > eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:100
> >           RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)
> >
> > eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)
> >
> > eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)
> >
> > Here I note that txqueuelen is 0 for eth0.1/eth0.1.1 and 100 for eth0 and
> > that it is only eth0.1 and eth0.1.1 that drops pkgs. It feels as if eth0.1
> > bypasses eth0's tx queue and passes pkgs directly to the HW driver. Is that so?
> > If so, that feels a bit strange and I am not sure how to best
> > fix this. Any ides?
> >
> > Using kernel 2.6.33
> 
> So I did some more testing
> two nodes A and B connected over a slow link.
> Create two VLAN's as above and start pinging from A to B
> with pkg size 9600, start a few(4-10) parallel ping processes.
> 
> Now I see dropped packages on B, the receiver of pings, and no
> pkg loss on A.
> 

dropped on RX path or TX path ?

> 1) since the link is symmetrical, why do I only see pkg loss
>    at B?
> 
> 2) pkg loss in B only manifests on the VLAN's interfaces and
>    always in pair as if one lost pkg is counted twice?
> 

Congestion notifications can be stacked since commit cbbef5e183079
(vlan/macvlan: propagate transmission state to upper layers)

> 3) I would expect lost pkgs to be accounted on eth0 instead of
>    the VLAN interface(s) since that is where the pkg is lost, why
>    isn't it so?

You try to send packets on eth0.XXX, some are dropped, and accounted for
on eth0.XXX stats. What is wrong with this ?

If you want to avoid this, just add queues to your vlans

ip link add link eth0 eth0.103 txqueuelen 100 type vlan id 103

Patrick what do you think of special casing NET_XMIT_CN ?


diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b5249c5..c671b1a 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -327,6 +327,8 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
+	ret = net_xmit_eval(ret);
+
 	if (likely(ret == NET_XMIT_SUCCESS)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
@@ -353,6 +355,8 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
+	ret = net_xmit_eval(ret);
+
 	if (likely(ret == NET_XMIT_SUCCESS)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-07  8:53   ` Eric Dumazet
@ 2010-05-07  9:29     ` Joakim Tjernlund
  2010-05-10 14:33       ` Patrick McHardy
  2010-05-10 14:26     ` Patrick McHardy
  1 sibling, 1 reply; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-07  9:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Patrick McHardy, netdev

Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/05/07 10:53:23:
>
> Le vendredi 07 mai 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
> > Joakim Tjernlund/Transmode wrote on 2010/05/03 13:34:28:
> > >
> > > We noted dropped pkgs on our VLAN interfaces and i stated to look
> > > for a cause. Here is a ifconfig example:
> > >
> > > eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> > >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >           RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
> > >           TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
> > >           collisions:0 txqueuelen:100
> > >           RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)
> > >
> > > eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> > >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
> > >           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
> > >           collisions:0 txqueuelen:0
> > >           RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)
> > >
> > > eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
> > >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
> > >           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
> > >           collisions:0 txqueuelen:0
> > >           RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)
> > >
> > > Here I note that txqueuelen is 0 for eth0.1/eth0.1.1 and 100 for eth0 and
> > > that it is only eth0.1 and eth0.1.1 that drops pkgs. It feels as if eth0.1
> > > bypasses eth0's tx queue and passes pkgs directly to the HW driver. Is that so?
> > > If so, that feels a bit strange and I am not sure how to best
> > > fix this. Any ides?
> > >
> > > Using kernel 2.6.33
> >
> > So I did some more testing
> > two nodes A and B connected over a slow link.
> > Create two VLAN's as above and start pinging from A to B
> > with pkg size 9600, start a few(4-10) parallel ping processes.
> >
> > Now I see dropped packages on B, the receiver of pings, and no
> > pkg loss on A.
> >
>
> dropped on RX path or TX path ?

On TX path(see the ifconfig listing above)

>
> > 1) since the link is symmetrical, why do I only see pkg loss
> >    at B?
> >
> > 2) pkg loss in B only manifests on the VLAN's interfaces and
> >    always in pair as if one lost pkg is counted twice?
> >
>
> Congestion notifications can be stacked since commit cbbef5e183079
> (vlan/macvlan: propagate transmission state to upper layers)

I see.

>
> > 3) I would expect lost pkgs to be accounted on eth0 instead of
> >    the VLAN interface(s) since that is where the pkg is lost, why
> >    isn't it so?
>
> You try to send packets on eth0.XXX, some are dropped, and accounted for
> on eth0.XXX stats. What is wrong with this ?

In this case one lost pkg is accounted for twice, once on eth0.1 and
once more on eth0.1.1. Note that eth0.1.1 is stacked on
top of eth0.1

I would at least expect eth0 to also account lost pkgs too.
I was confused by the current accounting as I knew that
the underlying HW I/F should be the only I/F that could
drop pkgs.

>
> If you want to avoid this, just add queues to your vlans
>
> ip link add link eth0 eth0.103 txqueuelen 100 type vlan id 103

From memory now, but that didn't help. Still accounts pgks
as described. Why would where to account pkgs be affected by
queue or no queue?

  Jocke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-07  8:53   ` Eric Dumazet
  2010-05-07  9:29     ` Joakim Tjernlund
@ 2010-05-10 14:26     ` Patrick McHardy
  2010-05-10 14:36       ` Eric Dumazet
  1 sibling, 1 reply; 14+ messages in thread
From: Patrick McHardy @ 2010-05-10 14:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Joakim Tjernlund, netdev

Eric Dumazet wrote:
> Le vendredi 07 mai 2010 à 10:04 +0200, Joakim Tjernlund a écrit :
>> So I did some more testing
>> two nodes A and B connected over a slow link.
>> Create two VLAN's as above and start pinging from A to B
>> with pkg size 9600, start a few(4-10) parallel ping processes.
>>
>> Now I see dropped packages on B, the receiver of pings, and no
>> pkg loss on A.
>>
> 
> dropped on RX path or TX path ?
> 
>> 1) since the link is symmetrical, why do I only see pkg loss
>>    at B?
>>
>> 2) pkg loss in B only manifests on the VLAN's interfaces and
>>    always in pair as if one lost pkg is counted twice?
>>
> 
> Congestion notifications can be stacked since commit cbbef5e183079
> (vlan/macvlan: propagate transmission state to upper layers)
> 
>> 3) I would expect lost pkgs to be accounted on eth0 instead of
>>    the VLAN interface(s) since that is where the pkg is lost, why
>>    isn't it so?
> 
> You try to send packets on eth0.XXX, some are dropped, and accounted for
> on eth0.XXX stats. What is wrong with this ?
> 
> If you want to avoid this, just add queues to your vlans
> 
> ip link add link eth0 eth0.103 txqueuelen 100 type vlan id 103
> 
> Patrick what do you think of special casing NET_XMIT_CN ?

Is the intention just to avoid accounting the packet as dropped?
That seems fine to me since in case of NET_XMIT_CN its actually
not the currently transmitted packet that was dropped.

But part of the intention of the above mentioned patch was actually
to inform higher layers of congestion so they can take action if
desired, which would be defeated by this patch.

> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index b5249c5..c671b1a 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -327,6 +327,8 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
>  	len = skb->len;
>  	ret = dev_queue_xmit(skb);
>  
> +	ret = net_xmit_eval(ret);
> +
>  	if (likely(ret == NET_XMIT_SUCCESS)) {
>  		txq->tx_packets++;
>  		txq->tx_bytes += len;
> @@ -353,6 +355,8 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
>  	len = skb->len;
>  	ret = dev_queue_xmit(skb);
>  
> +	ret = net_xmit_eval(ret);
> +
>  	if (likely(ret == NET_XMIT_SUCCESS)) {
>  		txq->tx_packets++;
>  		txq->tx_bytes += len;
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-07  9:29     ` Joakim Tjernlund
@ 2010-05-10 14:33       ` Patrick McHardy
  2010-05-10 14:50         ` Joakim Tjernlund
  0 siblings, 1 reply; 14+ messages in thread
From: Patrick McHardy @ 2010-05-10 14:33 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, netdev

Joakim Tjernlund wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/05/07 10:53:23:
>>> 3) I would expect lost pkgs to be accounted on eth0 instead of
>>>    the VLAN interface(s) since that is where the pkg is lost, why
>>>    isn't it so?
>> You try to send packets on eth0.XXX, some are dropped, and accounted for
>> on eth0.XXX stats. What is wrong with this ?
> 
> In this case one lost pkg is accounted for twice, once on eth0.1 and
> once more on eth0.1.1. Note that eth0.1.1 is stacked on
> top of eth0.1
> 
> I would at least expect eth0 to also account lost pkgs too.
> I was confused by the current accounting as I knew that
> the underlying HW I/F should be the only I/F that could
> drop pkgs.

In case of NET_XMIT_CN, the packet is dropped by the qdisc before
it reaches eth0, so its only accounted on the upper devices.

>> If you want to avoid this, just add queues to your vlans
>>
>> ip link add link eth0 eth0.103 txqueuelen 100 type vlan id 103
> 
>>From memory now, but that didn't help. Still accounts pgks
> as described. Why would where to account pkgs be affected by
> queue or no queue?

If a queue is used on the vlan device, it will queue the packet
until the lower device is able to transmit it (unless its own
queue overflows).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:26     ` Patrick McHardy
@ 2010-05-10 14:36       ` Eric Dumazet
  2010-05-10 14:41         ` Patrick McHardy
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2010-05-10 14:36 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Joakim Tjernlund, netdev

Le lundi 10 mai 2010 à 16:26 +0200, Patrick McHardy a écrit :

> 
> Is the intention just to avoid accounting the packet as dropped?
> That seems fine to me since in case of NET_XMIT_CN its actually
> not the currently transmitted packet that was dropped.
> 
> But part of the intention of the above mentioned patch was actually
> to inform higher layers of congestion so they can take action if
> desired, which would be defeated by this patch.
> 

I see, so maybe we want following patch instead ?

(letting NET_XMIT_CN be given to caller, but accounting current packet
as transmitted ?)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b5249c5..55be908 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -327,7 +327,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
-	if (likely(ret == NET_XMIT_SUCCESS)) {
+	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
 	} else
@@ -353,7 +353,7 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
-	if (likely(ret == NET_XMIT_SUCCESS)) {
+	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
 	} else



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:36       ` Eric Dumazet
@ 2010-05-10 14:41         ` Patrick McHardy
  2010-05-10 14:51           ` Eric Dumazet
  2010-05-10 14:54           ` Joakim Tjernlund
  0 siblings, 2 replies; 14+ messages in thread
From: Patrick McHardy @ 2010-05-10 14:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Joakim Tjernlund, netdev

Eric Dumazet wrote:
> Le lundi 10 mai 2010 à 16:26 +0200, Patrick McHardy a écrit :
> 
>> Is the intention just to avoid accounting the packet as dropped?
>> That seems fine to me since in case of NET_XMIT_CN its actually
>> not the currently transmitted packet that was dropped.
>>
>> But part of the intention of the above mentioned patch was actually
>> to inform higher layers of congestion so they can take action if
>> desired, which would be defeated by this patch.
>>
> 
> I see, so maybe we want following patch instead ?
> 
> (letting NET_XMIT_CN be given to caller, but accounting current packet
> as transmitted ?)

Perfect, thanks. I'd suggest to change macvlan in a similar fashion
for consistency though.

In any case please feel free to add my

Acked-by: Patrick McHardy <kaber@trash.net>

> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index b5249c5..55be908 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -327,7 +327,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
>  	len = skb->len;
>  	ret = dev_queue_xmit(skb);
>  
> -	if (likely(ret == NET_XMIT_SUCCESS)) {
> +	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
>  		txq->tx_packets++;
>  		txq->tx_bytes += len;
>  	} else
> @@ -353,7 +353,7 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
>  	len = skb->len;
>  	ret = dev_queue_xmit(skb);
>  
> -	if (likely(ret == NET_XMIT_SUCCESS)) {
> +	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
>  		txq->tx_packets++;
>  		txq->tx_bytes += len;
>  	} else
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:33       ` Patrick McHardy
@ 2010-05-10 14:50         ` Joakim Tjernlund
  2010-05-16  7:40           ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-10 14:50 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric Dumazet, netdev

Patrick McHardy <kaber@trash.net> wrote on 2010/05/10 16:33:00:
>
> Joakim Tjernlund wrote:
> > Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/05/07 10:53:23:
> >>> 3) I would expect lost pkgs to be accounted on eth0 instead of
> >>>    the VLAN interface(s) since that is where the pkg is lost, why
> >>>    isn't it so?
> >> You try to send packets on eth0.XXX, some are dropped, and accounted for
> >> on eth0.XXX stats. What is wrong with this ?
> >
> > In this case one lost pkg is accounted for twice, once on eth0.1 and
> > once more on eth0.1.1. Note that eth0.1.1 is stacked on
> > top of eth0.1
> >
> > I would at least expect eth0 to also account lost pkgs too.
> > I was confused by the current accounting as I knew that
> > the underlying HW I/F should be the only I/F that could
> > drop pkgs.
>
> In case of NET_XMIT_CN, the packet is dropped by the qdisc before
> it reaches eth0, so its only accounted on the upper devices.

hmm, I am afraid I don't follow this. Why would a pkg be dropped before
it reaches eth0?

>
> >> If you want to avoid this, just add queues to your vlans
> >>
> >> ip link add link eth0 eth0.103 txqueuelen 100 type vlan id 103
> >
> >>From memory now, but that didn't help. Still accounts pgks
> > as described. Why would where to account pkgs be affected by
> > queue or no queue?
>
> If a queue is used on the vlan device, it will queue the packet
> until the lower device is able to transmit it (unless its own
> queue overflows).

And if a pkg is is lost this also changes where to account dropped?
I don't understand this. The queue may prevent pkg loss to some degree
but I don't get why a queue!=0 would change on which interface to
account for lost pkg's.

      Jocke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:41         ` Patrick McHardy
@ 2010-05-10 14:51           ` Eric Dumazet
  2010-05-10 14:54           ` Joakim Tjernlund
  1 sibling, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2010-05-10 14:51 UTC (permalink / raw)
  To: Patrick McHardy, David Miller; +Cc: Joakim Tjernlund, netdev

Le lundi 10 mai 2010 à 16:41 +0200, Patrick McHardy a écrit :
> Eric Dumazet wrote:
> > Le lundi 10 mai 2010 à 16:26 +0200, Patrick McHardy a écrit :
> > 
> >> Is the intention just to avoid accounting the packet as dropped?
> >> That seems fine to me since in case of NET_XMIT_CN its actually
> >> not the currently transmitted packet that was dropped.
> >>
> >> But part of the intention of the above mentioned patch was actually
> >> to inform higher layers of congestion so they can take action if
> >> desired, which would be defeated by this patch.
> >>
> > 
> > I see, so maybe we want following patch instead ?
> > 
> > (letting NET_XMIT_CN be given to caller, but accounting current packet
> > as transmitted ?)
> 
> Perfect, thanks. I'd suggest to change macvlan in a similar fashion
> for consistency though.
> 
> In any case please feel free to add my
> 
> Acked-by: Patrick McHardy <kaber@trash.net>

Indeed, thanks !

[PATCH net-next-2.6] net: congestion notifications are not dropped packets

vlan/macvlan start_xmit() can inform caller of congestion with
NET_XMIT_CN return value. This doesnt mean packet was dropped.
Increment normal stat counters instead of tx_dropped.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
---
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 9a939d8..0f8cd95 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -243,7 +243,7 @@ netdev_tx_t macvlan_start_xmit(struct sk_buff *skb,
 	int ret;
 
 	ret = macvlan_queue_xmit(skb, dev);
-	if (likely(ret == NET_XMIT_SUCCESS)) {
+	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
 	} else
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b5249c5..55be908 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -327,7 +327,7 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
-	if (likely(ret == NET_XMIT_SUCCESS)) {
+	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
 	} else
@@ -353,7 +353,7 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
 	len = skb->len;
 	ret = dev_queue_xmit(skb);
 
-	if (likely(ret == NET_XMIT_SUCCESS)) {
+	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
 		txq->tx_packets++;
 		txq->tx_bytes += len;
 	} else



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:41         ` Patrick McHardy
  2010-05-10 14:51           ` Eric Dumazet
@ 2010-05-10 14:54           ` Joakim Tjernlund
  2010-05-10 15:14             ` Eric Dumazet
  1 sibling, 1 reply; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-10 14:54 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric Dumazet, netdev

Patrick McHardy <kaber@trash.net> wrote on 2010/05/10 16:41:40:
>
> Eric Dumazet wrote:
> > Le lundi 10 mai 2010 à 16:26 +0200, Patrick McHardy a écrit :
> >
> >> Is the intention just to avoid accounting the packet as dropped?
> >> That seems fine to me since in case of NET_XMIT_CN its actually
> >> not the currently transmitted packet that was dropped.
> >>
> >> But part of the intention of the above mentioned patch was actually
> >> to inform higher layers of congestion so they can take action if
> >> desired, which would be defeated by this patch.
> >>
> >
> > I see, so maybe we want following patch instead ?
> >
> > (letting NET_XMIT_CN be given to caller, but accounting current packet
> > as transmitted ?)
>
> Perfect, thanks. I'd suggest to change macvlan in a similar fashion
> for consistency though.
>
> In any case please feel free to add my
>
> Acked-by: Patrick McHardy <kaber@trash.net>

hmm, as I don't follow this could you tell me where the dropped pkgs
are accounted in my case:
 eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:100
           RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)

 eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)

 eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
           TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:54           ` Joakim Tjernlund
@ 2010-05-10 15:14             ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2010-05-10 15:14 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Patrick McHardy, netdev

Le lundi 10 mai 2010 à 16:54 +0200, Joakim Tjernlund a écrit :
> Patrick McHardy <kaber@trash.net> wrote on 2010/05/10 16:41:40:
> >
> > Eric Dumazet wrote:
> > > Le lundi 10 mai 2010 à 16:26 +0200, Patrick McHardy a écrit :
> > >
> > >> Is the intention just to avoid accounting the packet as dropped?
> > >> That seems fine to me since in case of NET_XMIT_CN its actually
> > >> not the currently transmitted packet that was dropped.
> > >>
> > >> But part of the intention of the above mentioned patch was actually
> > >> to inform higher layers of congestion so they can take action if
> > >> desired, which would be defeated by this patch.
> > >>
> > >
> > > I see, so maybe we want following patch instead ?
> > >
> > > (letting NET_XMIT_CN be given to caller, but accounting current packet
> > > as transmitted ?)
> >
> > Perfect, thanks. I'd suggest to change macvlan in a similar fashion
> > for consistency though.
> >
> > In any case please feel free to add my
> >
> > Acked-by: Patrick McHardy <kaber@trash.net>
> 
> hmm, as I don't follow this could you tell me where the dropped pkgs
> are accounted in my case:
>  eth0      Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:8886910 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:8880219 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:100
>            RX bytes:1626842951 (1.5 GiB)  TX bytes:1555540810 (1.4 GiB)
> 
>  eth0.1    Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
>            collisions:0 txqueuelen:0
>            RX bytes:2467090557 (2.2 GiB)  TX bytes:2480246455 (2.3 GiB)
> 
>  eth0.1.1  Link encap:Ethernet  HWaddr 00:AA:BB:CC:DD:EE
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:2163164 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:2161943 errors:0 dropped:98 overruns:0 carrier:0
>            collisions:0 txqueuelen:0
>            RX bytes:2458437901 (2.2 GiB)  TX bytes:2471598683 (2.3 GiB)
> 
> -

As I already said to you, this is a side effect, because your eth0.1.1
and eth0.1 have no queue.

Packet is not dropped, its only a congestion indication.

If you use a queue on eth0.1.1, queue itself handle the queueing
(obviously), and queue drops are not reported in 'ifconfig -a', but in 
'tc -s -d qdisc'




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-10 14:50         ` Joakim Tjernlund
@ 2010-05-16  7:40           ` David Miller
  2010-05-16 14:22             ` Joakim Tjernlund
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2010-05-16  7:40 UTC (permalink / raw)
  To: joakim.tjernlund; +Cc: kaber, eric.dumazet, netdev

From: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Date: Mon, 10 May 2010 16:50:20 +0200

> Patrick McHardy <kaber@trash.net> wrote on 2010/05/10 16:33:00:
>>
>> Joakim Tjernlund wrote:
>> > Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/05/07 10:53:23:
>> >>> 3) I would expect lost pkgs to be accounted on eth0 instead of
>> >>>    the VLAN interface(s) since that is where the pkg is lost, why
>> >>>    isn't it so?
>> >> You try to send packets on eth0.XXX, some are dropped, and accounted for
>> >> on eth0.XXX stats. What is wrong with this ?
>> >
>> > In this case one lost pkg is accounted for twice, once on eth0.1 and
>> > once more on eth0.1.1. Note that eth0.1.1 is stacked on
>> > top of eth0.1
>> >
>> > I would at least expect eth0 to also account lost pkgs too.
>> > I was confused by the current accounting as I knew that
>> > the underlying HW I/F should be the only I/F that could
>> > drop pkgs.
>>
>> In case of NET_XMIT_CN, the packet is dropped by the qdisc before
>> it reaches eth0, so its only accounted on the upper devices.
> 
> hmm, I am afraid I don't follow this. Why would a pkg be dropped before
> it reaches eth0?

Because we have packet schedulers that sit before the device transmit
happens, and those packet schedulers enforce limits based upon
classification results or other criteria, and if those limits are
exceeded packets are droppers and NET_XMIT_CN is returned back up into
the transmit path of the networking stack.

The device never sees that packet get submitted to it's ->ndo_start_xmit()
routine, and this is entirely intentional.  And it is entirely intentional
that NET_XMIT_CN gets passed up into the caller, where protocols such as
TCP can key off this information to make congestion control decisions.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: VLAN I/F's and TX queue.
  2010-05-16  7:40           ` David Miller
@ 2010-05-16 14:22             ` Joakim Tjernlund
  0 siblings, 0 replies; 14+ messages in thread
From: Joakim Tjernlund @ 2010-05-16 14:22 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, kaber, netdev


David Miller <davem@davemloft.net> wrote on 2010/05/16 09:40:41:
>
> From: Joakim Tjernlund <joakim.tjernlund@transmode.se>
> Date: Mon, 10 May 2010 16:50:20 +0200
>
> > Patrick McHardy <kaber@trash.net> wrote on 2010/05/10 16:33:00:
> >>
> >> Joakim Tjernlund wrote:
> >> > Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/05/07 10:53:23:
> >> >>> 3) I would expect lost pkgs to be accounted on eth0 instead of
> >> >>>    the VLAN interface(s) since that is where the pkg is lost, why
> >> >>>    isn't it so?
> >> >> You try to send packets on eth0.XXX, some are dropped, and accounted for
> >> >> on eth0.XXX stats. What is wrong with this ?
> >> >
> >> > In this case one lost pkg is accounted for twice, once on eth0.1 and
> >> > once more on eth0.1.1. Note that eth0.1.1 is stacked on
> >> > top of eth0.1
> >> >
> >> > I would at least expect eth0 to also account lost pkgs too.
> >> > I was confused by the current accounting as I knew that
> >> > the underlying HW I/F should be the only I/F that could
> >> > drop pkgs.
> >>
> >> In case of NET_XMIT_CN, the packet is dropped by the qdisc before
> >> it reaches eth0, so its only accounted on the upper devices.
> >
> > hmm, I am afraid I don't follow this. Why would a pkg be dropped before
> > it reaches eth0?
>
> Because we have packet schedulers that sit before the device transmit
> happens, and those packet schedulers enforce limits based upon
> classification results or other criteria, and if those limits are
> exceeded packets are droppers and NET_XMIT_CN is returned back up into
> the transmit path of the networking stack.

OK, but what I don't get is if pgks are dropped as soon as the underlying
device cannot handle the pkg directly(returns !NETDEV_TX_OK or stops the queue)?
Are !NETDEV_TX_OK and stopping the queue handled differently by upper layers?
I would have expected the pkg be added to the TX queue and transmitted somewhat later.
If not, what is the TX queue for?

>
> The device never sees that packet get submitted to it's ->ndo_start_xmit()
> routine, and this is entirely intentional.  And it is entirely intentional
> that NET_XMIT_CN gets passed up into the caller, where protocols such as
> TCP can key off this information to make congestion control decisions.

In this case it gets passed up to the VLAN driver, should the VLAN driver
do something else to use the TX queue?

      Jocke


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-05-16 14:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <OF5A42C874.3AF220FE-ONC1257718.003ABC6E-C1257718.003F94D2@LocalDomain>
2010-05-07  8:04 ` VLAN I/F's and TX queue Joakim Tjernlund
2010-05-07  8:53   ` Eric Dumazet
2010-05-07  9:29     ` Joakim Tjernlund
2010-05-10 14:33       ` Patrick McHardy
2010-05-10 14:50         ` Joakim Tjernlund
2010-05-16  7:40           ` David Miller
2010-05-16 14:22             ` Joakim Tjernlund
2010-05-10 14:26     ` Patrick McHardy
2010-05-10 14:36       ` Eric Dumazet
2010-05-10 14:41         ` Patrick McHardy
2010-05-10 14:51           ` Eric Dumazet
2010-05-10 14:54           ` Joakim Tjernlund
2010-05-10 15:14             ` Eric Dumazet
2010-05-03 11:34 Joakim Tjernlund

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).