From mboxrd@z Thu Jan 1 00:00:00 1970 From: Karl Hiramoto Subject: [RFC] atm/br2684 netif_*_queue, packet loss or performance loss. Date: Wed, 26 Aug 2009 11:45:16 +0200 Message-ID: <4A95042C.3010209@hiramoto.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: netdev@vger.kernel.org Return-path: Received: from hapkido.dreamhost.com ([66.33.216.122]:46521 "EHLO hapkido.dreamhost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753350AbZHZJp6 (ORCPT ); Wed, 26 Aug 2009 05:45:58 -0400 Received: from spunkymail-a6.g.dreamhost.com (caiajhbdcaid.dreamhost.com [208.97.132.83]) by hapkido.dreamhost.com (Postfix) with ESMTP id 2C1AF17F5AA for ; Wed, 26 Aug 2009 02:46:00 -0700 (PDT) Received: from [192.168.10.51] (12.Red-81-44-17.dynamicIP.rima-tde.net [81.44.17.12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by spunkymail-a6.g.dreamhost.com (Postfix) with ESMTP id BE616109F28 for ; Wed, 26 Aug 2009 02:45:19 -0700 (PDT) Sender: netdev-owner@vger.kernel.org List-ID: Hi, With a br2684/ IPoE bridge, I was seeing more than 50% packet loss under heavy load without this patch in a vanilla 2.6.28.9 and 2.6.30 kernel. With this patch, I have less then 5% packet loss now, measured by doing a ping, while having HTTP downloads with other connections. Without this patch it seems one TCP connection monopolies the line, because the packet loss does not allow the other TCP connections to ramp up. The problem this patch, is that my max throughput on a 24Mbps down/ 1Mbps up line drops by about 30% so I have about 15mbps utilization, looking at traffic on the line i see there are gaps when nothing is being transmitted.. How much latency should netif_wake_queue() introduce? It seems that i call netif_wake_queue() but the upper layers in net/core/ sometimes take a while to restart sending packets to atm/br2684.c and i see these gaps in data transmission. Note if i change the ADSL2+ line speed to 4Mbps down/ 320Kbps up this patch seems to work perfectly, I have very low packet loss, and maximum utilization of the line. I tried playing with the internal buffering that the hardware driver does in the high speed 24Mbps down/ 1Mbps up case but it did not seem to make any difference. The TX seemed to be starved of data because the upper layers were not calling hard_start_xmit() I see the atm/clip driver calling netif_wake_queue() and netif_stop_queue() so i was trying to do the same in br2684. I don't have any hardware at the moment though to test CLIP at high speed. My setup is: test client pc eth0 <---> eth0 <--> arm ixp425 cpu <-- br2684 --> atm driver & device <--> ADSL2+ DSLAM <---> Test server eth0 You can see in my patch that i remove a line that has "dev_kfree_skb(skb);" that causes the packet loss. Then i added code to call netif_stop_queue() and netif_wake_queue() --- linux-2.6.28.9.orig/net/atm/br2684.c 2009-03-23 22:55:52.000000000 +0100 +++ linux-2.6.28.9/net/atm/br2684.c 2009-08-26 11:03:34.000000000 +0200 @@ -69,7 +69,7 @@ struct br2684_vcc { struct net_device *device; /* keep old push, pop functions for chaining */ void (*old_push) (struct atm_vcc * vcc, struct sk_buff * skb); - /* void (*old_pop)(struct atm_vcc *vcc, struct sk_buff *skb); */ + void (*old_pop)(struct atm_vcc *vcc, struct sk_buff *skb); enum br2684_encaps encaps; struct list_head brvccs; #ifdef CONFIG_ATM_BR2684_IPFILTER @@ -143,6 +143,26 @@ static struct net_device *br2684_find_de return NULL; } +/* chained vcc->pop function. Check if we should wake the netif_queue */ +static void br2684_pop(struct atm_vcc *vcc, struct sk_buff *skb) +{ + struct br2684_vcc *brvcc = BR2684_VCC(vcc); + struct net_device *net_dev = skb->dev; + + pr_debug("br2684_pop(vcc %p ; net_dev %p )\n", vcc, net_dev); + brvcc->old_pop(vcc, skb); + + if (!net_dev) + return; + + if (atm_may_send(vcc, 0)) { + if (netif_queue_stopped(net_dev)) { + netif_wake_queue(net_dev); + } + } + +} + /* * Send a packet out a particular vcc. Not to useful right now, but paves * the way for multiple vcc's per itf. Returns true if we can send, @@ -200,20 +220,20 @@ static int br2684_xmit_vcc(struct sk_buf ATM_SKB(skb)->vcc = atmvcc = brvcc->atmvcc; pr_debug("atm_skb(%p)->vcc(%p)->dev(%p)\n", skb, atmvcc, atmvcc->dev); - if (!atm_may_send(atmvcc, skb->truesize)) { - /* - * We free this here for now, because we cannot know in a higher - * layer whether the skb pointer it supplied wasn't freed yet. - * Now, it always is. - */ - dev_kfree_skb(skb); - return 0; - } + atomic_add(skb->truesize, &sk_atm(atmvcc)->sk_wmem_alloc); ATM_SKB(skb)->atm_options = atmvcc->atm_options; brdev->stats.tx_packets++; brdev->stats.tx_bytes += skb->len; + atmvcc->send(atmvcc, skb); + + if (!atm_may_send(atmvcc, 0)) { + /* the queue is now full so stop it + * Restart the queue in br2684_pop */ + netif_stop_queue(brvcc->device); + } + return 1; } @@ -240,6 +260,8 @@ static int br2684_start_xmit(struct sk_b read_unlock(&devs_lock); return 0; } + netif_tx_lock_bh(brdev->net_dev); + if (!br2684_xmit_vcc(skb, brdev, brvcc)) { /* * We should probably use netif_*_queue() here, but that @@ -251,6 +273,7 @@ static int br2684_start_xmit(struct sk_b brdev->stats.tx_errors++; brdev->stats.tx_fifo_errors++; } + netif_tx_unlock_bh(brdev->net_dev); read_unlock(&devs_lock); return 0; } @@ -509,8 +532,10 @@ static int br2684_regvcc(struct atm_vcc atmvcc->user_back = brvcc; brvcc->encaps = (enum br2684_encaps)be.encaps; brvcc->old_push = atmvcc->push; + brvcc->old_pop = atmvcc->pop; barrier(); atmvcc->push = br2684_push; + atmvcc->pop = br2684_pop; rq = &sk_atm(atmvcc)->sk_receive_queue; @@ -621,6 +646,8 @@ static int br2684_create(void __user * a write_lock_irq(&devs_lock); brdev->payload = payload; + netif_start_queue(netdev); + brdev->number = list_empty(&br2684_devs) ? 1 : BRPRIV(list_entry_brdev(br2684_devs.prev))->number + 1; list_add_tail(&brdev->br2684_devs, &br2684_devs);