Netdev List
 help / color / mirror / Atom feed
* Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1
From: Rafał Miłecki @ 2011-11-23  7:37 UTC (permalink / raw)
  To: Linus Torvalds, stephen hemminger
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Maciej Rutecki,
	Florian Mickler, Andrew Morton, Kernel Testers List,
	Network Development, Linux ACPI, Linux PM List, Linux SCSI List,
	Linux Wireless List, DRI
In-Reply-To: <CA+55aFyy19VYSdZW0+jNxAb8ix0xpX2j9YFw9oQi3jm3+mDEvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

W dniu 21 listopada 2011 23:22 użytkownik Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> napisał:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
>>
>> Subject    : [3.1-rc8 REGRESSION] sky2 hangs machine on turning off or suspending
>> Submitter  : Rafał Miłecki <zajec5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date       : 2011-11-09 11:46
>> Message-ID : CACna6ryTdLcWVYgHu=_mRFga1sFivpE_DyZOY-HMmKggkWAJAw@mail.gmail.com
>> References : http://marc.info/?l=linux-netdev&m=132083922228088&w=4
>
> This should be fixed by commit 1401a8008a09 ("sky2: fix hang on
> shutdown (and other irq issues)") in current -git.

This patch doesn't fix my hang.

However git contains also:
sky2: fix hang in napi_disable
This is the one fixing my case.

So the bug is resolved, however I'm a little disappointed noone
ping-ed me about that patches. I've spent some time on bisecting this
issue, expected to get some response :/

-- 
Rafał

^ permalink raw reply

* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Eric Dumazet @ 2011-11-23  7:20 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Christoph Lameter, Christian Kujau, Benjamin Herrenschmidt,
	Alex,Shi, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Pekka Enberg, Matt Mackall, netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <20111123071349.GA1671@x4.trippels.de>

Le mercredi 23 novembre 2011 à 08:13 +0100, Markus Trippelsdorf a
écrit :
> On 2011.11.22 at 09:27 +0100, Eric Dumazet wrote:
> > Le mardi 22 novembre 2011 à 08:48 +0100, Eric Dumazet a écrit :
> > 
> > > For x86, I wonder if our !X86_FEATURE_CX16 support is correct on SMP
> > > machines.
> > > 
> > 
> > 
> > By the way, I wonder why we still emit this_cpu_cmpxchg16b_emu() code
> > and calls when compiling a kernel for a cpu implementing cmpxchg16b
> > 
> > (CONFIG_MCORE2=y)
> 
> Yeah, it's strange (CONFIG_MK8):
> 
> ffffffff811058b0 <__kmalloc>:
> ...
> ffffffff8110594f:       48 8d 4a 04             lea    0x4(%rdx),%rcx
> ffffffff81105953:       49 8b 1c 04             mov    (%r12,%rax,1),%rbx
> ffffffff81105957:       4c 89 e0                mov    %r12,%rax
> ffffffff8110595a:       e8 11 70 10 00          callq  ffffffff8120c970 <this_cpu_cmpxchg16b_emu>
> ffffffff8110595f:       66 66 90                data32 xchg %ax,%ax
> ffffffff81105962:       84 c0                   test   %al,%al
> ffffffff81105964:       74 c6                   je     ffffffff8110592c <__kmalloc+0x7c>
> ...
> 

This is patched at bootime (asm alternative)



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [patch] isdn: make sure strings are null terminated
From: Dan Carpenter @ 2011-11-23  7:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Karsten Keil, netdev, kernel-janitors
In-Reply-To: <1322031811.1298.38.camel@edumazet-laptop>

[-- Attachment #1: Type: text/plain, Size: 594 bytes --]

On Wed, Nov 23, 2011 at 08:03:31AM +0100, Eric Dumazet wrote:
> > +			if (strlen(dioctl.cf_ctrl.msn) >= sizeof(dioctl.cf_ctrl.msn))
> > +				return -EINVAL;
> 
> This looks buggy.
> 
> If string is not null terminated, how strlen() will stop you from going
> out of bounds, and trigger some run time checker ?
> 
> strnlen() would be more effective...
> 

Aw crap.  My first version used strnlen() and I redid it to be
simpler.  I just figured that it doesn't take long to hit a zeroed
u8.

I'll resend all three strlen() patches to use strnlen().

regards,
dan carpenter


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Markus Trippelsdorf @ 2011-11-23  7:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Christian Kujau, Benjamin Herrenschmidt,
	Alex,Shi, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Pekka Enberg, Matt Mackall, netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <1321950432.27077.27.camel@edumazet-laptop>

On 2011.11.22 at 09:27 +0100, Eric Dumazet wrote:
> Le mardi 22 novembre 2011 à 08:48 +0100, Eric Dumazet a écrit :
> 
> > For x86, I wonder if our !X86_FEATURE_CX16 support is correct on SMP
> > machines.
> > 
> 
> 
> By the way, I wonder why we still emit this_cpu_cmpxchg16b_emu() code
> and calls when compiling a kernel for a cpu implementing cmpxchg16b
> 
> (CONFIG_MCORE2=y)

Yeah, it's strange (CONFIG_MK8):

ffffffff811058b0 <__kmalloc>:
...
ffffffff8110594f:       48 8d 4a 04             lea    0x4(%rdx),%rcx
ffffffff81105953:       49 8b 1c 04             mov    (%r12,%rax,1),%rbx
ffffffff81105957:       4c 89 e0                mov    %r12,%rax
ffffffff8110595a:       e8 11 70 10 00          callq  ffffffff8120c970 <this_cpu_cmpxchg16b_emu>
ffffffff8110595f:       66 66 90                data32 xchg %ax,%ax
ffffffff81105962:       84 c0                   test   %al,%al
ffffffff81105964:       74 c6                   je     ffffffff8110592c <__kmalloc+0x7c>
...

There is a comment in arch/x86/include/asm/percpu.h:

 * Pretty complex macro to generate cmpxchg16 instruction.  The instruction
 * is not supported on early AMD64 processors so we must be able to emulate
 * it in software.  The address used in the cmpxchg16 instruction must be
 * aligned to a 16 byte boundary.
-- 
Markus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: MPLS for Linux kernel
From: Igor Maravić @ 2011-11-23  7:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20111122.164909.156852889818363753.davem@davemloft.net>

OK,
Thanks

2011/11/22 David Miller <davem@davemloft.net>:
> From: Igor Maravić <igorm@etf.rs>
> Date: Tue, 22 Nov 2011 22:41:44 +0100
>
>> I would like to know what is necesary for MPLS implementation to have,
>> and to do, so it would be accepted in upstream kernel?
>
> A long and laborious back and forth review process, taking into consideration
> not just the technical details of the patches themselves, but the top level
> and overall design.
>
> That's what it will take.
>
> Taking someone else's work, fixing all the bugs and cleaning them up is
> far from sufficient for a feature of this nature.  There is natural
> overlap all over and we have to make sure the implementation bits are
> going into the right places.
>
> One issue of constant contention is that people want to add all of their
> favorite packet filtering and packet mangling into their protocol handling
> code, with all kinds of custom controls and configuration mechanisms.
>
> WE HATE THIS.
>
> We have the packet scheduler classifiers and packet actions for a reason,
> and we want them to used instead of ignored.
>
> We are going through the same thing in the review process for the openvswitch
> code, which brings up another design question for MPLS, which is whether MPLS
> can be better implemented in terms of openvswitch.
>
> You're in the unfortunate position of submitting a feature that has a
> lot of overlap with many other subsystems, existing code, and features
> being submitted at the same time.  We want as much reuse as possible,
> and we want it all designed right before it gets integrated.
>
> I frankly don't care very much about MPLS personally, it's such a
> fringe facility.  So if people just argue themselves into oblivion and
> no forward progress is made, just like last time an MPLS submission
> was attempted, that's also fine with me :-)
>

^ permalink raw reply

* Re: [patch] isdn: make sure strings are null terminated
From: Eric Dumazet @ 2011-11-23  7:03 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Karsten Keil, netdev, kernel-janitors
In-Reply-To: <20111123064204.GA6871@elgon.mountain>

Le mercredi 23 novembre 2011 à 09:42 +0300, Dan Carpenter a écrit :
> These strings come from the user.  We strcpy() them inside
> cf_command() so we should check that they are NULL terminated and
> return an error if not.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> diff --git a/drivers/isdn/divert/divert_procfs.c b/drivers/isdn/divert/divert_procfs.c
> index 33ec9e4..0c16687 100644
> --- a/drivers/isdn/divert/divert_procfs.c
> +++ b/drivers/isdn/divert/divert_procfs.c
> @@ -242,6 +242,10 @@ static int isdn_divert_ioctl_unlocked(struct file *file, uint cmd, ulong arg)
>  		case IIOCDOCFINT:
>  			if (!divert_if.drv_to_name(dioctl.cf_ctrl.drvid))
>  				return (-EINVAL);	/* invalid driver */
> +			if (strlen(dioctl.cf_ctrl.msn) >= sizeof(dioctl.cf_ctrl.msn))
> +				return -EINVAL;

This looks buggy.

If string is not null terminated, how strlen() will stop you from going
out of bounds, and trigger some run time checker ?

strnlen() would be more effective...

> +			if (strlen(dioctl.cf_ctrl.fwd_nr) >= sizeof(dioctl.cf_ctrl.fwd_nr))
> +				return -EINVAL;
>  			if ((i = cf_command(dioctl.cf_ctrl.drvid,
>  					    (cmd == IIOCDOCFACT) ? 1 : (cmd == IIOCDOCFDIS) ? 0 : 2,
>  					    dioctl.cf_ctrl.cfproc,
> --

^ permalink raw reply

* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Pekka Enberg @ 2011-11-23  6:59 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Benjamin Herrenschmidt, Eric Dumazet, Christoph Lameter,
	Markus Trippelsdorf, Alex,Shi, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Matt Mackall, netdev@vger.kernel.org,
	Tejun Heo, David Rientjes
In-Reply-To: <alpine.DEB.2.01.1111222145470.8000@trent.utfs.org>

2011/11/23 Christian Kujau <lists@nerdbynature.de>:
> OK, with Christoph's patch applied, 3.2.0-rc2-00274-g6fe4c6d-dirty survives
> on this machine, with the disk & cpu workload that caused the machine to
> panic w/o the patch. Load was at 4-5 this time, which is expected for this
> box. I'll run a few more tests later on, but it seems ok for now.
>
> I couldn't resist and ran "slabinfo" anyway (after the workload!) - the
> box survived, nothing was printed in syslog either. Output attached.

Christoph, Eric, would you mind sending me the final patches that
Christian tested? Maybe CC David too for extra pair of eyes.

                                Pekka

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [patch] netrom: check that user string is terminated
From: Dan Carpenter @ 2011-11-23  6:52 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: David S. Miller, linux-hams, netdev, kernel-janitors

We do an strcpy() of mnemonic in nr_add_node().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index 915a87b..e126c48 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -670,6 +670,8 @@ int nr_rt_ioctl(unsigned int cmd, void __user *arg)
 	case SIOCADDRT:
 		if (copy_from_user(&nr_route, arg, sizeof(struct nr_route_struct)))
 			return -EFAULT;
+		if (strlen(nr_route.mnemonic) >= sizeof(nr_route.mnemonic))
+			return -EINVAL;
 		if ((dev = nr_ax25_dev_get(nr_route.device)) == NULL)
 			return -EINVAL;
 		if (nr_route.ndigis < 0 || nr_route.ndigis > AX25_MAX_DIGIS) {

^ permalink raw reply related

* [patch] isdn: avoid copying too long drvid
From: Dan Carpenter @ 2011-11-23  6:43 UTC (permalink / raw)
  To: Karsten Keil
  Cc: David S. Miller, Lucas De Marchi, Neil Horman, netdev,
	kernel-janitors

"cfg->drvid" comes from the user so there is a possibility they
didn't NUL terminate properly.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/drivers/isdn/i4l/isdn_net.c b/drivers/isdn/i4l/isdn_net.c
index 1f73d7f..487d214 100644
--- a/drivers/isdn/i4l/isdn_net.c
+++ b/drivers/isdn/i4l/isdn_net.c
@@ -2756,6 +2756,8 @@ isdn_net_setcfg(isdn_net_ioctl_cfg * cfg)
 			char *c,
 			*e;
 
+			if (strlen(cfg->drvid) >= sizeof(drvid))
+				return -EINVAL;
 			drvidx = -1;
 			chidx = -1;
 			strcpy(drvid, cfg->drvid);

^ permalink raw reply related

* [patch] isdn: make sure strings are null terminated
From: Dan Carpenter @ 2011-11-23  6:42 UTC (permalink / raw)
  To: Karsten Keil; +Cc: netdev, kernel-janitors

These strings come from the user.  We strcpy() them inside
cf_command() so we should check that they are NULL terminated and
return an error if not.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/drivers/isdn/divert/divert_procfs.c b/drivers/isdn/divert/divert_procfs.c
index 33ec9e4..0c16687 100644
--- a/drivers/isdn/divert/divert_procfs.c
+++ b/drivers/isdn/divert/divert_procfs.c
@@ -242,6 +242,10 @@ static int isdn_divert_ioctl_unlocked(struct file *file, uint cmd, ulong arg)
 		case IIOCDOCFINT:
 			if (!divert_if.drv_to_name(dioctl.cf_ctrl.drvid))
 				return (-EINVAL);	/* invalid driver */
+			if (strlen(dioctl.cf_ctrl.msn) >= sizeof(dioctl.cf_ctrl.msn))
+				return -EINVAL;
+			if (strlen(dioctl.cf_ctrl.fwd_nr) >= sizeof(dioctl.cf_ctrl.fwd_nr))
+				return -EINVAL;
 			if ((i = cf_command(dioctl.cf_ctrl.drvid,
 					    (cmd == IIOCDOCFACT) ? 1 : (cmd == IIOCDOCFDIS) ? 0 : 2,
 					    dioctl.cf_ctrl.cfproc,

^ permalink raw reply related

* [PATCH] ipv6: fix a bug in ndisc_send_redirect
From: Li Wei @ 2011-11-23  6:18 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

Release skb when transmit rate limit _not_ allow

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
---
 net/ipv6/ndisc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 44e5b7f..0cb78d7 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1571,7 +1571,7 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 	}
 	if (!rt->rt6i_peer)
 		rt6_bind_peer(rt, 1);
-	if (inet_peer_xrlim_allow(rt->rt6i_peer, 1*HZ))
+	if (!inet_peer_xrlim_allow(rt->rt6i_peer, 1*HZ))
 		goto release;
 
 	if (dev->addr_len) {
-- 
1.7.3.2

^ permalink raw reply related

* Re: Kernel v3.0.8 igb driver dies when pulling network cable
From: Stefan Priebe - Profihost AG @ 2011-11-23  6:15 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Stable Tree, stable, Greg KH, LKML, Linux Netdev List,
	Jeff Kirsher,
	Jesse Brandeburg <jesse.brandeburg@intel.com> Bruce Allan,
	Carolyn Wyborny, Don Skidmore, Greg Rose, PJ Waskiewicz,
	John Ronciak
In-Reply-To: <4ECC1592.60607@intel.com>

Hi Alex,

> It seems like there might be an issue with something specific to your
> board since I tried reproducing the issue here on an 82576 based adapter
> and the stable 3.0.9 kernel I have and I have not had much success.
but as it is working fine with the up2date igb module - it still was 
fixed ;-)

To reproduce try the following (gave me 100% success rate). Configure 
eth0 to DHCP, leave eth1 unconfigured.
Boot the system with lan cable plugged into eth1 !!
When the boot has finished switch the LAN cable from eth1 to eth0.

> I'm assuming the device that is failing is eth0.  I was wondering if you
> could send me the output of the following three commands so that I can
> do some further work to try and isolate the root cause for this issue:
> ethtool eth0
> ethtool -e eth0
> grep eth0 /proc/interrupts

Will send them as soon i have access to the machine again.

> The issue seems to be that your adapter is not detecting that the cable
> was unplugged.  This in turn is leaving stale packets on the Tx ring and
> is what is resulting in the dev_watchdog message you are seeing.
Are you sure? Cause when you look at the dmesg you'll see this:

igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

STACKTRACE!

igb 0000:0a:00.0: eth0: Reset adapter

eth0: no IPv6 routers present

igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

igb 0000:0a:00.0: eth0: Reset adapter

igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

So it happens when the cable is plugged in and NOT when i unplug it.

Stefan

^ permalink raw reply

* Re: Missing TCP SYN on loopback, retransmits after 1s
From: Eric Dumazet @ 2011-11-23  6:13 UTC (permalink / raw)
  To: John Heffner; +Cc: Jesse Young, netdev
In-Reply-To: <1322028540.1298.8.camel@edumazet-laptop>

Le mercredi 23 novembre 2011 à 07:09 +0100, Eric Dumazet a écrit :
> Le mercredi 23 novembre 2011 à 06:58 +0100, Eric Dumazet a écrit :
> 
> > First connection went well.
> > 
> > Now we try to reuse tuple  (ports 49374, 8009 on loopback) while a socket is in TIMEWAIT, and first
> > SYN packet (time 06:48:20.337335) is dropped (considered as a packet part of previous session)
> > 
> > Now why the first SYN packet is dropped and not the second one, I dont know yet.
> 
> 
> in netstat -s output, the suspect increasing counter is :
> 
> "45 congestion windows recovered without slow start after partial ack"
> 
> 

# vi +2831 net/ipv4/tcp_input.c

/* Undo during loss recovery after partial ACK. */
static int tcp_try_undo_loss(struct sock *sk)
{
        struct tcp_sock *tp = tcp_sk(sk);

        if (tcp_may_undo(tp)) {
                struct sk_buff *skb;
                tcp_for_write_queue(skb, sk) {
                        if (skb == tcp_send_head(sk))
                                break;
                        TCP_SKB_CB(skb)->sacked &= ~TCPCB_LOST;
                }

                tcp_clear_all_retrans_hints(tp);

                DBGUNDO(sk, "partial loss");
                tp->lost_out = 0;
                tcp_undo_cwr(sk, true);
HERE:            NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPLOSSUNDO);
                inet_csk(sk)->icsk_retransmits = 0;
                tp->undo_marker = 0;
                if (tcp_is_sack(tp))
                        tcp_set_ca_state(sk, TCP_CA_Open);
                return 1;
        }
        return 0;
}

^ permalink raw reply

* Re: Missing TCP SYN on loopback, retransmits after 1s
From: Eric Dumazet @ 2011-11-23  6:09 UTC (permalink / raw)
  To: John Heffner; +Cc: Jesse Young, netdev
In-Reply-To: <1322027911.1298.4.camel@edumazet-laptop>

Le mercredi 23 novembre 2011 à 06:58 +0100, Eric Dumazet a écrit :

> First connection went well.
> 
> Now we try to reuse tuple  (ports 49374, 8009 on loopback) while a socket is in TIMEWAIT, and first
> SYN packet (time 06:48:20.337335) is dropped (considered as a packet part of previous session)
> 
> Now why the first SYN packet is dropped and not the second one, I dont know yet.


in netstat -s output, the suspect increasing counter is :

"45 congestion windows recovered without slow start after partial ack"

^ permalink raw reply

* [PATCH v3 10/10] sfc: Support for byte queue limits
From: Tom Herbert @ 2011-11-23  5:53 UTC (permalink / raw)
  To: davem, netdev

Changes to sfc to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/ethernet/sfc/tx.c |   27 +++++++++++++++++++++------
 1 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index df88c543..60bcbe1 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -31,7 +31,9 @@
 #define EFX_TXQ_THRESHOLD(_efx) ((_efx)->txq_entries / 2u)
 
 static void efx_dequeue_buffer(struct efx_tx_queue *tx_queue,
-			       struct efx_tx_buffer *buffer)
+			       struct efx_tx_buffer *buffer,
+			       unsigned int *pkts_compl,
+			       unsigned int *bytes_compl)
 {
 	if (buffer->unmap_len) {
 		struct pci_dev *pci_dev = tx_queue->efx->pci_dev;
@@ -48,6 +50,8 @@ static void efx_dequeue_buffer(struct efx_tx_queue *tx_queue,
 	}
 
 	if (buffer->skb) {
+		(*pkts_compl)++;
+		(*bytes_compl) += buffer->skb->len;
 		dev_kfree_skb_any((struct sk_buff *) buffer->skb);
 		buffer->skb = NULL;
 		netif_vdbg(tx_queue->efx, tx_done, tx_queue->efx->net_dev,
@@ -250,6 +254,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	buffer->skb = skb;
 	buffer->continuation = false;
 
+	netdev_tx_sent_queue(tx_queue->core_txq, 1, skb->len);
+
 	/* Pass off to hardware */
 	efx_nic_push_buffers(tx_queue);
 
@@ -267,10 +273,11 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
  unwind:
 	/* Work backwards until we hit the original insert pointer value */
 	while (tx_queue->insert_count != tx_queue->write_count) {
+		unsigned int pkts_compl = 0, bytes_compl = 0;
 		--tx_queue->insert_count;
 		insert_ptr = tx_queue->insert_count & tx_queue->ptr_mask;
 		buffer = &tx_queue->buffer[insert_ptr];
-		efx_dequeue_buffer(tx_queue, buffer);
+		efx_dequeue_buffer(tx_queue, buffer, &pkts_compl, &bytes_compl);
 		buffer->len = 0;
 	}
 
@@ -293,7 +300,9 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
  * specified index.
  */
 static void efx_dequeue_buffers(struct efx_tx_queue *tx_queue,
-				unsigned int index)
+				unsigned int index,
+				unsigned int *pkts_compl,
+				unsigned int *bytes_compl)
 {
 	struct efx_nic *efx = tx_queue->efx;
 	unsigned int stop_index, read_ptr;
@@ -311,7 +320,7 @@ static void efx_dequeue_buffers(struct efx_tx_queue *tx_queue,
 			return;
 		}
 
-		efx_dequeue_buffer(tx_queue, buffer);
+		efx_dequeue_buffer(tx_queue, buffer, pkts_compl, bytes_compl);
 		buffer->continuation = true;
 		buffer->len = 0;
 
@@ -422,10 +431,12 @@ void efx_xmit_done(struct efx_tx_queue *tx_queue, unsigned int index)
 {
 	unsigned fill_level;
 	struct efx_nic *efx = tx_queue->efx;
+	unsigned int pkts_compl = 0, bytes_compl = 0;
 
 	EFX_BUG_ON_PARANOID(index > tx_queue->ptr_mask);
 
-	efx_dequeue_buffers(tx_queue, index);
+	efx_dequeue_buffers(tx_queue, index, &pkts_compl, &bytes_compl);
+	netdev_tx_completed_queue(tx_queue->core_txq, pkts_compl, bytes_compl);
 
 	/* See if we need to restart the netif queue.  This barrier
 	 * separates the update of read_count from the test of the
@@ -515,13 +526,15 @@ void efx_release_tx_buffers(struct efx_tx_queue *tx_queue)
 
 	/* Free any buffers left in the ring */
 	while (tx_queue->read_count != tx_queue->write_count) {
+		unsigned int pkts_compl = 0, bytes_compl = 0;
 		buffer = &tx_queue->buffer[tx_queue->read_count & tx_queue->ptr_mask];
-		efx_dequeue_buffer(tx_queue, buffer);
+		efx_dequeue_buffer(tx_queue, buffer, &pkts_compl, &bytes_compl);
 		buffer->continuation = true;
 		buffer->len = 0;
 
 		++tx_queue->read_count;
 	}
+	netdev_tx_reset_queue(tx_queue->core_txq);
 }
 
 void efx_fini_tx_queue(struct efx_tx_queue *tx_queue)
@@ -1163,6 +1176,8 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 	/* Pass off to hardware */
 	efx_nic_push_buffers(tx_queue);
 
+	netdev_tx_sent_queue(tx_queue->core_txq, 1, skb->len);
+
 	tx_queue->tso_bursts++;
 	return NETDEV_TX_OK;
 
-- 
1.7.3.1

^ permalink raw reply related

* Re: [PATCH v3 0/10] bql: Byte Queue Limits
From: Eric Dumazet @ 2011-11-23  6:00 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.2.00.1111222134590.9126@pokey.mtv.corp.google.com>

Le mardi 22 novembre 2011 à 21:52 -0800, Tom Herbert a écrit :
> Changes from last version:
>   - Rebase to 3.2
>   - Added CONFIG_BQL and CONFIG_DQL
>   - Added some cache alignment in struct dql, to split read only, writeable
>     elements, and split those elements written on transmit from those
>     written at transmit completion (suggested by Eric).
>   - Split out adding xps_queue_release as its own patch.
>   - Some minor performance changes, use likely and unlikely for some
>     conditionals.
>   - Cleaned up some "show" functions for bql (pointed out by Ben).
>   - Change netdev_tx_completed_queue to do check xoff, check
>     availability, and then check xoff again.  This to prevent potential
>     race conditions with netdev_sent_queue (as Ben pointed out).
>   - Did some more testing trying to evaluate overhead of BQL in the
>     transmit path.  I see about 1-3% degradation in CPU utilization
>     and maximum pps when BQL is enabled.  Any ideas to beat this
>     down as much as possible would be appreciated!
>   - Added high versus low priority traffic test to results below.
>   

Excellent, I plan to review and test this today

Thanks !

^ permalink raw reply

* [PATCH v3 09/10] bnx2x: Support for byte queue limits
From: Tom Herbert @ 2011-11-23  5:53 UTC (permalink / raw)
  To: davem, netdev

Changes to bnx2x to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |   26 +++++++++++++++++++---
 1 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 0d60b9e..3fe9460 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -102,7 +102,8 @@ int load_count[2][3] = { {0} }; /* per-path: 0-common, 1-port0, 2-port1 */
  * return idx of last bd freed
  */
 static u16 bnx2x_free_tx_pkt(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata,
-			     u16 idx)
+			     u16 idx, unsigned int *pkts_compl,
+			     unsigned int *bytes_compl)
 {
 	struct sw_tx_bd *tx_buf = &txdata->tx_buf_ring[idx];
 	struct eth_tx_start_bd *tx_start_bd;
@@ -159,6 +160,10 @@ static u16 bnx2x_free_tx_pkt(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata,
 
 	/* release skb */
 	WARN_ON(!skb);
+	if (skb) {
+		(*pkts_compl)++;
+		(*bytes_compl) += skb->len;
+	}
 	dev_kfree_skb_any(skb);
 	tx_buf->first_bd = 0;
 	tx_buf->skb = NULL;
@@ -170,6 +175,7 @@ int bnx2x_tx_int(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata)
 {
 	struct netdev_queue *txq;
 	u16 hw_cons, sw_cons, bd_cons = txdata->tx_bd_cons;
+	unsigned int pkts_compl = 0, bytes_compl = 0;
 
 #ifdef BNX2X_STOP_ON_ERROR
 	if (unlikely(bp->panic))
@@ -189,10 +195,14 @@ int bnx2x_tx_int(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata)
 				      " pkt_cons %u\n",
 		   txdata->txq_index, hw_cons, sw_cons, pkt_cons);
 
-		bd_cons = bnx2x_free_tx_pkt(bp, txdata, pkt_cons);
+		bd_cons = bnx2x_free_tx_pkt(bp, txdata, pkt_cons,
+		    &pkts_compl, &bytes_compl);
+
 		sw_cons++;
 	}
 
+	netdev_tx_completed_queue(txq, pkts_compl, bytes_compl);
+
 	txdata->tx_pkt_cons = sw_cons;
 	txdata->tx_bd_cons = bd_cons;
 
@@ -1077,14 +1087,18 @@ static void bnx2x_free_tx_skbs(struct bnx2x *bp)
 		struct bnx2x_fastpath *fp = &bp->fp[i];
 		for_each_cos_in_tx_queue(fp, cos) {
 			struct bnx2x_fp_txdata *txdata = &fp->txdata[cos];
+			unsigned pkts_compl = 0, bytes_compl = 0;
 
 			u16 sw_prod = txdata->tx_pkt_prod;
 			u16 sw_cons = txdata->tx_pkt_cons;
 
 			while (sw_cons != sw_prod) {
-				bnx2x_free_tx_pkt(bp, txdata, TX_BD(sw_cons));
+				bnx2x_free_tx_pkt(bp, txdata, TX_BD(sw_cons),
+				    &pkts_compl, &bytes_compl);
 				sw_cons++;
 			}
+			netdev_tx_reset_queue(
+			    netdev_get_tx_queue(bp->dev, txdata->txq_index));
 		}
 	}
 }
@@ -2788,6 +2802,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		mapping = skb_frag_dma_map(&bp->pdev->dev, frag, 0,
 					   skb_frag_size(frag), DMA_TO_DEVICE);
 		if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
+			unsigned int pkts_compl = 0, bytes_compl = 0;
 
 			DP(NETIF_MSG_TX_QUEUED, "Unable to map page - "
 						"dropping packet...\n");
@@ -2799,7 +2814,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			 */
 			first_bd->nbd = cpu_to_le16(nbd);
 			bnx2x_free_tx_pkt(bp, txdata,
-					  TX_BD(txdata->tx_pkt_prod));
+					  TX_BD(txdata->tx_pkt_prod),
+					  &pkts_compl, &bytes_compl);
 			return NETDEV_TX_OK;
 		}
 
@@ -2860,6 +2876,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		   pbd_e2->parsing_data);
 	DP(NETIF_MSG_TX_QUEUED, "doorbell: nbd %d  bd %u\n", nbd, bd_prod);
 
+	netdev_tx_sent_queue(txq, 1, skb->len);
+
 	txdata->tx_pkt_prod++;
 	/*
 	 * Make sure that the BD data is updated before updating the producer
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH v3 05/10] bql: Byte queue limits
From: Tom Herbert @ 2011-11-23  5:52 UTC (permalink / raw)
  To: davem, netdev

Networking stack support for byte queue limits, uses dynamic queue
limits library.  Byte queue limits are maintained per transmit queue,
and a dql structure has been added to netdev_queue structure for this
purpose.

Configuration of bql is in the tx-<n> sysfs directory for the queue
under the byte_queue_limits directory.  Configuration includes:
limit_min, bql minimum limit
limit_max, bql maximum limit
hold_time, bql slack hold time

Also under the directory are:
limit, current byte limit
inflight, current number of bytes on the queue

Signed-off-by: Tom Herbert <therbert@google.com>
---
 include/linux/netdevice.h |   28 ++++++++
 net/Kconfig               |   13 ++++
 net/core/dev.c            |    3 +
 net/core/net-sysfs.c      |  150 ++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 186 insertions(+), 8 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8b3eb8a..e17ece6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -43,6 +43,7 @@
 #include <linux/rculist.h>
 #include <linux/dmaengine.h>
 #include <linux/workqueue.h>
+#include <linux/dynamic_queue_limits.h>
 
 #include <linux/ethtool.h>
 #include <net/net_namespace.h>
@@ -557,6 +558,9 @@ struct netdev_queue {
 	 * please use this field instead of dev->trans_start
 	 */
 	unsigned long		trans_start;
+#ifdef CONFIG_BQL
+	struct dql		dql;
+#endif
 } ____cacheline_aligned_in_smp;
 
 static inline int netdev_queue_numa_node_read(const struct netdev_queue *q)
@@ -1927,6 +1931,15 @@ static inline int netif_xmit_frozen_or_stopped(const struct netdev_queue *dev_qu
 static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue,
 					unsigned int pkts, unsigned int bytes)
 {
+#ifdef CONFIG_BQL
+	dql_queued(&dev_queue->dql, bytes);
+	if (unlikely(dql_avail(&dev_queue->dql) < 0)) {
+		set_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state);
+		if (unlikely(dql_avail(&dev_queue->dql) >= 0))
+			clear_bit(__QUEUE_STATE_STACK_XOFF,
+			    &dev_queue->state);
+	}
+#endif
 }
 
 static inline void netdev_sent_queue(struct net_device *dev,
@@ -1938,6 +1951,18 @@ static inline void netdev_sent_queue(struct net_device *dev,
 static inline void netdev_tx_completed_queue(struct netdev_queue *dev_queue,
 					     unsigned pkts, unsigned bytes)
 {
+#ifdef CONFIG_BQL
+	if (likely(bytes)) {
+		dql_completed(&dev_queue->dql, bytes);
+		if (unlikely(test_bit(__QUEUE_STATE_STACK_XOFF,
+		    &dev_queue->state) &&
+		    dql_avail(&dev_queue->dql) >= 0)) {
+			if (test_and_clear_bit(__QUEUE_STATE_STACK_XOFF,
+			     &dev_queue->state))
+				netif_schedule_queue(dev_queue);
+		}
+	}
+#endif
 }
 
 static inline void netdev_completed_queue(struct net_device *dev,
@@ -1948,6 +1973,9 @@ static inline void netdev_completed_queue(struct net_device *dev,
 
 static inline void netdev_tx_reset_queue(struct netdev_queue *q)
 {
+#ifdef CONFIG_BQL
+	dql_reset(&q->dql);
+#endif
 }
 
 static inline void netdev_reset_queue(struct net_device *dev_queue)
diff --git a/net/Kconfig b/net/Kconfig
index a073148..217ae0a 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -232,6 +232,19 @@ config XPS
 	depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
 	default y
 
+config BQL
+	bool "Byte Queue Limits"
+	depends on SYSFS
+	select DQL
+	default y
+	---help---
+	  Byte queue limits uses a dynamic algorithm to limit the number of
+	  bytes that are queued to a NIC HW queue.  By limiting this number
+	  latencies and head-of-line blocking of high priority packets
+	  can be reduced.
+
+	  This feature requires driver support.
+
 config HAVE_BPF_JIT
 	bool
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 8ca56c0..49ef8c1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5548,6 +5548,9 @@ static void netdev_init_one_queue(struct net_device *dev,
 	queue->xmit_lock_owner = -1;
 	netdev_queue_numa_node_write(queue, NUMA_NO_NODE);
 	queue->dev = dev;
+#ifdef CONFIG_BQL
+	dql_init(&queue->dql, HZ);
+#endif
 }
 
 static int netif_alloc_netdev_queues(struct net_device *dev)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index fffd5b2..27c9046 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -21,6 +21,7 @@
 #include <linux/wireless.h>
 #include <linux/vmalloc.h>
 #include <linux/export.h>
+#include <linux/jiffies.h>
 #include <net/wext.h>
 
 #include "net-sysfs.h"
@@ -780,7 +781,7 @@ net_rx_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
 #endif
 }
 
-#ifdef CONFIG_XPS
+#if defined(CONFIG_XPS) | defined(CONFIG_BQL)
 /*
  * netdev_queue sysfs structures and functions.
  */
@@ -839,8 +840,119 @@ static inline unsigned int get_netdev_queue_index(struct netdev_queue *queue)
 
 	return i;
 }
+#endif /* defined(CONFIG_XPS) | defined(CONFIG_BQL) */
+
+#ifdef CONFIG_BQL
+/*
+ * Byte queue limits sysfs structures and functions.
+ */
+static ssize_t bql_show(char *buf, unsigned long value)
+{
+	return sprintf(buf, "%lu\n", value);
+}
+
+static ssize_t bql_set(const char *buf, const size_t count,
+		       unsigned long *pvalue)
+{
+	unsigned long value;
+	int err;
+
+	if (!strcmp(buf, "max") || !strcmp(buf, "max\n"))
+		value = DQL_MAX_LIMIT;
+	else {
+		err = kstrtoul(buf, 10, &value);
+		if (err < 0)
+			return err;
+		if (value > DQL_MAX_LIMIT)
+			return -EINVAL;
+	}
+
+	*pvalue = value;
+
+	return count;
+}
+
+static ssize_t bql_show_hold_time(struct netdev_queue *queue,
+				  struct netdev_queue_attribute *attr,
+				  char *buf)
+{
+	struct dql *dql = &queue->dql;
+
+	return sprintf(buf, "%u\n", jiffies_to_msecs(dql->slack_hold_time));
+}
+
+static ssize_t bql_set_hold_time(struct netdev_queue *queue,
+				 struct netdev_queue_attribute *attribute,
+				 const char *buf, size_t len)
+{
+	struct dql *dql = &queue->dql;
+	unsigned value;
+	int err;
+
+	err = kstrtouint(buf, 10, &value);
+	if (err < 0)
+		return err;
+
+	dql->slack_hold_time = msecs_to_jiffies(value);
+
+	return len;
+}
+
+static struct netdev_queue_attribute bql_hold_time_attribute =
+	__ATTR(hold_time, S_IRUGO | S_IWUSR, bql_show_hold_time,
+	    bql_set_hold_time);
+
+static ssize_t bql_show_inflight(struct netdev_queue *queue,
+				 struct netdev_queue_attribute *attr,
+				 char *buf)
+{
+	struct dql *dql = &queue->dql;
+
+	return sprintf(buf, "%lu\n", dql->num_queued - dql->num_completed);
+}
+
+static struct netdev_queue_attribute bql_inflight_attribute =
+	__ATTR(inflight, S_IRUGO | S_IWUSR, bql_show_inflight, NULL);
+
+#define BQL_ATTR(NAME, FIELD)						\
+static ssize_t bql_show_ ## NAME(struct netdev_queue *queue,		\
+				 struct netdev_queue_attribute *attr,	\
+				 char *buf)				\
+{									\
+	return bql_show(buf, queue->dql.FIELD);				\
+}									\
+									\
+static ssize_t bql_set_ ## NAME(struct netdev_queue *queue,		\
+				struct netdev_queue_attribute *attr,	\
+				const char *buf, size_t len)		\
+{									\
+	return bql_set(buf, len, &queue->dql.FIELD);			\
+}									\
+									\
+static struct netdev_queue_attribute bql_ ## NAME ## _attribute =	\
+	__ATTR(NAME, S_IRUGO | S_IWUSR, bql_show_ ## NAME,		\
+	    bql_set_ ## NAME);
+
+BQL_ATTR(limit, limit)
+BQL_ATTR(limit_max, max_limit)
+BQL_ATTR(limit_min, min_limit)
+
+static struct attribute *dql_attrs[] = {
+	&bql_limit_attribute.attr,
+	&bql_limit_max_attribute.attr,
+	&bql_limit_min_attribute.attr,
+	&bql_hold_time_attribute.attr,
+	&bql_inflight_attribute.attr,
+	NULL
+};
 
+static struct attribute_group dql_group = {
+	.name  = "byte_queue_limits",
+	.attrs  = dql_attrs,
+};
+#endif /* CONFIG_BQL */
 
+#ifdef CONFIG_XPS
 static ssize_t show_xps_map(struct netdev_queue *queue,
 			    struct netdev_queue_attribute *attribute, char *buf)
 {
@@ -1067,8 +1179,14 @@ error:
 static struct netdev_queue_attribute xps_cpus_attribute =
     __ATTR(xps_cpus, S_IRUGO | S_IWUSR, show_xps_map, store_xps_map);
 
+#endif /* CONFIG_XPS */
+
+#if defined(CONFIG_XPS) || defined(CONFIG_BQL)
+
 static struct attribute *netdev_queue_default_attrs[] = {
+#ifdef CONFIG_XPS
 	&xps_cpus_attribute.attr,
+#endif
 	NULL
 };
 
@@ -1076,7 +1194,9 @@ static void netdev_queue_release(struct kobject *kobj)
 {
 	struct netdev_queue *queue = to_netdev_queue(kobj);
 
+#ifdef CONFIG_XPS
 	xps_queue_release(queue);
+#endif
 
 	memset(kobj, 0, sizeof(*kobj));
 	dev_put(queue->dev);
@@ -1097,22 +1217,30 @@ static int netdev_queue_add_kobject(struct net_device *net, int index)
 	kobj->kset = net->queues_kset;
 	error = kobject_init_and_add(kobj, &netdev_queue_ktype, NULL,
 	    "tx-%u", index);
+	if (error)
+		goto exit;
+
+#ifdef CONFIG_BQL
+	error = sysfs_create_group(kobj, &dql_group);
 	if (error) {
 		kobject_put(kobj);
-		return error;
+		goto exit;
 	}
+#endif
 
 	kobject_uevent(kobj, KOBJ_ADD);
 	dev_hold(queue->dev);
 
+	return 0;
+exit:
 	return error;
 }
-#endif /* CONFIG_XPS */
+#endif /* defined(CONFIG_XPS) || defined(CONFIG_BQL) */
 
 int
 netdev_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
 {
-#ifdef CONFIG_XPS
+#if defined(CONFIG_XPS) || defined(CONFIG_BQL)
 	int i;
 	int error = 0;
 
@@ -1124,8 +1252,14 @@ netdev_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
 		}
 	}
 
-	while (--i >= new_num)
-		kobject_put(&net->_tx[i].kobj);
+	while (--i >= new_num) {
+		struct netdev_queue *queue = net->_tx + i;
+
+#ifdef CONFIG_BQL
+		sysfs_remove_group(&queue->kobj, &dql_group);
+#endif
+		kobject_put(&queue->kobj);
+	}
 
 	return error;
 #else
@@ -1137,7 +1271,7 @@ static int register_queue_kobjects(struct net_device *net)
 {
 	int error = 0, txq = 0, rxq = 0, real_rx = 0, real_tx = 0;
 
-#if defined(CONFIG_RPS) || defined(CONFIG_XPS)
+#if defined(CONFIG_RPS) || defined(CONFIG_XPS) || defined(CONFIG_BQL)
 	net->queues_kset = kset_create_and_add("queues",
 	    NULL, &net->dev.kobj);
 	if (!net->queues_kset)
@@ -1178,7 +1312,7 @@ static void remove_queue_kobjects(struct net_device *net)
 
 	net_rx_queue_update_kobjects(net, real_rx, 0);
 	netdev_queue_update_kobjects(net, real_tx, 0);
-#if defined(CONFIG_RPS) || defined(CONFIG_XPS)
+#if defined(CONFIG_RPS) || defined(CONFIG_XPS) || defined(CONFIG_BQL)
 	kset_unregister(net->queues_kset);
 #endif
 }
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH v3 08/10] tg3: Support for byte queue limits
From: Tom Herbert @ 2011-11-23  5:53 UTC (permalink / raw)
  To: davem, netdev

Changes to tg3 to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/ethernet/broadcom/tg3.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index cd36234..076c0e5 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -5320,6 +5320,7 @@ static void tg3_tx(struct tg3_napi *tnapi)
 	u32 sw_idx = tnapi->tx_cons;
 	struct netdev_queue *txq;
 	int index = tnapi - tp->napi;
+	unsigned int pkts_compl = 0, bytes_compl = 0;
 
 	if (tg3_flag(tp, ENABLE_TSS))
 		index--;
@@ -5370,6 +5371,9 @@ static void tg3_tx(struct tg3_napi *tnapi)
 			sw_idx = NEXT_TX(sw_idx);
 		}
 
+		pkts_compl++;
+		bytes_compl += skb->len;
+
 		dev_kfree_skb(skb);
 
 		if (unlikely(tx_bug)) {
@@ -5378,6 +5382,8 @@ static void tg3_tx(struct tg3_napi *tnapi)
 		}
 	}
 
+	netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
+
 	tnapi->tx_cons = sw_idx;
 
 	/* Need to make the tx_cons update visible to tg3_start_xmit()
@@ -6816,6 +6822,7 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	skb_tx_timestamp(skb);
+	netdev_sent_queue(tp->dev, 1, skb->len);
 
 	/* Packets are ready, update Tx producer idx local and on card. */
 	tw32_tx_mbox(tnapi->prodmbox, entry);
@@ -7297,6 +7304,7 @@ static void tg3_free_rings(struct tg3 *tp)
 			dev_kfree_skb_any(skb);
 		}
 	}
+	netdev_reset_queue(tp->dev);
 }
 
 /* Initialize tx/rx rings for packet processing.
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH v3 06/10] e1000e: Support for byte queue limits
From: Tom Herbert @ 2011-11-23  5:52 UTC (permalink / raw)
  To: davem, netdev

Changes to e1000e to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index a855db1..e337611 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1095,6 +1095,7 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter)
 	unsigned int i, eop;
 	unsigned int count = 0;
 	unsigned int total_tx_bytes = 0, total_tx_packets = 0;
+	unsigned int bytes_compl = 0, pkts_compl = 0;
 
 	i = tx_ring->next_to_clean;
 	eop = tx_ring->buffer_info[i].next_to_watch;
@@ -1112,6 +1113,10 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter)
 			if (cleaned) {
 				total_tx_packets += buffer_info->segs;
 				total_tx_bytes += buffer_info->bytecount;
+				if (buffer_info->skb) {
+					bytes_compl += buffer_info->skb->len;
+					pkts_compl++;
+				}
 			}
 
 			e1000_put_txbuf(adapter, buffer_info);
@@ -1130,6 +1135,8 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter)
 
 	tx_ring->next_to_clean = i;
 
+	netdev_completed_queue(netdev, pkts_compl, bytes_compl);
+
 #define TX_WAKE_THRESHOLD 32
 	if (count && netif_carrier_ok(netdev) &&
 	    e1000_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD) {
@@ -2260,6 +2267,7 @@ static void e1000_clean_tx_ring(struct e1000_adapter *adapter)
 		e1000_put_txbuf(adapter, buffer_info);
 	}
 
+	netdev_reset_queue(adapter->netdev);
 	size = sizeof(struct e1000_buffer) * tx_ring->count;
 	memset(tx_ring->buffer_info, 0, size);
 
@@ -4985,6 +4993,7 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
 	/* if count is 0 then mapping error has occurred */
 	count = e1000_tx_map(adapter, skb, first, max_per_txd, nr_frags, mss);
 	if (count) {
+		netdev_sent_queue(netdev, 1, skb->len);
 		e1000_tx_queue(adapter, tx_flags, count);
 		/* Make sure there is space in the ring for the next send. */
 		e1000_maybe_stop_tx(netdev, MAX_SKB_FRAGS + 2);
-- 
1.7.3.1

^ permalink raw reply related

* Re: Missing TCP SYN on loopback, retransmits after 1s
From: Eric Dumazet @ 2011-11-23  5:58 UTC (permalink / raw)
  To: John Heffner; +Cc: Jesse Young, netdev
In-Reply-To: <1322025880.29324.1.camel@edumazet-laptop>

Le mercredi 23 novembre 2011 à 06:24 +0100, Eric Dumazet a écrit :
> Le mardi 22 novembre 2011 à 21:06 -0500, John Heffner a écrit :
> > Offhand, I'd guess you're overflowing the TCP SYN queue.  (You can try
> > tuning tcp_max_syn_backlog.)
> > 
> 
> There is one litle thing called "netstat -s", a very useful tool,
> included in many distros :)

This is related to TIMEWAIT syndrom ?

06:47:42.090522 IP6 ::1.49374 > ::1.8009: Flags [SEW], seq 2646115915, win 32752, options [mss 16376,sackOK,TS val 26574090 ecr 0,nop,wscale 6], length 0
06:47:42.090579 IP6 ::1.8009 > ::1.49374: Flags [S.E], seq 184529170, ack 2646115916, win 32728, options [mss 16376,sackOK,TS val 26574090 ecr 26574090,nop,wscale 6], length 0
06:47:42.090616 IP6 ::1.49374 > ::1.8009: Flags [.], ack 1, win 512, options [nop,nop,TS val 26574090 ecr 26574090], length 0
06:47:42.090718 IP6 ::1.8009 > ::1.49374: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 26574090 ecr 26574090], length 0
06:47:42.090780 IP6 ::1.49374 > ::1.8009: Flags [F.], seq 1, ack 2, win 512, options [nop,nop,TS val 26574090 ecr 26574090], length 0
06:47:42.090843 IP6 ::1.8009 > ::1.49374: Flags [.], ack 2, win 512, options [nop,nop,TS val 26574090 ecr 26574090], length 0

First connection went well.

Now we try to reuse tuple  (ports 49374, 8009 on loopback) while a socket is in TIMEWAIT, and first
SYN packet (time 06:48:20.337335) is dropped (considered as a packet part of previous session)

Now why the first SYN packet is dropped and not the second one, I dont know yet.

06:48:20.337335 IP6 ::1.49374 > ::1.8009: Flags [SEW], seq 3243722104, win 32752, options [mss 16376,sackOK,TS val 26612337 ecr 0,nop,wscale 6], length 0
06:48:21.340112 IP6 ::1.49374 > ::1.8009: Flags [SEW], seq 3243722104, win 32752, options [mss 16376,sackOK,TS val 26613340 ecr 0,nop,wscale 6], length 0
06:48:21.340162 IP6 ::1.8009 > ::1.49374: Flags [S.E], seq 797804014, ack 3243722105, win 32728, options [mss 16376,sackOK,TS val 26613340 ecr 26613340,nop,wscale 6], length 0
06:48:21.340217 IP6 ::1.49374 > ::1.8009: Flags [.], ack 1, win 512, options [nop,nop,TS val 26613340 ecr 26613340], length 0
06:48:21.340360 IP6 ::1.8009 > ::1.49374: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 26613340 ecr 26613340], length 0
06:48:21.340466 IP6 ::1.49374 > ::1.8009: Flags [F.], seq 1, ack 2, win 512, options [nop,nop,TS val 26613340 ecr 26613340], length 0
06:48:21.340541 IP6 ::1.8009 > ::1.49374: Flags [.], ack 2, win 512, options [nop,nop,TS val 26613340 ecr 26613340], length 0

^ permalink raw reply

* [PATCH v3 04/10] xps: Add xps_queue_release function
From: Tom Herbert @ 2011-11-23  5:52 UTC (permalink / raw)
  To: davem, netdev

This patch moves the xps specific parts in netdev_queue_release into
its own function which netdev_queue_release can call.  This allows
netdev_queue_release to be more generic (for adding new attributes
to tx queues).

Signed-off-by: Tom Herbert <therbert@google.com>
---
 net/core/net-sysfs.c |   89 ++++++++++++++++++++++++++-----------------------
 1 files changed, 47 insertions(+), 42 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index c71c434..fffd5b2 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -890,6 +890,52 @@ static DEFINE_MUTEX(xps_map_mutex);
 #define xmap_dereference(P)		\
 	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
 
+static void xps_queue_release(struct netdev_queue *queue)
+{
+	struct net_device *dev = queue->dev;
+	struct xps_dev_maps *dev_maps;
+	struct xps_map *map;
+	unsigned long index;
+	int i, pos, nonempty = 0;
+
+	index = get_netdev_queue_index(queue);
+
+	mutex_lock(&xps_map_mutex);
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	if (dev_maps) {
+		for_each_possible_cpu(i) {
+			map = xmap_dereference(dev_maps->cpu_map[i]);
+			if (!map)
+				continue;
+
+			for (pos = 0; pos < map->len; pos++)
+				if (map->queues[pos] == index)
+					break;
+
+			if (pos < map->len) {
+				if (map->len > 1)
+					map->queues[pos] =
+					    map->queues[--map->len];
+				else {
+					RCU_INIT_POINTER(dev_maps->cpu_map[i],
+					    NULL);
+					kfree_rcu(map, rcu);
+					map = NULL;
+				}
+			}
+			if (map)
+				nonempty = 1;
+		}
+
+		if (!nonempty) {
+			RCU_INIT_POINTER(dev->xps_maps, NULL);
+			kfree_rcu(dev_maps, rcu);
+		}
+	}
+	mutex_unlock(&xps_map_mutex);
+}
+
 static ssize_t store_xps_map(struct netdev_queue *queue,
 		      struct netdev_queue_attribute *attribute,
 		      const char *buf, size_t len)
@@ -1029,49 +1075,8 @@ static struct attribute *netdev_queue_default_attrs[] = {
 static void netdev_queue_release(struct kobject *kobj)
 {
 	struct netdev_queue *queue = to_netdev_queue(kobj);
-	struct net_device *dev = queue->dev;
-	struct xps_dev_maps *dev_maps;
-	struct xps_map *map;
-	unsigned long index;
-	int i, pos, nonempty = 0;
-
-	index = get_netdev_queue_index(queue);
-
-	mutex_lock(&xps_map_mutex);
-	dev_maps = xmap_dereference(dev->xps_maps);
 
-	if (dev_maps) {
-		for_each_possible_cpu(i) {
-			map = xmap_dereference(dev_maps->cpu_map[i]);
-			if (!map)
-				continue;
-
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-
-			if (pos < map->len) {
-				if (map->len > 1)
-					map->queues[pos] =
-					    map->queues[--map->len];
-				else {
-					RCU_INIT_POINTER(dev_maps->cpu_map[i],
-					    NULL);
-					kfree_rcu(map, rcu);
-					map = NULL;
-				}
-			}
-			if (map)
-				nonempty = 1;
-		}
-
-		if (!nonempty) {
-			RCU_INIT_POINTER(dev->xps_maps, NULL);
-			kfree_rcu(dev_maps, rcu);
-		}
-	}
-
-	mutex_unlock(&xps_map_mutex);
+	xps_queue_release(queue);
 
 	memset(kobj, 0, sizeof(*kobj));
 	dev_put(queue->dev);
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH v3 0/10] bql: Byte Queue Limits
From: Tom Herbert @ 2011-11-23  5:52 UTC (permalink / raw)
  To: davem, netdev

Changes from last version:
  - Rebase to 3.2
  - Added CONFIG_BQL and CONFIG_DQL
  - Added some cache alignment in struct dql, to split read only, writeable
    elements, and split those elements written on transmit from those
    written at transmit completion (suggested by Eric).
  - Split out adding xps_queue_release as its own patch.
  - Some minor performance changes, use likely and unlikely for some
    conditionals.
  - Cleaned up some "show" functions for bql (pointed out by Ben).
  - Change netdev_tx_completed_queue to do check xoff, check
    availability, and then check xoff again.  This to prevent potential
    race conditions with netdev_sent_queue (as Ben pointed out).
  - Did some more testing trying to evaluate overhead of BQL in the
    transmit path.  I see about 1-3% degradation in CPU utilization
    and maximum pps when BQL is enabled.  Any ideas to beat this
    down as much as possible would be appreciated!
  - Added high versus low priority traffic test to results below.
  
----

This patch series implements byte queue limits (bql) for NIC TX queues.

Byte queue limits are a mechanism to limit the size of the transmit
hardware queue on a NIC by number of bytes. The goal of these byte
limits is too reduce latency (HOL blocking) caused by excessive queuing
in hardware (aka buffer bloat) without sacrificing throughput.

Hardware queuing limits are typically specified in terms of a number
hardware descriptors, each of which has a variable size. The variability
of the size of individual queued items can have a very wide range. For
instance with the e1000 NIC the size could range from 64 bytes to 4K
(with TSO enabled). This variability makes it next to impossible to
choose a single queue limit that prevents starvation and provides lowest
possible latency.

The objective of byte queue limits is to set the limit to be the
minimum needed to prevent starvation between successive transmissions to
the hardware. The latency between two transmissions can be variable in a
system. It is dependent on interrupt frequency, NAPI polling latencies,
scheduling of the queuing discipline, lock contention, etc. Therefore we
propose that byte queue limits should be dynamic and change in
accordance with networking stack latencies a system encounters.  BQL
should not need to take the underlying link speed as input, it should
automatically adjust to whatever the speed is (even if that in itself is
dynamic).

Patches to implement this:
- Dynamic queue limits (dql) library.  This provides the general
queuing algorithm.
- netdev changes that use dlq to support byte queue limits.
- Support in drivers for byte queue limits.

The effects of BQL are demonstrated in the benchmark results below.

--- High priority versus low priority traffic:

In this test 100 netperf TCP_STREAMs were started to saturate the link.
A single instance of a netperf TCP_RR was run with high priority set.
Queuing discipline in pfifo_fast, NIC is e1000 with TX ring size set to
1024.  tps for the high priority RR is listed.

No BQL, tso on: 3000-3200K bytes in queue: 36 tps
BQL, tso on: 156-194K bytes in queue, 535 tps
No BQL, tso off: 453-454K bytes int queue, 234 tps
BQL, tso off: 66K bytes in queue, 914 tps

---  Various RR sizes

These tests were done running 200 stream of netperf RR tests.  The
results demonstrate the reduction in queuing and also illustrates 
the overhead due to BQL (in small RR sizes).

140000 rr size
BQL: 80-215K bytes in queue, 856 tps, 3.26%
No BQL: 2700-2930K bytes in queue, 854 tps, 3.71% cpu

14000 rr size
BQL: 25-55K bytes in queue, 8500 tps
No BQL: 1500-1622K bytes in queue,  8523 tps, 4.53% cpu

1400 rr size
BQL: 20-38K in queue bytes in queue, 86582 tps,  7.38% cpu
No BQL: 29-117K 85738 tps, 7.67% cpu

140 rr size
BQL: 1-10K bytes in queue, 320540 tps, 34.6% cpu
No BQL: 1-13K bytes in queue, 323158, 37.16% cpu

1 rr size
BQL: 0-3K in queue, 338811 tps, 41.41% cpu
No BQL: 0-3K in queue, 339947 42.36% cpu

So the amount of queuing in the NIC can be reduced up to 90% or more.
Accordingly, the latency for high priority packets in the prescence
of low priority bulk throughput traffic can be reduced by 90% or more.

Since BQL accounting is in the transmit path for every packet, and the
function to recompute the byte limit is run once per transmit
completion-- there will be some overhead in using BQL.  So far, Ive see
the overhead to be in the range of 1-3% for CPU utilization and maximum
pps.

^ permalink raw reply

* [PATCH v3 07/10] forcedeth: Support for byte queue limits
From: Tom Herbert @ 2011-11-23  5:53 UTC (permalink / raw)
  To: davem, netdev

Changes to forcedeth to use byte queue limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index e8a5ae3..98e5464 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -1842,6 +1842,7 @@ static void nv_init_tx(struct net_device *dev)
 		np->last_tx.ex = &np->tx_ring.ex[np->tx_ring_size-1];
 	np->get_tx_ctx = np->put_tx_ctx = np->first_tx_ctx = np->tx_skb;
 	np->last_tx_ctx = &np->tx_skb[np->tx_ring_size-1];
+	netdev_reset_queue(np->dev);
 	np->tx_pkts_in_progress = 0;
 	np->tx_change_owner = NULL;
 	np->tx_end_flip = NULL;
@@ -2187,6 +2188,9 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	/* set tx flags */
 	start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra);
+
+	netdev_sent_queue(np->dev, 1, skb->len);
+
 	np->put_tx.orig = put_tx;
 
 	spin_unlock_irqrestore(&np->lock, flags);
@@ -2331,6 +2335,9 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 
 	/* set tx flags */
 	start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra);
+
+	netdev_sent_queue(np->dev, 1, skb->len);
+
 	np->put_tx.ex = put_tx;
 
 	spin_unlock_irqrestore(&np->lock, flags);
@@ -2368,6 +2375,7 @@ static int nv_tx_done(struct net_device *dev, int limit)
 	u32 flags;
 	int tx_work = 0;
 	struct ring_desc *orig_get_tx = np->get_tx.orig;
+	unsigned int bytes_compl = 0;
 
 	while ((np->get_tx.orig != np->put_tx.orig) &&
 	       !((flags = le32_to_cpu(np->get_tx.orig->flaglen)) & NV_TX_VALID) &&
@@ -2381,6 +2389,7 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					if ((flags & NV_TX_RETRYERROR) && !(flags & NV_TX_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				}
+				bytes_compl += np->get_tx_ctx->skb->len;
 				dev_kfree_skb_any(np->get_tx_ctx->skb);
 				np->get_tx_ctx->skb = NULL;
 				tx_work++;
@@ -2391,6 +2400,7 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					if ((flags & NV_TX2_RETRYERROR) && !(flags & NV_TX2_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				}
+				bytes_compl += np->get_tx_ctx->skb->len;
 				dev_kfree_skb_any(np->get_tx_ctx->skb);
 				np->get_tx_ctx->skb = NULL;
 				tx_work++;
@@ -2401,6 +2411,9 @@ static int nv_tx_done(struct net_device *dev, int limit)
 		if (unlikely(np->get_tx_ctx++ == np->last_tx_ctx))
 			np->get_tx_ctx = np->first_tx_ctx;
 	}
+
+	netdev_completed_queue(np->dev, tx_work, bytes_compl);
+
 	if (unlikely((np->tx_stop == 1) && (np->get_tx.orig != orig_get_tx))) {
 		np->tx_stop = 0;
 		netif_wake_queue(dev);
@@ -2414,6 +2427,7 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 	u32 flags;
 	int tx_work = 0;
 	struct ring_desc_ex *orig_get_tx = np->get_tx.ex;
+	unsigned long bytes_cleaned = 0;
 
 	while ((np->get_tx.ex != np->put_tx.ex) &&
 	       !((flags = le32_to_cpu(np->get_tx.ex->flaglen)) & NV_TX2_VALID) &&
@@ -2431,6 +2445,7 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 				}
 			}
 
+			bytes_cleaned += np->get_tx_ctx->skb->len;
 			dev_kfree_skb_any(np->get_tx_ctx->skb);
 			np->get_tx_ctx->skb = NULL;
 			tx_work++;
@@ -2438,6 +2453,9 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 			if (np->tx_limit)
 				nv_tx_flip_ownership(dev);
 		}
+
+		netdev_completed_queue(np->dev, tx_work, bytes_cleaned);
+
 		if (unlikely(np->get_tx.ex++ == np->last_tx.ex))
 			np->get_tx.ex = np->first_tx.ex;
 		if (unlikely(np->get_tx_ctx++ == np->last_tx_ctx))
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH v3 03/10] net: Add netdev interfaces recording send/compl
From: Tom Herbert @ 2011-11-23  5:52 UTC (permalink / raw)
  To: davem, netdev

Add interfaces for drivers to call for recording number of packets and
bytes at send time and transmit completion.  Also, added a function to
"reset" a queue.  These will be used by Byte Queue Limits.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 include/linux/netdevice.h |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index dfb50ed..8b3eb8a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1924,6 +1924,35 @@ static inline int netif_xmit_frozen_or_stopped(const struct netdev_queue *dev_qu
 	return dev_queue->state & QUEUE_STATE_ANY_XOFF_OR_FROZEN;
 }
 
+static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue,
+					unsigned int pkts, unsigned int bytes)
+{
+}
+
+static inline void netdev_sent_queue(struct net_device *dev,
+				     unsigned int pkts, unsigned int bytes)
+{
+	netdev_tx_sent_queue(netdev_get_tx_queue(dev, 0), pkts, bytes);
+}
+
+static inline void netdev_tx_completed_queue(struct netdev_queue *dev_queue,
+					     unsigned pkts, unsigned bytes)
+{
+}
+
+static inline void netdev_completed_queue(struct net_device *dev,
+					  unsigned pkts, unsigned bytes)
+{
+	netdev_tx_completed_queue(netdev_get_tx_queue(dev, 0), pkts, bytes);
+}
+
+static inline void netdev_tx_reset_queue(struct netdev_queue *q)
+{
+}
+
+static inline void netdev_reset_queue(struct net_device *dev_queue)
+{
+	netdev_tx_reset_queue(netdev_get_tx_queue(dev_queue, 0));
 }
 
 /**
-- 
1.7.3.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox