Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] CHOKe flow scheduler (0.8)
From: Patrick McHardy @ 2011-01-14 13:55 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <4D305598.1010207@trash.net>

On 14.01.2011 14:54, Patrick McHardy wrote:
> On 14.01.2011 00:34, Stephen Hemminger wrote:
>> +static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>> +{
>> ...
>> +	/* Is queue small? */
>> +	if (p->qavg <= p->qth_min)
>> +		p->qcount = -1;
>> +	else {
>> +		struct sk_buff *oskb;
>> +		unsigned int idx;
>> +
>> +		/* Draw a packet at random from queue */
>> +		oskb = choke_peek_random(sch, &idx);
>> +
>> +		/* Both packets from same flow ? */
>> +		if (*(unsigned int *)(qdisc_skb_cb(oskb)->data) == hash) {
>> +			/* Drop both packets */
>> +			q->stats.matched++;
>> +			choke_drop_by_idx(q, idx);
>> +			sch->qstats.backlog -= qdisc_pkt_len(skb);
>> +			--sch->q.qlen;
>> +			qdisc_drop(oskb, sch);
> 
> You need to adjust the qlen values of parent qdiscs by calling
> qdisc_tree_decrease_qlen(), they are not aware that a second
> packet has been dropped.

I just saw that Eric already fixed this :)

^ permalink raw reply

* [PATCH resend] Bluetooth: l2cap: fix misuse of logical operation in place of bitop
From: David Sterba @ 2011-01-14 13:59 UTC (permalink / raw)
  To: akpm
  Cc: marcel, linux-bluetooth, netdev, linux-kernel, David Sterba,
	Gustavo F. Padovan, João Paulo Rechi Vita

CC: Marcel Holtmann <marcel@holtmann.org>
CC: "Gustavo F. Padovan" <padovan@profusion.mobi>
CC: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: David Sterba <dsterba@suse.cz>
---

Andrew, this has not been picked up by maintainers since 27.12., please consider it for -mm.

 net/bluetooth/l2cap.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c
index cd8f6ea..bdfdfdc 100644
--- a/net/bluetooth/l2cap.c
+++ b/net/bluetooth/l2cap.c
@@ -1893,8 +1893,8 @@ static int l2cap_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct ms
 		if (pi->mode == L2CAP_MODE_STREAMING) {
 			l2cap_streaming_send(sk);
 		} else {
-			if (pi->conn_state & L2CAP_CONN_REMOTE_BUSY &&
-					pi->conn_state && L2CAP_CONN_WAIT_F) {
+			if ((pi->conn_state & L2CAP_CONN_REMOTE_BUSY) &&
+					(pi->conn_state & L2CAP_CONN_WAIT_F)) {
 				err = len;
 				break;
 			}
-- 
1.7.3.4.626.g73e7b


^ permalink raw reply related

* Re: net 00/05: routing based send-to-self implementation
From: Jonathan Corbet @ 2011-01-14 13:40 UTC (permalink / raw)
  To: Kirill Smelkov; +Cc: Patrick McHardy, davem, netdev, Boris Kocherov
In-Reply-To: <20110114101832.GA3170@tugrik.mns.mnsspb.ru>

On Fri, 14 Jan 2011 13:18:32 +0300
Kirill Smelkov <kirr@mns.spb.ru> wrote:

>  ( Jonathan, I though something like this could be useful for LDD4 in
>    revised snull not needing to play dirty tricks with IP addresses anymore )

Nice thought, thanks.  I don't know yet whether we'd want to do that or
just run stuff on a virtual machine for LDD4 - lots of stuff to figure
out still.

Thanks,

jon

^ permalink raw reply

* pull request: sfc-2.6 2011-01-14
From: Ben Hutchings @ 2011-01-14 14:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, sf-linux-drivers

The following changes since commit 5b919f833d9d60588d026ad82d17f17e8872c7a9:

  net: ax25: fix information leak to userland harder (2011-01-12 00:34:49 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6.git master

A minor optimisation and a regression fix.

Ben.

Ben Hutchings (2):
      sfc: Make efx_get_tx_queue() an inline function
      sfc: Restore the effect of the rss_cpus module parameter

 drivers/net/sfc/efx.c        |   18 ++++++------------
 drivers/net/sfc/net_driver.h |   10 ++++++++--
 2 files changed, 14 insertions(+), 14 deletions(-)

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.




^ permalink raw reply

* Re: [PATCH] CHOKe flow scheduler (0.8)
From: Eric Dumazet @ 2011-01-14 14:24 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Stephen Hemminger, David Miller, netdev
In-Reply-To: <4D3055C2.3060807@trash.net>

Le vendredi 14 janvier 2011 à 14:55 +0100, Patrick McHardy a écrit :
> On 14.01.2011 14:54, Patrick McHardy wrote:
> > On 14.01.2011 00:34, Stephen Hemminger wrote:
> >> +static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> >> +{
> >> ...
> >> +	/* Is queue small? */
> >> +	if (p->qavg <= p->qth_min)
> >> +		p->qcount = -1;
> >> +	else {
> >> +		struct sk_buff *oskb;
> >> +		unsigned int idx;
> >> +
> >> +		/* Draw a packet at random from queue */
> >> +		oskb = choke_peek_random(sch, &idx);
> >> +
> >> +		/* Both packets from same flow ? */
> >> +		if (*(unsigned int *)(qdisc_skb_cb(oskb)->data) == hash) {
> >> +			/* Drop both packets */
> >> +			q->stats.matched++;
> >> +			choke_drop_by_idx(q, idx);
> >> +			sch->qstats.backlog -= qdisc_pkt_len(skb);
> >> +			--sch->q.qlen;
> >> +			qdisc_drop(oskb, sch);
> > 
> > You need to adjust the qlen values of parent qdiscs by calling
> > qdisc_tree_decrease_qlen(), they are not aware that a second
> > packet has been dropped.
> 
> I just saw that Eric already fixed this :)
> --

Thanks Patrick, I did that in choke_change() only, not on this part of
the code.


qdisc cbq 1: root refcnt 2 rate 1000Mbit cell 8b (bounded,isolated) prio no-transmit/8 weight 1000Mbit allot 1514b 
level 2 ewma 5 avpkt 1000b maxidle 0us 
 Sent 92016573 bytes 167811 pkt (dropped 615544, overlimits 982637 requeues 0) 
 rate 6248bit 9pps backlog 0b 191202p requeues 0 
  borrowed 0 overactions 0 avgidle 125 undertime 0
qdisc choke 11: parent 1:11 limit 130000b min 10833b max 32500b ewma 13 Plog 21 Scell_log 30
 Sent 91969929 bytes 167257 pkt (dropped 806746, overlimits 424342 requeues 0) 
 rate 56bit 0pps backlog 0b 0p requeues 0 
  marked 0 early 424342 pdrop 0 other 0 matched 191202


So yes : we see 191202 were CHOKed, and upper cbq leaks 191202 packets in 'backlog'

After following patch :

diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index 29a91d8..5479f7e 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -223,6 +223,7 @@ static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 			q->stats.matched++;
 			choke_drop_by_idx(q, idx);
 			sch->qstats.backlog -= qdisc_pkt_len(oskb);
+			qdisc_tree_decrease_qlen(sch, 1);
 			--sch->q.qlen;
 			qdisc_drop(oskb, sch);
 			goto congestion_drop;

Everything seems fine :

qdisc cbq 1: root refcnt 2 rate 1000Mbit cell 8b (bounded,isolated) prio no-transmit/8 weight 1000Mbit allot 1514b 
level 2 ewma 5 avpkt 1000b maxidle 0us 
 Sent 93962148 bytes 170940 pkt (dropped 633053, overlimits 1010490 requeues 0) 
 rate 568bit 1pps backlog 0b 0p requeues 0 
  borrowed 0 overactions 0 avgidle 125 undertime 0
qdisc choke 11: parent 1:11 limit 130000b min 10833b max 32500b ewma 13 Plog 21 Scell_log 30
 Sent 93957621 bytes 170888 pkt (dropped 829160, overlimits 436946 requeues 0) 
 rate 104bit 0pps backlog 0b 0p requeues 0 
  marked 0 early 436946 pdrop 0 other 0 matched 196107



^ permalink raw reply related

* [PATCH net-2.6 1/2] sfc: Make efx_get_tx_queue() an inline function
From: Ben Hutchings @ 2011-01-14 14:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1295014889.5386.1.camel@bwh-desktop>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c        |   15 +++------------
 drivers/net/sfc/net_driver.h |   10 ++++++++--
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 711449c..c2dc9a5 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1266,27 +1266,18 @@ static void efx_remove_interrupts(struct efx_nic *efx)
 	efx->legacy_irq = 0;
 }
 
-struct efx_tx_queue *
-efx_get_tx_queue(struct efx_nic *efx, unsigned index, unsigned type)
-{
-	unsigned tx_channel_offset =
-		separate_tx_channels ? efx->n_channels - efx->n_tx_channels : 0;
-	EFX_BUG_ON_PARANOID(index >= efx->n_tx_channels ||
-			    type >= EFX_TXQ_TYPES);
-	return &efx->channel[tx_channel_offset + index]->tx_queue[type];
-}
-
 static void efx_set_channels(struct efx_nic *efx)
 {
 	struct efx_channel *channel;
 	struct efx_tx_queue *tx_queue;
-	unsigned tx_channel_offset =
+
+	efx->tx_channel_offset =
 		separate_tx_channels ? efx->n_channels - efx->n_tx_channels : 0;
 
 	/* Channel pointers were set in efx_init_struct() but we now
 	 * need to clear them for TX queues in any RX-only channels. */
 	efx_for_each_channel(channel, efx) {
-		if (channel->channel - tx_channel_offset >=
+		if (channel->channel - efx->tx_channel_offset >=
 		    efx->n_tx_channels) {
 			efx_for_each_channel_tx_queue(tx_queue, channel)
 				tx_queue->channel = NULL;
diff --git a/drivers/net/sfc/net_driver.h b/drivers/net/sfc/net_driver.h
index bdce66d..28df866 100644
--- a/drivers/net/sfc/net_driver.h
+++ b/drivers/net/sfc/net_driver.h
@@ -735,6 +735,7 @@ struct efx_nic {
 	unsigned next_buffer_table;
 	unsigned n_channels;
 	unsigned n_rx_channels;
+	unsigned tx_channel_offset;
 	unsigned n_tx_channels;
 	unsigned int rx_buffer_len;
 	unsigned int rx_buffer_order;
@@ -929,8 +930,13 @@ efx_get_channel(struct efx_nic *efx, unsigned index)
 	     _channel = (_channel->channel + 1 < (_efx)->n_channels) ?	\
 		     (_efx)->channel[_channel->channel + 1] : NULL)
 
-extern struct efx_tx_queue *
-efx_get_tx_queue(struct efx_nic *efx, unsigned index, unsigned type);
+static inline struct efx_tx_queue *
+efx_get_tx_queue(struct efx_nic *efx, unsigned index, unsigned type)
+{
+	EFX_BUG_ON_PARANOID(index >= efx->n_tx_channels ||
+			    type >= EFX_TXQ_TYPES);
+	return &efx->channel[efx->tx_channel_offset + index]->tx_queue[type];
+}
 
 static inline struct efx_tx_queue *
 efx_channel_get_tx_queue(struct efx_channel *channel, unsigned type)
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-2.6 2/2] sfc: Restore the effect of the rss_cpus module parameter
From: Ben Hutchings @ 2011-01-14 14:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1295014889.5386.1.camel@bwh-desktop>

Commit a4900ac ("sfc: Create multiple TX queues") accidentally
disabled the rss_cpus module parameter.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index c2dc9a5..002bac7 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1153,6 +1153,9 @@ static int efx_wanted_channels(void)
 	int count;
 	int cpu;
 
+	if (rss_cpus)
+		return rss_cpus;
+
 	if (unlikely(!zalloc_cpumask_var(&core_mask, GFP_KERNEL))) {
 		printk(KERN_WARNING
 		       "sfc: RSS disabled due to allocation failure\n");
-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: net 00/05: routing based send-to-self implementation
From: Kirill Smelkov @ 2011-01-14 15:02 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Patrick McHardy, davem, netdev, Boris Kocherov
In-Reply-To: <20110114064048.5829220a@bike.lwn.net>

On Fri, Jan 14, 2011 at 06:40:48AM -0700, Jonathan Corbet wrote:
> On Fri, 14 Jan 2011 13:18:32 +0300
> Kirill Smelkov <kirr@mns.spb.ru> wrote:
> 
> >  ( Jonathan, I though something like this could be useful for LDD4 in
> >    revised snull not needing to play dirty tricks with IP addresses anymore )
> 
> Nice thought, thanks.  I don't know yet whether we'd want to do that or
> just run stuff on a virtual machine for LDD4 - lots of stuff to figure
> out still.

Just FYI: I've started with virtual machines, but figured out it is
(sometimes, maybe my fault) a bit of pain to coherently setup and also
that not all hardware have support for KVM - for example my netbook with
Atom N455, which I use half the time, does not support VT and they've
killed support for KQEMU in QEMU recently...

Anyway, good luck with LDD4!

Thanks,
Kirill

^ permalink raw reply

* Re: sch_sfb
From: Eric Dumazet @ 2011-01-14 15:09 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Patrick McHardy, David Miller, netdev
In-Reply-To: <7ivd1rsj6n.fsf@lanthane.pps.jussieu.fr>

Le vendredi 14 janvier 2011 à 14:34 +0100, Juliusz Chroboczek a écrit :
> > I just looked at it out of interest after already having started my
> > own version.
> 
> >>   http://thread.gmane.org/gmane.linux.network/90225
> >>   http://thread.gmane.org/gmane.linux.network/90375
> 
> >> It was reviewed in particular by one Patrick McHardy.
> 
> > There's no reason to be pissed
> 
> Yes, there is.
> 
> First you object to my patch by making a bunch of unreasonable requests
> (notably that I use the in-kernel classifiers, which are not usable with
> Bloom filters).  Then it turns out you're implementing your own version
> "from scratch".  And then you claim that you never saw my version in the
> first place?
> 
> Patrick, what you're doing is not merely rude, it's actually unethical.
> 

Juliusz, I believe you overreact on this one.

Patrick reviews are really good and very reasonable. Yet, for a beginner
it might sounds difficult to understand and make the requested changes.

I sent you a private mail 2 weeks ago to offer my services to take SFB
and get it into kernel, because I read your previous attempts and
understood you had litle time to make the needed changes.

I got distracted by SFQ/CHOKe experiments but was about to work on SFB
when Patrick sent his mail. I really think we should work together, and
eventually offer SFB as a new qdisc to network admins :)

Thanks !



^ permalink raw reply

* Re: [PATCH v3 1/2] net_device: add support for network device groups
From: jamal @ 2011-01-14 15:10 UTC (permalink / raw)
  To: Vlad Dogaru; +Cc: netdev, Octavian Purdila
In-Reply-To: <1294997911-13866-2-git-send-email-ddvlad@rosedu.org>

On Fri, 2011-01-14 at 11:38 +0200, Vlad Dogaru wrote:
> Net devices can now be grouped, enabling simpler manipulation from
> userspace. This patch adds a group field to the net_device structure, as
> well as rtnetlink support to query and modify it.
> 
> Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org>

Good stuff.

Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>

cheers,
jamal


^ permalink raw reply

* Re: [PATCH v3 2/2] netlink: support setting devgroup parameters
From: jamal @ 2011-01-14 15:10 UTC (permalink / raw)
  To: Vlad Dogaru; +Cc: netdev, Octavian Purdila
In-Reply-To: <1294997911-13866-3-git-send-email-ddvlad@rosedu.org>

On Fri, 2011-01-14 at 11:38 +0200, Vlad Dogaru wrote:
> If a rtnetlink request specifies a negative or zero ifindex and has no
> interface name attribute, but has a group attribute, then the chenges
> are made to all the interfaces belonging to the specified group.
> 
> Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org>

Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>


cheers,
jamal


^ permalink raw reply

* [PATCH v6 08/10] ARM: mxs: add ocotp read function
From: Shawn Guo @ 2011-01-14 15:11 UTC (permalink / raw)
  To: davem, gerg, baruch, eric, bryan.wu, r64343, B32542,
	u.kleine-koenig
  Cc: Shawn Guo
In-Reply-To: <1294297998-26930-1-git-send-email-shawn.guo@freescale.com>

Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
---
 arch/arm/mach-mxs/Kconfig               |    4 ++
 arch/arm/mach-mxs/Makefile              |    2 +
 arch/arm/mach-mxs/include/mach/common.h |    1 +
 arch/arm/mach-mxs/ocotp.c               |   90 +++++++++++++++++++++++++++++++
 4 files changed, 97 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mach-mxs/ocotp.c

diff --git a/arch/arm/mach-mxs/Kconfig b/arch/arm/mach-mxs/Kconfig
index 8bfc8df..cd2fbdf 100644
--- a/arch/arm/mach-mxs/Kconfig
+++ b/arch/arm/mach-mxs/Kconfig
@@ -2,6 +2,9 @@ if ARCH_MXS
 
 source "arch/arm/mach-mxs/devices/Kconfig"
 
+config MXS_OCOTP
+	bool
+
 config SOC_IMX23
 	bool
 	select CPU_ARM926T
@@ -26,6 +29,7 @@ config MACH_MX28EVK
 	select SOC_IMX28
 	select MXS_HAVE_AMBA_DUART
 	select MXS_HAVE_PLATFORM_FEC
+	select MXS_OCOTP
 	default y
 	help
 	  Include support for MX28EVK platform. This includes specific
diff --git a/arch/arm/mach-mxs/Makefile b/arch/arm/mach-mxs/Makefile
index 39d3f9c..df501a8 100644
--- a/arch/arm/mach-mxs/Makefile
+++ b/arch/arm/mach-mxs/Makefile
@@ -1,6 +1,8 @@
 # Common support
 obj-y := clock.o devices.o gpio.o icoll.o iomux.o system.o timer.o
 
+obj-$(CONFIG_MXS_OCOTP) += ocotp.o
+
 obj-$(CONFIG_SOC_IMX23) += clock-mx23.o mm-mx23.o
 obj-$(CONFIG_SOC_IMX28) += clock-mx28.o mm-mx28.o
 
diff --git a/arch/arm/mach-mxs/include/mach/common.h b/arch/arm/mach-mxs/include/mach/common.h
index 59133eb..635bb5d 100644
--- a/arch/arm/mach-mxs/include/mach/common.h
+++ b/arch/arm/mach-mxs/include/mach/common.h
@@ -13,6 +13,7 @@
 
 struct clk;
 
+extern const u32 *mxs_get_ocotp(void);
 extern int mxs_reset_block(void __iomem *);
 extern void mxs_timer_init(struct clk *, int);
 
diff --git a/arch/arm/mach-mxs/ocotp.c b/arch/arm/mach-mxs/ocotp.c
new file mode 100644
index 0000000..65157a3
--- /dev/null
+++ b/arch/arm/mach-mxs/ocotp.c
@@ -0,0 +1,90 @@
+/*
+ * Copyright 2010 Freescale Semiconductor, Inc. All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/mutex.h>
+
+#include <mach/mxs.h>
+
+#define OCOTP_WORD_OFFSET		0x20
+#define OCOTP_WORD_COUNT		0x20
+
+#define BM_OCOTP_CTRL_BUSY		(1 << 8)
+#define BM_OCOTP_CTRL_ERROR		(1 << 9)
+#define BM_OCOTP_CTRL_RD_BANK_OPEN	(1 << 12)
+
+static DEFINE_MUTEX(ocotp_mutex);
+static u32 ocotp_words[OCOTP_WORD_COUNT];
+
+const u32 *mxs_get_ocotp(void)
+{
+	void __iomem *ocotp_base = MXS_IO_ADDRESS(MXS_OCOTP_BASE_ADDR);
+	int timeout = 0x400;
+	size_t i;
+	static int once = 0;
+
+	if (once)
+		return ocotp_words;
+
+	mutex_lock(&ocotp_mutex);
+
+	/*
+	 * clk_enable(hbus_clk) for ocotp can be skipped
+	 * as it must be on when system is running.
+	 */
+
+	/* try to clear ERROR bit */
+	__mxs_clrl(BM_OCOTP_CTRL_ERROR, ocotp_base);
+
+	/* check both BUSY and ERROR cleared */
+	while ((__raw_readl(ocotp_base) &
+		(BM_OCOTP_CTRL_BUSY | BM_OCOTP_CTRL_ERROR)) && --timeout)
+		cpu_relax();
+
+	if (unlikely(!timeout))
+		goto error_unlock;
+
+	/* open OCOTP banks for read */
+	__mxs_setl(BM_OCOTP_CTRL_RD_BANK_OPEN, ocotp_base);
+
+	/* approximately wait 32 hclk cycles */
+	udelay(1);
+
+	/* poll BUSY bit becoming cleared */
+	timeout = 0x400;
+	while ((__raw_readl(ocotp_base) & BM_OCOTP_CTRL_BUSY) && --timeout)
+		cpu_relax();
+
+	if (unlikely(!timeout))
+		goto error_unlock;
+
+	for (i = 0; i < OCOTP_WORD_COUNT; i++)
+		ocotp_words[i] = __raw_readl(ocotp_base + OCOTP_WORD_OFFSET +
+						i * 0x10);
+
+	/* close banks for power saving */
+	__mxs_clrl(BM_OCOTP_CTRL_RD_BANK_OPEN, ocotp_base);
+
+	once = 1;
+
+	mutex_unlock(&ocotp_mutex);
+
+	return ocotp_words;
+
+error_unlock:
+	mutex_unlock(&ocotp_mutex);
+	pr_err("%s: timeout in reading OCOTP\n", __func__);
+	return NULL;
+}
-- 
1.7.1



^ permalink raw reply related

* Re: STMMAC driver: NFS Problem on 2.6.37
From: Chuck Lever @ 2011-01-14 15:35 UTC (permalink / raw)
  To: deepaksi
  Cc: Armando VISCONTI,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Shiraz HASHIM,
	Viresh KUMAR, Peppe CAVALLARO, Amit GOEL
In-Reply-To: <4D301DD1.9070104-qxv4g6HH51o@public.gmane.org>


On Jan 14, 2011, at 4:56 AM, deepaksi wrote:

> The nfs work fine when you use the dhcp server, but still it is a
> totally different proposition. The problem comes when we assign a static
> IP in the bootargs, and then try to boot the system using NFS. here is
> the bootargs setting at the u-boot.
> 
> bootargs = console=ttyAMA0,115200 mem=128M root=/dev/nfs
> nfsroot=192.168.1.1:/opt/STM/STLinux-2.3/devkit/arm/target
> ip=192.168.1.10:192.168.1.1:192.168.1.1:255.255.255.0::eth0
> 
> These settings have been working fine for earlier kernels. The tcp dump
> while doing the nfs boot is quite different between the 2.6.32 and
> 2.6.37 kernels.

Have you tried any kernel versions between 2.6.32 and 2.6.37?  I made significant changes in 2.6.36, so it would be helpful to know when this stopped working.

Also, in 2.6.36, I added a new debugging feature for NFSROOT.  See "nfsrootdebug" description in Documentation/filesystems/nfs/nfsroot.txt.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [net-next-2.6 PATCH v7 1/2] net: implement mechanism for HW based QOS
From: John Fastabend @ 2011-01-14 16:01 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem@davemloft.net, jarkao2@gmail.com, hadi@cyberus.ca,
	eric.dumazet@gmail.com, shemminger@vyatta.com,
	tgraf@infradead.org, nhorman@tuxdriver.com,
	netdev@vger.kernel.org
In-Reply-To: <1294957832.3946.61.camel@bwh-desktop>

On 1/13/2011 2:30 PM, Ben Hutchings wrote:
> I'm actually having a go at implementing this, as a result of which I
> have some more questions:
> 

Nice.

> On Fri, 2011-01-07 at 14:45 -0800, John Fastabend wrote:
> [...]
>> With the mechanism in this patch users can set skb priority using
>> expected methods ie setsockopt() or the stack can set the priority
>> directly. Then the skb will be steered to the correct tx queues
>> aligned with hardware QOS traffic classes. In the normal case with
>> a single traffic class and all queues in this class everything
>> works as is until the LLD enables multiple tcs.
>>
>> To steer the skb we mask out the lower 4 bits of the priority
>> and allow the hardware to configure upto 15 distinct classes
>> of traffic. This is expected to be sufficient for most applications
>> at any rate it is more then the 8021Q spec designates and is
>> equal to the number of prio bands currently implemented in
>> the default qdisc.
> 
> What is the meaning of a class number?  Is it simply higher number =>
> higher priority?  If there are exactly 8 classes, can they be assumed to
> match the 802.1q priority classes?

The exact meaning will depend on how the hardware is configured. With DCB
configured the class number is going to have a different meaning then when
DCB is disabled. Paring any specific configuration using 'strict priority'
should be expected. This would simply be higher number => higher priority.

Merely having 8 classes should not be assumed to match the 802.1q priority
classes. As soon as you say 802.1q priority classes I expect to see 802.1q
priority tagged frames. This patch doesn't enforce any sort of tagging.
However if you set up 8 traffic classes and built a qos egress map with
'ip link' that matched then this would be fair to say. I intentionally
tried to leave this up to user space policy to configure no reason to try
and enforce configurations IMHO. Notice you could go wrong with this and
create configurations that probably make little sense but again I think this
is a user space problem and at least in the DCB case we can have 'lldpad'
manage this.

By the way defaulting to 'strict priority' is the 802.1Q spec suggestion
for bridges that do not support the credit-based shaper algorithm. And the
default mapping for 8 traffic classes is priorities map 1:1 with classes
so this all lines up to get what you assumed.

> 
>> This in conjunction with a userspace application such as
>> lldpad can be used to implement 8021Q transmission selection
>> algorithms one of these algorithms being the extended transmission
>> selection algorithm currently being used for DCB.
> 
> Should an Ethernet driver/hardware insert a 802.1q priority tag if it
> implements this, or should that be left to higher levels?

I am not so sure about this. I tend to like using the mechanisms already
in place, the egress qos mapping, to add 802.1q priority tag. But it looks
like some drivers add the tags explicitly. And also there appears to be
some expectation that the 802.1q priority tag is added to all packets even
those sent on non-vlan devices at least in the DCB case.

> 
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>
>>  include/linux/netdevice.h |   65 +++++++++++++++++++++++++++++++++++++++++++++
>>  net/core/dev.c            |   61 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 125 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 0f6b1c9..b1dbbed 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
> [...]
>> @@ -756,6 +764,7 @@ struct xps_dev_maps {
>>   * int (*ndo_set_vf_port)(struct net_device *dev, int vf,
>>   *			  struct nlattr *port[]);
>>   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
>> + * void (*ndo_setup_tc)(struct net_device *dev, u8 tc, unsigned int txq)
> [...]
> 
> This is not documentation, it is just repetition!
> 

Point taken. It does seem to be the trend though.

> Please specify what the parameters mean.  In particular, what is the
> purpose of the txq parameter; can it be different from
> dev->real_num_tx_queues?

'tc' is the number of traffic classes to enable and 'txq' is the number
of queues in use. This should be dev->real_num_tx_queues except when
called from netif_set_real_num_tx_queues where the new number of queues
is used. This is done to prevent steering skb's to rings that may no
longer exist.

> 
> Is this operation allowed to change the number of TX queues?  I was
> looking at scaling the number of queues according to the number of
> classes.


As is no. Changing the number of TX queues can't be done cleanly because
calling netif_set_real_num_tx_queues() calls ndo_tc_setup(). I was
expecting other netlink messages would change the tx queue numbers.

However I think I like this better... In order to make this possible
I can remove the ndo_tc_setup() call from netif_set_real_num_tc_queues().
Then the 'txq' param can be removed making this call a bit neater as well.
The downside of this is drivers will need to ensure that calling
netif_set_real_num_tx_queues() will not break the mapping otherwise
the mapping gets invalidated. I'll have to see but this might be OK.

Thanks,
John.


> 
> Ben.
> 


^ permalink raw reply

* Re: Kernel 2.6.37-git10 build failure: cassini.c
From: Anca Emanuel @ 2011-01-14 16:02 UTC (permalink / raw)
  To: LKML
  Cc: netdev, Grant Likely, David S. Miller, Eric Dumazet, Joe Perches,
	Christoph Egger, Jiri Pirko
In-Reply-To: <AANLkTin61iqrD0Jbax+cvQ1wxD8b=c4KJv7pBSvU7kbv@mail.gmail.com>

On Fri, Jan 14, 2011 at 10:09 AM, Anca Emanuel <anca.emanuel@gmail.com> wrote:
> drivers/net/cassini.c: In function ‘cas_get_vpd_info’:
> drivers/net/cassini.c:3358: error: implicit declaration of function
> ‘of_get_property’
> drivers/net/cassini.c:3358: warning: assignment makes pointer from
> integer without a cast
> drivers/net/cassini.c: In function ‘cas_init_one’:
> drivers/net/cassini.c:5035: error: implicit declaration of function
> ‘pci_device_to_OF_node’
> drivers/net/cassini.c:5035: warning: assignment makes pointer from
> integer without a cast
> make[3]: *** [drivers/net/cassini.o] Error 1
> make[2]: *** [drivers/net] Error 2
>
>
> The first part can be solved with:
>
> diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c
> index 7206ab2..fe3795e 100644
> --- a/drivers/net/cassini.c
> +++ b/drivers/net/cassini.c
> @@ -94,6 +94,7 @@
>  #include <linux/tcp.h>
>  #include <linux/mutex.h>
>  #include <linux/firmware.h>
> +#include <linux/of.h>
>
>  #include <net/checksum.h>
>

The second part it's a little bit harder, because:
http://lxr.linux.no/linux+v2.6.37/+search=pci_device_to_OF_node
there is no generic function, only microblaze, powerpc and sparc are
there as users.

Still not solved in git11.
I'm testing this from x86, Ubuntu 10.10 standard config.

^ permalink raw reply

* Re: [PATCH] r8169: keep firmware in memory.
From: Linus Torvalds @ 2011-01-14 16:05 UTC (permalink / raw)
  To: Michael Tokarev, David Woodhouse, Johannes Berg,
	Greg Kroah-Hartman, Rafael J. Wysocki
  Cc: Francois Romieu, David Miller, netdev, Jarek Kamiński, Hayes,
	Ben Hutchings
In-Reply-To: <4D2FF29C.1050101@msgid.tls.msk.ru>

[ background for new people in discussion: once more, a driver resume
didn't get a working firmware load. ]

On Thu, Jan 13, 2011 at 10:52 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> Given all this I think this is somewhat clumsy still.  How
> does other NIC drivers handles this situation - e.g. tg3?
> Maybe this needs to be a generic solution instead of per-driver?

We've had this bug several times - and not just for network drivers
either. It's almost universal that drivers just load their firmware at
initialization, and then don't realize that they need to do it at
resume too and have no filesystems accessible. I think the firmware
access interface is partly to blame, because I suspect both the
documentation and the code (and probably any examples: as mentioned,
this is very common) tends to be about that initial one-shot "bring
the device up" rather than thinking about "oh, resume is a bootup too,
and unlike initial boot, it needs to come up in a configured state".

Maybe the firmware loader could be made more useful to automatically
handle the caching (it already knows about built-in firmware) for the
case where CONFIG_PM is enabled. So that drivers _could_ just
basically do "request_firmware()" in their resume functions, and it
would get satisfied from RAM for the re-request case.

Or maybe we could have some honking big comments, at least ;)

Added some device core/fw-class people to the discussion for comments.
Clearly the fw loading _works_, but it does end up being a common bug
that it's messed up at resume. This time it was the "oh, I moved the
firmware loading around so now it's done at a different stage", but
quite often it's literally just "oh, it never worked, I had just
forgotten about resume".

                       Linus

^ permalink raw reply

* Re: [PATCH v3 0/2] net: add device groups
From: Stephen Hemminger @ 2011-01-14 16:08 UTC (permalink / raw)
  To: Vlad Dogaru; +Cc: netdev, jamal, Octavian Purdila
In-Reply-To: <1294997911-13866-1-git-send-email-ddvlad@rosedu.org>

On Fri, 14 Jan 2011 11:38:29 +0200
Vlad Dogaru <ddvlad@rosedu.org> wrote:

> This patchset implements network device grouping and simple manipulation
> of groups. Netlink has been updated to provide group information and
> means of applying changes to members of a specific group via a single
> message.
> 
> The patchset has a corresponding one for iproute2, which implements the
> new functionality in userspace.
> 
> Some basic testing, using 1024 dummy network interfaces (all of them in
> group 0):
> 
> # time { for i in `seq 0 1023`; do ip l s dev dummy$i up; done }
> 
> real	0m7.70s
> user	0m0.36s
> sys	0m4.85s
> 
> # time ip l s devgroup 0 up
> 
> real	0m0.14s
> user	0m0.00s
> sys	0m0.14s
> 
> # time { for i in `seq 0 1023`; do ip l s dev dummy$i mtu 2000; done }
> 
> real	0m7.43s
> user	0m0.48s
> sys	0m4.72s
> 
> # time ip l s devgroup 0 mtu 2000
> 
> real	0m0.02s
> user	0m0.00s
> sys	0m0.02s
> 
> Improvement stems both from less user-kernel communication and from less
> processes being created.
> 
> Changes since version 2:
>  * fix net_device field type (should be int, not unsigned int).
>  * fix typo in commit message.
> 
> Changes since version 1:
>  * we avoid adding a new attribute type by using the following
>    convention: if no device name is specified, the interface index is
>    negative, and there is a group specified, we change parameters for
>    the whole group.
>  * the dummy module is no longer modified to include an initial group
>    for the devices it creates. The user is responsible for moving them
>    to a different group by means of the provided netlink interface.
> 
> Vlad Dogaru (2):
>   net_device: add support for network device groups
>   netlink: support setting devgroup parameters
> 
>  include/linux/if_link.h   |    1 +
>  include/linux/netdevice.h |    7 +++++++
>  net/core/dev.c            |   12 ++++++++++++
>  net/core/rtnetlink.c      |   38 ++++++++++++++++++++++++++++++++++----
>  4 files changed, 54 insertions(+), 4 deletions(-)

What about a read/write sysfs interface as well?
  /sys/class/net/eth0/devgroup

Not sure if numeric devgroup is best choice. Since this is more of
a human interface parameter maybe it should be a string? Or have
a translation in the utilities /etc/iproute2/devgroup?



-- 

^ permalink raw reply

* [PATCH] net_sched: accurate bytes/packets stats/rates
From: Eric Dumazet @ 2011-01-14 16:16 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Patrick McHardy, Stephen Hemminger, jamal,
	Jarek Poplawski

In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: jamal <hadi@cyberus.ca>
---
 include/net/sch_generic.h |   14 ++++++++------
 net/sched/sch_cbq.c       |    3 +--
 net/sched/sch_drr.c       |    2 +-
 net/sched/sch_dsmark.c    |    2 +-
 net/sched/sch_fifo.c      |    2 +-
 net/sched/sch_generic.c   |    2 +-
 net/sched/sch_hfsc.c      |    2 +-
 net/sched/sch_htb.c       |   12 +++++-------
 net/sched/sch_multiq.c    |    2 +-
 net/sched/sch_netem.c     |    3 +--
 net/sched/sch_prio.c      |    2 +-
 net/sched/sch_red.c       |   11 ++++++-----
 net/sched/sch_sfq.c       |    5 ++---
 net/sched/sch_tbf.c       |    2 +-
 net/sched/sch_teql.c      |    3 ++-
 15 files changed, 33 insertions(+), 34 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index e9eee99..8fed414 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -445,7 +445,6 @@ static inline int __qdisc_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
 {
 	__skb_queue_tail(list, skb);
 	sch->qstats.backlog += qdisc_pkt_len(skb);
-	qdisc_bstats_update(sch, skb);
 
 	return NET_XMIT_SUCCESS;
 }
@@ -456,25 +455,28 @@ static inline int qdisc_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch)
 }
 
 static inline struct sk_buff *__qdisc_dequeue_head(struct Qdisc *sch,
-						   struct sk_buff_head *list)
+						   struct sk_buff_head *list,
+						   bool stats_update)
 {
 	struct sk_buff *skb = __skb_dequeue(list);
 
-	if (likely(skb != NULL))
+	if (likely(skb != NULL)) {
+		if (stats_update)
+			qdisc_bstats_update(sch, skb);
 		sch->qstats.backlog -= qdisc_pkt_len(skb);
-
+	}
 	return skb;
 }
 
 static inline struct sk_buff *qdisc_dequeue_head(struct Qdisc *sch)
 {
-	return __qdisc_dequeue_head(sch, &sch->q);
+	return __qdisc_dequeue_head(sch, &sch->q, true);
 }
 
 static inline unsigned int __qdisc_queue_drop_head(struct Qdisc *sch,
 					      struct sk_buff_head *list)
 {
-	struct sk_buff *skb = __qdisc_dequeue_head(sch, list);
+	struct sk_buff *skb = __qdisc_dequeue_head(sch, list, false);
 
 	if (likely(skb != NULL)) {
 		unsigned int len = qdisc_pkt_len(skb);
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index c80d1c2..5f63ec5 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -390,7 +390,6 @@ cbq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	ret = qdisc_enqueue(skb, cl->q);
 	if (ret == NET_XMIT_SUCCESS) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 		cbq_mark_toplevel(q, cl);
 		if (!cl->next_alive)
 			cbq_activate_class(cl);
@@ -649,7 +648,6 @@ static int cbq_reshape_fail(struct sk_buff *skb, struct Qdisc *child)
 		ret = qdisc_enqueue(skb, cl->q);
 		if (ret == NET_XMIT_SUCCESS) {
 			sch->q.qlen++;
-			qdisc_bstats_update(sch, skb);
 			if (!cl->next_alive)
 				cbq_activate_class(cl);
 			return 0;
@@ -971,6 +969,7 @@ cbq_dequeue(struct Qdisc *sch)
 
 		skb = cbq_dequeue_1(sch);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			return skb;
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index de55e64..6b7fe4a 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -376,7 +376,6 @@ static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	}
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 
 	sch->q.qlen++;
 	return err;
@@ -403,6 +402,7 @@ static struct sk_buff *drr_dequeue(struct Qdisc *sch)
 			skb = qdisc_dequeue_peeked(cl->qdisc);
 			if (cl->qdisc->q.qlen == 0)
 				list_del(&cl->alist);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c
index 60f4bdd..0f7bf3f 100644
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -260,7 +260,6 @@ static int dsmark_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		return err;
 	}
 
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -283,6 +282,7 @@ static struct sk_buff *dsmark_dequeue(struct Qdisc *sch)
 	if (skb == NULL)
 		return NULL;
 
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	index = skb->tc_index & (p->indices - 1);
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index aa4d633..f2557e0 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -53,7 +53,7 @@ static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 		return qdisc_enqueue_tail(skb, sch);
 
 	/* queue full, remove one skb to fulfill the limit */
-	skb_head = qdisc_dequeue_head(sch);
+	skb_head = __qdisc_dequeue_head(sch, &sch->q, false);
 	sch->qstats.drops++;
 	kfree_skb(skb_head);
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 34dc598..eeb6343 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -467,7 +467,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc* qdisc)
 
 	if (likely(band >= 0)) {
 		struct sk_buff_head *list = band2list(priv, band);
-		struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list);
+		struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list, true);
 
 		qdisc->q.qlen--;
 		if (skb_queue_empty(list))
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 2e45791..14a799d 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1600,7 +1600,6 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		set_active(cl, qdisc_pkt_len(skb));
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -1666,6 +1665,7 @@ hfsc_dequeue(struct Qdisc *sch)
 	}
 
 	sch->flags &= ~TCQ_F_THROTTLED;
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	return skb;
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 984c1b0..fc12fe6 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -574,7 +574,6 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -842,7 +841,7 @@ next:
 
 static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 {
-	struct sk_buff *skb = NULL;
+	struct sk_buff *skb;
 	struct htb_sched *q = qdisc_priv(sch);
 	int level;
 	psched_time_t next_event;
@@ -851,6 +850,8 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 	/* try to dequeue direct packets as high prio (!) to minimize cpu work */
 	skb = __skb_dequeue(&q->direct_queue);
 	if (skb != NULL) {
+ok:
+		qdisc_bstats_update(sch, skb);
 		sch->flags &= ~TCQ_F_THROTTLED;
 		sch->q.qlen--;
 		return skb;
@@ -884,11 +885,8 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 			int prio = ffz(m);
 			m |= 1 << prio;
 			skb = htb_dequeue_tree(q, prio, level);
-			if (likely(skb != NULL)) {
-				sch->q.qlen--;
-				sch->flags &= ~TCQ_F_THROTTLED;
-				goto fin;
-			}
+			if (likely(skb != NULL))
+				goto ok;
 		}
 	}
 	sch->qstats.overlimits++;
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index 21f13da..436a2e7 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -83,7 +83,6 @@ multiq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -112,6 +111,7 @@ static struct sk_buff *multiq_dequeue(struct Qdisc *sch)
 			qdisc = q->queues[q->curband];
 			skb = qdisc->dequeue(qdisc);
 			if (skb) {
+				qdisc_bstats_update(sch, skb);
 				sch->q.qlen--;
 				return skb;
 			}
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 1c4bce8..6a3006b 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -240,7 +240,6 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	if (likely(ret == NET_XMIT_SUCCESS)) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 	} else if (net_xmit_drop_count(ret)) {
 		sch->qstats.drops++;
 	}
@@ -289,6 +288,7 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
 				skb->tstamp.tv64 = 0;
 #endif
 			pr_debug("netem_dequeue: return skb=%p\n", skb);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
@@ -476,7 +476,6 @@ static int tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
 		__skb_queue_after(list, skb, nskb);
 
 		sch->qstats.backlog += qdisc_pkt_len(nskb);
-		qdisc_bstats_update(sch, nskb);
 
 		return NET_XMIT_SUCCESS;
 	}
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 966158d..fbd710d 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -84,7 +84,6 @@ prio_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -116,6 +115,7 @@ static struct sk_buff *prio_dequeue(struct Qdisc* sch)
 		struct Qdisc *qdisc = q->queues[prio];
 		struct sk_buff *skb = qdisc->dequeue(qdisc);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index a6009c5..9f98dbd 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -94,7 +94,6 @@ static int red_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 
 	ret = qdisc_enqueue(skb, child);
 	if (likely(ret == NET_XMIT_SUCCESS)) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 	} else if (net_xmit_drop_count(ret)) {
 		q->stats.pdrop++;
@@ -114,11 +113,13 @@ static struct sk_buff * red_dequeue(struct Qdisc* sch)
 	struct Qdisc *child = q->qdisc;
 
 	skb = child->dequeue(child);
-	if (skb)
+	if (skb) {
+		qdisc_bstats_update(sch, skb);
 		sch->q.qlen--;
-	else if (!red_is_idling(&q->parms))
-		red_start_of_idle_period(&q->parms);
-
+	} else {
+		if (!red_is_idling(&q->parms))
+			red_start_of_idle_period(&q->parms);
+	}
 	return skb;
 }
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 239ec53..edea8ce 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -402,10 +402,8 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		q->tail = slot;
 		slot->allot = q->scaled_quantum;
 	}
-	if (++sch->q.qlen <= q->limit) {
-		qdisc_bstats_update(sch, skb);
+	if (++sch->q.qlen <= q->limit)
 		return NET_XMIT_SUCCESS;
-	}
 
 	sfq_drop(sch);
 	return NET_XMIT_CN;
@@ -445,6 +443,7 @@ next_slot:
 	}
 	skb = slot_dequeue_head(slot);
 	sfq_dec(q, a);
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 	sch->qstats.backlog -= qdisc_pkt_len(skb);
 
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 77565e7..e931658 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -134,7 +134,6 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -187,6 +186,7 @@ static struct sk_buff *tbf_dequeue(struct Qdisc* sch)
 			q->ptokens = ptoks;
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
+			qdisc_bstats_update(sch, skb);
 			return skb;
 		}
 
diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c
index af9360d..2ecd9af 100644
--- a/net/sched/sch_teql.c
+++ b/net/sched/sch_teql.c
@@ -83,7 +83,6 @@ teql_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 
 	if (q->q.qlen < dev->tx_queue_len) {
 		__skb_queue_tail(&q->q, skb);
-		qdisc_bstats_update(sch, skb);
 		return NET_XMIT_SUCCESS;
 	}
 
@@ -107,6 +106,8 @@ teql_dequeue(struct Qdisc* sch)
 			dat->m->slaves = sch;
 			netif_wake_queue(m);
 		}
+	} else {
+		qdisc_bstats_update(sch, skb);
 	}
 	sch->q.qlen = dat->q.qlen + dat_queue->qdisc->q.qlen;
 	return skb;



^ permalink raw reply related

* Re: [PATCH resend] Bluetooth: l2cap: fix misuse of logical operation in place of bitop
From: Gustavo F. Padovan @ 2011-01-14 16:30 UTC (permalink / raw)
  To: David Sterba
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	marcel-kz+m5ild9QBg9hUCZPvPmw,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, João Paulo Rechi Vita
In-Reply-To: <1295013584-15592-1-git-send-email-dsterba-AlSwsSmVLrQ@public.gmane.org>

Hi all,

* David Sterba <dsterba-AlSwsSmVLrQ@public.gmane.org> [2011-01-14 14:59:44 +0100]:

> CC: Marcel Holtmann <marcel-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
> CC: "Gustavo F. Padovan" <padovan-Y3ZbgMPKUGA34EUeqzHoZw@public.gmane.org>
> CC: João Paulo Rechi Vita <jprvita-Y3ZbgMPKUGA34EUeqzHoZw@public.gmane.org>
> Signed-off-by: David Sterba <dsterba-AlSwsSmVLrQ@public.gmane.org>
> ---
> 
> Andrew, this has not been picked up by maintainers since 27.12., please consider it for -mm.
> 
>  net/bluetooth/l2cap.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)

I'm taking care of this. Sorry for the delay, I've been away a bit.
Patch is now applied to Bluetooth tree. Thanks.

-- 
Gustavo F. Padovan
http://profusion.mobi

^ permalink raw reply

* Re: [PATCH] r8169: keep firmware in memory.
From: Ben Hutchings @ 2011-01-14 16:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Tokarev, David Woodhouse, Johannes Berg,
	Greg Kroah-Hartman, Rafael J. Wysocki, Francois Romieu,
	David Miller, netdev, Jarek Kamiński, Hayes
In-Reply-To: <AANLkTikiGeSfae3guWRMp_hsh61JibacPpwGZVbgSy7h@mail.gmail.com>

On Fri, Jan 14, 2011 at 08:05:22AM -0800, Linus Torvalds wrote:
> [ background for new people in discussion: once more, a driver resume
> didn't get a working firmware load. ]
> 
> On Thu, Jan 13, 2011 at 10:52 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> >
> > Given all this I think this is somewhat clumsy still.  How
> > does other NIC drivers handles this situation - e.g. tg3?
> > Maybe this needs to be a generic solution instead of per-driver?
> 
> We've had this bug several times - and not just for network drivers
> either.
[...]
> Maybe the firmware loader could be made more useful to automatically
> handle the caching (it already knows about built-in firmware) for the
> case where CONFIG_PM is enabled. So that drivers _could_ just
> basically do "request_firmware()" in their resume functions, and it
> would get satisfied from RAM for the re-request case.
[...]
 
This is something I started to implement, but never got finished.  I
don't think it can be done without an API change, though, as we need
to know when to drop firmware from the cache.  But perhaps this could
be done with a hook in the device-driver binding code.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus

^ permalink raw reply

* Re: [PATCH] r8169: keep firmware in memory.
From: Linus Torvalds @ 2011-01-14 17:21 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Michael Tokarev, David Woodhouse, Johannes Berg,
	Greg Kroah-Hartman, Rafael J. Wysocki, Francois Romieu,
	David Miller, netdev, Jarek Kamiński, Hayes
In-Reply-To: <20110114163035.GY3702@decadent.org.uk>

On Fri, Jan 14, 2011 at 8:30 AM, Ben Hutchings <benh@debian.org> wrote:
>
> This is something I started to implement, but never got finished.  I
> don't think it can be done without an API change, though, as we need
> to know when to drop firmware from the cache.  But perhaps this could
> be done with a hook in the device-driver binding code.

Or just associate the firmware with a module?

So if the firmware gets loaded, it stays in memory until the module is unloaded?

And this all would only be the case if CONFIG_PM is set, so you'd not
waste memory unnecessarily.

                Linus

^ permalink raw reply

* [PATCH] can: test size of struct sockaddr
From: Kurt Van Dijck @ 2011-01-14 17:23 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w

I think this patch makes the CAN socket code comform to the
manpages of sendmsg & recvmsg.

Signed-off-by: Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>
---
 net/can/bcm.c |    6 +++++-
 net/can/raw.c |    6 +++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index 6faa825..dc0d5d6 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1256,6 +1256,9 @@ static int bcm_sendmsg(struct kiocb *iocb, struct socket *sock,
 		struct sockaddr_can *addr =
 			(struct sockaddr_can *)msg->msg_name;
 
+		if (msg->msg_namelen < sizeof(*addr))
+			return -EINVAL;
+
 		if (addr->can_family != AF_CAN)
 			return -EINVAL;
 
@@ -1557,8 +1560,9 @@ static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
 	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (msg->msg_name) {
+		memcpy(msg->msg_name, skb->cb, MIN(msg->msg_namelen,
+					sizeof(struct sockaddr_can)));
 		msg->msg_namelen = sizeof(struct sockaddr_can);
-		memcpy(msg->msg_name, skb->cb, msg->msg_namelen);
 	}
 
 	skb_free_datagram(sk, skb);
diff --git a/net/can/raw.c b/net/can/raw.c
index e88f610..e68a6d3 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -649,6 +649,9 @@ static int raw_sendmsg(struct kiocb *iocb, struct socket *sock,
 		struct sockaddr_can *addr =
 			(struct sockaddr_can *)msg->msg_name;
 
+		if (msg->msg_namelen < sizeof(*addr))
+			return -EINVAL;
+
 		if (addr->can_family != AF_CAN)
 			return -EINVAL;
 
@@ -727,8 +730,9 @@ static int raw_recvmsg(struct kiocb *iocb, struct socket *sock,
 	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (msg->msg_name) {
+		memcpy(msg->msg_name, skb->cb, MIN(msg->msg_namelen,
+					sizeof(struct sockaddr_can)));
 		msg->msg_namelen = sizeof(struct sockaddr_can);
-		memcpy(msg->msg_name, skb->cb, msg->msg_namelen);
 	}
 
 	/* assign the flags that have been recorded in raw_rcv() */

^ permalink raw reply related

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Jesse Gross @ 2011-01-14 17:49 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Michael Leun, Michael Chan, Eric Dumazet, David Miller,
	Ben Greear, linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <20110114011533.GA29723@mcarlson.broadcom.com>

On Thu, Jan 13, 2011 at 8:15 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> On Thu, Jan 13, 2011 at 01:58:40PM -0800, Jesse Gross wrote:
>> On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> > On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
>> >> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
>> >> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
>> >> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
>> >> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
>> >> >> >> <lkml20101129@newton.leun.net> wrote:
>> >> >> >> > OK - all tests done on that DL320G5:
>> >> >> >> >
>> >> >> >> > For completeness, 2.6.37-rc5 unpatched:
>> >> >> >> >
>> >> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
>> >> >> >> > without tag, single or untagged packets missing at all
>> >> >> >>
>> >> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
>> >> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
>> >> >> >> would expect it to behave like the vlan configured case.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
>> >> >> >> Missing tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > eth1 same as originally reported:
>> >> >> >> > without vlan configured see vlan tags (single and double tagged as
>> >> >> >> > expected)
>> >> >> >>
>> >> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
>> >> >> >>
>> >> >> >> > with vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
>> >> >> >> >
>> >> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
>> >> >> >> > tagged packets with one vlan tag)
>> >> >> >>
>> >> >> >> ASF enables tag stripping. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
>> >> >> >> > expected)
>> >> >> >>
>> >> >> >> No ASF, no vlan group means no stripping. ?Have tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > eth0, vlan configured: as without vlan
>> >> >> >>
>> >> >> >> ASF enables stripping. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1, vlan configured: as without vlan
>> >> >> >>
>> >> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
>> >> >> >> stripping. ?Have tag.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
>> >> >> >> >
>> >> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
>> >> >> >> > packets with one vlan tag)
>> >> >> >>
>> >> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> > eth1 with vlan: the same
>> >> >> >>
>> >> >> >> Stripping still always enabled. ?Missing tag.
>> >> >> >>
>> >> >> >> The bottom line is whenever vlan stripping is enabled we're missing
>> >> >> >> the outer tag. ?It might be worth adding some debugging in the area
>> >> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
>> >> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
>> >> >> >> vlan packets on this NIC.
>> >> >> >>
>> >> >> >> You said that everything works on the 5752? ?Matt, is it possible that
>> >> >> >> the 5714 either has a problem with vlan stripping or a different way
>> >> >> >> of reporting it?
>> >> >> >
>> >> >> > I don't think this is a 5714 specific issue. ?I think the problem is
>> >> >> > rooted in the fact that the VLAN tag stripping is enabled.
>> >> >>
>> >> >> It's definitely related to vlan stripping being enabled. ?Other cards
>> >> >> using tg3 seem to work fine with stripping though, which is why I
>> >> >> thought it might be specific to the 5714.
>> >> >
>> >> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
>> >> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
>> >> > applied). ?The tag is always visible in the packet stream as seen from
>> >> > tcpdump.
>> >> >
>> >> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
>> >> >> >
>> >> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
>> >> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
>> >> >> > not sure what else the driver should be doing.
>> >> >>
>> >> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
>> >> >> seeing this on older versions of the kernel as well with this NIC,
>> >> >> which predates both this patch and the larger vlan changes so it
>> >> >> doesn't seem like a problem with passing the tag to the network stack.
>> >> >> ?It's hard to know exactly what is going on though without seeing what
>> >> >> the hardware is reporting.
>> >> >
>> >> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
>> >> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
>> >> > call.
>> >> >
>> >> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
>> >> > __vlan_hwaccel_put_tag() is called to reinject the packet.
>> >>
>> >> OK, thanks for testing it out. ?I'm not sure that there's anything
>> >> more we can do without hearing from Michael.
>> >
>> > In the meantime, I think what we have should go upstream. ?Just to be
>> > absolutely clear though, your position is that VLAN tags should always
>> > be stripped?
>>
>> That's what the other converted drivers do by default (though most of
>> them also provide an ethtool set_flags() method to change this).  It's
>> generally the most efficient and is now safe to do in all cases.  It's
>> also the consistent with what was happening before, since stripping
>> was enabled when a vlan device was configured.  So, yes, normally I
>> think stripping should be enabled.
>>
>> I assumed that disabling stripping in most situations was just an
>> oversight.  Was there a reason why you feel it is better not to use
>> it?
>
> Actually, the tg3 driver was trying to disable VLAN tag stripping
> when possible.  I believe this was primarily to support the raw packet
> interface.

Hmm, is this really true?

        if (!tp->vlgrp &&
            !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
                rx_mode |= RX_MODE_KEEP_VLAN_TAG;

So if a vlan device is registered or ASF is enabled we will use
stripping.  That seems like it is using stripping in the common case
when vlans are used.

Before 2.6.37 it was only possible to deliver stripped tags if a vlan
group was configured.  This means that if ASF was enabled and forced
stripping but no group was configured we would lose the tag.  In this
situation tg3 manually reinserts the tags so raw packet capture will
see them, as you say.  However, in the current tree this limitation no
longer exists and it is always possible to deliver stripped tags to
the networking core, which should do the right thing in all
situations.

So I believe the only reason to disable tag stripping is for debugging
or some other special purpose situation.

^ permalink raw reply

* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: Stephen Hemminger @ 2011-01-14 17:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Patrick McHardy, jamal, Jarek Poplawski
In-Reply-To: <1295021808.3937.110.camel@edumazet-laptop>

By using __qdisc_queue_drop_head in sch_fifo.c the stats_update parameter won't be
needed.

>From Eric Dumazet <eric.dumazet@gmail.com>

In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: jamal <hadi@cyberus.ca>
---
 include/net/sch_generic.h |    8 +++++---
 net/sched/sch_cbq.c       |    3 +--
 net/sched/sch_drr.c       |    2 +-
 net/sched/sch_dsmark.c    |    2 +-
 net/sched/sch_fifo.c      |    6 +-----
 net/sched/sch_hfsc.c      |    2 +-
 net/sched/sch_htb.c       |   12 +++++-------
 net/sched/sch_multiq.c    |    2 +-
 net/sched/sch_netem.c     |    3 +--
 net/sched/sch_prio.c      |    2 +-
 net/sched/sch_red.c       |   11 ++++++-----
 net/sched/sch_sfq.c       |    5 ++---
 net/sched/sch_tbf.c       |    2 +-
 net/sched/sch_teql.c      |    3 ++-
 14 files changed, 29 insertions(+), 34 deletions(-)

--- a/include/net/sch_generic.h	2011-01-14 09:19:00.730849868 -0800
+++ b/include/net/sch_generic.h	2011-01-14 09:47:59.058551676 -0800
@@ -445,7 +445,6 @@ static inline int __qdisc_enqueue_tail(s
 {
 	__skb_queue_tail(list, skb);
 	sch->qstats.backlog += qdisc_pkt_len(skb);
-	qdisc_bstats_update(sch, skb);
 
 	return NET_XMIT_SUCCESS;
 }
@@ -460,8 +459,10 @@ static inline struct sk_buff *__qdisc_de
 {
 	struct sk_buff *skb = __skb_dequeue(list);
 
-	if (likely(skb != NULL))
+	if (likely(skb != NULL)) {
 		sch->qstats.backlog -= qdisc_pkt_len(skb);
+		qdisc_bstats_update(sch, skb);
+	}
 
 	return skb;
 }
@@ -474,10 +475,11 @@ static inline struct sk_buff *qdisc_dequ
 static inline unsigned int __qdisc_queue_drop_head(struct Qdisc *sch,
 					      struct sk_buff_head *list)
 {
-	struct sk_buff *skb = __qdisc_dequeue_head(sch, list);
+	struct sk_buff *skb = __skb_dequeue(list);
 
 	if (likely(skb != NULL)) {
 		unsigned int len = qdisc_pkt_len(skb);
+		sch->qstats.backlog -= len;
 		kfree_skb(skb);
 		return len;
 	}
--- a/net/sched/sch_cbq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_cbq.c	2011-01-14 09:28:20.398631228 -0800
@@ -390,7 +390,6 @@ cbq_enqueue(struct sk_buff *skb, struct
 	ret = qdisc_enqueue(skb, cl->q);
 	if (ret == NET_XMIT_SUCCESS) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 		cbq_mark_toplevel(q, cl);
 		if (!cl->next_alive)
 			cbq_activate_class(cl);
@@ -649,7 +648,6 @@ static int cbq_reshape_fail(struct sk_bu
 		ret = qdisc_enqueue(skb, cl->q);
 		if (ret == NET_XMIT_SUCCESS) {
 			sch->q.qlen++;
-			qdisc_bstats_update(sch, skb);
 			if (!cl->next_alive)
 				cbq_activate_class(cl);
 			return 0;
@@ -971,6 +969,7 @@ cbq_dequeue(struct Qdisc *sch)
 
 		skb = cbq_dequeue_1(sch);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			return skb;
--- a/net/sched/sch_drr.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_drr.c	2011-01-14 09:28:20.398631228 -0800
@@ -376,7 +376,6 @@ static int drr_enqueue(struct sk_buff *s
 	}
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 
 	sch->q.qlen++;
 	return err;
@@ -403,6 +402,7 @@ static struct sk_buff *drr_dequeue(struc
 			skb = qdisc_dequeue_peeked(cl->qdisc);
 			if (cl->qdisc->q.qlen == 0)
 				list_del(&cl->alist);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
--- a/net/sched/sch_dsmark.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_dsmark.c	2011-01-14 09:28:20.398631228 -0800
@@ -260,7 +260,6 @@ static int dsmark_enqueue(struct sk_buff
 		return err;
 	}
 
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -283,6 +282,7 @@ static struct sk_buff *dsmark_dequeue(st
 	if (skb == NULL)
 		return NULL;
 
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	index = skb->tc_index & (p->indices - 1);
--- a/net/sched/sch_fifo.c	2011-01-09 09:34:32.690685246 -0800
+++ b/net/sched/sch_fifo.c	2011-01-14 09:50:37.082839684 -0800
@@ -46,17 +46,13 @@ static int pfifo_enqueue(struct sk_buff
 
 static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 {
-	struct sk_buff *skb_head;
 	struct fifo_sched_data *q = qdisc_priv(sch);
 
 	if (likely(skb_queue_len(&sch->q) < q->limit))
 		return qdisc_enqueue_tail(skb, sch);
 
 	/* queue full, remove one skb to fulfill the limit */
-	skb_head = qdisc_dequeue_head(sch);
-	sch->qstats.drops++;
-	kfree_skb(skb_head);
-
+	__qdisc_queue_drop_head(sch, &sch->q);
 	qdisc_enqueue_tail(skb, sch);
 
 	return NET_XMIT_CN;
--- a/net/sched/sch_hfsc.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_hfsc.c	2011-01-14 09:28:20.428633918 -0800
@@ -1600,7 +1600,6 @@ hfsc_enqueue(struct sk_buff *skb, struct
 		set_active(cl, qdisc_pkt_len(skb));
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -1666,6 +1665,7 @@ hfsc_dequeue(struct Qdisc *sch)
 	}
 
 	sch->flags &= ~TCQ_F_THROTTLED;
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	return skb;
--- a/net/sched/sch_htb.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_htb.c	2011-01-14 09:28:20.438634799 -0800
@@ -574,7 +574,6 @@ static int htb_enqueue(struct sk_buff *s
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -842,7 +841,7 @@ next:
 
 static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 {
-	struct sk_buff *skb = NULL;
+	struct sk_buff *skb;
 	struct htb_sched *q = qdisc_priv(sch);
 	int level;
 	psched_time_t next_event;
@@ -851,6 +850,8 @@ static struct sk_buff *htb_dequeue(struc
 	/* try to dequeue direct packets as high prio (!) to minimize cpu work */
 	skb = __skb_dequeue(&q->direct_queue);
 	if (skb != NULL) {
+ok:
+		qdisc_bstats_update(sch, skb);
 		sch->flags &= ~TCQ_F_THROTTLED;
 		sch->q.qlen--;
 		return skb;
@@ -884,11 +885,8 @@ static struct sk_buff *htb_dequeue(struc
 			int prio = ffz(m);
 			m |= 1 << prio;
 			skb = htb_dequeue_tree(q, prio, level);
-			if (likely(skb != NULL)) {
-				sch->q.qlen--;
-				sch->flags &= ~TCQ_F_THROTTLED;
-				goto fin;
-			}
+			if (likely(skb != NULL))
+				goto ok;
 		}
 	}
 	sch->qstats.overlimits++;
--- a/net/sched/sch_multiq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_multiq.c	2011-01-14 09:28:20.438634799 -0800
@@ -83,7 +83,6 @@ multiq_enqueue(struct sk_buff *skb, stru
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -112,6 +111,7 @@ static struct sk_buff *multiq_dequeue(st
 			qdisc = q->queues[q->curband];
 			skb = qdisc->dequeue(qdisc);
 			if (skb) {
+				qdisc_bstats_update(sch, skb);
 				sch->q.qlen--;
 				return skb;
 			}
--- a/net/sched/sch_netem.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_netem.c	2011-01-14 09:28:20.438634799 -0800
@@ -240,7 +240,6 @@ static int netem_enqueue(struct sk_buff
 
 	if (likely(ret == NET_XMIT_SUCCESS)) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 	} else if (net_xmit_drop_count(ret)) {
 		sch->qstats.drops++;
 	}
@@ -289,6 +288,7 @@ static struct sk_buff *netem_dequeue(str
 				skb->tstamp.tv64 = 0;
 #endif
 			pr_debug("netem_dequeue: return skb=%p\n", skb);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
@@ -476,7 +476,6 @@ static int tfifo_enqueue(struct sk_buff
 		__skb_queue_after(list, skb, nskb);
 
 		sch->qstats.backlog += qdisc_pkt_len(nskb);
-		qdisc_bstats_update(sch, nskb);
 
 		return NET_XMIT_SUCCESS;
 	}
--- a/net/sched/sch_prio.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_prio.c	2011-01-14 09:28:20.438634799 -0800
@@ -84,7 +84,6 @@ prio_enqueue(struct sk_buff *skb, struct
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -116,6 +115,7 @@ static struct sk_buff *prio_dequeue(stru
 		struct Qdisc *qdisc = q->queues[prio];
 		struct sk_buff *skb = qdisc->dequeue(qdisc);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
--- a/net/sched/sch_red.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_red.c	2011-01-14 09:28:20.438634799 -0800
@@ -94,7 +94,6 @@ static int red_enqueue(struct sk_buff *s
 
 	ret = qdisc_enqueue(skb, child);
 	if (likely(ret == NET_XMIT_SUCCESS)) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 	} else if (net_xmit_drop_count(ret)) {
 		q->stats.pdrop++;
@@ -114,11 +113,13 @@ static struct sk_buff * red_dequeue(stru
 	struct Qdisc *child = q->qdisc;
 
 	skb = child->dequeue(child);
-	if (skb)
+	if (skb) {
+		qdisc_bstats_update(sch, skb);
 		sch->q.qlen--;
-	else if (!red_is_idling(&q->parms))
-		red_start_of_idle_period(&q->parms);
-
+	} else {
+		if (!red_is_idling(&q->parms))
+			red_start_of_idle_period(&q->parms);
+	}
 	return skb;
 }
 
--- a/net/sched/sch_sfq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_sfq.c	2011-01-14 09:28:20.438634799 -0800
@@ -402,10 +402,8 @@ sfq_enqueue(struct sk_buff *skb, struct
 		q->tail = slot;
 		slot->allot = q->scaled_quantum;
 	}
-	if (++sch->q.qlen <= q->limit) {
-		qdisc_bstats_update(sch, skb);
+	if (++sch->q.qlen <= q->limit)
 		return NET_XMIT_SUCCESS;
-	}
 
 	sfq_drop(sch);
 	return NET_XMIT_CN;
@@ -445,6 +443,7 @@ next_slot:
 	}
 	skb = slot_dequeue_head(slot);
 	sfq_dec(q, a);
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 	sch->qstats.backlog -= qdisc_pkt_len(skb);
 
--- a/net/sched/sch_tbf.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_tbf.c	2011-01-14 09:28:20.438634799 -0800
@@ -134,7 +134,6 @@ static int tbf_enqueue(struct sk_buff *s
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -187,6 +186,7 @@ static struct sk_buff *tbf_dequeue(struc
 			q->ptokens = ptoks;
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
+			qdisc_bstats_update(sch, skb);
 			return skb;
 		}
 
--- a/net/sched/sch_teql.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_teql.c	2011-01-14 09:28:20.438634799 -0800
@@ -83,7 +83,6 @@ teql_enqueue(struct sk_buff *skb, struct
 
 	if (q->q.qlen < dev->tx_queue_len) {
 		__skb_queue_tail(&q->q, skb);
-		qdisc_bstats_update(sch, skb);
 		return NET_XMIT_SUCCESS;
 	}
 
@@ -107,6 +106,8 @@ teql_dequeue(struct Qdisc* sch)
 			dat->m->slaves = sch;
 			netif_wake_queue(m);
 		}
+	} else {
+		qdisc_bstats_update(sch, skb);
 	}
 	sch->q.qlen = dat->q.qlen + dat_queue->qdisc->q.qlen;
 	return skb;

^ permalink raw reply

* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: Eric Dumazet @ 2011-01-14 18:08 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, netdev, Patrick McHardy, jamal, Jarek Poplawski
In-Reply-To: <20110114095201.4fc58a45@nehalam>

Le vendredi 14 janvier 2011 à 09:52 -0800, Stephen Hemminger a écrit :
> By using __qdisc_queue_drop_head in sch_fifo.c the stats_update parameter won't be
> needed.


Hmm, this is a constant parameter in inline function so removed by
compiler.

And you add a bug in pfifo_tail_enqueue(), it now lacks the
sch->qstats.drops++;

> 
> From Eric Dumazet <eric.dumazet@gmail.com>
> 

Note : This line should be the first one in mail ;)




^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox