Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: Eric Dumazet @ 2011-01-14 18:08 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, netdev, Patrick McHardy, jamal, Jarek Poplawski
In-Reply-To: <20110114095201.4fc58a45@nehalam>

Le vendredi 14 janvier 2011 à 09:52 -0800, Stephen Hemminger a écrit :
> By using __qdisc_queue_drop_head in sch_fifo.c the stats_update parameter won't be
> needed.


Hmm, this is a constant parameter in inline function so removed by
compiler.

And you add a bug in pfifo_tail_enqueue(), it now lacks the
sch->qstats.drops++;

> 
> From Eric Dumazet <eric.dumazet@gmail.com>
> 

Note : This line should be the first one in mail ;)




^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Matt Carlson @ 2011-01-14 18:38 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matthew Carlson, Michael Leun, Michael Chan, Eric Dumazet,
	David Miller, Ben Greear, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <AANLkTi=Z4mtV5SUqORivPKPxEGddsmQQ2sVdkeOAogeY@mail.gmail.com>

On Fri, Jan 14, 2011 at 09:49:47AM -0800, Jesse Gross wrote:
> On Thu, Jan 13, 2011 at 8:15 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Thu, Jan 13, 2011 at 01:58:40PM -0800, Jesse Gross wrote:
> >> On Thu, Jan 13, 2011 at 3:50 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> > On Thu, Jan 13, 2011 at 07:06:22AM -0800, Jesse Gross wrote:
> >> >> On Wed, Jan 12, 2011 at 8:21 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> >> > On Thu, Jan 06, 2011 at 08:36:27PM -0800, Jesse Gross wrote:
> >> >> >> On Thu, Jan 6, 2011 at 10:24 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> >> >> >> > On Sat, Dec 18, 2010 at 07:38:00PM -0800, Jesse Gross wrote:
> >> >> >> >> On Tue, Dec 14, 2010 at 11:16 PM, Michael Leun
> >> >> >> >> <lkml20101129@newton.leun.net> wrote:
> >> >> >> >> > OK - all tests done on that DL320G5:
> >> >> >> >> >
> >> >> >> >> > For completeness, 2.6.37-rc5 unpatched:
> >> >> >> >> >
> >> >> >> >> > eth0, no vlan configured: totally broken - see double tagged vlans
> >> >> >> >> > without tag, single or untagged packets missing at all
> >> >> >> >>
> >> >> >> >> Random behavior? ?This one is somewhat hard to explain - maybe there
> >> >> >> >> are some other factors. ?eth0 has ASF on, so it always strips tags. ?I
> >> >> >> >> would expect it to behave like the vlan configured case.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > eth0, vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> Both ASF and vlan group configured cause tag stripping to be enabled.
> >> >> >> >> Missing tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > eth1 same as originally reported:
> >> >> >> >> > without vlan configured see vlan tags (single and double tagged as
> >> >> >> >> > expected)
> >> >> >> >>
> >> >> >> >> No ASF and no vlan group means tag stripping is disabled. ?Have tag.
> >> >> >> >>
> >> >> >> >> > with vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> Configuring vlan group causes stripping to be enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch:
> >> >> >> >> >
> >> >> >> >> > eth0, no vlan configured: ?see packets without vlan tag (see double
> >> >> >> >> > tagged packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> ASF enables tag stripping. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1, no vlan configured: see vlan tags (single and double tagged as
> >> >> >> >> > expected)
> >> >> >> >>
> >> >> >> >> No ASF, no vlan group means no stripping. ?Have tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > eth0, vlan configured: as without vlan
> >> >> >> >>
> >> >> >> >> ASF enables stripping. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1, vlan configured: as without vlan
> >> >> >> >>
> >> >> >> >> With this patch vlan stripping is only enabled when ASF is on, so no
> >> >> >> >> stripping. ?Have tag.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > 2.6.37-rc5, your tg3 use new vlan-code patch with test patch ontop
> >> >> >> >> >
> >> >> >> >> > eth1 no vlan configured: see packets without vlan tag (see double tagged
> >> >> >> >> > packets with one vlan tag)
> >> >> >> >>
> >> >> >> >> With the second patch, vlan stripping is always enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> > eth1 with vlan: the same
> >> >> >> >>
> >> >> >> >> Stripping still always enabled. ?Missing tag.
> >> >> >> >>
> >> >> >> >> The bottom line is whenever vlan stripping is enabled we're missing
> >> >> >> >> the outer tag. ?It might be worth adding some debugging in the area
> >> >> >> >> before napi_gro_receive/vlan_gro_receive (depending on version). ?My
> >> >> >> >> guess is that (desc->type_flags & RXD_FLAG_VLAN) is false even for
> >> >> >> >> vlan packets on this NIC.
> >> >> >> >>
> >> >> >> >> You said that everything works on the 5752? ?Matt, is it possible that
> >> >> >> >> the 5714 either has a problem with vlan stripping or a different way
> >> >> >> >> of reporting it?
> >> >> >> >
> >> >> >> > I don't think this is a 5714 specific issue. ?I think the problem is
> >> >> >> > rooted in the fact that the VLAN tag stripping is enabled.
> >> >> >>
> >> >> >> It's definitely related to vlan stripping being enabled. ?Other cards
> >> >> >> using tg3 seem to work fine with stripping though, which is why I
> >> >> >> thought it might be specific to the 5714.
> >> >> >
> >> >> > I just tested this on a 5714S, using a net-next-2.6 snapshot obtained
> >> >> > today. ?It does the right thing in both cases (2nd tg3 patch ommited /
> >> >> > applied). ?The tag is always visible in the packet stream as seen from
> >> >> > tcpdump.
> >> >> >
> >> >> >> > Your RXD_FLAG_VLAN idea sounds unlikely to me, but it's worth a check.
> >> >> >> >
> >> >> >> > The patch here is using __vlan_hwaccel_put_tag(), which informs the
> >> >> >> > stack a VLAN tag is present. ?If this is indeed a reporting problem, I'm
> >> >> >> > not sure what else the driver should be doing.
> >> >> >>
> >> >> >> The code to hand off the tag to the stack looks OK to me. ?Michael was
> >> >> >> seeing this on older versions of the kernel as well with this NIC,
> >> >> >> which predates both this patch and the larger vlan changes so it
> >> >> >> doesn't seem like a problem with passing the tag to the network stack.
> >> >> >> ?It's hard to know exactly what is going on though without seeing what
> >> >> >> the hardware is reporting.
> >> >> >
> >> >> > When RX_MODE_KEEP_VLAN_TAG is set, the RXD_FLAG_VLAN flag will not be set
> >> >> > when receiving a packet. ?The driver skips the __vlan_hwaccel_put_tag()
> >> >> > call.
> >> >> >
> >> >> > When RX_MODE_KEEP_VLAN_TAG is unset, the RXD_FLAG_VLAN flag is set, and
> >> >> > __vlan_hwaccel_put_tag() is called to reinject the packet.
> >> >>
> >> >> OK, thanks for testing it out. ?I'm not sure that there's anything
> >> >> more we can do without hearing from Michael.
> >> >
> >> > In the meantime, I think what we have should go upstream. ?Just to be
> >> > absolutely clear though, your position is that VLAN tags should always
> >> > be stripped?
> >>
> >> That's what the other converted drivers do by default (though most of
> >> them also provide an ethtool set_flags() method to change this). ?It's
> >> generally the most efficient and is now safe to do in all cases. ?It's
> >> also the consistent with what was happening before, since stripping
> >> was enabled when a vlan device was configured. ?So, yes, normally I
> >> think stripping should be enabled.
> >>
> >> I assumed that disabling stripping in most situations was just an
> >> oversight. ?Was there a reason why you feel it is better not to use
> >> it?
> >
> > Actually, the tg3 driver was trying to disable VLAN tag stripping
> > when possible. ?I believe this was primarily to support the raw packet
> > interface.
> 
> Hmm, is this really true?
> 
>         if (!tp->vlgrp &&
>             !(tp->tg3_flags & TG3_FLAG_ENABLE_ASF))
>                 rx_mode |= RX_MODE_KEEP_VLAN_TAG;
> 
> So if a vlan device is registered or ASF is enabled we will use
> stripping.  That seems like it is using stripping in the common case
> when vlans are used.

Right.

> Before 2.6.37 it was only possible to deliver stripped tags if a vlan
> group was configured.  This means that if ASF was enabled and forced
> stripping but no group was configured we would lose the tag.  In this
> situation tg3 manually reinserts the tags so raw packet capture will
> see them, as you say.

Right.  VLAN tagged frames can still arrive if CONFIG_VLAN_8021Q or
CONFIG_VLAN_8021Q_MODULE is not set.  That's why the driver was trying
to keep them inline.  To eliminate the unnecessary overhead of
reinjecting the VLAN tag.

> However, in the current tree this limitation no
> longer exists and it is always possible to deliver stripped tags to
> the networking core, which should do the right thing in all
> situations.

Yes.  Even though the stack is capable of reinjecting the VLAN tag,
doesn't it make sense to avoid that if we knew they would never be
needed out-of-packet?

> So I believe the only reason to disable tag stripping is for debugging
> or some other special purpose situation.

Nope.  I think we covered it all.  Thanks for the info.

^ permalink raw reply

* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: Stephen Hemminger @ 2011-01-14 19:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Patrick McHardy, jamal, Jarek Poplawski
In-Reply-To: <1295028502.3937.116.camel@edumazet-laptop>

>From Eric Dumazet <eric.dumazet@gmail.com>

In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>

CC: Patrick McHardy <kaber@trash.net>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: jamal <hadi@cyberus.ca>
---
sch_fifo now changed to use __qdisc_queue_drop_head which
keeps correct statistics and is actually clearer.

 include/net/sch_generic.h |    8 +++++---
 net/sched/sch_cbq.c       |    3 +--
 net/sched/sch_drr.c       |    2 +-
 net/sched/sch_dsmark.c    |    2 +-
 net/sched/sch_fifo.c      |    6 +-----
 net/sched/sch_hfsc.c      |    2 +-
 net/sched/sch_htb.c       |   12 +++++-------
 net/sched/sch_multiq.c    |    2 +-
 net/sched/sch_netem.c     |    3 +--
 net/sched/sch_prio.c      |    2 +-
 net/sched/sch_red.c       |   11 ++++++-----
 net/sched/sch_sfq.c       |    5 ++---
 net/sched/sch_tbf.c       |    2 +-
 net/sched/sch_teql.c      |    3 ++-
 14 files changed, 29 insertions(+), 34 deletions(-)

--- a/include/net/sch_generic.h	2011-01-14 09:19:00.730849868 -0800
+++ b/include/net/sch_generic.h	2011-01-14 09:47:59.058551676 -0800
@@ -445,7 +445,6 @@ static inline int __qdisc_enqueue_tail(s
 {
 	__skb_queue_tail(list, skb);
 	sch->qstats.backlog += qdisc_pkt_len(skb);
-	qdisc_bstats_update(sch, skb);
 
 	return NET_XMIT_SUCCESS;
 }
@@ -460,8 +459,10 @@ static inline struct sk_buff *__qdisc_de
 {
 	struct sk_buff *skb = __skb_dequeue(list);
 
-	if (likely(skb != NULL))
+	if (likely(skb != NULL)) {
 		sch->qstats.backlog -= qdisc_pkt_len(skb);
+		qdisc_bstats_update(sch, skb);
+	}
 
 	return skb;
 }
@@ -474,10 +475,11 @@ static inline struct sk_buff *qdisc_dequ
 static inline unsigned int __qdisc_queue_drop_head(struct Qdisc *sch,
 					      struct sk_buff_head *list)
 {
-	struct sk_buff *skb = __qdisc_dequeue_head(sch, list);
+	struct sk_buff *skb = __skb_dequeue(list);
 
 	if (likely(skb != NULL)) {
 		unsigned int len = qdisc_pkt_len(skb);
+		sch->qstats.backlog -= len;
 		kfree_skb(skb);
 		return len;
 	}
--- a/net/sched/sch_cbq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_cbq.c	2011-01-14 09:28:20.398631228 -0800
@@ -390,7 +390,6 @@ cbq_enqueue(struct sk_buff *skb, struct
 	ret = qdisc_enqueue(skb, cl->q);
 	if (ret == NET_XMIT_SUCCESS) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 		cbq_mark_toplevel(q, cl);
 		if (!cl->next_alive)
 			cbq_activate_class(cl);
@@ -649,7 +648,6 @@ static int cbq_reshape_fail(struct sk_bu
 		ret = qdisc_enqueue(skb, cl->q);
 		if (ret == NET_XMIT_SUCCESS) {
 			sch->q.qlen++;
-			qdisc_bstats_update(sch, skb);
 			if (!cl->next_alive)
 				cbq_activate_class(cl);
 			return 0;
@@ -971,6 +969,7 @@ cbq_dequeue(struct Qdisc *sch)
 
 		skb = cbq_dequeue_1(sch);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			return skb;
--- a/net/sched/sch_drr.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_drr.c	2011-01-14 09:28:20.398631228 -0800
@@ -376,7 +376,6 @@ static int drr_enqueue(struct sk_buff *s
 	}
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 
 	sch->q.qlen++;
 	return err;
@@ -403,6 +402,7 @@ static struct sk_buff *drr_dequeue(struc
 			skb = qdisc_dequeue_peeked(cl->qdisc);
 			if (cl->qdisc->q.qlen == 0)
 				list_del(&cl->alist);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
--- a/net/sched/sch_dsmark.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_dsmark.c	2011-01-14 09:28:20.398631228 -0800
@@ -260,7 +260,6 @@ static int dsmark_enqueue(struct sk_buff
 		return err;
 	}
 
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -283,6 +282,7 @@ static struct sk_buff *dsmark_dequeue(st
 	if (skb == NULL)
 		return NULL;
 
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	index = skb->tc_index & (p->indices - 1);
--- a/net/sched/sch_fifo.c	2011-01-09 09:34:32.690685246 -0800
+++ b/net/sched/sch_fifo.c	2011-01-14 10:43:39.534246186 -0800
@@ -46,17 +46,14 @@ static int pfifo_enqueue(struct sk_buff
 
 static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 {
-	struct sk_buff *skb_head;
 	struct fifo_sched_data *q = qdisc_priv(sch);
 
 	if (likely(skb_queue_len(&sch->q) < q->limit))
 		return qdisc_enqueue_tail(skb, sch);
 
 	/* queue full, remove one skb to fulfill the limit */
-	skb_head = qdisc_dequeue_head(sch);
+	__qdisc_queue_drop_head(sch, &sch->q);
 	sch->qstats.drops++;
-	kfree_skb(skb_head);
-
 	qdisc_enqueue_tail(skb, sch);
 
 	return NET_XMIT_CN;
--- a/net/sched/sch_hfsc.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_hfsc.c	2011-01-14 09:28:20.428633918 -0800
@@ -1600,7 +1600,6 @@ hfsc_enqueue(struct sk_buff *skb, struct
 		set_active(cl, qdisc_pkt_len(skb));
 
 	bstats_update(&cl->bstats, skb);
-	qdisc_bstats_update(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -1666,6 +1665,7 @@ hfsc_dequeue(struct Qdisc *sch)
 	}
 
 	sch->flags &= ~TCQ_F_THROTTLED;
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 
 	return skb;
--- a/net/sched/sch_htb.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_htb.c	2011-01-14 09:28:20.438634799 -0800
@@ -574,7 +574,6 @@ static int htb_enqueue(struct sk_buff *s
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -842,7 +841,7 @@ next:
 
 static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 {
-	struct sk_buff *skb = NULL;
+	struct sk_buff *skb;
 	struct htb_sched *q = qdisc_priv(sch);
 	int level;
 	psched_time_t next_event;
@@ -851,6 +850,8 @@ static struct sk_buff *htb_dequeue(struc
 	/* try to dequeue direct packets as high prio (!) to minimize cpu work */
 	skb = __skb_dequeue(&q->direct_queue);
 	if (skb != NULL) {
+ok:
+		qdisc_bstats_update(sch, skb);
 		sch->flags &= ~TCQ_F_THROTTLED;
 		sch->q.qlen--;
 		return skb;
@@ -884,11 +885,8 @@ static struct sk_buff *htb_dequeue(struc
 			int prio = ffz(m);
 			m |= 1 << prio;
 			skb = htb_dequeue_tree(q, prio, level);
-			if (likely(skb != NULL)) {
-				sch->q.qlen--;
-				sch->flags &= ~TCQ_F_THROTTLED;
-				goto fin;
-			}
+			if (likely(skb != NULL))
+				goto ok;
 		}
 	}
 	sch->qstats.overlimits++;
--- a/net/sched/sch_multiq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_multiq.c	2011-01-14 09:28:20.438634799 -0800
@@ -83,7 +83,6 @@ multiq_enqueue(struct sk_buff *skb, stru
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -112,6 +111,7 @@ static struct sk_buff *multiq_dequeue(st
 			qdisc = q->queues[q->curband];
 			skb = qdisc->dequeue(qdisc);
 			if (skb) {
+				qdisc_bstats_update(sch, skb);
 				sch->q.qlen--;
 				return skb;
 			}
--- a/net/sched/sch_netem.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_netem.c	2011-01-14 09:28:20.438634799 -0800
@@ -240,7 +240,6 @@ static int netem_enqueue(struct sk_buff
 
 	if (likely(ret == NET_XMIT_SUCCESS)) {
 		sch->q.qlen++;
-		qdisc_bstats_update(sch, skb);
 	} else if (net_xmit_drop_count(ret)) {
 		sch->qstats.drops++;
 	}
@@ -289,6 +288,7 @@ static struct sk_buff *netem_dequeue(str
 				skb->tstamp.tv64 = 0;
 #endif
 			pr_debug("netem_dequeue: return skb=%p\n", skb);
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
@@ -476,7 +476,6 @@ static int tfifo_enqueue(struct sk_buff
 		__skb_queue_after(list, skb, nskb);
 
 		sch->qstats.backlog += qdisc_pkt_len(nskb);
-		qdisc_bstats_update(sch, nskb);
 
 		return NET_XMIT_SUCCESS;
 	}
--- a/net/sched/sch_prio.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_prio.c	2011-01-14 09:28:20.438634799 -0800
@@ -84,7 +84,6 @@ prio_enqueue(struct sk_buff *skb, struct
 
 	ret = qdisc_enqueue(skb, qdisc);
 	if (ret == NET_XMIT_SUCCESS) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 		return NET_XMIT_SUCCESS;
 	}
@@ -116,6 +115,7 @@ static struct sk_buff *prio_dequeue(stru
 		struct Qdisc *qdisc = q->queues[prio];
 		struct sk_buff *skb = qdisc->dequeue(qdisc);
 		if (skb) {
+			qdisc_bstats_update(sch, skb);
 			sch->q.qlen--;
 			return skb;
 		}
--- a/net/sched/sch_red.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_red.c	2011-01-14 09:28:20.438634799 -0800
@@ -94,7 +94,6 @@ static int red_enqueue(struct sk_buff *s
 
 	ret = qdisc_enqueue(skb, child);
 	if (likely(ret == NET_XMIT_SUCCESS)) {
-		qdisc_bstats_update(sch, skb);
 		sch->q.qlen++;
 	} else if (net_xmit_drop_count(ret)) {
 		q->stats.pdrop++;
@@ -114,11 +113,13 @@ static struct sk_buff * red_dequeue(stru
 	struct Qdisc *child = q->qdisc;
 
 	skb = child->dequeue(child);
-	if (skb)
+	if (skb) {
+		qdisc_bstats_update(sch, skb);
 		sch->q.qlen--;
-	else if (!red_is_idling(&q->parms))
-		red_start_of_idle_period(&q->parms);
-
+	} else {
+		if (!red_is_idling(&q->parms))
+			red_start_of_idle_period(&q->parms);
+	}
 	return skb;
 }
 
--- a/net/sched/sch_sfq.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_sfq.c	2011-01-14 09:28:20.438634799 -0800
@@ -402,10 +402,8 @@ sfq_enqueue(struct sk_buff *skb, struct
 		q->tail = slot;
 		slot->allot = q->scaled_quantum;
 	}
-	if (++sch->q.qlen <= q->limit) {
-		qdisc_bstats_update(sch, skb);
+	if (++sch->q.qlen <= q->limit)
 		return NET_XMIT_SUCCESS;
-	}
 
 	sfq_drop(sch);
 	return NET_XMIT_CN;
@@ -445,6 +443,7 @@ next_slot:
 	}
 	skb = slot_dequeue_head(slot);
 	sfq_dec(q, a);
+	qdisc_bstats_update(sch, skb);
 	sch->q.qlen--;
 	sch->qstats.backlog -= qdisc_pkt_len(skb);
 
--- a/net/sched/sch_tbf.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_tbf.c	2011-01-14 09:28:20.438634799 -0800
@@ -134,7 +134,6 @@ static int tbf_enqueue(struct sk_buff *s
 	}
 
 	sch->q.qlen++;
-	qdisc_bstats_update(sch, skb);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -187,6 +186,7 @@ static struct sk_buff *tbf_dequeue(struc
 			q->ptokens = ptoks;
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
+			qdisc_bstats_update(sch, skb);
 			return skb;
 		}
 
--- a/net/sched/sch_teql.c	2011-01-14 09:19:00.830857886 -0800
+++ b/net/sched/sch_teql.c	2011-01-14 09:28:20.438634799 -0800
@@ -83,7 +83,6 @@ teql_enqueue(struct sk_buff *skb, struct
 
 	if (q->q.qlen < dev->tx_queue_len) {
 		__skb_queue_tail(&q->q, skb);
-		qdisc_bstats_update(sch, skb);
 		return NET_XMIT_SUCCESS;
 	}
 
@@ -107,6 +106,8 @@ teql_dequeue(struct Qdisc* sch)
 			dat->m->slaves = sch;
 			netif_wake_queue(m);
 		}
+	} else {
+		qdisc_bstats_update(sch, skb);
 	}
 	sch->q.qlen = dat->q.qlen + dat_queue->qdisc->q.qlen;
 	return skb;

^ permalink raw reply

* [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing
From: Oleg V. Ukhno @ 2011-01-14 19:07 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, David S. Miller

Patch introduces new hashing policy for 802.3ad bonding mode.
This hashing policy can be used(was tested) only for round-robin
balancing of ISCSI traffic(single TCP session is balanced (per-packet)
over all slave interfaces. 
General requirements for this hashing policy usage are:
1) switch must be configured with src-dst-mac or src-mac hashing policy 
2) number of bond slaves on sending and receiving machine should be equal
and preferrably even; or simply even, otherwise you may get asymmetric 
load on receiving machine
3) hashing policy must not be used when round trip time between source 
and destination machines for slaves in same bond is expected to be 
significanly different (it works fine when all slaves are plugged into
single switch)

Signed-off-by: Oleg V. Ukhno <olegu@yandex-team.ru>
---

 Documentation/networking/bonding.txt |   27 +++++++++++++++++++++++++++
 drivers/net/bonding/bond_3ad.c       |    6 ++++++
 drivers/net/bonding/bond_main.c      |   18 +++++++++++++++++-
 include/linux/if_bonding.h           |    1 +
 4 files changed, 51 insertions(+), 1 deletion(-)

diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/Documentation/networking/bonding.txt linux-2.6.37.my/Documentation/networking/bonding.txt
--- linux-2.6.37-vanilla/Documentation/networking/bonding.txt	2011-01-05 03:50:19.000000000 +0300
+++ linux-2.6.37.my/Documentation/networking/bonding.txt	2011-01-14 21:34:46.635268000 +0300
@@ -759,6 +759,33 @@ xmit_hash_policy
 		most UDP traffic is not involved in extended
 		conversations.  Other implementations of 802.3ad may
 		or may not tolerate this noncompliance.
+
+	simple-rr or 3
+		This policy simply sends every next packet via "next"
+		slave interface. When sending, it resets mac-address
+		within packet to real mac-address of the slave interface.
+
+		When switch is configured properly, and receiving machine
+		has even and equal number of interfaces, this guarantees
+		quite precise rx/tx load balancing for any single TCP
+		session. Typical use-case for this mode is ISCSI(and patch was
+		developed for), because it ises single TCP session to
+		transmit data.
+
+		It is important to remember, that all slaves should be
+		plugged into single switch to avoid out-of-order packets
+		It is recommended to have equal and even number of slave
+		interfaces in sending and receviving machines bond's,
+		otherwise you will get asymmetric load on receiving host.
+		Another caveat is that hashing policy must not be used when
+		round trip time between source and destination machines for
+		slaves in same bond is expected to be significanly different
+		(it works fine when all slaves are plugged into single switch)
+
+		For correct load baalncing on the receiving side you must
+		configure switch for using src-dst-mac or src-mac hashing
+		mode.
+
 
 	The default value is layer2.  This option was added in bonding
 	version 2.6.3.  In earlier versions of bonding, this parameter
diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c linux-2.6.37.my/drivers/net/bonding/bond_3ad.c
--- linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c	2011-01-14 19:39:05.575268000 +0300
+++ linux-2.6.37.my/drivers/net/bonding/bond_3ad.c	2011-01-14 19:47:03.815268000 +0300
@@ -2395,6 +2395,7 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
 	int i;
 	struct ad_info ad_info;
 	int res = 1;
+	struct ethhdr *eth_data;
 
 	/* make sure that the slaves list will
 	 * not change during tx
@@ -2447,6 +2448,11 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
 			slave_agg_id = agg->aggregator_identifier;
 
 		if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
+			if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYERRR && ntohs(skb->protocol) == ETH_P_IP) {
+				skb_reset_mac_header(skb);
+				eth_data = eth_hdr(skb);
+				memcpy(eth_data->h_source, slave->perm_hwaddr, ETH_ALEN);
+			}
 			res = bond_dev_queue_xmit(bond, skb, slave->dev);
 			break;
 		}
diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c linux-2.6.37.my/drivers/net/bonding/bond_main.c
--- linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c	2011-01-14 19:39:05.575268000 +0300
+++ linux-2.6.37.my/drivers/net/bonding/bond_main.c	2011-01-14 19:47:55.835268001 +0300
@@ -152,7 +152,9 @@ module_param(ad_select, charp, 0);
 MODULE_PARM_DESC(ad_select, "803.ad aggregation selection logic: stable (0, default), bandwidth (1), count (2)");
 module_param(xmit_hash_policy, charp, 0);
 MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method: 0 for layer 2 (default)"
-				   ", 1 for layer 3+4");
+				   ", 1 for layer 3+4"
+				   ", 2 for layer 2+3"
+				   ", 3 for round-robin");
 module_param(arp_interval, int, 0);
 MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds");
 module_param_array(arp_ip_target, charp, NULL, 0);
@@ -206,6 +208,7 @@ const struct bond_parm_tbl xmit_hashtype
 {	"layer2",		BOND_XMIT_POLICY_LAYER2},
 {	"layer3+4",		BOND_XMIT_POLICY_LAYER34},
 {	"layer2+3",		BOND_XMIT_POLICY_LAYER23},
+{	"simple-rr",		BOND_XMIT_POLICY_LAYERRR},
 {	NULL,			-1},
 };
 
@@ -3762,6 +3765,16 @@ static int bond_xmit_hash_policy_l2(stru
 	return (data->h_dest[5] ^ data->h_source[5]) % count;
 }
 
+/*
+ * simply round robin
+ */
+static int bond_xmit_hash_policy_rr(struct sk_buff *skb,
+				   struct net_device *bond_dev, int count)
+{
+	struct bonding *bond = netdev_priv(bond_dev);
+	return bond->rr_tx_counter++ % count;
+}
+
 /*-------------------------- Device entry points ----------------------------*/
 
 static int bond_open(struct net_device *bond_dev)
@@ -4482,6 +4495,9 @@ out:
 static void bond_set_xmit_hash_policy(struct bonding *bond)
 {
 	switch (bond->params.xmit_policy) {
+	case BOND_XMIT_POLICY_LAYERRR:
+		bond->xmit_hash_policy = bond_xmit_hash_policy_rr;
+		break;
 	case BOND_XMIT_POLICY_LAYER23:
 		bond->xmit_hash_policy = bond_xmit_hash_policy_l23;
 		break;
diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/include/linux/if_bonding.h linux-2.6.37.my/include/linux/if_bonding.h
--- linux-2.6.37-vanilla/include/linux/if_bonding.h	2011-01-05 03:50:19.000000000 +0300
+++ linux-2.6.37.my/include/linux/if_bonding.h	2011-01-14 19:34:29.755268001 +0300
@@ -91,6 +91,7 @@
 #define BOND_XMIT_POLICY_LAYER2		0 /* layer 2 (MAC only), default */
 #define BOND_XMIT_POLICY_LAYER34	1 /* layer 3+4 (IP ^ (TCP || UDP)) */
 #define BOND_XMIT_POLICY_LAYER23	2 /* layer 2+3 (IP ^ MAC) */
+#define BOND_XMIT_POLICY_LAYERRR	3 /* round-robin */
 
 typedef struct ifbond {
 	__s32 bond_mode;

^ permalink raw reply

* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: Eric Dumazet @ 2011-01-14 19:21 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, netdev, Patrick McHardy, jamal, Jarek Poplawski
In-Reply-To: <20110114110342.4d95ad5b@nehalam>

Le vendredi 14 janvier 2011 à 11:03 -0800, Stephen Hemminger a écrit :
> From Eric Dumazet <eric.dumazet@gmail.com>
> 
> In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
> a problem with pfifo_head drops that incorrectly decreased
> sch->bstats.bytes and sch->bstats.packets
> 
> Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
> previously enqueued packet, and bstats cannot be changed, so
> bstats/rates are not accurate (over estimated)
> 
> This patch changes the qdisc_bstats updates to be done at dequeue() time
> instead of enqueue() time. bstats counters no longer account for dropped
> frames, and rates are more correct, since enqueue() bursts dont have
> effect on dequeue() rate.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> CC: Patrick McHardy <kaber@trash.net>
> CC: Jarek Poplawski <jarkao2@gmail.com>
> CC: jamal <hadi@cyberus.ca>
> ---
> sch_fifo now changed to use __qdisc_queue_drop_head which
> keeps correct statistics and is actually clearer.
> 
>  

Thanks for doing this Stephen, this version seems fine.



^ permalink raw reply

* Re: [PATCH 1/2] genirq: Add IRQ affinity notifiers
From: Thomas Gleixner @ 2011-01-14 19:47 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Tom Herbert, linux-kernel, netdev,
	linux-net-drivers
In-Reply-To: <1294169919.3636.33.camel@bwh-desktop>

On Tue, 4 Jan 2011, Ben Hutchings wrote:
> +/**
> + * struct irq_affinity_notify - context for notification of IRQ affinity changes
> + * @irq:		Interrupt to which notification applies
> + * @kref:		Reference count, for internal use
> + * @work:		Work item, for internal use
> + * @notify:		Function to be called on change.  This will be
> + *			called in process context.
> + * @release:		Function to be called on release.  This will be
> + *			called in process context.  Once registered, the
> + *			structure must only be freed when this function is
> + *			called or later.
> + */
> +struct irq_affinity_notify {
> +        unsigned int irq;
> +        struct kref kref;
> +#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)

The whole affinity thing is SMP and GENERIC_HARDIRQS only anyway, so
what's the point of this ifdeffery ?

> +        struct work_struct work;
> +#endif
> +        void (*notify)(struct irq_affinity_notify *, const cpumask_t *mask);
> +        void (*release)(struct kref *ref);
> +};
> +

> +/**
> + *	irq_set_affinity_notifier - control notification of IRQ affinity changes
> + *	@irq:		Interrupt for which to enable/disable notification
> + *	@notify:	Context for notification, or %NULL to disable
> + *			notification.  Function pointers must be initialised;
> + *			the other fields will be initialised by this function.
> + *
> + *	Must be called in process context.  Notification may only be enabled
> + *	after the IRQ is allocated but before it is bound with request_irq()

Why? And if there is that restriction, then it needs to be
checked. But I don't see why this is necessary.

> + *	and must be disabled before the IRQ is freed using free_irq().
> + */

> +#ifdef CONFIG_SMP
> +	BUG_ON(desc->affinity_notify);

We should be nice here and just WARN and fixup the wreckage by
uninstalling it.

Thanks,

	tglx

^ permalink raw reply

* Re: [PATCH 1/2] genirq: Add IRQ affinity notifiers
From: Ben Hutchings @ 2011-01-14 20:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: David Miller, Tom Herbert, linux-kernel, netdev,
	linux-net-drivers
In-Reply-To: <alpine.LFD.2.00.1101141928210.2678@localhost6.localdomain6>

On Fri, 2011-01-14 at 20:47 +0100, Thomas Gleixner wrote:
> On Tue, 4 Jan 2011, Ben Hutchings wrote:
> > +/**
> > + * struct irq_affinity_notify - context for notification of IRQ affinity changes
> > + * @irq:		Interrupt to which notification applies
> > + * @kref:		Reference count, for internal use
> > + * @work:		Work item, for internal use
> > + * @notify:		Function to be called on change.  This will be
> > + *			called in process context.
> > + * @release:		Function to be called on release.  This will be
> > + *			called in process context.  Once registered, the
> > + *			structure must only be freed when this function is
> > + *			called or later.
> > + */
> > +struct irq_affinity_notify {
> > +        unsigned int irq;
> > +        struct kref kref;
> > +#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)
> 
> The whole affinity thing is SMP and GENERIC_HARDIRQS only anyway, so
> what's the point of this ifdeffery ?

The intent is that code using this can be compiled even if those config
options are not set.  The work_struct is not needed in that case.  I
think this is probably pointless though.

> > +        struct work_struct work;
> > +#endif
> > +        void (*notify)(struct irq_affinity_notify *, const cpumask_t *mask);
> > +        void (*release)(struct kref *ref);
> > +};
> > +
> 
> > +/**
> > + *	irq_set_affinity_notifier - control notification of IRQ affinity changes
> > + *	@irq:		Interrupt for which to enable/disable notification
> > + *	@notify:	Context for notification, or %NULL to disable
> > + *			notification.  Function pointers must be initialised;
> > + *			the other fields will be initialised by this function.
> > + *
> > + *	Must be called in process context.  Notification may only be enabled
> > + *	after the IRQ is allocated but before it is bound with request_irq()
> 
> Why? And if there is that restriction, then it needs to be
> checked. But I don't see why this is necessary.

Which restriction?

> > + *	and must be disabled before the IRQ is freed using free_irq().
> > + */
> 
> > +#ifdef CONFIG_SMP
> > +	BUG_ON(desc->affinity_notify);
> 
> We should be nice here and just WARN and fixup the wreckage by
> uninstalling it.

OK.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing
From: John Fastabend @ 2011-01-14 20:10 UTC (permalink / raw)
  To: Oleg V. Ukhno; +Cc: netdev@vger.kernel.org, Jay Vosburgh, David S. Miller
In-Reply-To: <20110114190714.GA11655@yandex-team.ru>

On 1/14/2011 11:07 AM, Oleg V. Ukhno wrote:
> Patch introduces new hashing policy for 802.3ad bonding mode.
> This hashing policy can be used(was tested) only for round-robin
> balancing of ISCSI traffic(single TCP session is balanced (per-packet)
> over all slave interfaces. 
> General requirements for this hashing policy usage are:
> 1) switch must be configured with src-dst-mac or src-mac hashing policy 
> 2) number of bond slaves on sending and receiving machine should be equal
> and preferrably even; or simply even, otherwise you may get asymmetric 
> load on receiving machine
> 3) hashing policy must not be used when round trip time between source 
> and destination machines for slaves in same bond is expected to be 
> significanly different (it works fine when all slaves are plugged into
> single switch)
> 
> Signed-off-by: Oleg V. Ukhno <olegu@yandex-team.ru>
> ---

I think you want this patch against net-next not 2.6.37.

> 
>  Documentation/networking/bonding.txt |   27 +++++++++++++++++++++++++++
>  drivers/net/bonding/bond_3ad.c       |    6 ++++++
>  drivers/net/bonding/bond_main.c      |   18 +++++++++++++++++-
>  include/linux/if_bonding.h           |    1 +
>  4 files changed, 51 insertions(+), 1 deletion(-)
> 
> diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/Documentation/networking/bonding.txt linux-2.6.37.my/Documentation/networking/bonding.txt
> --- linux-2.6.37-vanilla/Documentation/networking/bonding.txt	2011-01-05 03:50:19.000000000 +0300
> +++ linux-2.6.37.my/Documentation/networking/bonding.txt	2011-01-14 21:34:46.635268000 +0300
> @@ -759,6 +759,33 @@ xmit_hash_policy
>  		most UDP traffic is not involved in extended
>  		conversations.  Other implementations of 802.3ad may
>  		or may not tolerate this noncompliance.
> +
> +	simple-rr or 3
> +		This policy simply sends every next packet via "next"
> +		slave interface. When sending, it resets mac-address
> +		within packet to real mac-address of the slave interface.
> +
> +		When switch is configured properly, and receiving machine
> +		has even and equal number of interfaces, this guarantees
> +		quite precise rx/tx load balancing for any single TCP
> +		session. Typical use-case for this mode is ISCSI(and patch was
> +		developed for), because it ises single TCP session to
> +		transmit data.

Oleg, sorry but I don't follow. If this is simply sending every next packet
via "next" slave interface how are packets not going to get out of order? If
the links have different RTT this would seem problematic.

Have you considered using multipath at the block layer? This is how I generally
handle load balancing over iSCSI/FCoE and it works reasonably well.

see ./drivers/md/dm-mpath.c

> +
> +		It is important to remember, that all slaves should be
> +		plugged into single switch to avoid out-of-order packets
> +		It is recommended to have equal and even number of slave
> +		interfaces in sending and receviving machines bond's,
> +		otherwise you will get asymmetric load on receiving host.
> +		Another caveat is that hashing policy must not be used when
> +		round trip time between source and destination machines for
> +		slaves in same bond is expected to be significanly different
> +		(it works fine when all slaves are plugged into single switch)
> +
> +		For correct load baalncing on the receiving side you must
> +		configure switch for using src-dst-mac or src-mac hashing
> +		mode.
> +
>  
>  	The default value is layer2.  This option was added in bonding
>  	version 2.6.3.  In earlier versions of bonding, this parameter
> diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c linux-2.6.37.my/drivers/net/bonding/bond_3ad.c
> --- linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c	2011-01-14 19:39:05.575268000 +0300
> +++ linux-2.6.37.my/drivers/net/bonding/bond_3ad.c	2011-01-14 19:47:03.815268000 +0300
> @@ -2395,6 +2395,7 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
>  	int i;
>  	struct ad_info ad_info;
>  	int res = 1;
> +	struct ethhdr *eth_data;
>  
>  	/* make sure that the slaves list will
>  	 * not change during tx
> @@ -2447,6 +2448,11 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
>  			slave_agg_id = agg->aggregator_identifier;
>  
>  		if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
> +			if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYERRR && ntohs(skb->protocol) == ETH_P_IP) {
> +				skb_reset_mac_header(skb);
> +				eth_data = eth_hdr(skb);
> +				memcpy(eth_data->h_source, slave->perm_hwaddr, ETH_ALEN);
> +			}
>  			res = bond_dev_queue_xmit(bond, skb, slave->dev);
>  			break;
>  		}
> diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c linux-2.6.37.my/drivers/net/bonding/bond_main.c
> --- linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c	2011-01-14 19:39:05.575268000 +0300
> +++ linux-2.6.37.my/drivers/net/bonding/bond_main.c	2011-01-14 19:47:55.835268001 +0300
> @@ -152,7 +152,9 @@ module_param(ad_select, charp, 0);
>  MODULE_PARM_DESC(ad_select, "803.ad aggregation selection logic: stable (0, default), bandwidth (1), count (2)");
>  module_param(xmit_hash_policy, charp, 0);
>  MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method: 0 for layer 2 (default)"
> -				   ", 1 for layer 3+4");
> +				   ", 1 for layer 3+4"
> +				   ", 2 for layer 2+3"
> +				   ", 3 for round-robin");
>  module_param(arp_interval, int, 0);
>  MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds");
>  module_param_array(arp_ip_target, charp, NULL, 0);
> @@ -206,6 +208,7 @@ const struct bond_parm_tbl xmit_hashtype
>  {	"layer2",		BOND_XMIT_POLICY_LAYER2},
>  {	"layer3+4",		BOND_XMIT_POLICY_LAYER34},
>  {	"layer2+3",		BOND_XMIT_POLICY_LAYER23},
> +{	"simple-rr",		BOND_XMIT_POLICY_LAYERRR},
>  {	NULL,			-1},
>  };
>  
> @@ -3762,6 +3765,16 @@ static int bond_xmit_hash_policy_l2(stru
>  	return (data->h_dest[5] ^ data->h_source[5]) % count;
>  }
>  
> +/*
> + * simply round robin
> + */
> +static int bond_xmit_hash_policy_rr(struct sk_buff *skb,
> +				   struct net_device *bond_dev, int count)

Here's one reason why this won't work on net-next-2.6.

int      (*xmit_hash_policy)(struct sk_buff *, int);


Thanks,
John

^ permalink raw reply

* Re: [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing
From: Jay Vosburgh @ 2011-01-14 20:13 UTC (permalink / raw)
  To: Oleg V. Ukhno; +Cc: netdev, David S. Miller
In-Reply-To: <20110114190714.GA11655@yandex-team.ru>

Oleg V. Ukhno <olegu@yandex-team.ru> wrote:

>Patch introduces new hashing policy for 802.3ad bonding mode.
>This hashing policy can be used(was tested) only for round-robin
>balancing of ISCSI traffic(single TCP session is balanced (per-packet)
>over all slave interfaces. 

	This is a violation of the 802.3ad (now 802.1ax) standard, 5.2.1
(f), which requires that all frames of a given "conversation" are passed
to a single port.

	The existing layer3+4 hash has a similar problem (that it may
send packets from a conversation to multiple ports), but for that case
it's an unlikely exception (only in the case of IP fragmentation), but
here it's the norm.  At a minimum, this must be clearly documented.

	Also, what does a round robin in 802.3ad provide that the
existing round robin does not?  My presumption is that you're looking to
get the aggregator autoconfiguration that 802.3ad provides, but you
don't say.

	I don't necessarily think this is a bad cheat (round robining on
802.3ad as an explicit non-standard extension), since everybody wants to
stripe their traffic across multiple slaves.  I've given some thought to
making round robin into just another hash mode, but this also does some
magic to the MAC addresses of the outgoing frames (more on that below).

>General requirements for this hashing policy usage are:
>1) switch must be configured with src-dst-mac or src-mac hashing policy 
>2) number of bond slaves on sending and receiving machine should be equal
>and preferrably even; or simply even, otherwise you may get asymmetric 
>load on receiving machine
>3) hashing policy must not be used when round trip time between source 
>and destination machines for slaves in same bond is expected to be 
>significanly different (it works fine when all slaves are plugged into
>single switch)
>
>Signed-off-by: Oleg V. Ukhno <olegu@yandex-team.ru>
>---
>
> Documentation/networking/bonding.txt |   27 +++++++++++++++++++++++++++
> drivers/net/bonding/bond_3ad.c       |    6 ++++++
> drivers/net/bonding/bond_main.c      |   18 +++++++++++++++++-
> include/linux/if_bonding.h           |    1 +
> 4 files changed, 51 insertions(+), 1 deletion(-)
>
>diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/Documentation/networking/bonding.txt linux-2.6.37.my/Documentation/networking/bonding.txt
>--- linux-2.6.37-vanilla/Documentation/networking/bonding.txt	2011-01-05 03:50:19.000000000 +0300
>+++ linux-2.6.37.my/Documentation/networking/bonding.txt	2011-01-14 21:34:46.635268000 +0300
>@@ -759,6 +759,33 @@ xmit_hash_policy
> 		most UDP traffic is not involved in extended
> 		conversations.  Other implementations of 802.3ad may
> 		or may not tolerate this noncompliance.
>+
>+	simple-rr or 3
>+		This policy simply sends every next packet via "next"
>+		slave interface. When sending, it resets mac-address
>+		within packet to real mac-address of the slave interface.

	Why is the MAC address reset done?  This is also a violation of
802.3ad, 5.2.1 (j).

>+		When switch is configured properly, and receiving machine
>+		has even and equal number of interfaces, this guarantees
>+		quite precise rx/tx load balancing for any single TCP
>+		session. Typical use-case for this mode is ISCSI(and patch was
>+		developed for), because it ises single TCP session to
>+		transmit data.
>+
>+		It is important to remember, that all slaves should be
>+		plugged into single switch to avoid out-of-order packets
>+		It is recommended to have equal and even number of slave
>+		interfaces in sending and receviving machines bond's,
>+		otherwise you will get asymmetric load on receiving host.
>+		Another caveat is that hashing policy must not be used when
>+		round trip time between source and destination machines for
>+		slaves in same bond is expected to be significanly different
>+		(it works fine when all slaves are plugged into single switch)
>+
>+		For correct load baalncing on the receiving side you must
>+		configure switch for using src-dst-mac or src-mac hashing
>+		mode.
>+
>
> 	The default value is layer2.  This option was added in bonding
> 	version 2.6.3.  In earlier versions of bonding, this parameter
>diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c linux-2.6.37.my/drivers/net/bonding/bond_3ad.c
>--- linux-2.6.37-vanilla/drivers/net/bonding/bond_3ad.c	2011-01-14 19:39:05.575268000 +0300
>+++ linux-2.6.37.my/drivers/net/bonding/bond_3ad.c	2011-01-14 19:47:03.815268000 +0300
>@@ -2395,6 +2395,7 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
> 	int i;
> 	struct ad_info ad_info;
> 	int res = 1;
>+	struct ethhdr *eth_data;
>
> 	/* make sure that the slaves list will
> 	 * not change during tx
>@@ -2447,6 +2448,11 @@ int bond_3ad_xmit_xor(struct sk_buff *sk
> 			slave_agg_id = agg->aggregator_identifier;
>
> 		if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
>+			if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYERRR && ntohs(skb->protocol) == ETH_P_IP) {
>+				skb_reset_mac_header(skb);
>+				eth_data = eth_hdr(skb);
>+				memcpy(eth_data->h_source, slave->perm_hwaddr, ETH_ALEN);
>+			}

	This is the code that resets the MAC header as described above.
It doesn't quite match the documentation, since it only resets the MAC
for ETH_P_IP packets.

> 			res = bond_dev_queue_xmit(bond, skb, slave->dev);
> 			break;
> 		}
>diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c linux-2.6.37.my/drivers/net/bonding/bond_main.c
>--- linux-2.6.37-vanilla/drivers/net/bonding/bond_main.c	2011-01-14 19:39:05.575268000 +0300
>+++ linux-2.6.37.my/drivers/net/bonding/bond_main.c	2011-01-14 19:47:55.835268001 +0300
>@@ -152,7 +152,9 @@ module_param(ad_select, charp, 0);
> MODULE_PARM_DESC(ad_select, "803.ad aggregation selection logic: stable (0, default), bandwidth (1), count (2)");
> module_param(xmit_hash_policy, charp, 0);
> MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method: 0 for layer 2 (default)"
>-				   ", 1 for layer 3+4");
>+				   ", 1 for layer 3+4"
>+				   ", 2 for layer 2+3"
>+				   ", 3 for round-robin");
> module_param(arp_interval, int, 0);
> MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds");
> module_param_array(arp_ip_target, charp, NULL, 0);
>@@ -206,6 +208,7 @@ const struct bond_parm_tbl xmit_hashtype
> {	"layer2",		BOND_XMIT_POLICY_LAYER2},
> {	"layer3+4",		BOND_XMIT_POLICY_LAYER34},
> {	"layer2+3",		BOND_XMIT_POLICY_LAYER23},
>+{	"simple-rr",		BOND_XMIT_POLICY_LAYERRR},

	I'd just call it "round-robin" instead of "simple-rr".

> {	NULL,			-1},
> };
>
>@@ -3762,6 +3765,16 @@ static int bond_xmit_hash_policy_l2(stru
> 	return (data->h_dest[5] ^ data->h_source[5]) % count;
> }
>
>+/*
>+ * simply round robin
>+ */
>+static int bond_xmit_hash_policy_rr(struct sk_buff *skb,
>+				   struct net_device *bond_dev, int count)
>+{
>+	struct bonding *bond = netdev_priv(bond_dev);
>+	return bond->rr_tx_counter++ % count;
>+}
>+
> /*-------------------------- Device entry points ----------------------------*/
>
> static int bond_open(struct net_device *bond_dev)
>@@ -4482,6 +4495,9 @@ out:
> static void bond_set_xmit_hash_policy(struct bonding *bond)
> {
> 	switch (bond->params.xmit_policy) {
>+	case BOND_XMIT_POLICY_LAYERRR:
>+		bond->xmit_hash_policy = bond_xmit_hash_policy_rr;
>+		break;
> 	case BOND_XMIT_POLICY_LAYER23:
> 		bond->xmit_hash_policy = bond_xmit_hash_policy_l23;
> 		break;
>diff -uprN -X linux-2.6.37-vanilla/Documentation/dontdiff linux-2.6.37-vanilla/include/linux/if_bonding.h linux-2.6.37.my/include/linux/if_bonding.h
>--- linux-2.6.37-vanilla/include/linux/if_bonding.h	2011-01-05 03:50:19.000000000 +0300
>+++ linux-2.6.37.my/include/linux/if_bonding.h	2011-01-14 19:34:29.755268001 +0300
>@@ -91,6 +91,7 @@
> #define BOND_XMIT_POLICY_LAYER2		0 /* layer 2 (MAC only), default */
> #define BOND_XMIT_POLICY_LAYER34	1 /* layer 3+4 (IP ^ (TCP || UDP)) */
> #define BOND_XMIT_POLICY_LAYER23	2 /* layer 2+3 (IP ^ MAC) */
>+#define BOND_XMIT_POLICY_LAYERRR	3 /* round-robin */
>
> typedef struct ifbond {
> 	__s32 bond_mode;

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [PATCH 1/2] genirq: Add IRQ affinity notifiers
From: Thomas Gleixner @ 2011-01-14 20:40 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Tom Herbert, linux-kernel, netdev,
	linux-net-drivers
In-Reply-To: <1295035597.5386.8.camel@bwh-desktop>

On Fri, 14 Jan 2011, Ben Hutchings wrote:
> On Fri, 2011-01-14 at 20:47 +0100, Thomas Gleixner wrote:
> > > +#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)
> > 
> > The whole affinity thing is SMP and GENERIC_HARDIRQS only anyway, so
> > what's the point of this ifdeffery ?
> 
> The intent is that code using this can be compiled even if those config
> options are not set.  The work_struct is not needed in that case.  I
> think this is probably pointless though.

Yup, work_struct is defined for the !SMP and !GENERIC_HARDIRQS case as
well :)
 
> > > +        struct work_struct work;
> > > +#endif
> > > +        void (*notify)(struct irq_affinity_notify *, const cpumask_t *mask);
> > > +        void (*release)(struct kref *ref);
> > > +};
> > > +
> > 
> > > +/**
> > > + *	irq_set_affinity_notifier - control notification of IRQ affinity changes
> > > + *	@irq:		Interrupt for which to enable/disable notification
> > > + *	@notify:	Context for notification, or %NULL to disable
> > > + *			notification.  Function pointers must be initialised;
> > > + *			the other fields will be initialised by this function.
> > > + *
> > > + *	Must be called in process context.  Notification may only be enabled
> > > + *	after the IRQ is allocated but before it is bound with request_irq()
> > 
> > Why? And if there is that restriction, then it needs to be
> > checked. But I don't see why this is necessary.
> 
> Which restriction?

  Notification may only be enabled after the IRQ is allocated but
  before it is bound with request_irq()

After IRQ is allocated is obvious, but why needs it to be done
_before_ request_irq() ?

Thanks,

	tglx

^ permalink raw reply

* Re: Kernel 2.6.37-git10 build failure: cassini.c
From: David Miller @ 2011-01-14 20:41 UTC (permalink / raw)
  To: anca.emanuel
  Cc: linux-kernel, netdev, grant.likely, eric.dumazet, joe, siccegge,
	jpirko
In-Reply-To: <AANLkTin61iqrD0Jbax+cvQ1wxD8b=c4KJv7pBSvU7kbv@mail.gmail.com>

From: Anca Emanuel <anca.emanuel@gmail.com>
Date: Fri, 14 Jan 2011 10:09:43 +0200

> drivers/net/cassini.c: In function ‘cas_get_vpd_info’:
> drivers/net/cassini.c:3358: error: implicit declaration of function
> ‘of_get_property’
> drivers/net/cassini.c:3358: warning: assignment makes pointer from
> integer without a cast
> drivers/net/cassini.c: In function ‘cas_init_one’:
> drivers/net/cassini.c:5035: error: implicit declaration of function
> ‘pci_device_to_OF_node’
> drivers/net/cassini.c:5035: warning: assignment makes pointer from
> integer without a cast
> make[3]: *** [drivers/net/cassini.o] Error 1
> make[2]: *** [drivers/net] Error 2

This is the fix I'll be using:

--------------------
cassini: Fix build bustage on x86.

Unfortunately, not all CONFIG_OF platforms provide
pci_device_to_OF_node().

Change the test to CONFIG_SPARC for now to deal with
the build regressions.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/cassini.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c
index 7206ab2..3437613 100644
--- a/drivers/net/cassini.c
+++ b/drivers/net/cassini.c
@@ -3203,7 +3203,7 @@ static int cas_get_vpd_info(struct cas *cp, unsigned char *dev_addr,
 	int phy_type = CAS_PHY_MII_MDIO0; /* default phy type */
 	int mac_off  = 0;
 
-#if defined(CONFIG_OF)
+#if defined(CONFIG_SPARC)
 	const unsigned char *addr;
 #endif
 
@@ -3354,7 +3354,7 @@ use_random_mac_addr:
 	if (found & VPD_FOUND_MAC)
 		goto done;
 
-#if defined(CONFIG_OF)
+#if defined(CONFIG_SPARC)
 	addr = of_get_property(cp->of_node, "local-mac-address", NULL);
 	if (addr != NULL) {
 		memcpy(dev_addr, addr, 6);
@@ -5031,7 +5031,7 @@ static int __devinit cas_init_one(struct pci_dev *pdev,
 	cp->msg_enable = (cassini_debug < 0) ? CAS_DEF_MSG_ENABLE :
 	  cassini_debug;
 
-#if defined(CONFIG_OF)
+#if defined(CONFIG_SPARC)
 	cp->of_node = pci_device_to_OF_node(pdev);
 #endif
 
-- 
1.7.3.4


^ permalink raw reply related

* Re: [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing
From: Nicolas de Pesloüan @ 2011-01-14 20:41 UTC (permalink / raw)
  To: Oleg V. Ukhno; +Cc: netdev, Jay Vosburgh, David S. Miller
In-Reply-To: <20110114190714.GA11655@yandex-team.ru>

Le 14/01/2011 20:07, Oleg V. Ukhno a écrit :

> +
> +		For correct load baalncing on the receiving side you must
> +		configure switch for using src-dst-mac or src-mac hashing
> +		mode.

Typo in baalncing -> balancing.

	Nicolas.

^ permalink raw reply

* Re: [PULL] vhost-net: 2.6.38 fix
From: David Miller @ 2011-01-14 20:41 UTC (permalink / raw)
  To: mst; +Cc: kvm, virtualization, netdev, linux-kernel
In-Reply-To: <20110114093302.GA702@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Fri, 14 Jan 2011 11:33:02 +0200

> Please pull the following for 2.6.38.
> Thanks!
> 
> The following changes since commit 0c21e3aaf6ae85bee804a325aa29c325209180fd:
> 
>   Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus (2011-01-07 17:16:27 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net
> 
> Michael S. Tsirkin (1):
>       vhost: fix signed/unsigned comparison

Pulled, thanks.

^ permalink raw reply

* Re: pull request: sfc-2.6 2011-01-14
From: David Miller @ 2011-01-14 20:42 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, linux-net-drivers
In-Reply-To: <1295014889.5386.1.camel@bwh-desktop>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 14 Jan 2011 14:21:29 +0000

> The following changes since commit 5b919f833d9d60588d026ad82d17f17e8872c7a9:
> 
>   net: ax25: fix information leak to userland harder (2011-01-12 00:34:49 -0800)
> 
> are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6.git master
> 
> A minor optimisation and a regression fix.

Pulled, thanks Ben.

^ permalink raw reply

* Re: [net-2.6 0/3][pull-request] Intel Wired LAN Driver Updates
From: David Miller @ 2011-01-14 20:43 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, bphilips
In-Reply-To: <1295005350-28124-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Fri, 14 Jan 2011 03:42:27 -0800

> The following series contains a fix for e1000 and trivial fixes
> for e1000e.
> 
> The following are changes since commit 1949e084bfd143c76e22c0b37f370d6e7bf4bfdd:
>   Merge branch 'master' of git://1984.lsi.us.es/net-2.6
> 
> and are available in the git repository at:
>   master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-2.6 master
> 
> Bruce Allan (2):
>   e1000e: update Copyright for 2011
>   e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs
> 
> Jesse Brandeburg (1):
>   e1000: Avoid unhandled IRQ

Pulled, thanks Jeff.

^ permalink raw reply

* Re: [PATCH 1/7 v2] GRETH: added raw AMBA vendor/device number to match against.
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-1-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:37 +0100

> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/7 v2] GRETH: fix opening/closing
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-2-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:38 +0100

> When NAPI is disabled there is no point in having IRQs enabled, TX/RX
> should be off before clearing the TX/RX descriptor rings.
> 
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/7 v2] GRETH: GBit transmit descriptor handling optimization
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-3-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:39 +0100

> It is safe to enable all fragments before enabling the first descriptor,
> this way all descriptors don't have to be processed twice, added extra
> memory barrier.
> 
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 4/7 v2] GRETH: fixed skb buffer memory leak on frame errors
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-4-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:40 +0100

> A new SKB buffer should not be allocated when the old SKB is reused.
> 
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 5/7 v2] GRETH: avoid writing bad speed/duplex when setting transfer mode
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-5-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:41 +0100

> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 6/7 v2] GRETH: handle frame error interrupts
From: David Miller @ 2011-01-14 20:46 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-6-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:42 +0100

> Frame error interrupts must also be handled since the RX flag only indicates
> successful reception, it is unlikely but the old code may lead to dead lock
> if 128 error frames are recieved in a row.
> 
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: [PATCH 7/7 v2] GRETH: resolve SMP issues and other problems
From: David Miller @ 2011-01-14 20:47 UTC (permalink / raw)
  To: daniel; +Cc: netdev, kristoffer
In-Reply-To: <1295010163-2585-7-git-send-email-daniel@gaisler.com>

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Fri, 14 Jan 2011 14:02:43 +0100

> Fixes the following:
 ...
> Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>

Applied.

^ permalink raw reply

* Re: Kernel 2.6.37-git10 build failure: cassini.c
From: Grant Likely @ 2011-01-14 21:01 UTC (permalink / raw)
  To: David Miller
  Cc: anca.emanuel, linux-kernel, netdev, eric.dumazet, joe, siccegge,
	jpirko
In-Reply-To: <20110114.124116.197952526.davem@davemloft.net>

2011/1/14 David Miller <davem@davemloft.net>:
> From: Anca Emanuel <anca.emanuel@gmail.com>
> Date: Fri, 14 Jan 2011 10:09:43 +0200
>
>> drivers/net/cassini.c: In function ‘cas_get_vpd_info’:
>> drivers/net/cassini.c:3358: error: implicit declaration of function
>> ‘of_get_property’
>> drivers/net/cassini.c:3358: warning: assignment makes pointer from
>> integer without a cast
>> drivers/net/cassini.c: In function ‘cas_init_one’:
>> drivers/net/cassini.c:5035: error: implicit declaration of function
>> ‘pci_device_to_OF_node’
>> drivers/net/cassini.c:5035: warning: assignment makes pointer from
>> integer without a cast
>> make[3]: *** [drivers/net/cassini.o] Error 1
>> make[2]: *** [drivers/net] Error 2
>
> This is the fix I'll be using:
>
> --------------------
> cassini: Fix build bustage on x86.
>
> Unfortunately, not all CONFIG_OF platforms provide
> pci_device_to_OF_node().
>
> Change the test to CONFIG_SPARC for now to deal with
> the build regressions.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Acked-by: Grant Likely <grant.likely@secretlab.ca>

pci_device_to_OF_node() will probably become available for all
CONFIG_OF users in 2.6.39.  In the meantime, I agree with this
solution.

g.

> ---
>  drivers/net/cassini.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c
> index 7206ab2..3437613 100644
> --- a/drivers/net/cassini.c
> +++ b/drivers/net/cassini.c
> @@ -3203,7 +3203,7 @@ static int cas_get_vpd_info(struct cas *cp, unsigned char *dev_addr,
>        int phy_type = CAS_PHY_MII_MDIO0; /* default phy type */
>        int mac_off  = 0;
>
> -#if defined(CONFIG_OF)
> +#if defined(CONFIG_SPARC)
>        const unsigned char *addr;
>  #endif
>
> @@ -3354,7 +3354,7 @@ use_random_mac_addr:
>        if (found & VPD_FOUND_MAC)
>                goto done;
>
> -#if defined(CONFIG_OF)
> +#if defined(CONFIG_SPARC)
>        addr = of_get_property(cp->of_node, "local-mac-address", NULL);
>        if (addr != NULL) {
>                memcpy(dev_addr, addr, 6);
> @@ -5031,7 +5031,7 @@ static int __devinit cas_init_one(struct pci_dev *pdev,
>        cp->msg_enable = (cassini_debug < 0) ? CAS_DEF_MSG_ENABLE :
>          cassini_debug;
>
> -#if defined(CONFIG_OF)
> +#if defined(CONFIG_SPARC)
>        cp->of_node = pci_device_to_OF_node(pdev);
>  #endif
>
> --
> 1.7.3.4
>
>



-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2011-01-14 21:03 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) NAPI and SMP locking bug fixes in GRETH from Daniel Hellstrom.

2) Fix Cassini driver build on x86.

3) Fix unhandled IRQs in e1000, from Jesse Brandeburg.

4) SFC accidently stopped adhering to rss_cpus module parm, from
   Ben Hutchings.

5) IPV6 forwarding path must check skb->packet_type for PACKET_HOST,
   otherwise we get packet storms, fix from Alexey Kuznetsov.

6) rndis driver can deadlock in stats handling, part of the problem is
   the use of dev_txq_stats_fold() which makes this situation too easy
   to get into.  Kill the interface and convert the small number of
   existing users, thus fixing the rndis deadlocks.  From Eric Dumazet.

7) tproxy w/o conntrack build fix in netfilter, from KOVACS Krisztian.

8) ath9k wireless fixes from Sujith Manoharan.

9) Fix ctnetlink error signalling such that we don't loop forever
   in some situations, from Pablo Neira Ayuso.

10) Kernel doc fixups from Randy Dunlap.

11) Wireless stack kernel doc and other comment fixes from Johannes Berg.

Please pull, thanks a lot!

The following changes since commit 4162cf64973df51fc885825bc9ca4d055891c49f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 (2011-01-11 16:32:41 -0800)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Alexey Kuznetsov (1):
      inet6: prevent network storms caused by linux IPv6 routers

Ben Hutchings (4):
      sfc: Make efx_get_tx_queue() an inline function
      sfc: Restore the effect of the rss_cpus module parameter
      ks8695net: Disable non-working ethtool operations
      ks8695net: Use default implementation of ethtool_ops::get_link

Bruce Allan (2):
      e1000e: update Copyright for 2011
      e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs

Christian Lamparter (1):
      p54: fix sequence no. accounting off-by-one error

Daniel Hellstrom (7):
      GRETH: added raw AMBA vendor/device number to match against.
      GRETH: fix opening/closing
      GRETH: GBit transmit descriptor handling optimization
      GRETH: fixed skb buffer memory leak on frame errors
      GRETH: avoid writing bad speed/duplex when setting transfer mode
      GRETH: handle frame error interrupts
      GRETH: resolve SMP issues and other problems

David S. Miller (7):
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
      Merge branch 'master' of git://1984.lsi.us.es/net-2.6
      Merge branch 'master' of git://1984.lsi.us.es/net-2.6
      cassini: Fix build bustage on x86.
      Merge branch 'vhost-net' of git://git.kernel.org/.../mst/vhost
      Merge branch 'master' of git://git.kernel.org/.../bwh/sfc-2.6
      Merge branch 'master' of master.kernel.org:/.../jkirsher/net-2.6

Eric Dumazet (1):
      net: remove dev_txq_stats_fold()

Indan Zupancic (1):
      ipw2200: Check for -1 INTA in tasklet too.

Jesper Juhl (2):
      vxge: Remember to release firmware after upgrading firmware
      USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable.

Jesse Brandeburg (1):
      e1000: Avoid unhandled IRQ

Joe Perches (2):
      bna: Remove unnecessary memset(,0,)
      netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr

Johannes Berg (5):
      mac80211: add remain-on-channel docs
      mac80211: add missing docs for off-chan TX flag
      cfg80211: add mesh join/leave callback docs
      nl80211: add/fix mesh docs
      mac80211: add doc short section on LED triggers

KOVACS Krisztian (1):
      netfilter: fix compilation when conntrack is disabled but tproxy is enabled

Kees Cook (1):
      net: ax25: fix information leak to userland harder

Michael Buesch (1):
      ssb: Ignore dangling ethernet cores on wireless devices

Michael S. Tsirkin (1):
      vhost: fix signed/unsigned comparison

Nicolas Dichtel (1):
      ipsec: update MAX_AH_AUTH_LEN to support sha512

Pablo Neira Ayuso (1):
      netfilter: ctnetlink: fix loop in ctnetlink_get_conntrack()

Randy Dunlap (1):
      eth: fix new kernel-doc warning

Stanislaw Gruszka (1):
      hostap_cs: fix sleeping function called from invalid context

Sujith Manoharan (5):
      ath9k_hw: Fix chip test
      ath9k_hw: Fix calibration for AR9287 devices
      ath9k_hw: Fix thermal issue with UB94
      ath9k_hw: Fix RX handling for USB devices
      ath9k_htc: Really fix packet injection

Tobias Klauser (4):
      netdev: ucc_geth: Use is_multicast_ether_addr helper
      netdev: bfin_mac: Use is_multicast_ether_addr helper
      etherdevice.h: Add is_unicast_ether_addr function
      netdev: tilepro: Use is_unicast_ether_addr helper

françois romieu (1):
      r8169: keep firmware in memory.

stephen hemminger (1):
      sched: remove unused backlog in RED stats

 Documentation/DocBook/80211.tmpl               |   21 ++-
 drivers/net/arm/ks8695net.c                    |  288 ++++++++----------------
 drivers/net/bfin_mac.c                         |    9 +-
 drivers/net/bna/bnad_ethtool.c                 |    1 -
 drivers/net/cassini.c                          |    6 +-
 drivers/net/e1000/e1000_main.c                 |   10 +-
 drivers/net/e1000e/82571.c                     |    4 +-
 drivers/net/e1000e/Makefile                    |    2 +-
 drivers/net/e1000e/defines.h                   |    2 +-
 drivers/net/e1000e/e1000.h                     |    2 +-
 drivers/net/e1000e/es2lan.c                    |    2 +-
 drivers/net/e1000e/ethtool.c                   |    2 +-
 drivers/net/e1000e/hw.h                        |    4 +-
 drivers/net/e1000e/ich8lan.c                   |    2 +-
 drivers/net/e1000e/lib.c                       |   20 +-
 drivers/net/e1000e/netdev.c                    |  223 +++++++++---------
 drivers/net/e1000e/param.c                     |    6 +-
 drivers/net/e1000e/phy.c                       |    4 +-
 drivers/net/gianfar.c                          |   10 +-
 drivers/net/gianfar.h                          |   10 +
 drivers/net/greth.c                            |  221 +++++++++++--------
 drivers/net/greth.h                            |    2 +
 drivers/net/ixgbe/ixgbe_main.c                 |   23 ++-
 drivers/net/macvtap.c                          |    2 +-
 drivers/net/r8169.c                            |   43 +++-
 drivers/net/sfc/efx.c                          |   18 +-
 drivers/net/sfc/net_driver.h                   |   10 +-
 drivers/net/tile/tilepro.c                     |   10 +-
 drivers/net/ucc_geth.c                         |    2 +-
 drivers/net/usb/cdc_ncm.c                      |    4 +-
 drivers/net/vxge/vxge-main.c                   |    1 +
 drivers/net/wireless/ath/ath9k/ar9002_calib.c  |    3 +
 drivers/net/wireless/ath/ath9k/eeprom_def.c    |    4 +
 drivers/net/wireless/ath/ath9k/htc.h           |    1 +
 drivers/net/wireless/ath/ath9k/htc_drv_main.c  |   37 +++-
 drivers/net/wireless/ath/ath9k/hw.c            |    5 +-
 drivers/net/wireless/hostap/hostap_cs.c        |   15 +-
 drivers/net/wireless/ipw2x00/ipw2200.c         |    7 +
 drivers/net/wireless/p54/txrx.c                |    2 +-
 drivers/ssb/scan.c                             |   10 +
 drivers/vhost/vhost.c                          |   18 +-
 include/linux/etherdevice.h                    |   11 +
 include/linux/netdevice.h                      |    5 -
 include/linux/nl80211.h                        |   20 ++-
 include/linux/skbuff.h                         |   15 ++
 include/net/ah.h                               |    2 +-
 include/net/cfg80211.h                         |    2 +
 include/net/mac80211.h                         |   14 ++
 include/net/netfilter/ipv6/nf_conntrack_ipv6.h |   10 -
 include/net/netfilter/ipv6/nf_defrag_ipv6.h    |   10 +
 include/net/red.h                              |    1 -
 net/ax25/af_ax25.c                             |    2 +-
 net/core/dev.c                                 |   29 ---
 net/core/skbuff.c                              |    2 +
 net/ethernet/eth.c                             |    2 +-
 net/ipv6/ip6_output.c                          |    3 +
 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c      |    8 +-
 net/netfilter/nf_conntrack_netlink.c           |    3 +-
 net/sched/sch_teql.c                           |   26 ++-
 59 files changed, 655 insertions(+), 576 deletions(-)

^ permalink raw reply

* Re: sch_sfb
From: Jarek Poplawski @ 2011-01-14 21:06 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Patrick McHardy, netdev, David Miller
In-Reply-To: <7ivd1rsj6n.fsf@lanthane.pps.jussieu.fr>

Juliusz Chroboczek wrote:
>> I just looked at it out of interest after already having started my
>> own version.
> 
>>>   http://thread.gmane.org/gmane.linux.network/90225
>>>   http://thread.gmane.org/gmane.linux.network/90375
> 
>>> It was reviewed in particular by one Patrick McHardy.
> 
>> There's no reason to be pissed
> 
> Yes, there is.
> 
> First you object to my patch by making a bunch of unreasonable requests
> (notably that I use the in-kernel classifiers, which are not usable with
> Bloom filters).  Then it turns out you're implementing your own version
> "from scratch".  And then you claim that you never saw my version in the
> first place?
> 
> Patrick, what you're doing is not merely rude, it's actually unethical.

Or Linux Classics ;-) Vide: Molnar vs Kolivas.

Jarek P.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox