Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Emmanuel Grumbach @ 2012-06-30 18:18 UTC (permalink / raw)
  To: Paul Bolle
  Cc: Stanislaw Gruszka, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1341062406.1911.76.camel-uMdlDhfIn7prKue/0VVhAg@public.gmane.org>

>
> On Wed, 2012-06-27 at 10:36 +0200, Paul Bolle wrote:
> > Every now and then, after resuming from suspend, the iwlegacy driver
> > prints
> >     iwl4965 0000:03:00.0: Queue 2 stuck for 2000 ms.
> >     iwl4965 0000:03:00.0: On demand firmware reload
> >
> > I have no idea what causes these errors. But the code currently uses
> > wd_timeout in the first error. wd_timeout will generally be set at
> > IL_DEF_WD_TIMEOUT (ie, 2000). Perhaps printing for how long the queue
> > was actually stuck can clarify the cause of these errors.
>

You may want to try this one:
http://www.spinics.net/lists/stable-commits/msg18110.html
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: problems with iproute2 m_xt again...
From: Jan Engelhardt @ 2012-06-30 13:17 UTC (permalink / raw)
  To: Andreas Henriksson
  Cc: shemminger, Linux Networking Developer Mailing List, YANG Zhe
In-Reply-To: <20120630120939.GA18134@amd64.fatal.se>

On Saturday 2012-06-30 14:09, Andreas Henriksson wrote:

>Hello!
>
>Mailing you in hope that you could help out with the xt module of iproute2 tc
>once more, as you've done in the past.... It seems to be broken again. sigh.
>
>amd64:~# iptables -nL | grep test
>Chain test (0 references)
>amd64:~# tc filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j test
> failed to find target test
>
>bad action parsing
>parse_action: bad value (3:xt)!
>Illegal "action"
>amd64:~#

Looks more like a syntax error. I don't see any functions of m_xt.so 
being ever called (and it only has two).

>And maybe even more interesting is when I try to use a built-in chain 
>like DROP:

DROP is not a chain, but a verdict.

>amd64:~# tc filter add dev fon parent ffff: protocol ip prio 10 u32 
>action xt -j DROP
>tablename: mangle hook: NF_IP_PRE_ROUTING
>Segmentation fault

Rather than a segfault, this gets me "bad action parsing" in 
iproute2-3.4.0 as well.

^ permalink raw reply

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Paul Bolle @ 2012-06-30 13:20 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: John W. Linville, linux-wireless, netdev, linux-kernel
In-Reply-To: <1340786187.1911.24.camel@x61.thuisdomein>

On Wed, 2012-06-27 at 10:36 +0200, Paul Bolle wrote:
> Every now and then, after resuming from suspend, the iwlegacy driver
> prints
>     iwl4965 0000:03:00.0: Queue 2 stuck for 2000 ms.
>     iwl4965 0000:03:00.0: On demand firmware reload
> 
> I have no idea what causes these errors. But the code currently uses
> wd_timeout in the first error. wd_timeout will generally be set at
> IL_DEF_WD_TIMEOUT (ie, 2000). Perhaps printing for how long the queue
> was actually stuck can clarify the cause of these errors.

0) It's not just after resume! I just found the following lines through
dmesg (note that it's a period that all messages in dmesg were wlan
related, for some reason):

[...]
[114649.872338] wlan0: associated
[115837.970798] wlan0: deauthenticated from [...] (Reason: 7)
[115837.993405] cfg80211: Calling CRDA to update world regulatory domain
[115838.011979] cfg80211: World regulatory domain updated:
[115838.011986] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[115838.011995] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[115838.012033] cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[115838.012041] cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[115838.012048] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[115838.012055] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[115838.012108] cfg80211: Calling CRDA for country: NL
[115838.022615] cfg80211: Regulatory domain changed to country: NL
[115838.022622] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[115838.022630] cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[115838.022637] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[115838.022644] cfg80211:   (5250000 KHz - 5330000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[115838.022651] cfg80211:   (5490000 KHz - 5710000 KHz @ 40000 KHz), (N/A, 2700 mBm)
[115840.219977] wlan0: authenticate with [...]
[115840.228865] wlan0: send auth to [...] (try 1/3)
[115840.429054] wlan0: send auth to [...] (try 2/3)
[115840.630026] wlan0: send auth to [...] (try 3/3)
[115840.831051] wlan0: authentication with [...] timed out
[115848.022336] wlan0: authenticate with [...]
[115848.022495] wlan0: direct probe to [...] (try 1/3)
[115848.223052] wlan0: direct probe to [...] (try 2/3)
[115848.424052] wlan0: direct probe to [...] (try 3/3)
[115848.625048] wlan0: authentication with [...] timed out
[115855.702461] wlan0: authenticate with [...]
[115855.702623] wlan0: direct probe to [...] (try 1/3)
[115855.903053] wlan0: direct probe to [...] (try 2/3)
[115856.104061] wlan0: direct probe to [...] (try 3/3)
[115856.305050] wlan0: authentication with [...] timed out
[115863.464067] wlan0: authenticate with [...]
[115863.464221] wlan0: direct probe to [...] (try 1/3)
[115863.665054] wlan0: direct probe to [...] (try 2/3)
[115863.866058] wlan0: direct probe to [...] (try 3/3)
[115864.067051] wlan0: authentication with [...] timed out
[115871.267216] wlan0: authenticate with [...]
[115871.267376] wlan0: send auth to [...] (try 1/3)
[115871.269191] wlan0: authenticated
[115871.279459] wlan0: associate with [...] (try 1/3)
[115871.281715] wlan0: RX AssocResp from [...] (capab=0x411 status=0 aid=2)
[115871.281723] wlan0: associated
[115871.457043] iwl4965 0000:03:00.0: Queue 2 stuck for 33487 ms.
[115871.457048] iwl4965 0000:03:00.0: On demand firmware reload
[115871.457149] ieee80211 phy0: Hardware restart was requested
[117985.197630] wlan0: deauthenticating from [...] by local choice (reason=3)
[...]

1) My guess is that I left my laptop unattended for quite some time (so
_perhaps_ generating little to no wlan traffic), and in that period
NetworkManager and friends ran into trouble re-authenticating to my
wireless router. By the time authentication succeeded it just looked
like "Queue 2" was stuck. Wild, uneducated guess, actually.

2) It's always "Queue 2" that's stuck. What does that queue do?


Paul Bolle

^ permalink raw reply

* Re: AF_BUS socket address family
From: Alan Cox @ 2012-06-30 13:12 UTC (permalink / raw)
  To: David Miller; +Cc: vincent.sanders, netdev, linux-kernel
In-Reply-To: <20120629.165023.1605284574408858612.davem@davemloft.net>

> What you really don't get is that packet drops and event losses are
> absolutely fundamental.

The world is full of "receiver reliable" multicast transport providers
which provide ordered defined message delivery properties.

They are reliable in the sense that a message is either queued to the
other ends or is not queued. They are not reliable in the sense of "we
wait forever".

In fact if you look up the stack you'll find a large number of multicast
messaging systems which do reliable transport built on top of IP. In fact
Red Hat provides a high level messaging cluster service that does exactly
this. (as well as dbus which does it on the deskop level) plus a ton of
stuff on top of that (JGroups etc)

Everybody at the application level has been using these 'receiver
reliable'  multicast services for years (Websphere MQ, TIBCO, RTPGM,
OpenPGM, MS-PGM, you name it). There are even accelerators for PGM based
protocols in things like Cisco routers and Solarflare can do much of it
on the card for 10Gbit.

> As long as receivers lack infinite receive queue this will always be
> the case.
> 
> Multicast operates in non-reliable transports only so that one stuck
> or malfunctioning receiver doesn't screw things over for everyone nor
> unduly brudon the sender.

All the world is not IP. Dealing with a malfunctioning receiver is
something dbus already has to deal with. "Unduly burden the sender" is
you talking out of your underwear. The sender is already implementing
this property set - in user space. So there can't be any more burdening,
in fact the point of this is to get rid of excess burdens caused by lack
of kernel support.

This is a latency issue not a throughput one so you can't hide it with
buffers. A few ms shaved off desktop behaviour here and there makes a
massive difference to perceived responsiveness. Less task switches and
daemons means a lot less tasks bouncing around processors which means
less power consumption.

Alan

^ permalink raw reply

* [PATCH v6] sctp: be more restrictive in transport selection on bundled sacks
From: Neil Horman @ 2012-06-30 13:04 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Vlad Yaseivch, David S. Miller, linux-sctp
In-Reply-To: <1340742704-2192-1-git-send-email-nhorman@tuxdriver.com>

It was noticed recently that when we send data on a transport, its possible that
we might bundle a sack that arrived on a different transport.  While this isn't
a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
2960:

 An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
   etc.) to the same destination transport address from which it
   received the DATA or control chunk to which it is replying.  This
   rule should also be followed if the endpoint is bundling DATA chunks
   together with the reply chunk.

This patch seeks to correct that.  It restricts the bundling of sack operations
to only those transports which have moved the ctsn of the association forward
since the last sack.  By doing this we guarantee that we only bundle outbound
saks on a transport that has received a chunk since the last sack.  This brings
us into stricter compliance with the RFC.

Vlad had initially suggested that we strictly allow only sack bundling on the
transport that last moved the ctsn forward.  While this makes sense, I was
concerned that doing so prevented us from bundling in the case where we had
received chunks that moved the ctsn on multiple transports.  In those cases, the
RFC allows us to select any of the transports having received chunks to bundle
the sack on.  so I've modified the approach to allow for that, by adding a state
variable to each transport that tracks weather it has moved the ctsn since the
last sack.  This I think keeps our behavior (and performance), close enough to
our current profile that I think we can do this without a sysctl knob to
enable/disable it.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yaseivch <vyasevich@gmail.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Reported-by: Michele Baldessari <michele@redhat.com>
Reported-by: sorin serban <sserban@redhat.com>

---
Change Notes:
V2)
	* Removed unused variable as per Dave M. Request
	* Delayed rwnd adjustment until we are sure we will sack (Vlad Y.)
V3)
	* Switched test to use pkt->transport rather than chunk->transport
	* Modified detection of sacka-able transport.  Instead of just setting
	  and clearning a flag, we now mark each transport and association with
	  a sack generation tag.  We increment the associations generation on
	  every sack, and assign that generation tag to every transport that
	  updates the ctsn.  This prevents us from having to iterate over a for
	  loop on every sack, which is much more scalable.
V4)
	* Fixed up wrapping comment and logic
V5)
	* Simplified wrap logic further per request from vlad
V6)
	* Changed some style point as per request from Dave M.
---
 include/net/sctp/structs.h |    4 ++++
 include/net/sctp/tsnmap.h  |    3 ++-
 net/sctp/associola.c       |    1 +
 net/sctp/output.c          |    5 +++++
 net/sctp/sm_make_chunk.c   |   16 ++++++++++++++++
 net/sctp/sm_sideeffect.c   |    2 +-
 net/sctp/transport.c       |    2 ++
 net/sctp/tsnmap.c          |    6 +++++-
 net/sctp/ulpevent.c        |    3 ++-
 net/sctp/ulpqueue.c        |    2 +-
 10 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index e4652fe..fecdf31 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -912,6 +912,9 @@ struct sctp_transport {
 		/* Is this structure kfree()able? */
 		malloced:1;
 
+	/* Has this transport moved the ctsn since we last sacked */
+	__u32 sack_generation;
+
 	struct flowi fl;
 
 	/* This is the peer's IP address and port. */
@@ -1584,6 +1587,7 @@ struct sctp_association {
 		 */
 		__u8    sack_needed;     /* Do we need to sack the peer? */
 		__u32	sack_cnt;
+		__u32	sack_generation;
 
 		/* These are capabilities which our peer advertised.  */
 		__u8	ecn_capable:1,	    /* Can peer do ECN? */
diff --git a/include/net/sctp/tsnmap.h b/include/net/sctp/tsnmap.h
index e7728bc..2c5d2b4 100644
--- a/include/net/sctp/tsnmap.h
+++ b/include/net/sctp/tsnmap.h
@@ -117,7 +117,8 @@ void sctp_tsnmap_free(struct sctp_tsnmap *map);
 int sctp_tsnmap_check(const struct sctp_tsnmap *, __u32 tsn);
 
 /* Mark this TSN as seen.  */
-int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn);
+int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn,
+		     struct sctp_transport *trans);
 
 /* Mark this TSN and all lower as seen. */
 void sctp_tsnmap_skip(struct sctp_tsnmap *map, __u32 tsn);
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 5bc9ab1..b16517e 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -271,6 +271,7 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a
 	 */
 	asoc->peer.sack_needed = 1;
 	asoc->peer.sack_cnt = 0;
+	asoc->peer.sack_generation = 1;
 
 	/* Assume that the peer will tell us if he recognizes ASCONF
 	 * as part of INIT exchange.
diff --git a/net/sctp/output.c b/net/sctp/output.c
index f1b7d4b..6ae47ac 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -248,6 +248,11 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct sctp_packet *pkt,
 		/* If the SACK timer is running, we have a pending SACK */
 		if (timer_pending(timer)) {
 			struct sctp_chunk *sack;
+
+			if (pkt->transport->sack_generation !=
+			    pkt->transport->asoc->peer.sack_generation)
+				return retval;
+
 			asoc->a_rwnd = asoc->rwnd;
 			sack = sctp_make_sack(asoc);
 			if (sack) {
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index a85eeeb..098cff5 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -734,8 +734,10 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
 	int len;
 	__u32 ctsn;
 	__u16 num_gabs, num_dup_tsns;
+	struct sctp_association *aptr = (struct sctp_association *)asoc;
 	struct sctp_tsnmap *map = (struct sctp_tsnmap *)&asoc->peer.tsn_map;
 	struct sctp_gap_ack_block gabs[SCTP_MAX_GABS];
+	struct sctp_transport *trans;
 
 	memset(gabs, 0, sizeof(gabs));
 	ctsn = sctp_tsnmap_get_ctsn(map);
@@ -805,6 +807,20 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
 		sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
 				 sctp_tsnmap_get_dups(map));
 
+	/* Once we have a sack generated, check to see what our sack
+	 * generation is, if its 0, reset the transports to 0, and reset
+	 * the association generation to 1
+	 *
+	 * The idea is that zero is never used as a valid generation for the
+	 * association so no transport will match after a wrap event like this,
+	 * Until the next sack
+	 */ 
+	if (++aptr->peer.sack_generation == 0) {
+		list_for_each_entry(trans, &asoc->peer.transport_addr_list,
+				    transports)
+			trans->sack_generation = 0;
+		aptr->peer.sack_generation = 1;
+	}
 nodata:
 	return retval;
 }
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index c96d1a8..8716da1 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1268,7 +1268,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
 		case SCTP_CMD_REPORT_TSN:
 			/* Record the arrival of a TSN.  */
 			error = sctp_tsnmap_mark(&asoc->peer.tsn_map,
-						 cmd->obj.u32);
+						 cmd->obj.u32, NULL);
 			break;
 
 		case SCTP_CMD_REPORT_FWDTSN:
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index b026ba0..1dcceb6 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -68,6 +68,8 @@ static struct sctp_transport *sctp_transport_init(struct sctp_transport *peer,
 	peer->af_specific = sctp_get_af_specific(addr->sa.sa_family);
 	memset(&peer->saddr, 0, sizeof(union sctp_addr));
 
+	peer->sack_generation = 0;
+
 	/* From 6.3.1 RTO Calculation:
 	 *
 	 * C1) Until an RTT measurement has been made for a packet sent to the
diff --git a/net/sctp/tsnmap.c b/net/sctp/tsnmap.c
index f1e40ceb..b5fb7c4 100644
--- a/net/sctp/tsnmap.c
+++ b/net/sctp/tsnmap.c
@@ -114,7 +114,8 @@ int sctp_tsnmap_check(const struct sctp_tsnmap *map, __u32 tsn)
 
 
 /* Mark this TSN as seen.  */
-int sctp_tsnmap_mark(struct sctp_tsnmap *map, __u32 tsn)
+int sctp_tsnmap_mark(struct sctp_tsnmap *map, __u32 tsn,
+		     struct sctp_transport *trans)
 {
 	u16 gap;
 
@@ -133,6 +134,9 @@ int sctp_tsnmap_mark(struct sctp_tsnmap *map, __u32 tsn)
 		 */
 		map->max_tsn_seen++;
 		map->cumulative_tsn_ack_point++;
+		if (trans)
+			trans->sack_generation =
+				trans->asoc->peer.sack_generation;
 		map->base_tsn++;
 	} else {
 		/* Either we already have a gap, or about to record a gap, so
diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 8a84017..33d8947 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -715,7 +715,8 @@ struct sctp_ulpevent *sctp_ulpevent_make_rcvmsg(struct sctp_association *asoc,
 	 * can mark it as received so the tsn_map is updated correctly.
 	 */
 	if (sctp_tsnmap_mark(&asoc->peer.tsn_map,
-			     ntohl(chunk->subh.data_hdr->tsn)))
+			     ntohl(chunk->subh.data_hdr->tsn),
+			     chunk->transport))
 		goto fail_mark;
 
 	/* First calculate the padding, so we don't inadvertently
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index f2d1de7..f5a6a4f 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -1051,7 +1051,7 @@ void sctp_ulpq_renege(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
 	if (chunk && (freed >= needed)) {
 		__u32 tsn;
 		tsn = ntohl(chunk->subh.data_hdr->tsn);
-		sctp_tsnmap_mark(&asoc->peer.tsn_map, tsn);
+		sctp_tsnmap_mark(&asoc->peer.tsn_map, tsn, chunk->transport);
 		sctp_ulpq_tail_data(ulpq, chunk, gfp);
 
 		sctp_ulpq_partial_delivery(ulpq, chunk, gfp);
-- 
1.7.7.6

^ permalink raw reply related

* Re: AF_BUS socket address family
From: Alan Cox @ 2012-06-30 12:52 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Vincent Sanders, David Miller, netdev, linux-kernel
In-Reply-To: <20120630001350.GS21968@kvack.org>

On Fri, 29 Jun 2012 20:13:50 -0400
Benjamin LaHaise <bcrl@kvack.org> wrote:

> On Sat, Jun 30, 2012 at 12:42:30AM +0100, Vincent Sanders wrote:
> > The current users are suffering from the issues outlined in my
> > introductory mail all the time. These issues are caused by emulating an
> > IPC system over AF_UNIX in userspace.
> 
> Nothing in your introductory statements indicate how your requirements 
> can't be met through a hybrid socket + shared memory solution.  The IPC 
> facilities of the kernel are already quite rich, and sufficient for 
> building many kinds of complex systems.  What's so different about DBus' 
> requirements?

dbus wants to
- multicast
- pass file handles
- never lose an event
- be fast
- have a security model

The security model makes a shared memory hack impractical, the file
handle passing means at least some of it needs to be AF_UNIX. The event
loss handling/speed argue for putting it in kernel.

I'm not convinced AF_BUS entirely sorts this either. In particular the
failure case dbus currently has to handle for not losing events allows it
to identify who in a "group" has jammed the bus by not listening (eg by
locking up). This information appears to be lost in the AF_BUS case and
that's slightly catastrophic for error recovery.

Alan

^ permalink raw reply

* Re: [PATCH v5] sctp: be more restrictive in transport selection on bundled sacks
From: Neil Horman @ 2012-06-30 12:26 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, vyasevich, linux-sctp
In-Reply-To: <20120629.163408.1111786778802302299.davem@davemloft.net>

On Fri, Jun 29, 2012 at 04:34:08PM -0700, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Fri, 29 Jun 2012 16:15:29 -0400
> 
> > +	/* Has this transport moved the ctsn since we last sacked */
> > +	__u32 sack_generation;
> > +
>  ...
> > +		__u32	sack_generation;
> 
> These are __u32 but they only take on the value '1' or '0'.  Please
> use bool and give it a more reasonable name, a name that describes
> how it is really a predicate.
> 
This is wrong.  Its a counter that increments every time we call sctp_make_sack,
so that we can create a unique generation identifier for use in tagging which
transports move ctsn in a given generation.  It saves us from having to iterate
over a list every time we send a sack. 

> > -		struct sctp_association *asoc;
> >  		struct timer_list *timer;
> > -		asoc = pkt->transport->asoc;
> > +		struct sctp_association *asoc = pkt->transport->asoc;
> > +
> 
> Please leave asoc where it was, on the first line.  We encourage
> listing local variables such that the longest lines come first,
> then gradually shorter and short lines.
> 
> > +	/*
> > +	 * Once we have a sack generated, check to see what our sack
> > +	 * generation is, if its 0, reset the transports to 0, and reset
> 
> Please format:
> 
>   /* Like
>    * this.
>    */
> 
> Thanks.
> 
Very well, I'll repost in the next few days
Neil

> 

^ permalink raw reply

* possible integer underflow in __sctp_auth_cid()
From: Dan Carpenter @ 2012-06-30 12:17 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: linux-sctp, netdev

In 555d3d5d "SCTP: Fix chunk acceptance when no authenticated chunks
were listed.", we added a check for if (param->param_hdr.length == 0).
Shouldn't that check be a check for if
(param->param_hdr.length < sizeos(sizeof(sctp_paramhdr_t)))?  Otherwise,
when we do the substraction on the next line we would unintentionally
end up with a high positive number.

I had a similar question about sctp_auth_ep_add_chunkid():

net/sctp/auth.c
   770          /* Check if we can add this chunk to the array */
   771          param_len = ntohs(p->param_hdr.length);
   772          nchunks = param_len - sizeof(sctp_paramhdr_t);
   773          if (nchunks == SCTP_NUM_CHUNK_TYPES)
   774                  return -EINVAL;
   775
   776          p->chunks[nchunks] = chunk_id;

If param_len is less than sizeof(sctp_paramhdr_t) we could write past
the end of the array.  There are a couple other places with this same
subtraction as well.

regards,
dan carpenter

^ permalink raw reply

* problems with iproute2 m_xt again...
From: Andreas Henriksson @ 2012-06-30 12:09 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: shemminger, netdev, YANG Zhe

Hello!

Mailing you in hope that you could help out with the xt module of iproute2 tc
once more, as you've done in the past.... It seems to be broken again. sigh.

amd64:~# iptables -nL | grep test
Chain test (0 references)
amd64:~# tc filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j test
 failed to find target test

bad action parsing
parse_action: bad value (3:xt)!
Illegal "action"
amd64:~#

And maybe even more interesting is when I try to use a built-in chain like DROP:

amd64:~# tc filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j DROP
tablename: mangle hook: NF_IP_PRE_ROUTING
Segmentation fault
amd64:~# 

(gdb) set args filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j DROP
(gdb) run
Starting program: /home/gem/opt/pkg-iproute/iproute/tc/tc filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j DROP
tablename: mangle hook: NF_IP_PRE_ROUTING

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
Starting program: tc filter add dev fon parent ffff: protocol ip prio 10 u32 action xt -j DROP
tablename: mangle hook: NF_IP_PRE_ROUTING

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff71b87a0 in parse_ipt (a=0x7ffff73b9500, argc_p=0x7fffffff9e94, 
    argv_p=0x7fffffff9e88, tca_id=2, n=0x7fffffffa840) at m_xt.c:230
#2  0x000000000040dc11 in parse_action (argc_p=0x7fffffff9eec, 
    argv_p=0x7fffffff9ee0, tca_id=7, n=0x7fffffffa840) at m_action.c:214
#3  0x000000000042177e in u32_parse_opt (qu=0x648c80, handle=0x0, argc=3, 
    argv=0x7fffffffea50, n=0x7fffffffa840) at f_u32.c:1126
#4  0x0000000000409bf8 in tc_filter_modify (cmd=44, flags=1536, argc=4, 
    argv=0x7fffffffea48) at tc_filter.c:142
#5  0x000000000040a620 in do_filter (argc=14, argv=0x7fffffffe9f8)
    at tc_filter.c:357
#6  0x0000000000406c74 in do_cmd (argc=15, argv=0x7fffffffe9f0) at tc.c:199
#7  0x00000000004071ae in main (argc=16, argv=0x7fffffffe9e8) at tc.c:316

This is with the iproute package version 20120521-3 on Debian unstable.
Did I screw anything up? Did I use the wrong commands? Or are things just
broken again?

Btw. I have iptables package version 1.4.14-2

-- 
Andreas Henriksson

^ permalink raw reply

* Re: [PATCH V3 2/2] bonding support for IPv6 transmit hashing
From: Hannes Frederic Sowa @ 2012-06-30 11:59 UTC (permalink / raw)
  To: John; +Cc: netdev
In-Reply-To: <4FEE99EE.2000001@8192.net>

On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
> diff --git a/Documentation/networking/bonding.txt
> b/Documentation/networking/bonding.txt
> index bfea8a3..5db14fe 100644
> --- a/Documentation/networking/bonding.txt
> +++ b/Documentation/networking/bonding.txt
> @@ -752,12 +752,22 @@ xmit_hash_policy
>                 protocol information to generate the hash.
>
>                 Uses XOR of hardware MAC addresses and IP addresses to
> -               generate the hash.  The formula is
> +               generate the hash.  The IPv4 formula is
>
>                 (((source IP XOR dest IP) AND 0xffff) XOR
>                         ( source MAC XOR destination MAC ))
>                                 modulo slave count
>
> +               The IPv6 forumla is
> +
> +               iphash =
> +                       (source ip quad 2 XOR dest IP quad 2) XOR
> +                       (source ip quad 3 XOR dest IP quad 3) XOR
> +                       (source ip quad 4 XOR dest IP quad 4)
> +
> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
> +                       modulo slave count
> +

Wouldn't it be beneficial to include the ipv6 flow label in the hash
calculation?

Greetings,

  Hannes

^ permalink raw reply

* [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: Dan Carpenter @ 2012-06-30 11:49 UTC (permalink / raw)
  To: James E.J. Bottomley, Barak Witkowski
  Cc: Eddie Wai, Michael Chan, linux-scsi, netdev, David S. Miller

DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
12 bytes from the stack so it's a small information leak.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
This was just added to linux-next yesterday, but I'm not sure which tree
it came from.

diff --git a/drivers/scsi/bnx2i/bnx2i_init.c b/drivers/scsi/bnx2i/bnx2i_init.c
index 7729a52..b17637a 100644
--- a/drivers/scsi/bnx2i/bnx2i_init.c
+++ b/drivers/scsi/bnx2i/bnx2i_init.c
@@ -400,7 +400,7 @@ int bnx2i_get_stats(void *handle)
 	if (!stats)
 		return -ENOMEM;

-	memcpy(stats->version, DRV_MODULE_VERSION, sizeof(stats->version));
+	strlcpy(stats->version, DRV_MODULE_VERSION, sizeof(stats->version));
 	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);

 	stats->max_frame_size = hba->netdev->mtu;

^ permalink raw reply related

* [patch -next] netfilter: use kfree_skb() not kfree()
From: Dan Carpenter @ 2012-06-30 11:48 UTC (permalink / raw)
  To: Bart De Schuymer
  Cc: coreteam, netdev, bridge, kernel-janitors, David S. Miller,
	netfilter, netfilter-devel, Stephen Hemminger, Pablo Neira Ayuso

This was should be a kfree_skb() here to free the sk_buff pointer.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/net/bridge/netfilter/ebt_ulog.c b/net/bridge/netfilter/ebt_ulog.c
index 1bd1732..dfbb019 100644
--- a/net/bridge/netfilter/ebt_ulog.c
+++ b/net/bridge/netfilter/ebt_ulog.c
@@ -156,7 +156,7 @@ static void ebt_ulog_packet(unsigned int hooknr, const struct sk_buff *skb,
 	nlh = nlmsg_put(ub->skb, 0, ub->qlen, 0,
 			size - NLMSG_ALIGN(sizeof(*nlh)), 0);
 	if (!nlh) {
-		kfree(ub->skb);
+		kfree_skb(ub->skb);
 		ub->skb = NULL;
 		goto unlock;
 	}

^ permalink raw reply related

* [IPv6] How to wait for pending SLAAC
From: Marc Haber @ 2012-06-30 11:10 UTC (permalink / raw)
  To: netdev

Hi,

I would like to have my system wait for the first SLAAC attempt during
bootup. To do this, I have written a script that waits until no more
IP address is found in "tentative" state.

Unfortunately, this is so fast that the script terminates before the
first address reaches "tentative" state.

Is there a way to find out whether the system is trying to do SLAAC on
any interface (for example information inside /proc or /sys), so that
one can wait for the IP address to appear, or is a fixed "sleep 5" the
only thing one can do here?

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Mannheim, Germany  |  lose things."    Winona Ryder | Fon: *49 621 31958061
Nordisch by Nature |  How to make an American Quilt | Fax: *49 621 31958062

^ permalink raw reply

* Re: [BUG, regression, bisected] Marvell 88E8055 NIC (sky2) fails to detect link after resume from S3
From: Michal Zatloukal @ 2012-06-30 11:21 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev, Mirko Lindner, Stephen Hemminger
In-Reply-To: <CAKKZj2D-VVZF81=7-y+WtFGumzKJUfuaiB8k8X=BFXiXi0nfbg@mail.gmail.com>

Well, that didn't help.The 800-ms delay I intoduced also showed up in
dmesg AFTER reading all the ffs:

[   52.890768] sky2 0000:04:00.0: eth0: disabling interface
[   55.768112] sky2 0000:04:00.0: PME# enabled
[   55.768118] sky2 0000:04:00.0: wake-up capability enabled by ACPI
[   55.924104] PM: late suspend of drv:sky2 dev:0000:04:00.0 complete
after 156.048 msecs
[   56.236280] sky2 0000:04:00.0: Refused to change power state, currently in D3
[   56.236288] sky2 0000:04:00.0: restoring config space at offset 0xf
(was 0xffffffff, writing 0x10a)
[   56.236292] sky2 0000:04:00.0: restoring config space at offset 0xe
(was 0xffffffff, writing 0x0)
[   56.236297] sky2 0000:04:00.0: restoring config space at offset 0xd
(was 0xffffffff, writing 0x48)
[   56.236301] sky2 0000:04:00.0: restoring config space at offset 0xc
(was 0xffffffff, writing 0x0)
[   56.236305] sky2 0000:04:00.0: restoring config space at offset 0xb
(was 0xffffffff, writing 0x110f1734)
[   56.236310] sky2 0000:04:00.0: restoring config space at offset 0xa
(was 0xffffffff, writing 0x0)
[   56.236314] sky2 0000:04:00.0: restoring config space at offset 0x9
(was 0xffffffff, writing 0x0)
[   56.236318] sky2 0000:04:00.0: restoring config space at offset 0x8
(was 0xffffffff, writing 0x0)
[   56.236323] sky2 0000:04:00.0: restoring config space at offset 0x7
(was 0xffffffff, writing 0x0)
[   56.236327] sky2 0000:04:00.0: restoring config space at offset 0x6
(was 0xffffffff, writing 0x3001)
[   56.236331] sky2 0000:04:00.0: restoring config space at offset 0x5
(was 0xffffffff, writing 0x0)
[   56.236335] sky2 0000:04:00.0: restoring config space at offset 0x4
(was 0xffffffff, writing 0xf8000004)
[   56.236340] sky2 0000:04:00.0: restoring config space at offset 0x3
(was 0xffffffff, writing 0x10)
[   56.236344] sky2 0000:04:00.0: restoring config space at offset 0x2
(was 0xffffffff, writing 0x2000014)
[   56.236348] sky2 0000:04:00.0: restoring config space at offset 0x1
(was 0xffffffff, writing 0x100507)
[   56.236352] sky2 0000:04:00.0: restoring config space at offset 0x0
(was 0xffffffff, writing 0x436311ab)
[   56.241868] sky2 0000:04:00.0: wake-up capability disabled by ACPI
[   56.241872] sky2 0000:04:00.0: PME# disabled
[   57.035970] sky2 0000:04:00.0: ignoring stuck error report bit
[   57.035993] PM: resume of drv:sky2 dev:0000:04:00.0 complete after
794.124 msecs
[   58.421169] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421174] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421178] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421181] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421184] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421187] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421190] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421193] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421196] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421199] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421203] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421206] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421208] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421212] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421214] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421217] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421220] sky2 0000:04:00.0: eth0: phy I/O error
[   58.421348] sky2 0000:04:00.0: eth0: enabling interface

MZ

On Sat, Jun 30, 2012 at 12:36 AM, Michal Zatloukal <myxal.mxl@gmail.com> wrote:
> Sorry, not a developer, but I assume you mean something like:
>  if (!hw)
>                 return 0;
>
> +        mdelay(800);
>         /* Re-enable all clocks */
> Right? In that case, I'm building the kernel now and should be able to
> report by tomorrow.
>
> MZ
>
> On Fri, Jun 29, 2012 at 11:58 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
>> (maintainers Cced)
>>
>> Michal Zatloukal <myxal.mxl@gmail.com> :
>> [7afe1845dd1e7c90828c942daed7e57ffa7c38d6 induced regression]
>>> My uneducated guess is that by making the resume from S3 shorter, the
>>> driver catches the hardware with its pants down and freaks out.
>>> You can find all details/files (dmesg, lspci, dmidecode, config...)
>>> collected by apport in the ubuntu bug linked above. Let me know if I
>>> should supply any more info.
>>> Note: Please CC me into replies, I'm not subscribed. Thank you.
>>
>> Can you workaround it by enforcing some mdelay() in sky2_resume before
>> it does any real work ?
>>
>> --
>> Ueimor

^ permalink raw reply

* Re: one pci_id missing in sky2 driver
From: Xose Vazquez Perez @ 2012-06-30 11:10 UTC (permalink / raw)
  To: mlindner, netdev, shemminger

Mirko Lindner wrote:

 > On Sunday, April 01, 2012 12:22:35 PM Stephen Hemminger wrote:

 >> I would like to use this opportunity to have the developers at Marvell, test
 >> and submit this. They have the hardware (I don't) and often new chips
 >> require other special tweaks.  Marvell expressed interest in taking over
 >> maintaining the sky2 driver, this would be a good first step.

> Thanks Stephen. The ID belongs to our newest device. We'll include the code
> into the driver and send a patch to the list as soon the driver has passed our
> internal tests.

Mirko, any news on this ?

^ permalink raw reply

* Re: [PATCH] ipv4: Elide fib_validate_source() completely when possible.
From: Julian Anastasov @ 2012-06-30 10:45 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120629.020552.2190372012516348013.davem@davemloft.net>


	Hello,

On Fri, 29 Jun 2012, David Miller wrote:

> If rpfilter is off (or the SKB has an IPSEC path) and there are not
> tclassid users, we don't have to do anything at all when
> fib_validate_source() is invoked besides setting the itag to zero.
> 
> We monitor tclassid uses with a counter (modified only under RTNL and
> marked __read_mostly) and we protect the fib_validate_source() real
> work with a test against this counter and whether rpfilter is to be
> done.
> 
> Having a way to know whether we need no tclassid processing or not
> also opens the door for future optimized rpfilter algorithms that do
> not perform full FIB lookups.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>

> +/* Ignore rp_filter for packets protected by IPsec. */
> +int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
> +			u8 tos, int oif, struct net_device *dev,
> +			struct in_device *idev, u32 *itag)
> +{
> +	int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev);
> +

	It seems now we change the IN_DEV_ACCEPT_LOCAL policy
to depend on IN_DEV_RPFILTER. Isn't that dangerous for
home routers that disable rp_filter when using 2 or more
uplinks? I know that now rp_filter can be enabled for such
setups but sometimes users just update the kernel and can
miss such change.

	If we need the old behavior we should also check
IN_DEV_ACCEPT_LOCAL here. Then servers protected by firewall
can avoid this check by enabling accept_local. Established
stream sockets are anyways optimized by the new demux code.

	Not sure how fatal is the case of forwarding with
saddr=local_address. May be the risks for loops are same.

	If we really want a change in behavior we should
at least update the accept_local info in
Documentation/networking/ip-sysctl.txt ?

> +	if (!r && !fib_num_tclassid_users) {
> +		*itag = 0;
> +		return 0;
> +	}
> +	return __fib_validate_source(skb, src, dst, tos, oif, dev, r, idev, itag);
> +}
> +

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [net-next] e1000e: remove use of IP payload checksum
From: Jeff Kirsher @ 2012-06-30 10:35 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher

From: Bruce Allan <bruce.w.allan@intel.com>

Currently only used when packet split mode is enabled with jumbo frames,
IP payload checksum (for fragmented UDP packets) is mutually exclusive with
receive hashing offload since the hardware uses the same space in the
receive descriptor for the hardware-provided packet checksum and the RSS
hash, respectively.  Users currently must disable jumbos when receive
hashing offload is enabled, or vice versa, because of this incompatibility.
Since testing has shown that IP payload checksum does not provide any real
benefit, just remove it so that there is no longer a choice between jumbos
or receive hashing offload but not both as done in other Intel GbE drivers
(e.g. e1000, igb).

Also, add a missing check for IP checksum error reported by the hardware;
let the stack verify the checksum when this happens.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/defines.h |    1 +
 drivers/net/ethernet/intel/e1000e/netdev.c  |   75 +++++----------------------
 2 files changed, 15 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h
index 351a409..76edbc1 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -103,6 +103,7 @@
 #define E1000_RXD_ERR_SEQ       0x04    /* Sequence Error */
 #define E1000_RXD_ERR_CXE       0x10    /* Carrier Extension Error */
 #define E1000_RXD_ERR_TCPE      0x20    /* TCP/UDP Checksum Error */
+#define E1000_RXD_ERR_IPE       0x40    /* IP Checksum Error */
 #define E1000_RXD_ERR_RXE       0x80    /* Rx Data Error */
 #define E1000_RXD_SPC_VLAN_MASK 0x0FFF  /* VLAN ID is in lower 12 bits */
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index ba86b3f..a166efc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -496,7 +496,7 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
  * @sk_buff: socket buffer with received data
  **/
 static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
-			      __le16 csum, struct sk_buff *skb)
+			      struct sk_buff *skb)
 {
 	u16 status = (u16)status_err;
 	u8 errors = (u8)(status_err >> 24);
@@ -511,8 +511,8 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 	if (status & E1000_RXD_STAT_IXSM)
 		return;
 
-	/* TCP/UDP checksum error bit is set */
-	if (errors & E1000_RXD_ERR_TCPE) {
+	/* TCP/UDP checksum error bit or IP checksum error bit is set */
+	if (errors & (E1000_RXD_ERR_TCPE | E1000_RXD_ERR_IPE)) {
 		/* let the stack verify checksum errors */
 		adapter->hw_csum_err++;
 		return;
@@ -523,19 +523,7 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 		return;
 
 	/* It must be a TCP or UDP packet with a valid checksum */
-	if (status & E1000_RXD_STAT_TCPCS) {
-		/* TCP checksum is good */
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
-	} else {
-		/*
-		 * IP fragment with UDP payload
-		 * Hardware complements the payload checksum, so we undo it
-		 * and then put the value in host order for further stack use.
-		 */
-		__sum16 sum = (__force __sum16)swab16((__force u16)csum);
-		skb->csum = csum_unfold(~sum);
-		skb->ip_summed = CHECKSUM_COMPLETE;
-	}
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
 	adapter->hw_csum_good++;
 }
 
@@ -954,8 +942,7 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 		skb_put(skb, length);
 
 		/* Receive Checksum Offload */
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -1341,8 +1328,7 @@ copydone:
 		total_rx_bytes += skb->len;
 		total_rx_packets++;
 
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -1512,9 +1498,8 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 			}
 		}
 
-		/* Receive Checksum Offload XXX recompute due to CRC strip? */
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		/* Receive Checksum Offload */
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -3098,19 +3083,10 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 
 	/* Enable Receive Checksum Offload for TCP and UDP */
 	rxcsum = er32(RXCSUM);
-	if (adapter->netdev->features & NETIF_F_RXCSUM) {
+	if (adapter->netdev->features & NETIF_F_RXCSUM)
 		rxcsum |= E1000_RXCSUM_TUOFL;
-
-		/*
-		 * IPv4 payload checksum for UDP fragments must be
-		 * used in conjunction with packet-split.
-		 */
-		if (adapter->rx_ps_pages)
-			rxcsum |= E1000_RXCSUM_IPPCSE;
-	} else {
+	else
 		rxcsum &= ~E1000_RXCSUM_TUOFL;
-		/* no need to clear IPPCSE as it defaults to 0 */
-	}
 	ew32(RXCSUM, rxcsum);
 
 	if (adapter->hw.mac.type == e1000_pch2lan) {
@@ -5241,22 +5217,10 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
 
 	/* Jumbo frame support */
-	if (max_frame > ETH_FRAME_LEN + ETH_FCS_LEN) {
-		if (!(adapter->flags & FLAG_HAS_JUMBO_FRAMES)) {
-			e_err("Jumbo Frames not supported.\n");
-			return -EINVAL;
-		}
-
-		/*
-		 * IP payload checksum (enabled with jumbos/packet-split when
-		 * Rx checksum is enabled) and generation of RSS hash is
-		 * mutually exclusive in the hardware.
-		 */
-		if ((netdev->features & NETIF_F_RXCSUM) &&
-		    (netdev->features & NETIF_F_RXHASH)) {
-			e_err("Jumbo frames cannot be enabled when both receive checksum offload and receive hashing are enabled.  Disable one of the receive offload features before enabling jumbos.\n");
-			return -EINVAL;
-		}
+	if ((max_frame > ETH_FRAME_LEN + ETH_FCS_LEN) &&
+	    !(adapter->flags & FLAG_HAS_JUMBO_FRAMES)) {
+		e_err("Jumbo Frames not supported.\n");
+		return -EINVAL;
 	}
 
 	/* Supported frame sizes */
@@ -6030,17 +5994,6 @@ static int e1000_set_features(struct net_device *netdev,
 			 NETIF_F_RXALL)))
 		return 0;
 
-	/*
-	 * IP payload checksum (enabled with jumbos/packet-split when Rx
-	 * checksum is enabled) and generation of RSS hash is mutually
-	 * exclusive in the hardware.
-	 */
-	if (adapter->rx_ps_pages &&
-	    (features & NETIF_F_RXCSUM) && (features & NETIF_F_RXHASH)) {
-		e_err("Enabling both receive checksum offload and receive hashing is not possible with jumbo frames.  Disable jumbos or enable only one of the receive offload features.\n");
-		return -EINVAL;
-	}
-
 	if (changed & NETIF_F_RXFCS) {
 		if (features & NETIF_F_RXFCS) {
 			adapter->flags2 &= ~FLAG2_CRC_STRIPPING;
-- 
1.7.10.4

^ permalink raw reply related

* [net] igbvf: fix divide by zero
From: Jeff Kirsher @ 2012-06-30 10:23 UTC (permalink / raw)
  To: davem
  Cc: Mitch A Williams, netdev, gospo, sassmann, stable, daahern,
	Jeff Kirsher

From: Mitch A Williams <mitch.a.williams@intel.com>

Using ethtool -C ethX rx-usecs 0 crashes with a divide by zero.
Refactor this function to fix this issue and make it more clear
what the intent of each conditional is. Add comment regarding
using a setting of zero.

CC: stable <stable@vger.kernel.org> [3.3+]
CC: David Ahern <daahern@cisco.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igbvf/ethtool.c |   29 +++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/ethtool.c b/drivers/net/ethernet/intel/igbvf/ethtool.c
index 8ce6706..90eef07 100644
--- a/drivers/net/ethernet/intel/igbvf/ethtool.c
+++ b/drivers/net/ethernet/intel/igbvf/ethtool.c
@@ -357,21 +357,28 @@ static int igbvf_set_coalesce(struct net_device *netdev,
 	struct igbvf_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 
-	if ((ec->rx_coalesce_usecs > IGBVF_MAX_ITR_USECS) ||
-	    ((ec->rx_coalesce_usecs > 3) &&
-	     (ec->rx_coalesce_usecs < IGBVF_MIN_ITR_USECS)) ||
-	    (ec->rx_coalesce_usecs == 2))
-		return -EINVAL;
-
-	/* convert to rate of irq's per second */
-	if (ec->rx_coalesce_usecs && ec->rx_coalesce_usecs <= 3) {
+	if ((ec->rx_coalesce_usecs >= IGBVF_MIN_ITR_USECS) &&
+	     (ec->rx_coalesce_usecs <= IGBVF_MAX_ITR_USECS)) {
+		adapter->current_itr = ec->rx_coalesce_usecs << 2;
+		adapter->requested_itr = 1000000000 /
+					(adapter->current_itr * 256);
+	} else if ((ec->rx_coalesce_usecs == 3) ||
+		   (ec->rx_coalesce_usecs == 2)) {
 		adapter->current_itr = IGBVF_START_ITR;
 		adapter->requested_itr = ec->rx_coalesce_usecs;
-	} else {
-		adapter->current_itr = ec->rx_coalesce_usecs << 2;
+	} else if (ec->rx_coalesce_usecs == 0) {
+		/*
+		 * The user's desire is to turn off interrupt throttling
+		 * altogether, but due to HW limitations, we can't do that.
+		 * Instead we set a very small value in EITR, which would
+		 * allow ~967k interrupts per second, but allow the adapter's
+		 * internal clocking to still function properly.
+		 */
+		adapter->current_itr = 4;
 		adapter->requested_itr = 1000000000 /
 					(adapter->current_itr * 256);
-	}
+	} else
+		return -EINVAL;
 
 	writel(adapter->current_itr,
 	       hw->hw_addr + adapter->rx_ring->itr_register);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH] NFC: Prevent NULL deref when getting socket name
From: Sasha Levin @ 2012-06-30  9:56 UTC (permalink / raw)
  To: lauro.venancio, aloisio.almeida, sameo, linville
  Cc: linux-wireless, netdev, linux-kernel, Sasha Levin

llcp_sock_getname can be called without a device attached to the nfc_llcp_sock.

This would lead to the following BUG:

[  362.341807] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  362.341815] IP: [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
[  362.341818] PGD 31b35067 PUD 30631067 PMD 0
[  362.341821] Oops: 0000 [#627] PREEMPT SMP DEBUG_PAGEALLOC
[  362.341826] CPU 3
[  362.341827] Pid: 7816, comm: trinity-child55 Tainted: G      D W    3.5.0-rc4-next-20120628-sasha-00005-g9f23eb7 #479
[  362.341831] RIP: 0010:[<ffffffff836258e5>]  [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
[  362.341832] RSP: 0018:ffff8800304fde88  EFLAGS: 00010286
[  362.341834] RAX: 0000000000000000 RBX: ffff880033cb8000 RCX: 0000000000000001
[  362.341835] RDX: ffff8800304fdec4 RSI: ffff8800304fdec8 RDI: ffff8800304fdeda
[  362.341836] RBP: ffff8800304fdea8 R08: 7ebcebcb772b7ffb R09: 5fbfcb9c35bdfd53
[  362.341838] R10: 4220020c54326244 R11: 0000000000000246 R12: ffff8800304fdec8
[  362.341839] R13: ffff8800304fdec4 R14: ffff8800304fdec8 R15: 0000000000000044
[  362.341841] FS:  00007effa376e700(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000
[  362.341843] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.341844] CR2: 0000000000000000 CR3: 0000000030438000 CR4: 00000000000406e0
[  362.341851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  362.341856] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  362.341858] Process trinity-child55 (pid: 7816, threadinfo ffff8800304fc000, task ffff880031270000)
[  362.341858] Stack:
[  362.341862]  ffff8800304fdea8 ffff880035156780 0000000000000000 0000000000001000
[  362.341865]  ffff8800304fdf78 ffffffff83183b40 00000000304fdec8 0000006000000000
[  362.341868]  ffff8800304f0027 ffffffff83729649 ffff8800304fdee8 ffff8800304fdf48
[  362.341869] Call Trace:
[  362.341874]  [<ffffffff83183b40>] sys_getpeername+0xa0/0x110
[  362.341877]  [<ffffffff83729649>] ? _raw_spin_unlock_irq+0x59/0x80
[  362.341882]  [<ffffffff810f342b>] ? do_setitimer+0x23b/0x290
[  362.341886]  [<ffffffff81985ede>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  362.341889]  [<ffffffff8372a539>] system_call_fastpath+0x16/0x1b
[  362.341921] Code: 84 00 00 00 00 00 b8 b3 ff ff ff 48 85 db 74 54 66 41 c7 04 24 27 00 49 8d 7c 24 12 41 c7 45 00 60 00 00 00 48 8b 83 28 05 00 00 <8b> 00 41 89 44 24 04 0f b6 83 41 05 00 00 41 88 44 24 10 0f b6
[  362.341924] RIP  [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
[  362.341925]  RSP <ffff8800304fde88>
[  362.341926] CR2: 0000000000000000
[  362.341928] ---[ end trace 6d450e935ee18bf3 ]---

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/nfc/llcp/sock.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/nfc/llcp/sock.c b/net/nfc/llcp/sock.c
index 2c0b317..05ca5a6 100644
--- a/net/nfc/llcp/sock.c
+++ b/net/nfc/llcp/sock.c
@@ -292,7 +292,7 @@ static int llcp_sock_getname(struct socket *sock, struct sockaddr *addr,
 
 	pr_debug("%p\n", sk);
 
-	if (llcp_sock == NULL)
+	if (llcp_sock == NULL || llcp_sock->dev == NULL)
 		return -EBADFD;
 
 	addr->sa_family = AF_NFC;
-- 
1.7.8.6

^ permalink raw reply related

* Re: [PATCH] ipv4: Create and use fib_compute_spec_dst() helper.
From: Julian Anastasov @ 2012-06-30  9:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120628.184500.114483408843364230.davem@davemloft.net>


	Hello,

On Thu, 28 Jun 2012, David Miller wrote:

> ipv4: Fix bugs in fib_compute_spec_dst().

	Some more thoughts on this topic...

	I'm wondering, may be ip_options_echo wants to put
local IP for srr. ip_options_echo is called by ip_send_unicast_reply.
ip_send_unicast_reply supports source address spoofing for
tproxy (arg.flags & IP_REPLY_ARG_NOSRCCHECK).

	May be the tproxy users add local routes to redirect
the traffic to local stack but daddr is preserved (non-local).
So, rt_flags will have RTCF_LOCAL but for srr purposes we
need local address, right?

	There can be optimization in ip_options_echo to
avoid fib_compute_spec_dst if daddr is not needed. It seems
it is needed only in the sopt->srr case.

	It seems ip_options_compile can be called by
ip_rcv_options (ip_rcv_finish) just after ip_route_input_noref
but before dst_input. It means, it can happen for forwarding,
not just for local delivery.

	To summarize, we can not rely on iph->daddr to be
local address if RTCF_LOCAL is set. There is always the risk to
work with redirected or forwarded traffic. Even for the PKTINFO
case we should make sure ipi_spec_dst is a local address (original
daddr goes to ipi_addr anyways), in case later ipi_spec_dst
is used again for sending in PKTINFO.

	For now, I see only one possible optimization.
When fib_lookup returns res.fi and res.type is RTN_LOCAL
we can check fib_protocol. If fib_protocol is not
RTPROT_KERNEL we will add RTCF_MAYBE_LOCAL (new flag) to rt_flags.
It will lead to slow lookups to validate the iph->daddr
if used later as source address, like in the spec_dst case.

	For the common case of local routes created by fib_magic()
we will use iph->daddr in fib_compute_spec_dst as follows:

	if (rt->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST |
			    RTCF_LOCAL | RTCF_MAYBE_LOCAL) == RTCF_LOCAL))
		return ip_hdr(skb)->daddr;
	/* For mcast, forwarding and spoofing we take the slow path */

	If users add local RTPROT_KERNEL routes, later
the outgoing traffic will anyways fail in some output route lookup
because FLOWI_FLAG_ANYSRC is set in rare cases. But also
users can break srr in this way, so there is some risk.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [BUG, regression, bisected] Marvell 88E8055 NIC (sky2) fails to detect link after resume from S3
From: Michal Zatloukal @ 2012-06-30  9:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20120629164610.1f343434@nehalam.linuxnetplumber.net>

On Sat, Jun 30, 2012 at 1:46 AM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
<snip>
>
> Is ubuntu still doing the stupid rmmod on suspend?

The modprobes in the posted dmesg were done manually by me to see if
it would help. It didn't.
As for Ubuntu doing it, I don't know. Any way I could tell? I've
looked into /usr/lib/pm-utils/pm-functions and the $SUSPEND_MODULES
variable is set to empty, so it doesn't look like it's doing it for
any modules at this point.

> Looks like PCI power management has turned the chip off (that is why it
> keeps reading ff to all requests).

Is there something I can try?

MZ

^ permalink raw reply

* Re: [PATCH] net: Update netdev_alloc_frag to work more efficiently with TCP and GRO
From: Eric Dumazet @ 2012-06-30  8:39 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Alexander Duyck, netdev, davem, jeffrey.t.kirsher
In-Reply-To: <4FEE3487.9080408@intel.com>

On Fri, 2012-06-29 at 16:04 -0700, Alexander Duyck wrote:

> I was wondering if there were any plans to clean this patch up and
> submit it to net-next?  If not, I can probably work on that since this
> addressed the concerns I had in my original patch.
> 

I used this patch for a while on my machines, but I am working on
something allowing fallback to order-0 allocations if memory gets
fragmented. This fallback should almost never happen, but we should have
it just in case ?

^ permalink raw reply

* Re: [patch net-next v2 0/4] net: introduce and use IFF_LIFE_ADDR_CHANGE
From: David Miller @ 2012-06-30  8:08 UTC (permalink / raw)
  To: jpirko
  Cc: mst, netdev, shimoda.hiroaki, virtualization, danny.kukawka,
	edumazet
In-Reply-To: <1340982608-897-1-git-send-email-jpirko@redhat.com>

From: Jiri Pirko <jpirko@redhat.com>
Date: Fri, 29 Jun 2012 17:10:04 +0200

> three drivers updated, but this can be used in many others.
> 
> v1->v2:
> %s/LIFE/LIVE
> 
> Jiri Pirko (4):
>   net: introduce new priv_flag indicating iface capable of change mac
>     when running
>   virtio_net: use IFF_LIVE_ADDR_CHANGE priv_flag
>   team: use IFF_LIVE_ADDR_CHANGE priv_flag
>   dummy: use IFF_LIVE_ADDR_CHANGE priv_flag

Applied, thanks Jiri.

^ permalink raw reply

* Re: [PATCH V3 1/2] bonding support for IPv6 transmit hashing
From: David Miller @ 2012-06-30  8:05 UTC (permalink / raw)
  To: linux; +Cc: netdev
In-Reply-To: <4FEE99E7.9010504@8192.net>

If you're going to post multiple patches, give them unique
subject line texts describing what each change does uniquely.
Do not use identical subject lines ever, that is very unhelpful
for the people reading your changes.

From: John <linux@8192.net>
Date: Fri, 29 Jun 2012 23:17:11 -0700

> + skb_network_header_len(skb) >= sizeof(struct ipv6hdr)) {
> +		ipv6h = ipv6_hdr(skb);
> +		v6hash =
> + (ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
> + (ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
> + (ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3]);
> +		v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
> + return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;

Either you formatted this terribly, or your email client corrupted
your patches.

^ permalink raw reply

* [PATCH V3 0/2] bonding support for IPv6 transmit hashing
From: John @ 2012-06-30  6:17 UTC (permalink / raw)
  To: netdev

Currently the "bonding" driver does not support load balancing outgoing
traffic in LACP mode for IPv6 traffic. IPv4 (and TCP or UDP over IPv4)
are currently supported; this patch adds transmit hashing for IPv6
(and TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support
in the bonding driver.

The algorithm chosen (xor'ing the bottom three quads and then xor'ing
the bottom three bytes of that) was chosen after testing almost 400,000
unique IPv6 addresses harvested from server logs. This algorithm
had the most even distribution for both big- and little-endian
architectures while still using few instructions.

Fragmented IPv6 packets are handled the same way as fragmented
IPv4 packets, ie, they are not balanced based on layer 4
information. Additionally, IPv6 packets with intermediate headers
are not balanced based on layer 4 information. In practice these
intermediate headers are not common and this should not cause any
problems, and the alternative (a packet-parsing loop and look-up table)
seemed slow and complicated for little gain.

This is an update to a prior patch I submitted. This version includes
a clarified description, more thorough bounds checking, updates
functions to call bond_xmit_hash_policy_l2 rather than re-implement
the same logic, incorporates Jay's style suggestions, and patches
against net-next. Patch has been tested and performs as expected.

John

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox