Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH 00/18] netfilter: IPv6 NAT
From: Amos Jeffries @ 2011-11-29 13:24 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Ulrich Weber, sclark46@earthlink.net, kaber@trash.net,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <alpine.LNX.2.01.1111291255200.20965@frira.zrqbmnf.qr>

On 30/11/2011 1:23 a.m., Jan Engelhardt wrote:
> On Tuesday 2011-11-29 10:19, Ulrich Weber wrote:
>> On 28.11.2011 23:03, Amos Jeffries wrote:
>>> I'm going to dare to call FUD on those statements...
>>>    * Load Balancing - what is preventing your routing rules or packet
>>>   marking using the same criteria as the NAT changer? nothing. Load
>>>   balancing works perfectly fine without NAT.
> Source address selection, having to occur on the source, would
> require that the source has to know all the parameters that a {what
> would have been your NAT GW} would need to know, which means you have
> to (a) collect and/or (b) distribute this information. Given two
> uplinks that only allow a certain source network address (different
> for each uplink), combined with the desire to balance on utilization,
> (a) a client is not in the position to easily obtain this data unless
> it is the router for all participants itself,

There you are adding a straw-man component into the mix. "Given two 
uplinks that only allow a certain source network address ", yes I agree, 
NAT is the sugar that makes these uplinks work. Irrelevant of load 
balancing.

In the same way the security != NAT, so too load balancing != NAT. As 
I'm sure you are all well aware.

Looking back at Ulrichs' original statement after reading yours it's 
clear he probably meant that (I hope so at least). The first reading 
though was that NAT by itself provided easy load balancing and DMZ.  I 
argue neither for or against validity of NAT. Just the Fuzziness 
Uncertainty and Doubt created by the particular statement was of a high 
amount.

>   (b) the clients needs
> to cooperate, and one cannot always trust client devices, or hope for
> their technical cooperation (firewalled themselves off).
>
>

>> I fully agree. NAT can not replace your firewall rules.
>>
>> However with NAT you could get some kind of anonymity.
> Same network prefix, some cookies, or a login form. Blam, identified,
> or at least (Almost-)Uniquely Identified Visitor tagging.
>
> Everybody should come out of their worshipping NAT for anonymity
> now - at best, that is an Emperor's Clothes' kind of anonymity.
>
>> Think of Tor: If your server/client operates with private IP addresses,
>> your public IP address is still masked after a security breach.
> If one's tor peer was busted, they would have the address.
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Myklebust, Trond @ 2011-11-29 13:35 UTC (permalink / raw)
  To: tao.peng, skinsbursky
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, jbottomley,
	bfields, davem, devel
In-Reply-To: <F19688880B763E40B28B2B462677FBF805E3A4AFFD@MX09A.corp.emc.com>

> -----Original Message-----
> From: tao.peng@emc.com [mailto:tao.peng@emc.com]
> Sent: Tuesday, November 29, 2011 7:40 AM
> To: skinsbursky@parallels.com
> Cc: Myklebust, Trond; linux-nfs@vger.kernel.org; xemul@parallels.com;
> neilb@suse.de; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> jbottomley@parallels.com; bfields@fieldses.org; davem@davemloft.net;
> devel@openvz.org
> Subject: RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> from blocklayout routines
> 
> > -----Original Message-----
> > From: Stanislav Kinsbursky [mailto:skinsbursky@parallels.com]
> > Sent: Tuesday, November 29, 2011 8:19 PM
> > To: Peng, Tao
> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Pavel
> > Emelianov; neilb@suse.de; netdev@vger.kernel.org;
> > linux-kernel@vger.kernel.org; James Bottomley; bfields@fieldses.org;
> > davem@davemloft.net; devel@openvz.org
> > Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> > from blocklayout routines
> >
> > 29.11.2011 16:00, tao.peng@emc.com пишет:
> > >> -----Original Message-----
> > >> From: linux-nfs-owner@vger.kernel.org
> > >> [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of
> > Stanislav
> > >> Kinsbursky
> > >> Sent: Tuesday, November 29, 2011 6:11 PM
> > >> To: Trond.Myklebust@netapp.com
> > >> Cc: linux-nfs@vger.kernel.org; xemul@parallels.com; neilb@suse.de;
> > >> netdev@vger.kernel.org; linux- kernel@vger.kernel.org;
> > >> jbottomley@parallels.com; bfields@fieldses.org;
> > >> davem@davemloft.net; devel@openvz.org
> > >> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> > >> from blocklayout routines
> > >>
> > >> This is a cleanup patch. We don't need this reference anymore,
> > >> because blocklayout pipes dentries now creates and destroys in
> > >> per-net operations and on PipeFS mount/umount notification.
> > >> Note that nfs4blocklayout_register_net() now returns 0 instead of
> > >> -ENOENT in case of PipeFS superblock absence. This is ok, because
> > >> blocklayout pipe dentry will be created on PipeFS mount event.
> > > When is the "pipefs mount event" going to happen? When inserting
> > > kernel modules or when user issues
> > mount command?
> > >
> >
> > When user issues mount command.
> > Kernel mounts of PipeFS has been removed with all these patch sets
> > I've sent already.
> Then it is going to break blocklayout user space program blkmapd, which is
> stared before mounting any file system and it tries to open the pipe file
> when started.

Why on earth is blkmapd doing this instead of listening for file creation notifications like the other rpc_pipefs daemons do?

Trond


^ permalink raw reply

* [PATCH net-next] sch_sfq: use skb_flow_dissect()
From: Eric Dumazet @ 2011-11-29 13:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/sched/sch_sfq.c |   65 ++++++------------------------------------
 1 file changed, 10 insertions(+), 55 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 4f5510e..c8da556 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -17,14 +17,13 @@
 #include <linux/in.h>
 #include <linux/errno.h>
 #include <linux/init.h>
-#include <linux/ipv6.h>
 #include <linux/skbuff.h>
 #include <linux/jhash.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
-#include <net/ip.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
+#include <net/flow_keys.h>
 
 
 /*	Stochastic Fairness Queuing algorithm.
@@ -137,61 +136,17 @@ static inline struct sfq_head *sfq_dep_head(struct sfq_sched_data *q, sfq_index
 	return &q->dep[val - SFQ_SLOTS];
 }
 
-static unsigned int sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 h1)
+static unsigned int sfq_hash(const struct sfq_sched_data *q,
+			     const struct sk_buff *skb)
 {
-	return jhash_2words(h, h1, q->perturbation) & (q->divisor - 1);
-}
-
-static unsigned int sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
-{
-	u32 h, h2;
-
-	switch (skb->protocol) {
-	case htons(ETH_P_IP):
-	{
-		const struct iphdr *iph;
-		int poff;
-
-		if (!pskb_network_may_pull(skb, sizeof(*iph)))
-			goto err;
-		iph = ip_hdr(skb);
-		h = (__force u32)iph->daddr;
-		h2 = (__force u32)iph->saddr ^ iph->protocol;
-		if (ip_is_fragment(iph))
-			break;
-		poff = proto_ports_offset(iph->protocol);
-		if (poff >= 0 &&
-		    pskb_network_may_pull(skb, iph->ihl * 4 + 4 + poff)) {
-			iph = ip_hdr(skb);
-			h2 ^= *(u32 *)((void *)iph + iph->ihl * 4 + poff);
-		}
-		break;
-	}
-	case htons(ETH_P_IPV6):
-	{
-		const struct ipv6hdr *iph;
-		int poff;
-
-		if (!pskb_network_may_pull(skb, sizeof(*iph)))
-			goto err;
-		iph = ipv6_hdr(skb);
-		h = (__force u32)iph->daddr.s6_addr32[3];
-		h2 = (__force u32)iph->saddr.s6_addr32[3] ^ iph->nexthdr;
-		poff = proto_ports_offset(iph->nexthdr);
-		if (poff >= 0 &&
-		    pskb_network_may_pull(skb, sizeof(*iph) + 4 + poff)) {
-			iph = ipv6_hdr(skb);
-			h2 ^= *(u32 *)((void *)iph + sizeof(*iph) + poff);
-		}
-		break;
-	}
-	default:
-err:
-		h = (unsigned long)skb_dst(skb) ^ (__force u32)skb->protocol;
-		h2 = (unsigned long)skb->sk;
-	}
+	struct flow_keys keys;
+	unsigned int hash;
 
-	return sfq_fold_hash(q, h, h2);
+	skb_flow_dissect(skb, &keys);
+	hash = jhash_3words((__force u32)keys.dst,
+			    (__force u32)keys.src ^ keys.ip_proto,
+			    (__force u32)keys.ports, q->perturbation);
+	return hash & (q->divisor - 1);
 }
 
 static unsigned int sfq_classify(struct sk_buff *skb, struct Qdisc *sch,

^ permalink raw reply related

* Re: pull request: wireless-next 2011-11-28
From: John W. Linville @ 2011-11-29 13:42 UTC (permalink / raw)
  To: Johannes Berg
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1322554972.4110.0.camel-8upI4CBIZJIJvtFkdXX2HixXY32XiHfO@public.gmane.org>

On Tue, Nov 29, 2011 at 09:22:52AM +0100, Johannes Berg wrote:
> On Mon, 2011-11-28 at 15:02 -0500, John W. Linville wrote:
> 
> > Johannes Berg (12):
> 
> >       mac80211: use skb list for fragments
> >       mac80211: move fragment flag adjustment
> >       mac80211: make TX LED handling independent of fragmentation
> >       mac80211: transmit fragment list to drivers
> 
> Since you included these patches, can you please pick up "mac80211: fix
> TX warning" relatively quickly? It fixes a harmless but possibly
> recurring WARN_ON in this code.

It's already in wireless-next. :-)

-- 
John W. Linville		Someday the world will need a hero, and you
linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org			might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: pull request: wireless-next 2011-11-28
From: Johannes Berg @ 2011-11-29 13:54 UTC (permalink / raw)
  To: John W. Linville; +Cc: davem, linux-wireless, netdev
In-Reply-To: <20111129134243.GB2705@tuxdriver.com>

On Tue, 2011-11-29 at 08:42 -0500, John W. Linville wrote:

> > > Johannes Berg (12):
> > 
> > >       mac80211: use skb list for fragments
> > >       mac80211: move fragment flag adjustment
> > >       mac80211: make TX LED handling independent of fragmentation
> > >       mac80211: transmit fragment list to drivers
> > 
> > Since you included these patches, can you please pick up "mac80211: fix
> > TX warning" relatively quickly? It fixes a harmless but possibly
> > recurring WARN_ON in this code.
> 
> It's already in wireless-next. :-)

Yeah, sorry, I noticed an hour later or so that I'd been checking the
wrong tree (without updating first)

johannes

^ permalink raw reply

* Re: [RFC PATCH 00/18] netfilter: IPv6 NAT
From: Patrick McHardy @ 2011-11-29 14:05 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: sclark46, netfilter-devel, netdev, ulrich.weber
In-Reply-To: <hr62jai.cb703ce1830a0044ae0461ba20654d29@obelix.schillstrom.com>

On 11/29/2011 01:50 PM, Hans Schillstrom wrote:
>>> Probabably a dumb question but are these patches for natting ipv6 to
>>> ipv6 or ipv4 to ipv6?
>> IPv6 to IPv6. I haven't really considered IPv6 to IPv4 yet.
> Have you ever tried  ivi  The IPv6 to IPv4 gateway ?
> i.e  address mapping and packet translation between IPv4 and IPv6 networks

No, I'm currently looking at stateless IPv6 NAT mechanisms.

> I made a port to resent kernel  in the begining of this year ( i think it was 2.6.36 ... ) and it seems to work.
> (however there is some issues left....)
>
> I like the idea of "ivi"  for a larger scale of translation.

Do you have a pointer to your port?

^ permalink raw reply

* Re: Daughterboard Jetway JAD3RTLANG with three RTL-8110SC/8169SC Gigabit Ethernet is curious
From: Markus Feldmann @ 2011-11-29 14:16 UTC (permalink / raw)
  To: netdev
In-Reply-To: <201111291012.25426.carsten@wolffcarsten.de>

Am 29.11.2011 10:12, schrieb Carsten Wolff:
> Hi Markus,
>
> On Monday 28 November 2011, Markus Feldmann wrote:
>> Hi All,
>>
>> i have a mini-ITX PC (Motherboard Jetway JNF92-270-LF), whit 3 ethernet
>> devices (RTL-8110SC/8169SC Gigabit Ethernet) on one additional
>> daughterboard (Jetway JAD3RTLANG).
>>
>> Here is my<hwinfo --netcard>  output:
>> http://pastebin.com/raw.php?i=Khy1NX61
>>
>> and dmesg:
>> http://pastebin.com/raw.php?i=Ziq1mizy
>>
>> I am using this mini-ITX PC as server/router with a debian
>> squeeze/stable system.
>>
>> My problem is that only one of this three ethernet devices (from the
>> daughterboard not the internal one from motherboard) at once is working
>> correctly. Sometimes i have only one client to my server connected, but
>> only one specific ehernet port is working (mostly the physically upper
>> one). I didnt found any regularity yet.
> I guess "not working" means "link detected: no" in ethtool?
>
>> Any hints? is there a specific person whom i schould contact?
> I had a similar (the same?) problems with this hardware about 2 years ago. My
> "solution" was to install windows + drivers, then throw windows away and
> install debian. After that, I had no more problems with those NICs. Since
> then, there have been some patches to the initialization code for these RTL
> Linux drivers, but maybe the windows driver installs newer firmware, too,
> which may be required for that particular product.
>
> Carsten
Hi,

You mean the latest Linux kernel was patched for a better 
initialization? At the moment i am using the kernel 3.0.8. As i 
understood the firmware must be loaded at every power on.

regards Markus

^ permalink raw reply

* [PATCH net-next] sch_choke: use skb_flow_dissect()
From: Eric Dumazet @ 2011-11-29 14:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/sched/sch_choke.c |  120 ++++++++++------------------------------
 1 file changed, 31 insertions(+), 89 deletions(-)

diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index 061bcb7..205d369 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -19,10 +19,7 @@
 #include <net/pkt_sched.h>
 #include <net/inet_ecn.h>
 #include <net/red.h>
-#include <linux/ip.h>
-#include <net/ip.h>
-#include <linux/ipv6.h>
-#include <net/ipv6.h>
+#include <net/flow_keys.h>
 
 /*
    CHOKe stateless AQM for fair bandwidth allocation
@@ -142,92 +139,10 @@ static void choke_drop_by_idx(struct Qdisc *sch, unsigned int idx)
 	--sch->q.qlen;
 }
 
-/*
- * Compare flow of two packets
- *  Returns true only if source and destination address and port match.
- *          false for special cases
- */
-static bool choke_match_flow(struct sk_buff *skb1,
-			     struct sk_buff *skb2)
-{
-	int off1, off2, poff;
-	const u32 *ports1, *ports2;
-	u32 _ports1, _ports2;
-	u8 ip_proto;
-	__u32 hash1;
-
-	if (skb1->protocol != skb2->protocol)
-		return false;
-
-	/* Use rxhash value as quick check */
-	hash1 = skb_get_rxhash(skb1);
-	if (!hash1 || hash1 != skb_get_rxhash(skb2))
-		return false;
-
-	/* Probably match, but be sure to avoid hash collisions */
-	off1 = skb_network_offset(skb1);
-	off2 = skb_network_offset(skb2);
-
-	switch (skb1->protocol) {
-	case __constant_htons(ETH_P_IP): {
-		const struct iphdr *ip1, *ip2;
-		struct iphdr _ip1, _ip2;
-
-		ip1 = skb_header_pointer(skb1, off1, sizeof(_ip1), &_ip1);
-		ip2 = skb_header_pointer(skb2, off2, sizeof(_ip2), &_ip2);
-		if (!ip1 || !ip2)
-			return false;
-		ip_proto = ip1->protocol;
-		if (ip_proto != ip2->protocol ||
-		    ip1->saddr != ip2->saddr || ip1->daddr != ip2->daddr)
-			return false;
-
-		if (ip_is_fragment(ip1) | ip_is_fragment(ip2))
-			ip_proto = 0;
-		off1 += ip1->ihl * 4;
-		off2 += ip2->ihl * 4;
-		break;
-	}
-
-	case __constant_htons(ETH_P_IPV6): {
-		const struct ipv6hdr *ip1, *ip2;
-		struct ipv6hdr _ip1, _ip2;
-
-		ip1 = skb_header_pointer(skb1, off1, sizeof(_ip1), &_ip1);
-		ip2 = skb_header_pointer(skb2, off2, sizeof(_ip2), &_ip2);
-		if (!ip1 || !ip2)
-			return false;
-
-		ip_proto = ip1->nexthdr;
-		if (ip_proto != ip2->nexthdr ||
-		    ipv6_addr_cmp(&ip1->saddr, &ip2->saddr) ||
-		    ipv6_addr_cmp(&ip1->daddr, &ip2->daddr))
-			return false;
-		off1 += 40;
-		off2 += 40;
-	}
-
-	default: /* Maybe compare MAC header here? */
-		return false;
-	}
-
-	poff = proto_ports_offset(ip_proto);
-	if (poff < 0)
-		return true;
-
-	off1 += poff;
-	off2 += poff;
-
-	ports1 = skb_header_pointer(skb1, off1, sizeof(_ports1), &_ports1);
-	ports2 = skb_header_pointer(skb2, off2, sizeof(_ports2), &_ports2);
-	if (!ports1 || !ports2)
-		return false;
-
-	return *ports1 == *ports2;
-}
-
 struct choke_skb_cb {
-	u16 classid;
+	u16			classid;
+	u8			keys_valid;
+	struct flow_keys	keys;
 };
 
 static inline struct choke_skb_cb *choke_skb_cb(const struct sk_buff *skb)
@@ -248,6 +163,32 @@ static u16 choke_get_classid(const struct sk_buff *skb)
 }
 
 /*
+ * Compare flow of two packets
+ *  Returns true only if source and destination address and port match.
+ *          false for special cases
+ */
+static bool choke_match_flow(struct sk_buff *skb1,
+			     struct sk_buff *skb2)
+{
+	if (skb1->protocol != skb2->protocol)
+		return false;
+
+	if (!choke_skb_cb(skb1)->keys_valid) {
+		choke_skb_cb(skb1)->keys_valid = 1;
+		skb_flow_dissect(skb1, &choke_skb_cb(skb1)->keys);
+	}
+
+	if (!choke_skb_cb(skb2)->keys_valid) {
+		choke_skb_cb(skb2)->keys_valid = 1;
+		skb_flow_dissect(skb2, &choke_skb_cb(skb2)->keys);
+	}
+
+	return !memcmp(&choke_skb_cb(skb1)->keys,
+		       &choke_skb_cb(skb2)->keys,
+		       sizeof(struct flow_keys));
+}
+
+/*
  * Classify flow using either:
  *  1. pre-existing classification result in skb
  *  2. fast internal classification
@@ -333,6 +274,7 @@ static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 			goto other_drop;	/* Packet was eaten by filter */
 	}
 
+	choke_skb_cb(skb)->keys_valid = 0;
 	/* Compute average queue usage (see RED) */
 	p->qavg = red_calc_qavg(p, sch->q.qlen);
 	if (red_is_idling(p))

^ permalink raw reply related

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Ben Hutchings @ 2011-11-29 14:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Taht, Tom Herbert, davem, netdev
In-Reply-To: <1322550138.2970.70.camel@edumazet-laptop>

On Tue, 2011-11-29 at 08:02 +0100, Eric Dumazet wrote:
> Le mardi 29 novembre 2011 à 05:23 +0100, Dave Taht a écrit :
> > > In this test 100 netperf TCP_STREAMs were started to saturate the link.
> > > A single instance of a netperf TCP_RR was run with high priority set.
> > > Queuing discipline in pfifo_fast, NIC is e1000 with TX ring size set to
> > > 1024.  tps for the high priority RR is listed.
> > >
> > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps
> > > BQL, tso on: 156-194K bytes in queue, 535 tps
> > 
> > > No BQL, tso off: 453-454K bytes int queue, 234 tps
> > > BQL, tso off: 66K bytes in queue, 914 tps
> > 
> > 
> > Jeeze. Under what circumstances is tso a win? I've always
> > had great trouble with it, as some e1000 cards do it rather badly.
> > 
> > I assume these are while running at GigE speeds?
> > 
> > What of 100Mbit? 10GigE? (I will duplicate your tests
> > at 100Mbit, but as for 10gigE...)
> > 
> 
> TSO on means a low priority 65Kbytes packet can be in TX ring right
> before the high priority packet. If you cant afford the delay, you lose.
> 
> There is no mystery here.
> 
> If you want low latencies :
> - TSO must be disabled so that packets are at most one ethernet frame. 
[...]

Not if you separate hardware queues by priority (and your high priority
packets are non-TCP or PuSHed).

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Eric Dumazet @ 2011-11-29 14:29 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Dave Taht, Tom Herbert, davem, netdev
In-Reply-To: <1322576645.7454.48.camel@deadeye>

Le mardi 29 novembre 2011 à 14:24 +0000, Ben Hutchings a écrit :

> Not if you separate hardware queues by priority (and your high priority
> packets are non-TCP or PuSHed).

I mostly have tg3 , bnx2 cards, mono queues...

I presume Dave, working on small Wifi/ADSL routers have same kind of
hardware.

^ permalink raw reply

* Re: Proposed removal of DECnet support (was: Re: [BUG] 3.2-rc2: BUG kmalloc-8: Redzone overwritten)
From: Philipp Schafft @ 2011-11-29 14:47 UTC (permalink / raw)
  To: Steven Whitehouse
  Cc: Eric Dumazet, Sasha Levin, David Miller, Matt Mackall,
	Christoph Lameter, Pekka Enberg, linux-mm, linux-kernel, netdev,
	Chrissie Caulfield, Linux-DECnet user, RoarAudio
In-Reply-To: <1322490161.2711.26.camel@menhir>

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]

reflum,

On Tue, 2011-11-29 at 15:34 +0100, Steven Whitehouse wrote:

> Has anybody actually tested it
> > >> lately against "real" DEC implementations?
> > > I doubt it :-)
> > DECnet is in use against real DEC implementations - I have checked it 
> > quite recently against a VAX running OpenVMS. How many people are 
> > actually using it for real work is a different question though.
> > 
> Ok, thats useful info.

I confirmed parts of it with tcpdump and the specs some weeks ago. The
parts I worked on passed :) I also considered to send the tcpdump
upstream a patch for protocol decoding.


> > It's also true that it's not really supported by anyone as I orphaned it 
> > some time ago and nobody else seems to care enough to take it over. So 
> > if it's becoming a burden on people doing real kernel work then I don't 
> > think many tears will be wept for its removal.
> > Chrissie
> 
> Really the only issue with keeping it around is the maintenance burden I
> think. It doesn't look like anybody wants to take it on, but maybe we
> should give it another few days for someone to speak up, just in case
> they are on holiday or something at the moment.
> 
> Also, I've updated the subject of the thread, to make it more obvious
> what is being discussed, as well as bcc'ing it again to the DECnet list,

I'm very interested in the module. However my problem is that I had
nothing to do with kernel coding yet. However I'm currently searching a
new maintainer for it (I got info about this thread by today).
If somebody is interested in this and only needs some "motivation" or
maybe someone would like to get me into kernel coding, please just
reply :)

-- 
Philipp.
 (Rah of PH2)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 482 bytes --]

^ permalink raw reply

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Eric Dumazet @ 2011-11-29 14:57 UTC (permalink / raw)
  To: Dave Taht
  Cc: John Fastabend, Tom Herbert, davem@davemloft.net,
	netdev@vger.kernel.org
In-Reply-To: <CAA93jw6bvzASrtvjhKL+haZb0VEPvQQD0guf7aBTPxJbxydEAg@mail.gmail.com>

Le mardi 29 novembre 2011 à 09:51 +0100, Dave Taht a écrit :

> Perhaps I don't understand the gross effects of TSO very well, but if you have
> 100 streams coming from a server, destined to X different destinations,
> and you FQ to each on a per packet basis, you end up impacting the downstream
> receive buffers throughout much less than if you send each stream as a burst.

TSO makes packets larger, to lower cpu use in different layers (netfilter, qdisc, ...).

Imagine you could have MSS=65000 on your ethernet wire.

If you need to send a high prio packet while a prior big one is
in-flight on a dumb device (a single TX FIFO), there is nothing you
can do but wait last bit of big packet hit the wire.

Even with one flow you lose. Hundred flows dont matter
(as long as you have proper classification in Qdisc layer, of course)

Most setups dont care.

The ones caring dedicate a link for exclusive use, making sure it wont
cross loaded trunks. (Heartbeats in clusters)

Even disabling TSO wont be enough for them, if a single tcp flow can compete with them.

^ permalink raw reply

* Re: [PATCH] sctp: integer overflow in sctp_auth_create_key()
From: Vladislav Yasevich @ 2011-11-29 15:03 UTC (permalink / raw)
  To: Xi Wang
  Cc: linux-kernel, Sridhar Samudrala, David S. Miller, linux-sctp,
	netdev, security
In-Reply-To: <147F953A-CC69-41FF-ACD4-64E5E2956411@gmail.com>

On 11/29/2011 02:33 AM, Xi Wang wrote:
> I agree that this is not a security issue if key_len can never get large.
> 
> So how about just removing the overflow check at all?

That should be ok as well.  There is an overflow guard in the api
entry point so that should guard against overflows from user space.

On the network end I miscalculated a little.  The key is actually made up
of user_key (1 short) + 2 * key_vector (3 shorts) for a total of 7*MAX_USHORT;
however, that still will not overflow 32 bits.

-vlad

> 
> - xi
> 
> On Nov 28, 2011, at 10:45 AM, Vladislav Yasevich wrote:
>>
>> Hmm.  Yes, this is a more correct check.
>>
>> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>>
>>
>> However, I don't think this is a security issue.  As I've written before, this function is
>> called from 2 places:
>>
>>  1) setsockopt() code path
>>
>>  2) sctp_auth_asoc_set_secret() code path
>>
>> In case (1), sca_keylength is never going to exceed 65535 since it's
>> bounded by a u16 from the user api.  As such, The integer promotion will
>> not impact anything and the malloc() will never overflow.
>>
>> In case (2), sca_keylength is computed based on the key the user provided
>> (MAX_USHORT) and the combination of protocol negotiated data where that
>> combination has a max size of 3 * MAX_USHORT (see sctp_auth_make_key_vector()).
>> So, even this case, our maximum key length can be 4* MAX_USHORT which still
>> will always be below MAX_INT and will not overflow.
>>
>> So, I don't think there is big security consideration here, just a bad
>> check that just happens to always work.
>>
>> -vlad
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Peng Tao @ 2011-11-29 15:05 UTC (permalink / raw)
  To: Stanislav Kinsbursky
  Cc: tao.peng-mb1K0bWo544@public.gmane.org,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Pavel Emelianov, neilb-l3A5Bk7waGM@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	James Bottomley, bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org,
	devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
In-Reply-To: <4ED4DA80.9080303-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

On Tue, Nov 29, 2011 at 9:13 PM, Stanislav Kinsbursky
<skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote:
> 29.11.2011 16:40, tao.peng-mb1K0bWo544@public.gmane.org пишет:
>
>>> -----Original Message-----
>>> From: Stanislav Kinsbursky [mailto:skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org]
>>> Sent: Tuesday, November 29, 2011 8:19 PM
>>> To: Peng, Tao
>>> Cc: Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Pavel
>>> Emelianov; neilb-l3A5Bk7waGM@public.gmane.org;
>>> netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; James Bottomley;
>>> bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org;
>>> davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org; devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
>>> Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>>> from blocklayout routines
>>>
>>> 29.11.2011 16:00, tao.peng-mb1K0bWo544@public.gmane.org пишет:
>>>>>
>>>>> -----Original Message-----
>>>>> From: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>>> [mailto:linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of
>>>
>>> Stanislav
>>>>>
>>>>> Kinsbursky
>>>>> Sent: Tuesday, November 29, 2011 6:11 PM
>>>>> To: Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
>>>>> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org; neilb-l3A5Bk7waGM@public.gmane.org;
>>>>> netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-
>>>>> kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org; bfields@fieldses.org;
>>>>> davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org;
>>>>> devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
>>>>> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from
>>>>> blocklayout routines
>>>>>
>>>>> This is a cleanup patch. We don't need this reference anymore, because
>>>>> blocklayout pipes dentries now creates and destroys in per-net
>>>>> operations and
>>>>> on PipeFS mount/umount notification.
>>>>> Note that nfs4blocklayout_register_net() now returns 0 instead of
>>>>> -ENOENT in
>>>>> case of PipeFS superblock absence. This is ok, because blocklayout pipe
>>>>> dentry
>>>>> will be created on PipeFS mount event.
>>>>
>>>> When is the "pipefs mount event" going to happen? When inserting kernel
>>>> modules or when user issues
>>>
>>> mount command?
>>>>
>>>>
>>>
>>> When user issues mount command.
>>> Kernel mounts of PipeFS has been removed with all these patch sets I've
>>> sent
>>> already.
>>
>> Then it is going to break blocklayout user space program blkmapd, which is
>> stared before mounting any file system and it tries to open the pipe file
>> when started.
>
>
> Sorry, but I don't get it. Probably we have misunderstanding here.
> You said, that "blkmapd ... tries to open the pipe file when started". This
> pipe file is located on PipeFS, isn't it?
> If yes, then PipeFS have to be mounted already in user-space. And if it has
> been mounted - then pipe dentry is present.
> IOW, pipe (without dentry) will be created on module load. Pipe dentry will
> be created right after that (like it was before) if PipeFS was mounted from
> user-space. If not - then pipe dentry will be created  on PipeFS (!) mount
> (not NFS or pNFS mount) from user-space.
Sorry, I misunderstood. I was thinking about mounting NFS or pNFS when
you say "when user issues mount command". Thanks for the explanation.

Regards,
Tao
>
> Or I'm missing something in your reply?
>
>
>> Not sure if you implement the same logic on nfs pipe as well. But if you
>> do, then nfs client user space program idmapd will fail to start for the
>> same reason.
>>
>
> The same logic here.
>
>
>> Why not just fail to load module if you fail to initialize pipefs? When is
>> rpc_get_sb_net() going to fail?
>>
>
> Sorry, but I don't understand, what is your idea. And why do we need to fail
> at all.
> BTW, rpc_get_sb_net() just checks, was PipeFS mounted in passed net, or not.
> If not - not a problem. Dentries will be created on mount event. If yes,
> then it returns locked PipeFS sb and the next step is dentry creation.
>
>
>>>
>>>
>>>> Thanks,
>>>> Tao
>>>>
>>>>>
>>>>> Signed-off-by: Stanislav Kinsbursky<skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
>>>>>
>>>>> ---
>>>>>   fs/nfs/blocklayout/blocklayout.c |    9 +--------
>>>>>   1 files changed, 1 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfs/blocklayout/blocklayout.c
>>>>> b/fs/nfs/blocklayout/blocklayout.c
>>>>> index acf7ac9..8211ffd 100644
>>>>> --- a/fs/nfs/blocklayout/blocklayout.c
>>>>> +++ b/fs/nfs/blocklayout/blocklayout.c
>>>>> @@ -1032,7 +1032,7 @@ static struct dentry
>>>>> *nfs4blocklayout_register_net(struct net *net,
>>>>>
>>>>>        pipefs_sb = rpc_get_sb_net(net);
>>>>>        if (!pipefs_sb)
>>>>> -               return ERR_PTR(-ENOENT);
>>>>> +               return 0;
>>>>>        dentry = nfs4blocklayout_register_sb(pipefs_sb, pipe);
>>>>>        rpc_put_sb_net(net);
>>>>>        return dentry;
>>>>> @@ -1083,7 +1083,6 @@ static struct pernet_operations
>>>>> nfs4blocklayout_net_ops = {
>>>>>
>>>>>   static int __init nfs4blocklayout_init(void)
>>>>>   {
>>>>> -       struct vfsmount *mnt;
>>>>>        int ret;
>>>>>
>>>>>        dprintk("%s: NFSv4 Block Layout Driver Registering...\n",
>>>>> __func__);
>>>>> @@ -1093,12 +1092,6 @@ static int __init nfs4blocklayout_init(void)
>>>>>                goto out;
>>>>>
>>>>>        init_waitqueue_head(&bl_wq);
>>>>> -
>>>>> -       mnt = rpc_get_mount();
>>>>> -       if (IS_ERR(mnt)) {
>>>>> -               ret = PTR_ERR(mnt);
>>>>> -               goto out_remove;
>>>>> -       }
>>>>>        ret = rpc_pipefs_notifier_register(&nfs4blocklayout_block);
>>>>>        if (ret)
>>>>>                goto out_remove;
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Stanislav Kinsbursky
>>
>>
>
>
> --
> Best regards,
> Stanislav Kinsbursky
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Peng Tao @ 2011-11-29 15:10 UTC (permalink / raw)
  To: Myklebust, Trond
  Cc: tao.peng-mb1K0bWo544, skinsbursky-bzQdu9zFT3WakBO8gow8eQ,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	neilb-l3A5Bk7waGM, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	jbottomley-bzQdu9zFT3WakBO8gow8eQ, bfields-uC3wQj2KruNg9hUCZPvPmw,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, devel-GEFAQzZX7r8dnm+yROfE0A
In-Reply-To: <2E1EB2CF9ED1CB4AA966F0EB76EAB4430C3CBC23-hX7t0kiaRRrlMGe9HJ1VYQK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>

On Tue, Nov 29, 2011 at 9:35 PM, Myklebust, Trond
<Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org> wrote:
>> -----Original Message-----
>> From: tao.peng-mb1K0bWo544@public.gmane.org [mailto:tao.peng-mb1K0bWo544@public.gmane.org]
>> Sent: Tuesday, November 29, 2011 7:40 AM
>> To: skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org
>> Cc: Myklebust, Trond; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org;
>> neilb-l3A5Bk7waGM@public.gmane.org; netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
>> jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org; bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org; davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org;
>> devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
>> Subject: RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> from blocklayout routines
>>
>> > -----Original Message-----
>> > From: Stanislav Kinsbursky [mailto:skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org]
>> > Sent: Tuesday, November 29, 2011 8:19 PM
>> > To: Peng, Tao
>> > Cc: Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Pavel
>> > Emelianov; neilb-l3A5Bk7waGM@public.gmane.org; netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
>> > linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; James Bottomley; bfields-uC3wQj2KruMpug/h7KTFAQ@public.gmane.orgg;
>> > davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org; devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
>> > Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> > from blocklayout routines
>> >
>> > 29.11.2011 16:00, tao.peng-mb1K0bWo544@public.gmane.org пишет:
>> > >> -----Original Message-----
>> > >> From: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> > >> [mailto:linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of
>> > Stanislav
>> > >> Kinsbursky
>> > >> Sent: Tuesday, November 29, 2011 6:11 PM
>> > >> To: Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
>> > >> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org; neilb@suse.de;
>> > >> netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux- kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
>> > >> jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org; bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org;
>> > >> davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org; devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
>> > >> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> > >> from blocklayout routines
>> > >>
>> > >> This is a cleanup patch. We don't need this reference anymore,
>> > >> because blocklayout pipes dentries now creates and destroys in
>> > >> per-net operations and on PipeFS mount/umount notification.
>> > >> Note that nfs4blocklayout_register_net() now returns 0 instead of
>> > >> -ENOENT in case of PipeFS superblock absence. This is ok, because
>> > >> blocklayout pipe dentry will be created on PipeFS mount event.
>> > > When is the "pipefs mount event" going to happen? When inserting
>> > > kernel modules or when user issues
>> > mount command?
>> > >
>> >
>> > When user issues mount command.
>> > Kernel mounts of PipeFS has been removed with all these patch sets
>> > I've sent already.
>> Then it is going to break blocklayout user space program blkmapd, which is
>> stared before mounting any file system and it tries to open the pipe file
>> when started.
>
> Why on earth is blkmapd doing this instead of listening for file creation notifications like the other rpc_pipefs daemons do?
Not sure how the original implementer chose this but I think it is
likely because we do not expect the pipe file to be created or deleted
dynamically.

-- 
Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] staging: hv: move hv_netvsc out of staging area
From: Greg KH @ 2011-11-29 15:16 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: gregkh@suse.de, linux-kernel@vger.kernel.org,
	devel@linuxdriverproject.org, NetDev
In-Reply-To: <A1F3067C9B68744AA19F6802BAB8FFDC0CCCC087@TK5EX14MBXC227.redmond.corp.microsoft.com>

On Mon, Nov 28, 2011 at 10:00:35PM +0000, Haiyang Zhang wrote:
> > -----Original Message-----
> > From: Haiyang Zhang [mailto:haiyangz@microsoft.com]
> > Sent: Monday, November 28, 2011 4:36 PM
> > To: Haiyang Zhang; KY Srinivasan; gregkh@suse.de; linux-
> > kernel@vger.kernel.org; devel@linuxdriverproject.org
> > Cc: Mike Sterling; NetDev
> > Subject: [PATCH] staging: hv: move hv_netvsc out of staging area
> > 
> > hv_netvsc has been reviewed on netdev mailing list on 6/09/2011.
> > All recommended changes have been made. We are requesting to move it
> > out of staging area.
> > 
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > Signed-off-by: KY Srinivasan <kys@microsoft.com>
> > Signed-off-by: Mike Sterling <Mike.Sterling@microsoft.com>
> > Cc: NetDev <netdev@vger.kernel.org>
> > Acked-by: Stephen Hemminger <shemminger@vyatta.com>
> > ---
> >  drivers/net/Kconfig                               |    2 ++
> >  drivers/net/Makefile                              |    2 ++
> >  drivers/net/hyperv/Kconfig                        |    5 +++++
> >  drivers/net/hyperv/Makefile                       |    3 +++
> >  drivers/{staging/hv => net/hyperv}/hyperv_net.h   |    0
> >  drivers/{staging/hv => net/hyperv}/netvsc.c       |    0
> >  drivers/{staging/hv => net/hyperv}/netvsc_drv.c   |    0
> >  drivers/{staging/hv => net/hyperv}/rndis_filter.c |    0
> >  drivers/staging/hv/Kconfig                        |    6 ------
> >  drivers/staging/hv/Makefile                       |    2 --
> >  drivers/staging/hv/TODO                           |    1 -
> >  11 files changed, 12 insertions(+), 9 deletions(-)  create mode 100644
> > drivers/net/hyperv/Kconfig  create mode 100644
> > drivers/net/hyperv/Makefile  rename drivers/{staging/hv =>
> > net/hyperv}/hyperv_net.h (100%)  rename drivers/{staging/hv =>
> > net/hyperv}/netvsc.c (100%)  rename drivers/{staging/hv =>
> > net/hyperv}/netvsc_drv.c (100%)  rename drivers/{staging/hv =>
> > net/hyperv}/rndis_filter.c (100%)
> > 
> 
> I have rebased the previous patch on the latest staging-next branch,
> and re-submitting it now.  In another email, the same patch was
> submitted without using the "-M" flag, in case anyone wants to read
> the unchanged code in the patch body.

This one was fine, now applied, thanks.

greg k-h

^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Trond Myklebust @ 2011-11-29 15:18 UTC (permalink / raw)
  To: Peng Tao
  Cc: tao.peng, skinsbursky, linux-nfs, xemul, neilb, netdev,
	linux-kernel, jbottomley, bfields, davem, devel
In-Reply-To: <CA+a=Yy5GKMu8rbh8y3fOtK6pmqygPNBr=a+p7LfcmTqtXFH3iA@mail.gmail.com>

On Tue, 2011-11-29 at 23:10 +0800, Peng Tao wrote: 
> On Tue, Nov 29, 2011 at 9:35 PM, Myklebust, Trond
> <Trond.Myklebust@netapp.com> wrote:
> >> -----Original Message-----
> >> From: tao.peng@emc.com [mailto:tao.peng@emc.com]
> >> Sent: Tuesday, November 29, 2011 7:40 AM
> >> To: skinsbursky@parallels.com
> >> Cc: Myklebust, Trond; linux-nfs@vger.kernel.org; xemul@parallels.com;
> >> neilb@suse.de; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> >> jbottomley@parallels.com; bfields@fieldses.org; davem@davemloft.net;
> >> devel@openvz.org
> >> Subject: RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> from blocklayout routines
> >>
> >> > -----Original Message-----
> >> > From: Stanislav Kinsbursky [mailto:skinsbursky@parallels.com]
> >> > Sent: Tuesday, November 29, 2011 8:19 PM
> >> > To: Peng, Tao
> >> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Pavel
> >> > Emelianov; neilb@suse.de; netdev@vger.kernel.org;
> >> > linux-kernel@vger.kernel.org; James Bottomley; bfields@fieldses.org;
> >> > davem@davemloft.net; devel@openvz.org
> >> > Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> > from blocklayout routines
> >> >
> >> > 29.11.2011 16:00, tao.peng@emc.com пишет:
> >> > >> -----Original Message-----
> >> > >> From: linux-nfs-owner@vger.kernel.org
> >> > >> [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of
> >> > Stanislav
> >> > >> Kinsbursky
> >> > >> Sent: Tuesday, November 29, 2011 6:11 PM
> >> > >> To: Trond.Myklebust@netapp.com
> >> > >> Cc: linux-nfs@vger.kernel.org; xemul@parallels.com; neilb@suse.de;
> >> > >> netdev@vger.kernel.org; linux- kernel@vger.kernel.org;
> >> > >> jbottomley@parallels.com; bfields@fieldses.org;
> >> > >> davem@davemloft.net; devel@openvz.org
> >> > >> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> > >> from blocklayout routines
> >> > >>
> >> > >> This is a cleanup patch. We don't need this reference anymore,
> >> > >> because blocklayout pipes dentries now creates and destroys in
> >> > >> per-net operations and on PipeFS mount/umount notification.
> >> > >> Note that nfs4blocklayout_register_net() now returns 0 instead of
> >> > >> -ENOENT in case of PipeFS superblock absence. This is ok, because
> >> > >> blocklayout pipe dentry will be created on PipeFS mount event.
> >> > > When is the "pipefs mount event" going to happen? When inserting
> >> > > kernel modules or when user issues
> >> > mount command?
> >> > >
> >> >
> >> > When user issues mount command.
> >> > Kernel mounts of PipeFS has been removed with all these patch sets
> >> > I've sent already.
> >> Then it is going to break blocklayout user space program blkmapd, which is
> >> stared before mounting any file system and it tries to open the pipe file
> >> when started.
> >
> > Why on earth is blkmapd doing this instead of listening for file creation notifications like the other rpc_pipefs daemons do?
> Not sure how the original implementer chose this but I think it is
> likely because we do not expect the pipe file to be created or deleted
> dynamically.

Unless blkmapd can pin the sunrpc module (which it shouldn't be able to)
then that assumption would be wrong. Please look into fixing blkmapd...

   Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Peng Tao @ 2011-11-29 15:30 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: tao.peng, skinsbursky, linux-nfs, xemul, neilb, netdev,
	linux-kernel, jbottomley, bfields, davem, devel
In-Reply-To: <1322579906.3619.1.camel@lade.trondhjem.org>

On Tue, Nov 29, 2011 at 11:18 PM, Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:
> On Tue, 2011-11-29 at 23:10 +0800, Peng Tao wrote:
>> On Tue, Nov 29, 2011 at 9:35 PM, Myklebust, Trond
>> <Trond.Myklebust@netapp.com> wrote:
>> >> -----Original Message-----
>> >> From: tao.peng@emc.com [mailto:tao.peng@emc.com]
>> >> Sent: Tuesday, November 29, 2011 7:40 AM
>> >> To: skinsbursky@parallels.com
>> >> Cc: Myklebust, Trond; linux-nfs@vger.kernel.org; xemul@parallels.com;
>> >> neilb@suse.de; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
>> >> jbottomley@parallels.com; bfields@fieldses.org; davem@davemloft.net;
>> >> devel@openvz.org
>> >> Subject: RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> >> from blocklayout routines
>> >>
>> >> > -----Original Message-----
>> >> > From: Stanislav Kinsbursky [mailto:skinsbursky@parallels.com]
>> >> > Sent: Tuesday, November 29, 2011 8:19 PM
>> >> > To: Peng, Tao
>> >> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Pavel
>> >> > Emelianov; neilb@suse.de; netdev@vger.kernel.org;
>> >> > linux-kernel@vger.kernel.org; James Bottomley; bfields@fieldses.org;
>> >> > davem@davemloft.net; devel@openvz.org
>> >> > Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> >> > from blocklayout routines
>> >> >
>> >> > 29.11.2011 16:00, tao.peng@emc.com пишет:
>> >> > >> -----Original Message-----
>> >> > >> From: linux-nfs-owner@vger.kernel.org
>> >> > >> [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of
>> >> > Stanislav
>> >> > >> Kinsbursky
>> >> > >> Sent: Tuesday, November 29, 2011 6:11 PM
>> >> > >> To: Trond.Myklebust@netapp.com
>> >> > >> Cc: linux-nfs@vger.kernel.org; xemul@parallels.com; neilb@suse.de;
>> >> > >> netdev@vger.kernel.org; linux- kernel@vger.kernel.org;
>> >> > >> jbottomley@parallels.com; bfields@fieldses.org;
>> >> > >> davem@davemloft.net; devel@openvz.org
>> >> > >> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
>> >> > >> from blocklayout routines
>> >> > >>
>> >> > >> This is a cleanup patch. We don't need this reference anymore,
>> >> > >> because blocklayout pipes dentries now creates and destroys in
>> >> > >> per-net operations and on PipeFS mount/umount notification.
>> >> > >> Note that nfs4blocklayout_register_net() now returns 0 instead of
>> >> > >> -ENOENT in case of PipeFS superblock absence. This is ok, because
>> >> > >> blocklayout pipe dentry will be created on PipeFS mount event.
>> >> > > When is the "pipefs mount event" going to happen? When inserting
>> >> > > kernel modules or when user issues
>> >> > mount command?
>> >> > >
>> >> >
>> >> > When user issues mount command.
>> >> > Kernel mounts of PipeFS has been removed with all these patch sets
>> >> > I've sent already.
>> >> Then it is going to break blocklayout user space program blkmapd, which is
>> >> stared before mounting any file system and it tries to open the pipe file
>> >> when started.
>> >
>> > Why on earth is blkmapd doing this instead of listening for file creation notifications like the other rpc_pipefs daemons do?
>> Not sure how the original implementer chose this but I think it is
>> likely because we do not expect the pipe file to be created or deleted
>> dynamically.
>
> Unless blkmapd can pin the sunrpc module (which it shouldn't be able to)
> then that assumption would be wrong. Please look into fixing blkmapd...
Sorry, I don't quite get it. Do you mean sunrpc module may be removed
while nfs/blocklayout modules are still in use? Please explain it a
bit. Thanks.

Best,
Tao
>
>   Trond
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com
>



-- 
Thanks,
Tao

^ permalink raw reply

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Dave Taht @ 2011-11-29 16:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Hutchings, Tom Herbert, davem, netdev
In-Reply-To: <1322576986.2465.10.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

On Tue, Nov 29, 2011 at 3:29 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 29 novembre 2011 à 14:24 +0000, Ben Hutchings a écrit :
>
>> Not if you separate hardware queues by priority (and your high priority
>> packets are non-TCP or PuSHed).
>
> I mostly have tg3 , bnx2 cards, mono queues...
>
> I presume Dave, working on small Wifi/ADSL routers have same kind of
> hardware.

Nothing but mono queues here on wired - 4 queues on wireless, however.

and a focus on trying to make sure the
10Gig guys don't swamp the 128Kbit to 100Mbit guys, and everything in
between that bandwidth range is what I care about, mostly against GigE
servers...

( I'm still waiting on some 10Gig hw donations to arrive)

However the hardware array is much larger than you presume.

We have a variety of hardware, ranging from 7 cerowrt routers located
in the bloatlab #1 at ISC.org,  where there are also a couple x86_64
based multicore servers, and a variety of related (mostly wireless)
hardware, such as a bunch of OLPCs.

Bloatlab #1 is in California, and connected to the internet via 10gige
and on a dedicated gigE connection all it's own.

http://www.bufferbloat.net/projects/cerowrt/wiki/BloatLab_1

With overly reduced TX rings to combat bufferbloat, the best the
routers in the lab can do is about 290Mbit. They have excellent TCP_RR
stats, though. With larger rings, they do 540Mbit+. It's my hope with
BQL on the router to get closer to the larger figure.

One of the x86 machines in the lab does TSO and it's ugly...

I'm now based in Paris specifically to be testing FQ and AQM solutions
over the 170 ms LFN between here and there and have been working on
QFQ + RED (while awaiting 'RED light' both at 100Mbit line rates and
at software simulated rates below that common to actual end user
connectivity to the internet.

http://www.bufferbloat.net/issues/312

I have 3 additional routers and  several e1000e machines here in Paris.

And I'm checking into the interactions of all this against everything
else against a variety of models. ISC has made the bloatlab is
available to all, I note, if anyone wants to run a test there, let me
know....


>
>
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Dave Taht @ 2011-11-29 16:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: John Fastabend, Tom Herbert, davem@davemloft.net,
	netdev@vger.kernel.org
In-Reply-To: <1322578667.2465.13.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

On Tue, Nov 29, 2011 at 3:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 29 novembre 2011 à 09:51 +0100, Dave Taht a écrit :
>
>> Perhaps I don't understand the gross effects of TSO very well, but if you have
>> 100 streams coming from a server, destined to X different destinations,
>> and you FQ to each on a per packet basis, you end up impacting the downstream
>> receive buffers throughout much less than if you send each stream as a burst.
>
> TSO makes packets larger, to lower cpu use in different layers (netfilter, qdisc, ...).
>
> Imagine you could have MSS=65000 on your ethernet wire.
>
> If you need to send a high prio packet while a prior big one is
> in-flight on a dumb device (a single TX FIFO), there is nothing you
> can do but wait last bit of big packet hit the wire.
>
> Even with one flow you lose. Hundred flows dont matter
> (as long as you have proper classification in Qdisc layer, of course)

People keep talking about 'prioritization' as if it can apply.

It doesn't. Prioritization and classification are nearly hopeless
exercises when you have high rate streams. It worked at low rates for
some traffic, but...

The focus for fixing bufferbloat is

               "better queueing"...

and what that translates out to is some form of

fair queuing

- at the moment I'm enthralled with QFQ btw -

coupled with some form of active queue management that works. (RED
used to work but was rather flawed - it's still better than the
alternative of drop tail)

It doesn't necessarily translate out to more unmanaged dumb queues, it
may translate
out to more *managed* queues.

I wouldn't mind TSO AT ALL if the hardware did some of the above
underneath it. I've heard some rumblings that that might happen.  We
spent all that engineering time making TCP go fast and minimized the
hardware impact of that - why not spend a little more time - in the
next generation of hw/sw - making TCP work *better* on the network?

Cisco did that in the 90s, what's so hard about trying now, in
software and/or hardware?

Now look, this thread has got way off the original topic, which was on
BQL, and BQL mostly rocks. The MIAD (as opposed to AIMD) controller in
it bothers me, but that can be looked at harder later.

and I want to go back to making it work on top of my testbed so I can
finally see models matching reality and vice versa and keep working on
producing demonstrable results

that can help fix some problems in software now for end users,
gateways, routers, servers, and data centers... reducing latencies
from trips around the moon to around your living room...

...and get slammed into hardware someday, if there is ever market
demand for things like interactive video, gaming, or voice
applications that just work.

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net

^ permalink raw reply

* list_debug WARNINGs followed by BUG in inetpeer.c
From: Bradley Peterson @ 2011-11-29 16:35 UTC (permalink / raw)
  To: netdev

Hello,

I have been seeing an issue where servers will report a series of
list_debug WARNINGs, usually lasting about 10 minutes, followed by a
BUG in inetpeer.c and a kernel panic.  This is across a variety of
hardware.  The kernels are 2.6.38.8 modified with some fixes in the
pptp, gre, and l2tp modules.  The server load is mostly bulk network
traffic, but they also act as VPN endpoints.

Here is an example.  In this case, it looks like the initial WARNING
was for a GRE packet, but that is not always the case.  I can provide
other examples, if needed.

The first two list_debug warnings:
[1029266.080589] ------------[ cut here ]------------
[1029266.085408] WARNING: at lib/list_debug.c:56 __list_del_entry+0x8d/0x98()
[1029266.092302] Hardware name: HDAMA
[1029266.095726]9266.277510]  [<ffffffff8141111d>] ip_defrag+0xce/0x9be
[1029266.282840]  [<ffffffffa026a9df>] ? pptp_rcv+0xaf/0xc3 [pptp]
[1029266.288775]  [<ffffffffa02640a3>] ? gre_rcv+0x62/0x75 [gre]
[1029266.294538]  [<ffffffff814498f5>] ipv4_conntrack_defrag+0xf6/0x125
[1029266.300907]  [<ffffffff81400e80>] nf_iterate+0x48/0x83
[1029266.306237]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[1029266.311999]  [<ffffffff81400f25>] nf_hook_slow+0x6a/0xe9
[1029266.317525]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[1029266.323279]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[1029266.329041]  [<ffffffff8141083b>] NF_HOOK.clone.7+0x46/0x58
[1029266.334795]  [<ffffffff81410bb5>] ip_rcv+0x21b/0x246
[1029266.339944]  [<ffffffff813dd584>] __netif_receive_skb+0x426/0x45c
[1029266.346224]  [<ffffffff813e002e>] ? dev_hard_start_xmit+0x3dc/0x4d8
[1029266.352688]  [<ffffffff813dd641>] process_backlog+0x87/0x15d
[1029266.358530]  [<ffffffff813de528>] net_rx_action+0xac/0x1b1
[1029266.364214]  [<ffffffff8105efaa>] __do_softirq+0xd2/0x19e
[1029266.369795]  [<ffffffff81078a95>] ? sched_clock_tick+0x70/0x75
[1029266.375818]  [<ffffffff8100bb5c>] call_softirq+0x1c/0x30
[1029266.381319]  [<ffffffff8100d287>] do_softirq+0x46/0x83
[1029266.386646]  [<ffffffff8105f132>] irq_exit+0x49/0x8b
[1029266.391796]  [<ffffffff81022b66>]
smp_call_function_single_interrupt+0x25/0x27
[1029266.399204]  [<ffffffff8100b7b3>] call_function_single_interrupt+0x13/0x20
[1029266.406264]  <EOI>
[1029266.408378] ---[ end trace 44734be3fa460007 ]---
[1029266.413380] ------------[ cut here ]------------
[1029266.418196] WARNING: at lib/list_debug.c:30 __list_add+0x68/0x80()
[1029266.424554] Hardware name: HDAMA
[1029266.427967] list_add corruption. prev->next should be next
(ffffffff81a7be60), but was ffff8801735f8428. (prev=ffff8801735f8428).
[1029266.439800] Modules linked in: authenc esp4 xfrm4_mode_transport
arc4 ppp_mppe tcp_diag inet_diag xt_NOTRACK iptable_raw pptp gre
l2tp_ppp pppox ppp_generic slhc l2tp_netlin
k l2tp_core tunrcv+0x21b/0x246
[1029266.638169]  [<ffffffff813dd584>] __netif_receive_skb+0x426/0x45c
[1029266.644440]  [<ffffffff81085756>] ? __smp_call_function_single+0xa9/0xb2
[1029266.651321]  [<ffffffff813dd641>] process_backlog+0x87/0x15d
[1029266.657161]  [<ffffffff8100b7b3>] ?
call_function_single_interrupt+0x13/0x20
[1029266.664388]  [<ffffffff813de528>] net_rx_action+0xac/0x1b1
[1029266.670064]  [<ffffffff8105efaa>] __do_softirq+0xd2/0x19e
[1029266.675644]  [<ffffffff810245cf>] ? ack_APIC_irq+0x15/0x17
[1029266.681309]  [<ffffffff8100bb5c>] call_softirq+0x1c/0x30
[1029266.686804]  [<ffffffff8100d287>] do_softirq+0x46/0x83
[1029266.692132]  [<ffffffff8105f132>] irq_exit+0x49/0x8b
[1029266.697283]  [<ffffffff8148ff5e>] do_IRQ+0x8e/0xa5
[1029266.702256]  [<ffffffff81489d93>] ret_from_intr+0x0/0x15
[1029266.707778]  <EOI>  [<ffffffff810b8394>] ? rcu_needs_cpu+0x10e/0x1bf
[1029266.714413]  [<ffffffff8102c61d>] ? native_safe_halt+0xb/0xd
[1029266.720270]  [<ffffffff81011fac>] ? need_resched+0x23/0x2d
[1029266.725937]  [<ffffffff810120fa>] default_idle+0x4e/0x86
[1029266.731430]  [<ffffffff8100932a>] cpu_idle+0xaa/0xcc
[1029266.736577]  [<ffffffff81471cce>] rest_init+0x72/0x74
[1029266.741821]  [<ffffffff81b58c44>] start_kernel+0x3f3/0x3fe
[1029266.747487]  [<ffffffff81b582cb>] x86_64_start_reservations+0xb6/0xba
[1029266.754106]  [<ffffffff81b583d5>] x86_64_start_kernel+0x106/0x115
[1029266.760388] ---[ end trace 44734be3fa460008 ]---

These continue with different traces, and different list functions,
until a BUG in inetpeer.c causes a panic:

[1029996.182197] ------------[ cut here ]------------
[1029996.187042] kernel BUG at net/ipv4/inetpeer.c:386!
[1029996.19200, threadinfo ffffffff81a00000, task ffffffff81a0b020)
[1029996.378121] Stack:
[1029996.380314]  ffff880172da9010 ffffffff81a7be98 ffff8800efc03cb0
000000013d5fe800
[1029996.387982]  00000000000927c0 ffff8800efc03ce0 ffff8800efc03ea0
ffffffff81a01fd8
[1029996.395644]  ffff8800efc03e40 ffffffff8140ff27 ffffffff81a7be90
0000000000000086
[1029996.403312] Call Trace:
[1029996.405946]  <IRQ>
[1029996.408253]  [<ffffffff8140ff27>] peer_check_expire+0x88/0x110
[1029996.414267]  [<ffffffff813de8e1>] ? __napi_schedule+0x48/0x4f
[1029996.420195]  [<ffffffff8123d047>] ? radix_tree_lookup+0xb/0xd
[1029996.426120]  [<ffffffff812e6be5>] ? add_interrupt_randomness+0x29/0x2e
[1029996.432830]  [<ffffffff810245b8>] ? apic_write+0x16/0x18
[1029996.438322]  [<ffffffff810245cf>] ? ack_APIC_irq+0x15/0x17
[1029996.443988]  [<ffffffff81025fb7>] ? ack_apic_level+0x61/0xf7
[1029996.449829]  [<ffffffff810b5951>] ? handle_fasteoi_irq+0xc9/0xd9
[1029996.456021]  [<ffffffff8106521c>] ? internal_add_timer+0xcf/0xd1
[1029996.462202]  [<ffffffff810652e5>] ? cascade+0x65/0x7f
[1029996.467436]  [<ffffffff810654c8>] run_timer_softirq+0x1c9/0x294
[1029996.473536]  [<ffffffff81489d93>] ? ret_from_intr+0x0/0x15
[1029996.479207]  [<ffffffff8140fe9f>] ? peer_check_expire+0x0/0x110
[1029996.485307]  [<ffffffff8105efaa>] __do_softirq+0xd2/0x19e
[1029996.490887]  [<ffffffff8107ff34>] ? tick_program_event+0x1f/0x21
[1029996.497072]  [<ffffffff8100bb5c>] call_softirq+0x1c/0x30
[1029996.502563]  [<ffffffff8100d287>] do_softirq+0x46/0x83
[1029996.507885]  [<ffffffff8105f132>] irq_exit+0x49/0x8b
[1029996.513043]  [<ffffffff8148fff3>] smp_apic_timer_interrupt+0x7e/0x8c
[1029996.519575]  [<ffffffff8100b613>] apic_timer_interrupt+0x13/0x20
[1029996.525758]  <EOI>
[1029996.528065]  [<ffffffff810b8394>] ? rcu_needs_cpu+0x10e/0x1bf
[1029996.533993]  [<ffffffff8102c61d>] ? native_safe_halt+0xb/0xd
[1029996.539834]  [<ffffffff81011fac>] ? need_resched+0x23/0x2d
[1029996.545498]  [<ffffffff810120fa>] default_idle+0x4e/0x86
[1029996.550994]  [<ffffffff8100932a>] cpu_idle+0xaa/0xcc
[1029996.556140]  [<ffffffff81471cce>] rest_init+0x72/0x74
[1029996.561377]  [<ffffffff81b58c44>] start_kernel+0x3f3/0x3fe
[1029996.567042]  [<ffffffff81b582cb>] x86_64_start_reservations+0xb6/0xba
[1029996.573663]  [<ffffffff81b583d5>] x86_64_start_kernel+0x106/0x115
[1029996.579935] Code: fd ff ff 85 c0 74 1f 49 8d 54 24 08 ff c0 49 0f
44 d4 49 89 55 00 4c 8b 22 49 83 c5 08 49 81 fc 70 dc 66 81 75 cf 49
39 dc 74 02 <0f> 0b 48 81 3b 70 dc 66
81 49 8d 75 f8 75 0d 49 8b 45 f8 48 8b
[1029996.600244] RIP  [<ffffffff8140fdca>] cleanup_once+0x117/0x1ec
[1029996.606284]  RSP <ffff8800efc03c90>
[1029996.610416] ---[ end trace 44734be3fa46001d ]---
[1029996.615210] Kernel panic - not syncing: Fatal exception in interrupt
[1029996.621746] Pid: 0, comm: swapper Tainted: G      D W
2.6.38.8-32.3.fix.fc14.x86_64 #1
[1029996.630013] Call Trace:
[1029996.632644]  <IRQ>  [<ffffffff81487898>] panic+0x91/0x1a4
[1029996.638251]  [<ffffffff8148ab64>] oops_end+0xb7/0xc7
[1029996.643401]  [<ffffffff8100e5f0>] die+0x5a/0x66
[1029996.648114]  [<ffffffff8148a448>] do_trap+0x121/0x130
[1029996.653347]  [<ffffffff8100bfed>] do_invalid_op+0x98/0xa1
[1029996.658927]  [<ffffffff8140fdca>] ? cleanup_once+0x117/0x1ec
[1029996.664768]  [<ffffffff81487a13>] ? printk+0x68/0x6d
[1029996.669914]  [<ffffffff8100e576>] ? show_trace+0x15/0x17
[1029996.675409]  [<ffffffff81b583d5>] ? x86_64_start_kernel+0x106/0x115
[1029996.681859]  [<ffffffff8100b8db>] invalid_op+0x1b/0x20
[1029996.687176]  [<ffffffff81410634>] ? ip_local_deliver_finish+0x0/0x1c1
[1029996.693798]  [<ffffffff8140fdca>] ? cleanup_once+0x117/0x1ec
[1029996.699636]  [<ffffffff8140ff27>] peer_check_expire+0x88/0x110
[1029996.705671]  [<ffffffff813de8e1>] ? __napi_schedule+0x48/0x4f
[1029996.711594]  [<ffffffff8123d047>] ? radix_tree_lookup+0xb/0xd
[1029996.717520]  [<ffffffff812e6be5>] ? add_interrupt_randomness+0x29/0x2e
[1029996.724230]  [<ffffffff810245b8>] ? apic_write+0x16/0x18
[1029996.729720]  [<ffffffff810245cf>] ? ack_APIC_irq+0x15/0x17
[1029996.735388]  [<ffffffff81025fb7>] ? ack_apic_level+0x61/0xf7
[1029996.741228]  [<ffffffff810b5951>] ? handle_fasteoi_irq+0xc9/0xd9
[1029996.747417]  [<ffffffff8106521c>] ? internal_add_timer+0xcf/0xd1
[1029996.753603]  [<ffffffff810652e5>] ? cascade+0x65/0x7f
[1029996.758837]  [<ffffffff810654c8>] run_timer_softirq+0x1c9/0x294
[1029996.764937]  [<ffffffff81489d93>] ? ret_from_intr+0x0/0x15
[1029996.770604]  [<ffffffff8140fe9f>] ? peer_check_expire+0x0/0x110
[1029996.776707]  [<ffffffff8105efaa>] __do_softirq+0xd2/0x19e
[1029996.782285]  [<ffffffff8107ff34>] ? tick_program_event+0x1f/0x21
[1029996.788478]  [<ffffffff8100bb5c>] call_softirq+0x1c/0x30
[1029996.793974]  [<ffffffff8100d287>] do_softirq+0x46/0x83
[1029996.799294]  [<ffffffff8105f132>] irq_exit+0x49/0x8b
[1029996.804442]  [<ffffffff8148fff3>] smp_apic_timer_interrupt+0x7e/0x8c
[1029996.810976]  [<ffffffff8100b613>] apic_timer_interrupt+0x13/0x20
[1029996.817159]  <EOI>  [<ffffffff810b8394>] ? rcu_needs_cpu+0x10e/0x1bf
[1029996.823721]  [<ffffffff8102c61d>] ? native_safe_halt+0xb/0xd
[1029996.829560]  [<ffffffff81011fac>] ? need_resched+0x23/0x2d
[1029996.835228]  [<ffffffff810120fa>] default_idle+0x4e/0x86
[1029996.840721]  [<ffffffff8100932a>] cpu_idle+0xaa/0xcc
[1029996.845868]  [<ffffffff81471cce>] rest_init+0x72/0x74
[1029996.851102]  [<ffffffff81b58c44>] start_kernel+0x3f3/0x3fe
[1029996.856771]  [<ffffffff81b582cb>] x86_64_start_reservations+0xb6/0xba
[1029996.863390]  [<ffffffff81b583d5>] x86_64_start_kernel+0x106/0x115

Any ideas how to troubleshoot this?

Thanks,
Bradley Peterson

^ permalink raw reply

* Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode
From: Ben Hutchings @ 2011-11-29 16:35 UTC (permalink / raw)
  To: Greg Rose
  Cc: Roopa Prabhu, netdev@vger.kernel.org, davem@davemloft.net,
	chrisw@redhat.com, sri@us.ibm.com, dragos.tatulea@gmail.com,
	kvm@vger.kernel.org, arnd@arndb.de, mst@redhat.com,
	mchan@broadcom.com, dwang2@cisco.com, shemminger@vyatta.com,
	eric.dumazet@gmail.com, kaber@trash.net, benve@cisco.com
In-Reply-To: <4ECA8D50.9080603@intel.com>

On Mon, 2011-11-21 at 09:41 -0800, Greg Rose wrote:
> On 11/18/2011 9:40 AM, Ben Hutchings wrote:
[...]
> > What concerns me is that this seems to be a workaround rather than a fix
> > for over-use of promiscuous mode, and it changes the semantics of
> > filtering modes in ways that haven't been well-specified.
> 
> I feel the opposite is true.  It allows a known set of receive filters 
> so that you don't have to use promiscuous mode, which cuts down on 
> overhead from processing packets the upper layer stack isn't really 
> interested in.
>
> >
> > What if there's a software bridge between two net devices corresponding
> > to separate physical ports, so that they really need to be promiscuous?
> > What if the administrator runs tcpdump and really wants the (PF) net
> > device to be promiscuous?
> 
> I don't believe there is anything in this patch set that removes 
> promiscuous mode operation as it is commonly used.  Perhaps I've missed 
> something.
[...]

Maybe I missed something!

Let's be clear on what our models are for filtering.  At the moment we
have MAC filters set through ndo_set_rx_mode and VF filters set through
ndo_set_vf_{mac,vlan}.

Ignoring anti-spoofing for the moment, should the currently defined
filters look like this (a):

                TX ^   | RX
                   |   v
+------------------+---+-----------------+
|                  |  ++------------+    |
|                  |  |RX MAC filter|    |
|                  |  ++------------+    |
|                  |   |match            |
|                  ^   v                 |
|                  |  ++------------+    |
|                  |  |RX VF filters|    |
|                  |  +-------+-----+    |
|                 /|\     no /|\         |
|                | | \ match/ | |match 2 |
|                | ^  \    /  v |        |
|                | |   \  /match|        |
|                |  \   \/  1/  |        |
|                |   \  /\  /   |        |
|                ^    \/  \/    v        |
|                |    /\  /\    |        |
|                |   /  ||  \   |        |
|                |  /   ||   \  |        |
|                | /    ||    \ |        |
|                ||     ||     ||        |
+----------------++-----++-----++--------+
                 ||     ||     ||
                 PF    VF 1   VF 2

or like this (b):

                TX ^   | RX
                   |   v
+------------------+---+-----------------+
|                  |  ++------------+    |
|                  |  |RX VF filters|    |
|                  |  ++--------+---+    |
|                  | no|match  /|        |
|                  ^   v      | |        |
|                  | +-+----+ | |        |
|                  | |RX MAC| | |        |
|                  | |filter| | |        |
|                  | +------+ | |        |
|                  |   |match | |        |
|                 /|\  |      | |        |
|                | | \ | match| |match 2 |
|                | ^  \/    1 v |        |
|                | |  /\      | |        |
|                |  \/  \    /  |        |
|                |  /\   \  /   |        |
|                ^ /  \   \/    v        |
|                ||    \  /\    |        |
|                ||     ||  \   |        |
|                ||     ||   \  |        |
|                ||     ||    \ |        |
|                ||     ||     ||        |
+----------------++-----++-----++--------+
                 ||     ||     ||
                 PF    VF 1   VF 2

I think the current model is (a); do you agree?

So is the proposed new model something like this (c):

                TX ^   | RX
                   |   v
+------------------+---+-----------------+
|                  |  ++------------+    |
|                  |  |RX MAC filter|    |
|                  ^  ++------------+    |
|                  |   |match            |
|          no match|   v                 |
|  +----------------+ ++------------+    |
|  |loopback filters| |RX VF filters|    |
|  +---------+-----++ +-------+-----+    |
|           /|\   /|\ match  /|\         |
|          v | `-+>+-+-.2   / | |        |
|           \ \  | |m \ \   / | |        |
|     match 0\ `-+-+.a \ \ /  v |        |
|             \  | | \t \ X   / |        |
|              \ |  \ \c X X /  |        |
|               \|\  \ \h \ X   |        |
|                \ \  \/\1 X \  v        |
|                ||   /\ |/ \ \ |        |
|                |v  /  ||   \ \|        |
|                || /   ^|    \ |        |
|                ||/    |v     \|        |
|                ||     ||     ||        |
+----------------++-----++-----++--------+
                 ||     ||     ||
                 PF    VF 1   VF 2

(I've labelled the new filters as loopback filters here, and I'm still
leaving out anti-spoofing.)

If not, please explain what the new model *is*.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Trond Myklebust @ 2011-11-29 16:40 UTC (permalink / raw)
  To: Peng Tao
  Cc: tao.peng, skinsbursky, linux-nfs, xemul, neilb, netdev,
	linux-kernel, jbottomley, bfields, davem, devel
In-Reply-To: <CA+a=Yy6kJe0NemSiRwM+t8gz5-UF5fOdwdS2so+q=nSYjBx--w@mail.gmail.com>

On Tue, 2011-11-29 at 23:30 +0800, Peng Tao wrote: 
> On Tue, Nov 29, 2011 at 11:18 PM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
> > On Tue, 2011-11-29 at 23:10 +0800, Peng Tao wrote:
> >> On Tue, Nov 29, 2011 at 9:35 PM, Myklebust, Trond
> >> <Trond.Myklebust@netapp.com> wrote:
> >> >> -----Original Message-----
> >> >> From: tao.peng@emc.com [mailto:tao.peng@emc.com]
> >> >> Sent: Tuesday, November 29, 2011 7:40 AM
> >> >> To: skinsbursky@parallels.com
> >> >> Cc: Myklebust, Trond; linux-nfs@vger.kernel.org; xemul@parallels.com;
> >> >> neilb@suse.de; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> >> >> jbottomley@parallels.com; bfields@fieldses.org; davem@davemloft.net;
> >> >> devel@openvz.org
> >> >> Subject: RE: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> >> from blocklayout routines
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Stanislav Kinsbursky [mailto:skinsbursky@parallels.com]
> >> >> > Sent: Tuesday, November 29, 2011 8:19 PM
> >> >> > To: Peng, Tao
> >> >> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Pavel
> >> >> > Emelianov; neilb@suse.de; netdev@vger.kernel.org;
> >> >> > linux-kernel@vger.kernel.org; James Bottomley; bfields@fieldses.org;
> >> >> > davem@davemloft.net; devel@openvz.org
> >> >> > Subject: Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> >> > from blocklayout routines
> >> >> >
> >> >> > 29.11.2011 16:00, tao.peng@emc.com пишет:
> >> >> > >> -----Original Message-----
> >> >> > >> From: linux-nfs-owner@vger.kernel.org
> >> >> > >> [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of
> >> >> > Stanislav
> >> >> > >> Kinsbursky
> >> >> > >> Sent: Tuesday, November 29, 2011 6:11 PM
> >> >> > >> To: Trond.Myklebust@netapp.com
> >> >> > >> Cc: linux-nfs@vger.kernel.org; xemul@parallels.com; neilb@suse.de;
> >> >> > >> netdev@vger.kernel.org; linux- kernel@vger.kernel.org;
> >> >> > >> jbottomley@parallels.com; bfields@fieldses.org;
> >> >> > >> davem@davemloft.net; devel@openvz.org
> >> >> > >> Subject: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference
> >> >> > >> from blocklayout routines
> >> >> > >>
> >> >> > >> This is a cleanup patch. We don't need this reference anymore,
> >> >> > >> because blocklayout pipes dentries now creates and destroys in
> >> >> > >> per-net operations and on PipeFS mount/umount notification.
> >> >> > >> Note that nfs4blocklayout_register_net() now returns 0 instead of
> >> >> > >> -ENOENT in case of PipeFS superblock absence. This is ok, because
> >> >> > >> blocklayout pipe dentry will be created on PipeFS mount event.
> >> >> > > When is the "pipefs mount event" going to happen? When inserting
> >> >> > > kernel modules or when user issues
> >> >> > mount command?
> >> >> > >
> >> >> >
> >> >> > When user issues mount command.
> >> >> > Kernel mounts of PipeFS has been removed with all these patch sets
> >> >> > I've sent already.
> >> >> Then it is going to break blocklayout user space program blkmapd, which is
> >> >> stared before mounting any file system and it tries to open the pipe file
> >> >> when started.
> >> >
> >> > Why on earth is blkmapd doing this instead of listening for file creation notifications like the other rpc_pipefs daemons do?
> >> Not sure how the original implementer chose this but I think it is
> >> likely because we do not expect the pipe file to be created or deleted
> >> dynamically.
> >
> > Unless blkmapd can pin the sunrpc module (which it shouldn't be able to)
> > then that assumption would be wrong. Please look into fixing blkmapd...
> Sorry, I don't quite get it. Do you mean sunrpc module may be removed
> while nfs/blocklayout modules are still in use? Please explain it a
> bit. Thanks.

I mean that I'm perfectly entitled to do

'modprobe -r blocklayoutdriver'

and when I do that, then I expect blkmapd to close the rpc pipe and wait
for a new one to be created just like rpc.idmapd and rpc.gssd do when I
remove the nfs and sunrpc modules.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply

* Re: [PATCH v4 0/10] bql: Byte Queue Limits
From: Ben Hutchings @ 2011-11-29 16:41 UTC (permalink / raw)
  To: Dave Taht; +Cc: Eric Dumazet, Tom Herbert, davem, netdev
In-Reply-To: <CAA93jw5UG4=QRN3Wnh82wRg8YCSV7vDqGp0HyeVxsihUwLuioQ@mail.gmail.com>

On Tue, 2011-11-29 at 17:06 +0100, Dave Taht wrote:
> On Tue, Nov 29, 2011 at 3:29 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Le mardi 29 novembre 2011 à 14:24 +0000, Ben Hutchings a écrit :
> >
> >> Not if you separate hardware queues by priority (and your high priority
> >> packets are non-TCP or PuSHed).
> >
> > I mostly have tg3 , bnx2 cards, mono queues...
> >
> > I presume Dave, working on small Wifi/ADSL routers have same kind of
> > hardware.
> 
> Nothing but mono queues here on wired - 4 queues on wireless, however.
> 
> and a focus on trying to make sure the
> 10Gig guys don't swamp the 128Kbit to 100Mbit guys, and everything in
> between that bandwidth range is what I care about, mostly against GigE
> servers...

I'm not objecting to that, just the assertion that TSO can be a problem
even on 10G hardware.  In fact it makes a big improvement to CPU
efficiency (even if you do it in the driver, it can be better than GSO)
and almost all 10G hardware has multiple queues which can be used to
avoid the latency penalty.

> ( I'm still waiting on some 10Gig hw donations to arrive)
[...]

If you have a proposal to do interesting things with 10G hardware and
drivers then I can forward it for consideration here.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: J. Bruce Fields @ 2011-11-29 16:42 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peng Tao, tao.peng-mb1K0bWo544,
	skinsbursky-bzQdu9zFT3WakBO8gow8eQ,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	neilb-l3A5Bk7waGM, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	jbottomley-bzQdu9zFT3WakBO8gow8eQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	devel-GEFAQzZX7r8dnm+yROfE0A
In-Reply-To: <1322584830.4174.16.camel-SyLVLa/KEI9HwK5hSS5vWB2eb7JE58TQ@public.gmane.org>

On Tue, Nov 29, 2011 at 11:40:30AM -0500, Trond Myklebust wrote:
> I mean that I'm perfectly entitled to do
> 
> 'modprobe -r blocklayoutdriver'
> 
> and when I do that, then I expect blkmapd to close the rpc pipe and wait
> for a new one to be created just like rpc.idmapd and rpc.gssd do when I
> remove the nfs and sunrpc modules.

The rpc pipefs mount doesn't hold a reference on the sunrpc module?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox