Netdev List

Netdev List
 help / color / mirror / Atom feed

* sky2 still badly broken
From: Niccolò Belli @ 2012-04-27 20:15 UTC (permalink / raw)
  To: netdev

dmesg is full of

[ 1464.914044] sky2 0000:06:00.0: eth0: rx error, status 0x5220002 
length 1314
[ 1465.005628] sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 532
[ 1465.204459] sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 532
[ 1465.825909] sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 532
[ 1468.715858] net_ratelimit: 3 callbacks suppressed
[ 1468.715865] sky2 0000:06:00.0: eth0: rx error, status 0x7ffc0001 
length 532

dhcp does not work (after a few minutes it does not renew the ip)

while under heavy load the card resets and I have to unload and reload 
the module

also transfer rate is VERY low.

lowering the mtu does help a bit, but it doesn't make miracles...

kernel 3.4.0-rc4

06:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8055 PCI-E 
Gigabit Ethernet Controller (rev 13)

http://forums.gentoo.org/viewtopic-t-487018-start-0-postdays-0-postorder-asc-highlight-.html

Thanks,
Niccolò

^ permalink raw reply

* Re: [PATCH 2/2] net/ieee802154/6lowpan.c:  reuse eth_mac_addr()
From: Tony Cheneau @ 2012-04-27 20:21 UTC (permalink / raw)
  To: Danny Kukawka
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Danny Kukawka,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	David S. Miller
In-Reply-To: <1329914199-30862-3-git-send-email-danny.kukawka-2YacvwyR+KOzQB+pC5nmwQ@public.gmane.org>

Hello,

I believe the following patch should not have been applied and should 
now be reverted.
The reason is because eth_mac_addr() is not functionally equivalent to 
lowpan_set_address() that it replaces:
- lowpan_set_address() copies dev->addr_len bytes, where dev->addr_len 
is set to 8 bytes for IEEE802154 devices (this is an IEEE EUI-64 
address). Hence, the function copies the full 8 bytes of the address.
- eth_mac_addr() copies ETH_ALEN bytes, where ETH_ALEN is set to 6. 
Hence, the function copies only 6 bytes of the 8 bytes address. So, 2 
bytes of the address are never copied.

Does this sound reasonable?

Regards,
Tony


Le 22.02.2012 14:36, Danny Kukawka a écrit :
> Use eth_mac_addr() for .ndo_set_mac_address, remove
> lowpan_set_address since it do currently the same as
> eth_mac_addr(). Additional advantage: eth_mac_addr() already
> checks if the given address is valid
>
> Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
> ---
>  net/ieee802154/6lowpan.c |   16 ++--------------
>  1 files changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
> index e4ecc1e..3685158 100644
> --- a/net/ieee802154/6lowpan.c
> +++ b/net/ieee802154/6lowpan.c
> @@ -55,6 +55,7 @@
>  #include <linux/module.h>
>  #include <linux/moduleparam.h>
>  #include <linux/netdevice.h>
> +#include <linux/etherdevice.h>
>  #include <net/af_ieee802154.h>
>  #include <net/ieee802154.h>
>  #include <net/ieee802154_netdev.h>
> @@ -924,19 +925,6 @@ drop:
>  	return -EINVAL;
>  }
>
> -static int lowpan_set_address(struct net_device *dev, void *p)
> -{
> -	struct sockaddr *sa = p;
> -
> -	if (netif_running(dev))
> -		return -EBUSY;
> -
> -	/* TODO: validate addr */
> -	memcpy(dev->dev_addr, sa->sa_data, dev->addr_len);
> -
> -	return 0;
> -}
> -
>  static int lowpan_get_mac_header_length(struct sk_buff *skb)
>  {
>  	/*
> @@ -1062,7 +1050,7 @@ static struct header_ops lowpan_header_ops = {
>
>  static const struct net_device_ops lowpan_netdev_ops = {
>  	.ndo_start_xmit		= lowpan_xmit,
> -	.ndo_set_mac_address	= lowpan_set_address,
> +	.ndo_set_mac_address	= eth_mac_addr,
>  };
>
>  static void lowpan_setup(struct net_device *dev)


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Linux-zigbee-devel mailing list
Linux-zigbee-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-zigbee-devel

^ permalink raw reply

* Re: [PATCH RFC 1/2] netfilter: conntrack: remove RCU usage in conntrack notifier
From: Benjamin Poirier @ 2012-04-27 20:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	Andrew Morton, Mike Frysinger, Arun Sharma, netfilter-devel,
	netfilter, coreteam, linux-kernel, Paul E. McKenney
In-Reply-To: <1335553131.2775.252.camel@edumazet-glaptop>

On 2012/04/27 20:58, Eric Dumazet wrote:
> On Fri, 2012-04-27 at 14:28 -0400, Benjamin Poirier wrote:
> > I think that the rcu usage in this code is pointless. It should either be
> > removed or, if it was intended to protect against something, it ought to make
> > that clear.
> > 
> > 1) The code does not make use of the deferred deletion/wait for completion rcu
> > api (ie. synchronize_rcu(), call_rcu()).
> > 2) It does not benefit from the barriers implied by the rcu primitives used.
> > The code deals with callback pointers. There's no need to order writes to the
> > function code (!) before writes to the function pointers here.
> > ---
> 
> At a first glance, this seems pretty wrong.
> 
> code can disappear under you, thats for sure.
> 
> CONFIG_NF_CT_NETLINK=m
> 
> hint : module unload contains an rcu barrier.
> 

Thank you Eric. I had indeed failed to consider module load/unload
cases, which are effectively "writes to function code".

^ permalink raw reply

* [PATCH v2 2/2] drop_monitor: Make updating data->skb smp safe
From: Neil Horman @ 2012-04-27 20:11 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, David Miller
In-Reply-To: <1335557509-32726-1-git-send-email-nhorman@tuxdriver.com>

Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
its smp protections.  Specifically, its possible to replace data->skb while its
being written.  This patch corrects that by making data->skb and rcu protected
variable.  That will prevent it from being overwritten while a tracepoint is
modifying it.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
---
 net/core/drop_monitor.c |   70 ++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index a221a5b..4e04cf6 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -46,7 +46,7 @@ static DEFINE_MUTEX(trace_state_mutex);
 
 struct per_cpu_dm_data {
 	struct work_struct dm_alert_work;
-	struct sk_buff *skb;
+	struct sk_buff __rcu *skb;
 	atomic_t dm_hit_count;
 	struct timer_list send_timer;
 };
@@ -73,35 +73,58 @@ static int dm_hit_limit = 64;
 static int dm_delay = 1;
 static unsigned long dm_hw_check_delta = 2*HZ;
 static LIST_HEAD(hw_stats_list);
+static int initalized = 0;
 
 static void reset_per_cpu_data(struct per_cpu_dm_data *data)
 {
 	size_t al;
 	struct net_dm_alert_msg *msg;
 	struct nlattr *nla;
+	struct sk_buff *skb;
+	struct sk_buff *oskb = rcu_dereference_protected(data->skb, 1);
 
 	al = sizeof(struct net_dm_alert_msg);
 	al += dm_hit_limit * sizeof(struct net_dm_drop_point);
 	al += sizeof(struct nlattr);
 
-	data->skb = genlmsg_new(al, GFP_KERNEL);
-	genlmsg_put(data->skb, 0, 0, &net_drop_monitor_family,
-			0, NET_DM_CMD_ALERT);
-	nla = nla_reserve(data->skb, NLA_UNSPEC, sizeof(struct net_dm_alert_msg));
-	msg = nla_data(nla);
-	memset(msg, 0, al);
-	atomic_set(&data->dm_hit_count, dm_hit_limit);
+	skb = genlmsg_new(al, GFP_KERNEL);
+
+	if (skb) {
+		genlmsg_put(skb, 0, 0, &net_drop_monitor_family,
+				0, NET_DM_CMD_ALERT);
+		nla = nla_reserve(skb, NLA_UNSPEC,
+				  sizeof(struct net_dm_alert_msg));
+		msg = nla_data(nla);
+		memset(msg, 0, al);
+	} else if (initalized)
+		schedule_work_on(smp_processor_id(), &data->dm_alert_work);
+
+	/*
+	 * Don't need to lock this, since we are guaranteed to only
+	 * run this on a single cpu at a time.
+	 * Note also that we only update data->skb if the old and new skb
+	 * pointers don't match.  This ensures that we don't continually call
+	 * synchornize_rcu if we repeatedly fail to alloc a new netlink message.
+	 */
+	if (skb != oskb) {
+		rcu_assign_pointer(data->skb, skb);
+
+		synchronize_rcu();
+
+		atomic_set(&data->dm_hit_count, dm_hit_limit);
+	}
+
 }
 
 static void send_dm_alert(struct work_struct *unused)
 {
 	struct sk_buff *skb;
-	struct per_cpu_dm_data *data = &__get_cpu_var(dm_cpu_data);
+	struct per_cpu_dm_data *data = &get_cpu_var(dm_cpu_data);
 
 	/*
 	 * Grab the skb we're about to send
 	 */
-	skb = data->skb;
+	skb = rcu_dereference_protected(data->skb, 1);
 
 	/*
 	 * Replace it with a new one
@@ -111,8 +134,10 @@ static void send_dm_alert(struct work_struct *unused)
 	/*
 	 * Ship it!
 	 */
-	genlmsg_multicast(skb, 0, NET_DM_GRP_ALERT, GFP_KERNEL);
+	if (skb)
+		genlmsg_multicast(skb, 0, NET_DM_GRP_ALERT, GFP_KERNEL);
 
+	put_cpu_var(dm_cpu_data);
 }
 
 /*
@@ -123,9 +148,11 @@ static void send_dm_alert(struct work_struct *unused)
  */
 static void sched_send_work(unsigned long unused)
 {
-	struct per_cpu_dm_data *data =  &__get_cpu_var(dm_cpu_data);
+	struct per_cpu_dm_data *data =  &get_cpu_var(dm_cpu_data);
+
+	schedule_work_on(smp_processor_id(), &data->dm_alert_work);
 
-	schedule_work(&data->dm_alert_work);
+	put_cpu_var(dm_cpu_data);
 }
 
 static void trace_drop_common(struct sk_buff *skb, void *location)
@@ -134,9 +161,16 @@ static void trace_drop_common(struct sk_buff *skb, void *location)
 	struct nlmsghdr *nlh;
 	struct nlattr *nla;
 	int i;
-	struct per_cpu_dm_data *data = &__get_cpu_var(dm_cpu_data);
+	struct sk_buff *dskb;
+	struct per_cpu_dm_data *data = &get_cpu_var(dm_cpu_data);
 
 
+	rcu_read_lock();
+	dskb = rcu_dereference(data->skb);
+
+	if (!dskb)
+		goto out;
+
 	if (!atomic_add_unless(&data->dm_hit_count, -1, 0)) {
 		/*
 		 * we're already at zero, discard this hit
@@ -144,7 +178,7 @@ static void trace_drop_common(struct sk_buff *skb, void *location)
 		goto out;
 	}
 
-	nlh = (struct nlmsghdr *)data->skb->data;
+	nlh = (struct nlmsghdr *)dskb->data;
 	nla = genlmsg_data(nlmsg_data(nlh));
 	msg = nla_data(nla);
 	for (i = 0; i < msg->entries; i++) {
@@ -158,7 +192,7 @@ static void trace_drop_common(struct sk_buff *skb, void *location)
 	/*
 	 * We need to create a new entry
 	 */
-	__nla_reserve_nohdr(data->skb, sizeof(struct net_dm_drop_point));
+	__nla_reserve_nohdr(dskb, sizeof(struct net_dm_drop_point));
 	nla->nla_len += NLA_ALIGN(sizeof(struct net_dm_drop_point));
 	memcpy(msg->points[msg->entries].pc, &location, sizeof(void *));
 	msg->points[msg->entries].count = 1;
@@ -170,6 +204,8 @@ static void trace_drop_common(struct sk_buff *skb, void *location)
 	}
 
 out:
+	rcu_read_unlock();
+	put_cpu_var(dm_cpu_data);
 	return;
 }
 
@@ -375,6 +411,8 @@ static int __init init_net_drop_monitor(void)
 		data->send_timer.function = sched_send_work;
 	}
 
+	initalized = 1;
+
 	goto out;
 
 out_unreg:
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH v2 1/2] drop_monitor: fix sleeping in invalid context warning
From: Neil Horman @ 2012-04-27 20:11 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, David Miller
In-Reply-To: <1335557509-32726-1-git-send-email-nhorman@tuxdriver.com>

Eric Dumazet pointed out this warning in the drop_monitor protocol to me:

[   38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
[   38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
[   38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
[   38.352582] Call Trace:
[   38.352592]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[   38.352599]  [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
[   38.352606]  [<ffffffff81655b16>] mutex_lock+0x26/0x50
[   38.352610]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[   38.352616]  [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
[   38.352621]  [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
[   38.352625]  [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
[   38.352630]  [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
[   38.352636]  [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
[   38.352640]  [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
[   38.352645]  [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
[   38.352649]  [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
[   38.352653]  [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
[   38.352658]  [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
[   38.352663]  [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
[   38.352668]  [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
[   38.352673]  [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
[   38.352677]  [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390
[   38.352682]  [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210
[   38.352687]  [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0
[   38.352693]  [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0
[   38.352699]  [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10
[   38.352703]  [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140
[   38.352708]  [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80
[   38.352713]  [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b

It stems from holding a spinlock (trace_state_lock) while attempting to register
or unregister tracepoint hooks, making in_atomic() true in this context, leading
to the warning when the tracepoint calls might_sleep() while its taking a mutex.
Since we only use the trace_state_lock to prevent trace protocol state races, as
well as hardware stat list updates on an rcu write side, we can just convert the
spinlock to a mutex to avoid this problem.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
---
 net/core/drop_monitor.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 5c3c81a..a221a5b 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -42,7 +42,7 @@ static void send_dm_alert(struct work_struct *unused);
  * netlink alerts
  */
 static int trace_state = TRACE_OFF;
-static DEFINE_SPINLOCK(trace_state_lock);
+static DEFINE_MUTEX(trace_state_mutex);
 
 struct per_cpu_dm_data {
 	struct work_struct dm_alert_work;
@@ -214,7 +214,7 @@ static int set_all_monitor_traces(int state)
 	struct dm_hw_stat_delta *new_stat = NULL;
 	struct dm_hw_stat_delta *temp;
 
-	spin_lock(&trace_state_lock);
+	mutex_lock(&trace_state_mutex);
 
 	if (state == trace_state) {
 		rc = -EAGAIN;
@@ -253,7 +253,7 @@ static int set_all_monitor_traces(int state)
 		rc = -EINPROGRESS;
 
 out_unlock:
-	spin_unlock(&trace_state_lock);
+	mutex_unlock(&trace_state_mutex);
 
 	return rc;
 }
@@ -296,12 +296,12 @@ static int dropmon_net_event(struct notifier_block *ev_block,
 
 		new_stat->dev = dev;
 		new_stat->last_rx = jiffies;
-		spin_lock(&trace_state_lock);
+		mutex_lock(&trace_state_mutex);
 		list_add_rcu(&new_stat->list, &hw_stats_list);
-		spin_unlock(&trace_state_lock);
+		mutex_unlock(&trace_state_mutex);
 		break;
 	case NETDEV_UNREGISTER:
-		spin_lock(&trace_state_lock);
+		mutex_lock(&trace_state_mutex);
 		list_for_each_entry_safe(new_stat, tmp, &hw_stats_list, list) {
 			if (new_stat->dev == dev) {
 				new_stat->dev = NULL;
@@ -312,7 +312,7 @@ static int dropmon_net_event(struct notifier_block *ev_block,
 				}
 			}
 		}
-		spin_unlock(&trace_state_lock);
+		mutex_unlock(&trace_state_mutex);
 		break;
 	}
 out:
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH v2 0/2] drop_monitor: misc fixes for some recently reported bugs
From: Neil Horman @ 2012-04-27 20:11 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Eric Dumazet, David Miller
In-Reply-To: <1335466022-32661-1-git-send-email-nhorman@tuxdriver.com>

Hey-
        Eric was using dropwatch recently and reported a few bugs to me that he
had noted.  This short series should fix them all up.

Change Notes:
V2)
	renamed trace_state_lock to trace_state_mutex
	cleaned up rcu write side access to use rcu_dereference_protected
	handled some NULL allocation failure cases


Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH] e1000e:  MSI interrupt test failed, using legacy interrupt
From: Prasanna Panchamukhi @ 2012-04-27 19:52 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: netdev, bruce.w.allan, jesse.brandeburg, e1000-devel, gospo,
	sassmann
In-Reply-To: <1335490840.4056.137.camel@deadeye>

On 04/26/2012 06:40 PM, Ben Hutchings wrote:
> On Thu, 2012-04-26 at 17:05 -0700, prasanna.panchamukhi@riverbed.com
> wrote:
>> From: Prasanna S. Panchamukhi<ppanchamukhi@riverbed.com>
>>
>> Following logs where seen on Systems with multiple NICs&  ports,
>> while using MSI interrupts as shown below:
> [...]
>> This patch changes the IRQ tests to use polling loops starting with a
>> delay of 1 tick and doubling that if necessary up to a maximum total
>> delay of approximately 1 second.
> [...]
>
> I don't think you entirely understood my point.  This exponential
> back-off was the best I could come up with for a self-test
> (ethtool -t) where we have to answer the question 'are interrupts
> working' in a reasonable time, and the user can decide for themselves
> whether this was a false negative.
>
> If you want a definite answer in order to make an automatic decision
> about interrupt modes, there really is no valid time limit.  I think the
This issue is not able interrupt mode selection.

-Prasanna
> proper way to do interrupt mode selection in an in-tree driver is to
> rely on no-MSI quirks for chipsets and devices in the PCI core.
>
> Ben.
>


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* 72012-4-28
From: 保真优惠发票13928472421张 @ 2012-04-27 19:43 UTC (permalink / raw)
  To: nelhage

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="GB2312", Size: 12 bytes --]

éaj\a¤²™bqç(

[-- Attachment #2: Ycglpu_9894.gif --]
[-- Type: image/gif, Size: 5 bytes --]

^ permalink raw reply

* Re: [PATCH 3/4] bnx2x: add transmit timestamping support
From: Eilon Greenstein @ 2012-04-27 19:30 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: davem, netdev, jeffrey.t.kirscher, aabdulla, meravs
In-Reply-To: <1335553447-11964-4-git-send-email-willemb@google.com>

On Fri, 2012-04-27 at 15:04 -0400, Willem de Bruijn wrote:
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Acked-by: Eilon Greenstein <eilong@broadcom.com>

Thanks Willem - Merav was just testing exactly the same change, so far
it seems to be working on our side as well :)

^ permalink raw reply

* Technical Support
From: WEB ADMINISTRATOR @ 2012-04-27 18:52 UTC (permalink / raw)




Your webmail quota system has exceeded the storage limit,Due To hidden
files On your mailbox,Your Are Currently Running On 23GB to  Validate your
mailbox and increase your quota.

In addition, You are advice to fill in your
appropriate email login information in order to boost your webmail quota
and to enhance the security of your email account.

Email: ...............
Username:................
password:................
Confirm password .............

If your mailbox remains de-activated for an extended period of time, it
may result in further limitations or eventual closure

Technical Support
192.168.0.1

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

^ permalink raw reply

* [PATCH 4/4] forcedeth: add transmit timestamping support
From: Willem de Bruijn @ 2012-04-27 19:04 UTC (permalink / raw)
  To: davem, netdev, jeffrey.t.kirscher, eilong, aabdulla; +Cc: Willem de Bruijn
In-Reply-To: <1335553447-11964-1-git-send-email-willemb@google.com>

Insert an skb_tx_timestamp call in both ndo_start_xmit routines
Tested to work for the nv_start_xmit_optimized case

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index d93a088..928913c 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -2279,6 +2279,8 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	netdev_sent_queue(np->dev, skb->len);
 
+	skb_tx_timestamp(skb);
+
 	np->put_tx.orig = put_tx;
 
 	spin_unlock_irqrestore(&np->lock, flags);
@@ -2426,6 +2428,8 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 
 	netdev_sent_queue(np->dev, skb->len);
 
+	skb_tx_timestamp(skb);
+
 	np->put_tx.ex = put_tx;
 
 	spin_unlock_irqrestore(&np->lock, flags);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 2/4] e1000e: add transmit timestamping support
From: Willem de Bruijn @ 2012-04-27 19:04 UTC (permalink / raw)
  To: davem, netdev, jeffrey.t.kirscher, eilong, aabdulla; +Cc: Willem de Bruijn
In-Reply-To: <1335553447-11964-1-git-send-email-willemb@google.com>

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 851f793..c5d9ad7 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5150,6 +5150,8 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
 	/* if count is 0 then mapping error has occurred */
 	count = e1000_tx_map(tx_ring, skb, first, max_per_txd, nr_frags, mss);
 	if (count) {
+		skb_tx_timestamp(skb);
+
 		netdev_sent_queue(netdev, skb->len);
 		e1000_tx_queue(tx_ring, tx_flags, count);
 		/* Make sure there is space in the ring for the next send. */
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 3/4] bnx2x: add transmit timestamping support
From: Willem de Bruijn @ 2012-04-27 19:04 UTC (permalink / raw)
  To: davem, netdev, jeffrey.t.kirscher, eilong, aabdulla; +Cc: Willem de Bruijn
In-Reply-To: <1335553447-11964-1-git-send-email-willemb@google.com>

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 60d5b54..afa6cbb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -2953,6 +2953,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	netdev_tx_sent_queue(txq, skb->len);
 
+	skb_tx_timestamp(skb);
+
 	txdata->tx_pkt_prod++;
 	/*
 	 * Make sure that the BD data is updated before updating the producer
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 1/4] e1000: add transmit timestamping support
From: Willem de Bruijn @ 2012-04-27 19:04 UTC (permalink / raw)
  To: davem, netdev, jeffrey.t.kirscher, eilong, aabdulla; +Cc: Willem de Bruijn
In-Reply-To: <1335553447-11964-1-git-send-email-willemb@google.com>

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 3d712f2..40438ea 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3261,6 +3261,8 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
 	                     nr_frags, mss);
 
 	if (count) {
+		skb_tx_timestamp(skb);
+
 		e1000_tx_queue(adapter, tx_ring, tx_flags, count);
 		/* Make sure there is space in the ring for the next send. */
 		e1000_maybe_stop_tx(netdev, tx_ring, MAX_SKB_FRAGS + 2);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 0/4 net-next] another transmit timestamping driver set
From: Willem de Bruijn @ 2012-04-27 19:04 UTC (permalink / raw)
  To: davem, netdev, jeffrey.t.kirscher, eilong, aabdulla

This series adds skb_tx_timestamp calls into the ndo_start_xmit
implementations of a few more drivers. All calls are added before
increasing the txqueue write offset, to avoid the race conditions
with the completion handler experienced earlier.

All changes have been tested with the txtimestamp test at
http://kernel.googlecode.com/files/txtimestamp.c before and after
applying the change (to a slightly different kernel, though) to
verify that it actually adds the missing functionality and
introduces no obvious bugs.

Apologies if I missed any driver maintainers.

Signed-off-by Willem de Bruijn <willemb@google.com>

^ permalink raw reply

* Re: [PATCH RFC 1/2] netfilter: conntrack: remove RCU usage in conntrack notifier
From: Eric Dumazet @ 2012-04-27 18:58 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: netdev, Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	Andrew Morton, Mike Frysinger, Arun Sharma, netfilter-devel,
	netfilter, coreteam, linux-kernel, Paul E. McKenney
In-Reply-To: <1335551333-6103-1-git-send-email-bpoirier@suse.de>

On Fri, 2012-04-27 at 14:28 -0400, Benjamin Poirier wrote:
> I think that the rcu usage in this code is pointless. It should either be
> removed or, if it was intended to protect against something, it ought to make
> that clear.
> 
> 1) The code does not make use of the deferred deletion/wait for completion rcu
> api (ie. synchronize_rcu(), call_rcu()).
> 2) It does not benefit from the barriers implied by the rcu primitives used.
> The code deals with callback pointers. There's no need to order writes to the
> function code (!) before writes to the function pointers here.
> ---

At a first glance, this seems pretty wrong.

code can disappear under you, thats for sure.

CONFIG_NF_CT_NETLINK=m

hint : module unload contains an rcu barrier.




^ permalink raw reply

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Lennert Buytenhek @ 2012-04-27 18:58 UTC (permalink / raw)
  To: sedat.dilek
  Cc: John W. Linville, linux-wireless, netdev, linux-kernel, lautriv,
	Jim Cromie
In-Reply-To: <CA+icZUV76wSBj88EpzmWbO=u2dP23yi5P98nN2KZehBUni9mEg@mail.gmail.com>

On Fri, Apr 27, 2012 at 03:29:26PM +0200, Sedat Dilek wrote:

> >>> >> >> >> Are you planning to or even working on support (for) 8361 devices?
> >>> >> >> >
> >>> >> >> > I don't have any such plans, and I don't know of anyone who does.
> >>> >> >>
> >>> >> >> Does 8361 require firmware support?
> >>> >> >
> >>> >> > What do you mean by that?
> >>> >>
> >>> >> IIRC ath5k for example needs no external firmware file.
> >>> >
> >>> > The 8361 requires firmware to be loaded into it before it'll do
> >>> > anything useful.
> >>> >
> >>> >
> >>> >> >> Does a firmware file exist (name?)?
> >>> >> >
> >>> >> > There's firmware for the 8361 out there, however, that version of
> >>> >> > the firmware implements a firmware API that is different from the
> >>> >> > one that mwl8k currently implements.
> >>> >> >
> >>> >> > You could add 8361 support to mwl8k, but then you'd have to go over
> >>> >> > all the firmware command invocations in mwl8k and make sure that they
> >>> >> > will work on the 8361 firmware that you're trying to support as well.
> >>> >>
> >>> >> Without having a 8361 this will be even harder to walk through.
> >>> >>
> >>> >> Anyway, thanks for your detailed explanations.
> >>> >>
> >>> >> What's the alternative for such affected users?
> >>> >> Use ndis-wrapper?
> >>> >
> >>> > I'm not sure.  I've never tried to get a 8361 work under Linux.
> >>>
> >>> Just found on [1] this same wrong patch in [2] :-).
> >>> Dunno if [3] worked and from where they have stolen fw-files.
> >>>
> >>> - Sedat -
> >>>
> >>> [1] https://dev.openwrt.org/ticket/7209
> >>> [2] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_fix_pci_id.patch
> >>> [3] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch
> >>
> >> I doubt that [3] is really all that's needed to make it work.  But if
> >> there's someone for whom it works, I'd like them to run some tests on
> >> mwl8k + [3] on 88w8361p.
> >
> > Just asked lautriv to do so (he will do against linux-3.3.3), lets' see.
> > Also, I refreshed and adapted a bit the instructions (see attachment).
> >
> > - Sedat -
> 
> [ CC Jim Cromie ]
> 
> Now, with extracted firmware files renamed & copied to
> /lib/firmware/mwl8k/ and applied patch (see attachment) against
> Linux-3.4-rc4+ I am seeing with modinfo:
> 
> $ sudo modinfo mwl8k
> filename:
> /lib/modules/3.4.0-rc4-5-generic/kernel/drivers/net/wireless/mwl8k.ko
> license:        GPL
> author:         Lennert Buytenhek <buytenh@marvell.com>
> version:        0.13
> description:    Marvell TOPDOG(R) 802.11 Wireless Network Driver
> firmware:       mwl8k/fmimage_8366_ap-2.fw
> firmware:       mwl8k/fmimage_8366.fw
> firmware:       mwl8k/helper_8366.fw
> firmware:       mwl8k/fmimage_8687.fw
> firmware:       mwl8k/helper_8687.fw
> firmware:       mwl8k/fmimage_8363.fw
> firmware:       mwl8k/helper_8363.fw
> firmware:       mwl8k/fmimage_8361p.fw
> firmware:       mwl8k/helper_8361p.fw
> srcversion:     9E1479A05C8D67E6AE90746
> alias:          pci:v000011ABd00002A43sv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A40sv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A30sv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A2Bsv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A24sv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A0Csv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A0Asv*sd*bc*sc*i*
> alias:          pci:v000011ABd00002A02sv*sd*bc*sc*i* <--- 8361P:
> 0x2a02 PCI device-id
> depends:        mac80211,cfg80211
> vermagic:       3.4.0-rc4-5-generic SMP mod_unload modversions
> parm:           ap_mode_default:Set to 1 to make ap mode the default
> instead of sta mode (bool)
> 
> $ ls -l /lib/firmware/mwl8k/
> insgesamt 456
> -rw-r--r-- 1 root root  75848 Apr 27 13:49 fmimage_8361p.fw <---
> 8361P: Firmware image
> -rw-r--r-- 1 root root 101780 Mär 19 19:32 fmimage_8366_ap-1.fw
> -rw-r--r-- 1 root root 101976 Mär 19 19:32 fmimage_8366_ap-2.fw
> -rw-r--r-- 1 root root  96664 Mär 19 19:32 fmimage_8366.fw
> -rw-r--r-- 1 root root  73252 Feb 23 20:07 fmimage_8687.fw
> -rw-r--r-- 1 root root   2476 Apr 27 13:49 helper_8361p.fw <--- 8361P:
> Helper image
> -rw-r--r-- 1 root root   2476 Mär 19 19:32 helper_8366.fw
> -rw-r--r-- 1 root root   2476 Feb 23 20:07 helper_8687.fw
> 
> As said... ***compile-tested*** only here.

Which doesn't say much at all, but..


> I had a short query with lautriv on #linux-wireless this afternoon:
> * mwl8k kernel-module was autoloaded
> * wlan0 interface got fired up
> * ESSID was accepted
> * logs reported 802.11bgn support is active
> * (he tested on Linux-3.3.3)
> 
> With WPA/WPA2 lautriv had some problems as his installation was
> missing wpasupplicant.
> He setup a classic /etc/network/interfaces.
> After resetting his router mwl8k worked nicely.
> 
> Unfortunately, he was on the run and promised me to send logs and do
> more testing this evening.

..but this is pretty interesting.  I thought that the only available
8361p firmware used an incompatible firmware API, and I didn't know
of the existence of an apparently compatible firmware.


> So, Lennert if you want more testing - What? How? etc.

For one, the output of 'iw phy', please.

Also, does monitor mode work?  Do you get plausible channel/rxpower
values in tcpdump in monitor mode?

Are there any messages in the syslog about failing commands?


thanks,
Lennert

^ permalink raw reply

* [PATCH RFC 2/2] netfilter: conntrack: replace mutex with cmpxchg
From: Benjamin Poirier @ 2012-04-27 18:28 UTC (permalink / raw)
  To: netdev
  Cc: Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	Andrew Morton, Eric Dumazet, Mike Frysinger, Arun Sharma,
	netfilter-devel, netfilter, coreteam, linux-kernel,
	Paul E. McKenney
In-Reply-To: <1335551333-6103-1-git-send-email-bpoirier@suse.de>

This mutex protects a single pointer.
---
 net/netfilter/nf_conntrack_ecache.c |   38 +++++++++-------------------------
 1 files changed, 10 insertions(+), 28 deletions(-)

diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
index 0134009..603eb69 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -25,8 +25,6 @@
 #include <net/netfilter/nf_conntrack_core.h>
 #include <net/netfilter/nf_conntrack_extend.h>
 
-static DEFINE_MUTEX(nf_ct_ecache_mutex);
-
 /* deliver cached events and clear cache entry - must be called with locally
  * disabled softirqs */
 void nf_ct_deliver_cached_events(struct nf_conn *ct)
@@ -80,52 +78,36 @@ EXPORT_SYMBOL_GPL(nf_ct_deliver_cached_events);
 int nf_conntrack_register_notifier(struct net *net,
 				   struct nf_ct_event_notifier *new)
 {
-	int ret = 0;
-
-	mutex_lock(&nf_ct_ecache_mutex);
-	if (net->ct.nf_conntrack_event_cb != NULL)
-		ret = -EBUSY;
+	if (cmpxchg(&net->ct.nf_conntrack_event_cb, NULL, new) != NULL)
+		return -EBUSY;
 	else
-		net->ct.nf_conntrack_event_cb = new;
-	mutex_unlock(&nf_ct_ecache_mutex);
-
-	return ret;
+		return 0;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_register_notifier);
 
 void nf_conntrack_unregister_notifier(struct net *net,
 				      struct nf_ct_event_notifier *new)
 {
-	mutex_lock(&nf_ct_ecache_mutex);
-	BUG_ON(net->ct.nf_conntrack_event_cb != new);
-	net->ct.nf_conntrack_event_cb = NULL;
-	mutex_unlock(&nf_ct_ecache_mutex);
+	if (xchg(&net->ct.nf_conntrack_event_cb, NULL) != new)
+		BUG();
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_unregister_notifier);
 
 int nf_ct_expect_register_notifier(struct net *net,
 				   struct nf_exp_event_notifier *new)
 {
-	int ret = 0;
-
-	mutex_lock(&nf_ct_ecache_mutex);
-	if (net->ct.nf_expect_event_cb != NULL)
-		ret = -EBUSY;
+	if (cmpxchg(&net->ct.nf_expect_event_cb, NULL, new) != NULL)
+		return -EBUSY;
 	else
-		net->ct.nf_expect_event_cb = new;
-	mutex_unlock(&nf_ct_ecache_mutex);
-
-	return ret;
+		return 0;
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_register_notifier);
 
 void nf_ct_expect_unregister_notifier(struct net *net,
 				      struct nf_exp_event_notifier *new)
 {
-	mutex_lock(&nf_ct_ecache_mutex);
-	BUG_ON(net->ct.nf_expect_event_cb != new);
-	net->ct.nf_expect_event_cb = NULL;
-	mutex_unlock(&nf_ct_ecache_mutex);
+	if (xchg(&net->ct.nf_expect_event_cb, NULL) != new)
+		BUG();
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_unregister_notifier);
 
-- 
1.7.7


^ permalink raw reply related

* [PATCH RFC 1/2] netfilter: conntrack: remove RCU usage in conntrack notifier
From: Benjamin Poirier @ 2012-04-27 18:28 UTC (permalink / raw)
  To: netdev
  Cc: Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	Andrew Morton, Eric Dumazet, Mike Frysinger, Arun Sharma,
	netfilter-devel, netfilter, coreteam, linux-kernel,
	Paul E. McKenney

I think that the rcu usage in this code is pointless. It should either be
removed or, if it was intended to protect against something, it ought to make
that clear.

1) The code does not make use of the deferred deletion/wait for completion rcu
api (ie. synchronize_rcu(), call_rcu()).
2) It does not benefit from the barriers implied by the rcu primitives used.
The code deals with callback pointers. There's no need to order writes to the
function code (!) before writes to the function pointers here.
---
 include/net/netfilter/nf_conntrack_ecache.h |   20 +++------
 include/net/netns/conntrack.h               |    4 +-
 net/netfilter/nf_conntrack_ecache.c         |   58 +++++++-------------------
 3 files changed, 25 insertions(+), 57 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h
index a88fb69..ac5b8d2 100644
--- a/include/net/netfilter/nf_conntrack_ecache.h
+++ b/include/net/netfilter/nf_conntrack_ecache.h
@@ -99,14 +99,13 @@ nf_conntrack_eventmask_report(unsigned int eventmask,
 	struct nf_ct_event_notifier *notify;
 	struct nf_conntrack_ecache *e;
 
-	rcu_read_lock();
-	notify = rcu_dereference(net->ct.nf_conntrack_event_cb);
+	notify = net->ct.nf_conntrack_event_cb;
 	if (notify == NULL)
-		goto out_unlock;
+		return ret;
 
 	e = nf_ct_ecache_find(ct);
 	if (e == NULL)
-		goto out_unlock;
+		return ret;
 
 	if (nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct)) {
 		struct nf_ct_event item = {
@@ -118,7 +117,7 @@ nf_conntrack_eventmask_report(unsigned int eventmask,
 		unsigned long missed = e->pid ? 0 : e->missed;
 
 		if (!((eventmask | missed) & e->ctmask))
-			goto out_unlock;
+			return ret;
 
 		ret = notify->fcn(eventmask | missed, &item);
 		if (unlikely(ret < 0 || missed)) {
@@ -137,8 +136,6 @@ nf_conntrack_eventmask_report(unsigned int eventmask,
 			spin_unlock_bh(&ct->lock);
 		}
 	}
-out_unlock:
-	rcu_read_unlock();
 	return ret;
 }
 
@@ -178,14 +175,13 @@ nf_ct_expect_event_report(enum ip_conntrack_expect_events event,
 	struct nf_exp_event_notifier *notify;
 	struct nf_conntrack_ecache *e;
 
-	rcu_read_lock();
-	notify = rcu_dereference(net->ct.nf_expect_event_cb);
+	notify = net->ct.nf_expect_event_cb;
 	if (notify == NULL)
-		goto out_unlock;
+		return;
 
 	e = nf_ct_ecache_find(exp->master);
 	if (e == NULL)
-		goto out_unlock;
+		return;
 
 	if (e->expmask & (1 << event)) {
 		struct nf_exp_event item = {
@@ -195,8 +191,6 @@ nf_ct_expect_event_report(enum ip_conntrack_expect_events event,
 		};
 		notify->fcn(1 << event, &item);
 	}
-out_unlock:
-	rcu_read_unlock();
 }
 
 static inline void
diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
index 7a911ec..c96fd8c 100644
--- a/include/net/netns/conntrack.h
+++ b/include/net/netns/conntrack.h
@@ -18,8 +18,8 @@ struct netns_ct {
 	struct hlist_nulls_head	unconfirmed;
 	struct hlist_nulls_head	dying;
 	struct ip_conntrack_stat __percpu *stat;
-	struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb;
-	struct nf_exp_event_notifier __rcu *nf_expect_event_cb;
+	struct nf_ct_event_notifier *nf_conntrack_event_cb;
+	struct nf_exp_event_notifier *nf_expect_event_cb;
 	int			sysctl_events;
 	unsigned int		sysctl_events_retry_timeout;
 	int			sysctl_acct;
diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
index b924f3a..0134009 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -38,19 +38,18 @@ void nf_ct_deliver_cached_events(struct nf_conn *ct)
 	struct nf_ct_event item;
 	int ret;
 
-	rcu_read_lock();
-	notify = rcu_dereference(net->ct.nf_conntrack_event_cb);
+	notify = net->ct.nf_conntrack_event_cb;
 	if (notify == NULL)
-		goto out_unlock;
+		return;
 
 	e = nf_ct_ecache_find(ct);
 	if (e == NULL)
-		goto out_unlock;
+		return;
 
 	events = xchg(&e->cache, 0);
 
 	if (!nf_ct_is_confirmed(ct) || nf_ct_is_dying(ct) || !events)
-		goto out_unlock;
+		return;
 
 	/* We make a copy of the missed event cache without taking
 	 * the lock, thus we may send missed events twice. However,
@@ -58,7 +57,7 @@ void nf_ct_deliver_cached_events(struct nf_conn *ct)
 	missed = e->missed;
 
 	if (!((events | missed) & e->ctmask))
-		goto out_unlock;
+		return;
 
 	item.ct = ct;
 	item.pid = 0;
@@ -67,7 +66,7 @@ void nf_ct_deliver_cached_events(struct nf_conn *ct)
 	ret = notify->fcn(events | missed, &item);
 
 	if (likely(ret >= 0 && !missed))
-		goto out_unlock;
+		return;
 
 	spin_lock_bh(&ct->lock);
 	if (ret < 0)
@@ -75,9 +74,6 @@ void nf_ct_deliver_cached_events(struct nf_conn *ct)
 	else
 		e->missed &= ~missed;
 	spin_unlock_bh(&ct->lock);
-
-out_unlock:
-	rcu_read_unlock();
 }
 EXPORT_SYMBOL_GPL(nf_ct_deliver_cached_events);
 
@@ -85,21 +81,14 @@ int nf_conntrack_register_notifier(struct net *net,
 				   struct nf_ct_event_notifier *new)
 {
 	int ret = 0;
-	struct nf_ct_event_notifier *notify;
 
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference_protected(net->ct.nf_conntrack_event_cb,
-					   lockdep_is_held(&nf_ct_ecache_mutex));
-	if (notify != NULL) {
+	if (net->ct.nf_conntrack_event_cb != NULL)
 		ret = -EBUSY;
-		goto out_unlock;
-	}
-	rcu_assign_pointer(net->ct.nf_conntrack_event_cb, new);
+	else
+		net->ct.nf_conntrack_event_cb = new;
 	mutex_unlock(&nf_ct_ecache_mutex);
-	return ret;
 
-out_unlock:
-	mutex_unlock(&nf_ct_ecache_mutex);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_register_notifier);
@@ -107,13 +96,9 @@ EXPORT_SYMBOL_GPL(nf_conntrack_register_notifier);
 void nf_conntrack_unregister_notifier(struct net *net,
 				      struct nf_ct_event_notifier *new)
 {
-	struct nf_ct_event_notifier *notify;
-
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference_protected(net->ct.nf_conntrack_event_cb,
-					   lockdep_is_held(&nf_ct_ecache_mutex));
-	BUG_ON(notify != new);
-	RCU_INIT_POINTER(net->ct.nf_conntrack_event_cb, NULL);
+	BUG_ON(net->ct.nf_conntrack_event_cb != new);
+	net->ct.nf_conntrack_event_cb = NULL;
 	mutex_unlock(&nf_ct_ecache_mutex);
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_unregister_notifier);
@@ -122,21 +107,14 @@ int nf_ct_expect_register_notifier(struct net *net,
 				   struct nf_exp_event_notifier *new)
 {
 	int ret = 0;
-	struct nf_exp_event_notifier *notify;
 
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference_protected(net->ct.nf_expect_event_cb,
-					   lockdep_is_held(&nf_ct_ecache_mutex));
-	if (notify != NULL) {
+	if (net->ct.nf_expect_event_cb != NULL)
 		ret = -EBUSY;
-		goto out_unlock;
-	}
-	rcu_assign_pointer(net->ct.nf_expect_event_cb, new);
+	else
+		net->ct.nf_expect_event_cb = new;
 	mutex_unlock(&nf_ct_ecache_mutex);
-	return ret;
 
-out_unlock:
-	mutex_unlock(&nf_ct_ecache_mutex);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_register_notifier);
@@ -144,13 +122,9 @@ EXPORT_SYMBOL_GPL(nf_ct_expect_register_notifier);
 void nf_ct_expect_unregister_notifier(struct net *net,
 				      struct nf_exp_event_notifier *new)
 {
-	struct nf_exp_event_notifier *notify;
-
 	mutex_lock(&nf_ct_ecache_mutex);
-	notify = rcu_dereference_protected(net->ct.nf_expect_event_cb,
-					   lockdep_is_held(&nf_ct_ecache_mutex));
-	BUG_ON(notify != new);
-	RCU_INIT_POINTER(net->ct.nf_expect_event_cb, NULL);
+	BUG_ON(net->ct.nf_expect_event_cb != new);
+	net->ct.nf_expect_event_cb = NULL;
 	mutex_unlock(&nf_ct_ecache_mutex);
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_unregister_notifier);
-- 
1.7.7


^ permalink raw reply related

* [PATCH v2 4/4 net-next] net/l2tp: add support for L2TP over IPv6 UDP
From: Benjamin LaHaise @ 2012-04-27 18:24 UTC (permalink / raw)
  To: David S. Miller, James Chapman; +Cc: netdev

Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6.
Support has been tested with and without hardware offloading.  This
version fixes the L2TP over localhost issue with incorrect checksums
being reported.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
 include/linux/if_pppol2tp.h |   28 +++++++++++++-
 include/linux/if_pppox.h    |   12 ++++++
 net/l2tp/l2tp_core.c        |   89 +++++++++++++++++++++++++++++++++++++------
 net/l2tp/l2tp_ppp.c         |   42 ++++++++++++++++++++-
 4 files changed, 157 insertions(+), 14 deletions(-)

diff --git a/include/linux/if_pppol2tp.h b/include/linux/if_pppol2tp.h
index 23cefa1..b477541 100644
--- a/include/linux/if_pppol2tp.h
+++ b/include/linux/if_pppol2tp.h
@@ -19,10 +19,11 @@
 
 #ifdef __KERNEL__
 #include <linux/in.h>
+#include <linux/in6.h>
 #endif
 
 /* Structure used to connect() the socket to a particular tunnel UDP
- * socket.
+ * socket over IPv4.
  */
 struct pppol2tp_addr {
 	__kernel_pid_t	pid;		/* pid that owns the fd.
@@ -35,6 +36,20 @@ struct pppol2tp_addr {
 	__u16 d_tunnel, d_session;	/* For sending outgoing packets */
 };
 
+/* Structure used to connect() the socket to a particular tunnel UDP
+ * socket over IPv6.
+ */
+struct pppol2tpin6_addr {
+	__kernel_pid_t	pid;		/* pid that owns the fd.
+					 * 0 => current */
+	int	fd;			/* FD of UDP socket to use */
+
+	__u16 s_tunnel, s_session;	/* For matching incoming packets */
+	__u16 d_tunnel, d_session;	/* For sending outgoing packets */
+
+	struct sockaddr_in6 addr;	/* IP address and port to send to */
+};
+
 /* The L2TPv3 protocol changes tunnel and session ids from 16 to 32
  * bits. So we need a different sockaddr structure.
  */
@@ -49,6 +64,17 @@ struct pppol2tpv3_addr {
 	__u32 d_tunnel, d_session;	/* For sending outgoing packets */
 };
 
+struct pppol2tpv3in6_addr {
+	__kernel_pid_t	pid;		/* pid that owns the fd.
+					 * 0 => current */
+	int	fd;			/* FD of UDP or IP socket to use */
+
+	__u32 s_tunnel, s_session;	/* For matching incoming packets */
+	__u32 d_tunnel, d_session;	/* For sending outgoing packets */
+
+	struct sockaddr_in6 addr;	/* IP address and port to send to */
+};
+
 /* Socket options:
  * DEBUG	- bitmask of debug message categories
  * SENDSEQ	- 0 => don't send packets with sequence numbers
diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h
index b5f927f..6720d57 100644
--- a/include/linux/if_pppox.h
+++ b/include/linux/if_pppox.h
@@ -83,6 +83,12 @@ struct sockaddr_pppol2tp {
 	struct pppol2tp_addr pppol2tp;
 } __attribute__((packed));
 
+struct sockaddr_pppol2tpin6 {
+	__kernel_sa_family_t sa_family; /* address family, AF_PPPOX */
+	unsigned int    sa_protocol;    /* protocol identifier */
+	struct pppol2tpin6_addr pppol2tp;
+} __attribute__((packed));
+
 /* The L2TPv3 protocol changes tunnel and session ids from 16 to 32
  * bits. So we need a different sockaddr structure.
  */
@@ -92,6 +98,12 @@ struct sockaddr_pppol2tpv3 {
 	struct pppol2tpv3_addr pppol2tp;
 } __attribute__((packed));
 
+struct sockaddr_pppol2tpv3in6 {
+	__kernel_sa_family_t sa_family; /* address family, AF_PPPOX */
+	unsigned int    sa_protocol;    /* protocol identifier */
+	struct pppol2tpv3in6_addr pppol2tp;
+} __attribute__((packed));
+
 /*********************************************************************
  *
  * ioctl interface for defining forwarding of connections
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index f6732b6..e91d559 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -53,6 +53,9 @@
 #include <net/inet_common.h>
 #include <net/xfrm.h>
 #include <net/protocol.h>
+#include <net/inet6_connection_sock.h>
+#include <net/inet_ecn.h>
+#include <net/ip6_route.h>
 
 #include <asm/byteorder.h>
 #include <linux/atomic.h>
@@ -446,21 +449,43 @@ static inline int l2tp_verify_udp_checksum(struct sock *sk,
 {
 	struct udphdr *uh = udp_hdr(skb);
 	u16 ulen = ntohs(uh->len);
-	struct inet_sock *inet;
 	__wsum psum;
 
-	if (sk->sk_no_check || skb_csum_unnecessary(skb) || !uh->check)
-		return 0;
-
-	inet = inet_sk(sk);
-	psum = csum_tcpudp_nofold(inet->inet_saddr, inet->inet_daddr, ulen,
-				  IPPROTO_UDP, 0);
-
-	if ((skb->ip_summed == CHECKSUM_COMPLETE) &&
-	    !csum_fold(csum_add(psum, skb->csum)))
+	if (sk->sk_no_check || skb_csum_unnecessary(skb))
 		return 0;
 
-	skb->csum = psum;
+#if IS_ENABLED(CONFIG_IPV6)
+	if (sk->sk_family == PF_INET6) {
+		if (!uh->check) {
+			LIMIT_NETDEBUG(KERN_INFO "L2TP: IPv6: checksum is 0\n");
+			return 1;
+		}
+		if ((skb->ip_summed == CHECKSUM_COMPLETE) &&
+		    !csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+				     &ipv6_hdr(skb)->daddr, ulen,
+				     IPPROTO_UDP, skb->csum)) {
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+			return 0;
+		}
+		skb->csum = ~csum_unfold(csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+							 &ipv6_hdr(skb)->daddr,
+							 skb->len, IPPROTO_UDP,
+							 0));
+	} else
+#endif
+	{
+		struct inet_sock *inet;
+		if (!uh->check)
+			return 0;
+		inet = inet_sk(sk);
+		psum = csum_tcpudp_nofold(inet->inet_saddr, inet->inet_daddr,
+					  ulen, IPPROTO_UDP, 0);
+
+		if ((skb->ip_summed == CHECKSUM_COMPLETE) &&
+		    !csum_fold(csum_add(psum, skb->csum)))
+			return 0;
+		skb->csum = psum;
+	}
 
 	return __skb_checksum_complete(skb);
 }
@@ -988,7 +1013,12 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 
 	/* Queue the packet to IP for output */
 	skb->local_df = 1;
-	error = ip_queue_xmit(skb, fl);
+#if IS_ENABLED(CONFIG_IPV6)
+	if (skb->sk->sk_family == PF_INET6)
+		error = inet6_csk_xmit(skb, NULL);
+	else
+#endif
+		error = ip_queue_xmit(skb, fl);
 
 	/* Update stats */
 	if (error >= 0) {
@@ -1021,6 +1051,31 @@ static inline void l2tp_skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
 	skb->destructor = l2tp_sock_wfree;
 }
 
+#if IS_ENABLED(CONFIG_IPV6)
+static void l2tp_xmit_ipv6_csum(struct sock *sk, struct sk_buff *skb,
+				int udp_len)
+{
+	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct udphdr *uh = udp_hdr(skb);
+
+	if (!skb_dst(skb) || !skb_dst(skb)->dev ||
+	    !(skb_dst(skb)->dev->features & NETIF_F_IPV6_CSUM)) {
+		__wsum csum = skb_checksum(skb, 0, udp_len, 0);
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+		uh->check = csum_ipv6_magic(&np->saddr, &np->daddr, udp_len,
+					    IPPROTO_UDP, csum);
+		if (uh->check == 0)
+			uh->check = CSUM_MANGLED_0;
+	} else {
+		skb->ip_summed = CHECKSUM_PARTIAL;
+		skb->csum_start = skb_transport_header(skb) - skb->head;
+		skb->csum_offset = offsetof(struct udphdr, check);
+		uh->check = ~csum_ipv6_magic(&np->saddr, &np->daddr,
+					     udp_len, IPPROTO_UDP, 0);
+	}
+}
+#endif
+
 /* If caller requires the skb to have a ppp header, the header must be
  * inserted in the skb data before calling this function.
  */
@@ -1089,6 +1144,11 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len
 		uh->check = 0;
 
 		/* Calculate UDP checksum if configured to do so */
+#if IS_ENABLED(CONFIG_IPV6)
+		if (sk->sk_family == PF_INET6)
+			l2tp_xmit_ipv6_csum(sk, skb, udp_len);
+		else
+#endif
 		if (sk->sk_no_check == UDP_CSUM_NOXMIT)
 			skb->ip_summed = CHECKSUM_NONE;
 		else if ((skb_dst(skb) && skb_dst(skb)->dev) &&
@@ -1424,6 +1484,11 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 		/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
 		udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
 		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
+#if IS_ENABLED(CONFIG_IPV6)
+		if (sk->sk_family == PF_INET6)
+			udpv6_encap_enable();
+		else
+#endif
 		udp_encap_enable();
 	}
 
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index 1addd9f..27b9dec 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -916,7 +916,7 @@ static int pppol2tp_getname(struct socket *sock, struct sockaddr *uaddr,
 	}
 
 	inet = inet_sk(tunnel->sock);
-	if (tunnel->version == 2) {
+	if ((tunnel->version == 2) && (tunnel->sock->sk_family == AF_INET)) {
 		struct sockaddr_pppol2tp sp;
 		len = sizeof(sp);
 		memset(&sp, 0, len);
@@ -932,6 +932,46 @@ static int pppol2tp_getname(struct socket *sock, struct sockaddr *uaddr,
 		sp.pppol2tp.addr.sin_port = inet->inet_dport;
 		sp.pppol2tp.addr.sin_addr.s_addr = inet->inet_daddr;
 		memcpy(uaddr, &sp, len);
+#if IS_ENABLED(CONFIG_IPV6)
+	} else if ((tunnel->version == 2) &&
+		   (tunnel->sock->sk_family == AF_INET6)) {
+		struct ipv6_pinfo *np = inet6_sk(tunnel->sock);
+		struct sockaddr_pppol2tpin6 sp;
+		len = sizeof(sp);
+		memset(&sp, 0, len);
+		sp.sa_family	= AF_PPPOX;
+		sp.sa_protocol	= PX_PROTO_OL2TP;
+		sp.pppol2tp.fd  = tunnel->fd;
+		sp.pppol2tp.pid = pls->owner;
+		sp.pppol2tp.s_tunnel = tunnel->tunnel_id;
+		sp.pppol2tp.d_tunnel = tunnel->peer_tunnel_id;
+		sp.pppol2tp.s_session = session->session_id;
+		sp.pppol2tp.d_session = session->peer_session_id;
+		sp.pppol2tp.addr.sin6_family = AF_INET6;
+		sp.pppol2tp.addr.sin6_port = inet->inet_dport;
+		memcpy(&sp.pppol2tp.addr.sin6_addr, &np->daddr,
+		       sizeof(np->daddr));
+		memcpy(uaddr, &sp, len);
+	} else if ((tunnel->version == 3) &&
+		   (tunnel->sock->sk_family == AF_INET6)) {
+		struct ipv6_pinfo *np = inet6_sk(tunnel->sock);
+		struct sockaddr_pppol2tpv3in6 sp;
+		len = sizeof(sp);
+		memset(&sp, 0, len);
+		sp.sa_family	= AF_PPPOX;
+		sp.sa_protocol	= PX_PROTO_OL2TP;
+		sp.pppol2tp.fd  = tunnel->fd;
+		sp.pppol2tp.pid = pls->owner;
+		sp.pppol2tp.s_tunnel = tunnel->tunnel_id;
+		sp.pppol2tp.d_tunnel = tunnel->peer_tunnel_id;
+		sp.pppol2tp.s_session = session->session_id;
+		sp.pppol2tp.d_session = session->peer_session_id;
+		sp.pppol2tp.addr.sin6_family = AF_INET6;
+		sp.pppol2tp.addr.sin6_port = inet->inet_dport;
+		memcpy(&sp.pppol2tp.addr.sin6_addr, &np->daddr,
+		       sizeof(np->daddr));
+		memcpy(uaddr, &sp, len);
+#endif
 	} else if (tunnel->version == 3) {
 		struct sockaddr_pppol2tpv3 sp;
 		len = sizeof(sp);
-- 
1.7.4.1


-- 
"Thought is the essence of where you are now."

^ permalink raw reply related

* [PATCH v2 3/4 net-next] net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6
From: Benjamin LaHaise @ 2012-04-27 18:24 UTC (permalink / raw)
  To: David S. Miller, James Chapman; +Cc: netdev

Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
 include/net/udp.h |    3 +++
 net/ipv6/udp.c    |   39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index e66fc29..065f379 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -268,4 +268,7 @@ extern int udp4_ufo_send_check(struct sk_buff *skb);
 extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
 	netdev_features_t features);
 extern void udp_encap_enable(void);
+#if IS_ENABLED(CONFIG_IPV6)
+extern void udpv6_encap_enable(void);
+#endif
 #endif	/* _UDP_H */
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index bc533ea..c1d91a7 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -525,6 +525,14 @@ static __inline__ void udpv6_err(struct sk_buff *skb,
 	__udp6_lib_err(skb, opt, type, code, offset, info, &udp_table);
 }
 
+static struct static_key udpv6_encap_needed __read_mostly;
+void udpv6_encap_enable(void)
+{
+	if (!static_key_enabled(&udpv6_encap_needed))
+		static_key_slow_inc(&udpv6_encap_needed);
+}
+EXPORT_SYMBOL(udpv6_encap_enable);
+
 int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct udp_sock *up = udp_sk(sk);
@@ -534,6 +542,37 @@ int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto drop;
 
+	if (static_key_false(&udpv6_encap_needed) && up->encap_type) {
+		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
+
+		/*
+		 * This is an encapsulation socket so pass the skb to
+		 * the socket's udp_encap_rcv() hook. Otherwise, just
+		 * fall through and pass this up the UDP socket.
+		 * up->encap_rcv() returns the following value:
+		 * =0 if skb was successfully passed to the encap
+		 *    handler or was discarded by it.
+		 * >0 if skb should be passed on to UDP.
+		 * <0 if skb should be resubmitted as proto -N
+		 */
+
+		/* if we're overly short, let UDP handle it */
+		encap_rcv = ACCESS_ONCE(up->encap_rcv);
+		if (skb->len > sizeof(struct udphdr) && encap_rcv != NULL) {
+			int ret;
+
+			ret = encap_rcv(sk, skb);
+			if (ret <= 0) {
+				UDP_INC_STATS_BH(sock_net(sk),
+						 UDP_MIB_INDATAGRAMS,
+						 is_udplite);
+				return -ret;
+			}
+		}
+
+		/* FALLTHROUGH -- it's a UDP Packet */
+	}
+
 	/*
 	 * UDP-Lite specific tests, ignored on UDP sockets (see net/ipv4/udp.c).
 	 */
-- 
1.7.4.1


-- 
"Thought is the essence of where you are now."

^ permalink raw reply related

* [PATCH v2 2/4 net-next] net/ipv6/udp: UDP encapsulation: move socket locking into udpv6_queue_rcv_skb()
From: Benjamin LaHaise @ 2012-04-27 18:23 UTC (permalink / raw)
  To: David S. Miller, James Chapman; +Cc: netdev

In order to make sure that when the encap_rcv() hook is introduced it is
not called with the socket lock held, move socket locking from callers into
udpv6_queue_rcv_skb(), matching what happens in IPv4.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
 net/ipv6/udp.c |   97 +++++++++++++++++++++++++------------------------------
 1 files changed, 44 insertions(+), 53 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 6c0367f..bc533ea 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -558,14 +558,25 @@ int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 			goto drop;
 	}
 
+	if (sk_rcvqueues_full(sk, skb, sk->sk_rcvbuf))
+		goto drop;
+
 	skb_dst_drop(skb);
 
-	rc = __udpv6_queue_rcv_skb(sk, skb);
+	bh_lock_sock(sk);
+	rc = 0;
+	if (!sock_owned_by_user(sk))
+		rc = __udpv6_queue_rcv_skb(sk, skb);
+	else if (sk_add_backlog(sk, skb, sk->sk_rcvbuf)) {
+		bh_unlock_sock(sk);
+		goto drop;
+	}
+	bh_unlock_sock(sk);
 
 	return rc;
 drop:
-	atomic_inc(&sk->sk_drops);
 	UDP6_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
+	atomic_inc(&sk->sk_drops);
 	kfree_skb(skb);
 	return -1;
 }
@@ -614,37 +625,27 @@ static struct sock *udp_v6_mcast_next(struct net *net, struct sock *sk,
 static void flush_stack(struct sock **stack, unsigned int count,
 			struct sk_buff *skb, unsigned int final)
 {
-	unsigned int i;
+	struct sk_buff *skb1 = NULL;
 	struct sock *sk;
-	struct sk_buff *skb1;
+	unsigned int i;
 
 	for (i = 0; i < count; i++) {
-		skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC);
-
 		sk = stack[i];
-		if (skb1) {
-			if (sk_rcvqueues_full(sk, skb1, sk->sk_rcvbuf)) {
-				kfree_skb(skb1);
-				goto drop;
-			}
-			bh_lock_sock(sk);
-			if (!sock_owned_by_user(sk))
-				udpv6_queue_rcv_skb(sk, skb1);
-			else if (sk_add_backlog(sk, skb1, sk->sk_rcvbuf)) {
-				kfree_skb(skb1);
-				bh_unlock_sock(sk);
-				goto drop;
-			}
-			bh_unlock_sock(sk);
-			continue;
+		if (likely(skb1 == NULL))
+			skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC);
+		if (!skb1) {
+			atomic_inc(&sk->sk_drops);
+			UDP6_INC_STATS_BH(sock_net(sk), UDP_MIB_RCVBUFERRORS,
+					  IS_UDPLITE(sk));
+			UDP6_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS,
+					  IS_UDPLITE(sk));
 		}
-drop:
-		atomic_inc(&sk->sk_drops);
-		UDP6_INC_STATS_BH(sock_net(sk),
-				UDP_MIB_RCVBUFERRORS, IS_UDPLITE(sk));
-		UDP6_INC_STATS_BH(sock_net(sk),
-				UDP_MIB_INERRORS, IS_UDPLITE(sk));
+
+		if (skb1 && udpv6_queue_rcv_skb(sk, skb1) <= 0)
+			skb1 = NULL;
 	}
+	if (unlikely(skb1))
+		kfree_skb(skb1);
 }
 /*
  * Note: called only from the BH handler context,
@@ -784,39 +785,29 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 	 * for sock caches... i'll skip this for now.
 	 */
 	sk = __udp6_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
+	if (sk != NULL) {
+		int ret = udpv6_queue_rcv_skb(sk, skb);
+		sock_put(sk);
 
-	if (sk == NULL) {
-		if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))
-			goto discard;
-
-		if (udp_lib_checksum_complete(skb))
-			goto discard;
-		UDP6_INC_STATS_BH(net, UDP_MIB_NOPORTS,
-				proto == IPPROTO_UDPLITE);
-
-		icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0);
+		/* a return value > 0 means to resubmit the input, but
+		 * it wants the return to be -protocol, or 0
+		 */
+		if (ret > 0)
+			return -ret;
 
-		kfree_skb(skb);
 		return 0;
 	}
 
-	/* deliver */
-
-	if (sk_rcvqueues_full(sk, skb, sk->sk_rcvbuf)) {
-		sock_put(sk);
+	if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))
 		goto discard;
-	}
-	bh_lock_sock(sk);
-	if (!sock_owned_by_user(sk))
-		udpv6_queue_rcv_skb(sk, skb);
-	else if (sk_add_backlog(sk, skb, sk->sk_rcvbuf)) {
-		atomic_inc(&sk->sk_drops);
-		bh_unlock_sock(sk);
-		sock_put(sk);
+
+	if (udp_lib_checksum_complete(skb))
 		goto discard;
-	}
-	bh_unlock_sock(sk);
-	sock_put(sk);
+
+	UDP6_INC_STATS_BH(net, UDP_MIB_NOPORTS, proto == IPPROTO_UDPLITE);
+	icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0);
+
+	kfree_skb(skb);
 	return 0;
 
 short_packet:
-- 
1.7.4.1


-- 
"Thought is the essence of where you are now."

^ permalink raw reply related

* [PATCH v2 1/4 net-next] net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb
From: Benjamin LaHaise @ 2012-04-27 18:23 UTC (permalink / raw)
  To: David S. Miller, James Chapman; +Cc: netdev

This is the first step in reworking the IPv6 UDP code to be structured more
like the IPv4 UDP code.  This patch creates __udpv6_queue_rcv_skb() with
the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the
backlog_rcv method.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
 net/ipv6/udp.c |   42 +++++++++++++++++++++++++++---------------
 1 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d39bbc9..6c0367f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -496,6 +496,28 @@ out:
 	sock_put(sk);
 }
 
+static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+{
+	int rc;
+
+	if (!ipv6_addr_any(&inet6_sk(sk)->daddr))
+		sock_rps_save_rxhash(sk, skb);
+
+	rc = sock_queue_rcv_skb(sk, skb);
+	if (rc < 0) {
+		int is_udplite = IS_UDPLITE(sk);
+
+		/* Note that an ENOMEM error is charged twice */
+		if (rc == -ENOMEM)
+			UDP6_INC_STATS_BH(sock_net(sk),
+					UDP_MIB_RCVBUFERRORS, is_udplite);
+		UDP6_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
+		kfree_skb(skb);
+		return -1;
+	}
+	return 0;
+}
+
 static __inline__ void udpv6_err(struct sk_buff *skb,
 				 struct inet6_skb_parm *opt, u8 type,
 				 u8 code, int offset, __be32 info     )
@@ -503,15 +525,12 @@ static __inline__ void udpv6_err(struct sk_buff *skb,
 	__udp6_lib_err(skb, opt, type, code, offset, info, &udp_table);
 }
 
-int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb)
+int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct udp_sock *up = udp_sk(sk);
 	int rc;
 	int is_udplite = IS_UDPLITE(sk);
 
-	if (!ipv6_addr_any(&inet6_sk(sk)->daddr))
-		sock_rps_save_rxhash(sk, skb);
-
 	if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto drop;
 
@@ -540,19 +559,12 @@ int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb)
 	}
 
 	skb_dst_drop(skb);
-	rc = sock_queue_rcv_skb(sk, skb);
-	if (rc < 0) {
-		/* Note that an ENOMEM error is charged twice */
-		if (rc == -ENOMEM)
-			UDP6_INC_STATS_BH(sock_net(sk),
-					UDP_MIB_RCVBUFERRORS, is_udplite);
-		goto drop_no_sk_drops_inc;
-	}
 
-	return 0;
+	rc = __udpv6_queue_rcv_skb(sk, skb);
+
+	return rc;
 drop:
 	atomic_inc(&sk->sk_drops);
-drop_no_sk_drops_inc:
 	UDP6_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
 	kfree_skb(skb);
 	return -1;
@@ -1471,7 +1483,7 @@ struct proto udpv6_prot = {
 	.getsockopt	   = udpv6_getsockopt,
 	.sendmsg	   = udpv6_sendmsg,
 	.recvmsg	   = udpv6_recvmsg,
-	.backlog_rcv	   = udpv6_queue_rcv_skb,
+	.backlog_rcv	   = __udpv6_queue_rcv_skb,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v6_rehash,
-- 
1.7.4.1


-- 
"Thought is the essence of where you are now."

^ permalink raw reply related

* Re: [PATCH 2/2] ss: implement -M option to get all memory information
From: Stephen Hemminger @ 2012-04-27 17:21 UTC (permalink / raw)
  To: Shan Wei; +Cc: xemul, NetDev
In-Reply-To: <4F99FEA1.2000700@gmail.com>

On Fri, 27 Apr 2012 10:04:17 +0800
Shan Wei <shanwei88@gmail.com> wrote:

> Stephen Hemminger said, at 2012/4/26 1:41:
> 
> >>  "   -m, --memory        show socket memory usage\n"
> >> +"   -M, --skmem         show socket memory information.\n"
> >> +"                       this option which will replace -m, is valid since Linux 3.3.\n"
> > 
> > Please don't add reference to kernel version.
> > Why does this need to be a new option? Maybe just do the right thing
> > if present in the netlink response?
> 
> 
> Actually, I'm ok for your suggestion that using a existing option(-m) to
> export all memory information. But with a new option is better for us.

Why? if you are arguing to replace the option anyway.

> Maybe some people use latest iproute source with under kernel 3.3.0, some don't.
> -m option will take different user experience, specially for UDP socket.
> Currently -m option has no effect on UDP socket which i will submit another patch
> to export memory information for. new option, fresh experience.

Lots of options return more or different information based on kernel
version, probably the biggest example is how stats are processed.

> (Exporting out memory information will help us to analysis dropped packets 
> which recorded in RcvbufErrors and SndbufErrors.)
>  
> 
>  
> 

^ permalink raw reply

* Re: [PATCH 2/4 net-next] tg3: provide frags as skb head
From: Ben Hutchings @ 2012-04-27 16:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Ilpo Järvinen, Tom Herbert,
	Neal Cardwell, Matt Carlson, Jeff Kirsher, Michael Chan,
	Herbert Xu, Maciej Żenczykowski
In-Reply-To: <1335544964.2775.250.camel@edumazet-glaptop>

On Fri, 2012-04-27 at 18:42 +0200, Eric Dumazet wrote:
> On Fri, 2012-04-27 at 17:33 +0100, Ben Hutchings wrote:
> 
> > page_address() may return NULL if CONFIG_HIGHMEM is enabled.  And that's
> > going to be a general problem with this scheme on 32-bit machines.
> 
> kmalloc(2048, GFP_ATOMIC) and alloc_page(GFP_ATOMIC) uses the same pool
> of LOWMEM pages, so there is no difference here. page_address() is
> guaranteed to work.

Oh, of course, GFP_HIGH is not GFP_HIGHMEM...

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox