Netdev List
 help / color / mirror / Atom feed
* [REGRESSION,v3.7-rc5,bisected] 100% CPU usage in softirqd, unable to shutdown
From: Lekensteyn @ 2012-11-12 21:20 UTC (permalink / raw)
  To: Xiaotian Feng, David S. Miller; +Cc: netdev

Hi,

After upgrading from 3.7-rc4 to 3.7-rc5 I found that I was unable to suspend 
without locking up the system afterwards. Neither was I able to shutdown as it 
would simply hang where it should halt. The second suspend/resume in a session 
would make Networkmanager hang.

When looking in my process list, I saw that softirqd was using one full CPU 
core. Watching the contents of /proc/softirqs showed that the tasklet number 
would rapidly increase.

I got this message when trying to suspend for the second time in a session:

 Freezing user space processes ... 
 Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, 
wq_busy=0):
 NetworkManager  R  running task        0   332      1 0x00000004
  ffff88023169d628 ffffffff81549b86 ffff8802316e4470 ffff88023169dfd8
  ffff88023169dfd8 ffff88023169dfd8 ffff8802316e4470 ffff8802316e4470
  ffff88023169d698 ffff88023bc92a80 ffff88022fd3db70 ffff88022fd3dc90
 Call Trace:
  [<ffffffff81549b9e>] ? __schedule+0x13e/0x760
  [<ffffffff8154a4f9>] schedule+0x29/0x70
  [<ffffffff810722aa>] sys_sched_yield+0x4a/0x60
  [<ffffffff8154a7c2>] yield+0x32/0x40
  [<ffffffff810451f5>] tasklet_kill+0x35/0x80
  [<ffffffffa017c2f3>] jme_close+0xd3/0x850 [jme]
  [<ffffffff8146325d>] __dev_close_many+0x7d/0xc0
  [<ffffffff814632cd>] __dev_close+0x2d/0x40
  [<ffffffff81469551>] __dev_change_flags+0xa1/0x180
  [<ffffffff814696e8>] dev_change_flags+0x28/0x70
  [<ffffffff81475b68>] do_setlink+0x378/0xa00
  [<ffffffff81078a56>] ? find_busiest_group+0x36/0x490
  [<ffffffff812d5821>] ? nla_parse+0x31/0xe0
  [<ffffffff812d5821>] ? nla_parse+0x31/0xe0
  [<ffffffff81477e6e>] rtnl_newlink+0x36e/0x590
  [<ffffffff81286e16>] ? apparmor_capable+0x26/0x90
  [<ffffffff81477694>] rtnetlink_rcv_msg+0x114/0x300
  [<ffffffff8114d1c3>] ? __kmalloc_node_track_caller+0x63/0x1b0
  [<ffffffff8145b65b>] ? __alloc_skb+0x8b/0x290
  [<ffffffff81477580>] ? __rtnl_unlock+0x20/0x20
  [<ffffffff8148f471>] netlink_rcv_skb+0xb1/0xc0
  [<ffffffff814748f5>] rtnetlink_rcv+0x25/0x40
  [<ffffffff8148ed8b>] netlink_unicast+0x19b/0x220
  [<ffffffff8148f111>] netlink_sendmsg+0x301/0x3c0
  [<ffffffff8144f8ec>] sock_sendmsg+0xbc/0xf0
  [<ffffffff81450797>] ? sock_recvmsg+0xd7/0x110
  [<ffffffff8145045c>] __sys_sendmsg+0x3ac/0x3c0
  [<ffffffff810854dc>] ? ktime_get_ts+0x4c/0xf0
  [<ffffffff81452699>] sys_sendmsg+0x49/0x90
  [<ffffffff81553906>] system_call_fastpath+0x1a/0x1f

Bisecting leads to:
commit 175c0dffef310fc7d7f026ca4a7682beb2fbd8ec
Author: Xiaotian Feng <xtfeng@gmail.com>
Date:   Wed Oct 31 00:29:57 2012 +0000

    drivers/net: use tasklet_kill in device remove/close process
    
    Some driver uses tasklet_disable in device remove/close process,
    tasklet_disable will inc tasklet->count and return. If the tasklet
    is not handled yet because some softirq pressure, the tasklet will
    placed on the tasklet_vec, never have a chance to excute. This might
    lead to ksoftirqd heavy loaded, wakeup with pending_softirq, but
    tasklet is disabled. tasklet_kill should be used in this case.
    
    Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller <davem@davemloft.net>
(if it wasn't obvious, I have an Ethernet device that needs the "jme" driver, 
04:00.5 Ethernet controller [0200]: JMicron Technology Corp. JMC250 PCI 
Express Gigabit Ethernet Controller [197b:0250] (rev 03))

Since 3.7, I sometimes get the below messages during suspend, but it would 
never hang:
smpboot: CPU 2 is now offline                                                                                                                                                                                                    
NOHZ: local_softirq_pending 02                                                                                                                                                                                                   
NOHZ: local_softirq_pending 202                                                                                                                                                                                                  
NOHZ: local_softirq_pending 202
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 202
NOHZ: local_softirq_pending 202
smpboot: CPU 3 is now offline

Time for a revert or do you have an other proposed fix?

Regards,
Peter

^ permalink raw reply

* [RFC net-next] tun: implement BQL
From: Stephen Hemminger @ 2012-11-12 21:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Add multiqueue BQL to tap device. Tested with VPN, doesn't have a big impact,
most likely because there is little queueing occuring at TAP device except
under case of forwarding.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/tun.c	2012-11-12 07:58:45.527037947 -0800
+++ b/drivers/net/tun.c	2012-11-12 11:20:26.017512742 -0800
@@ -412,6 +412,7 @@ static void __tun_detach(struct tun_file
 		tun_flow_delete_by_queue(tun, tun->numqueues + 1);
 		/* Drop read queue */
 		skb_queue_purge(&tfile->sk.sk_receive_queue);
+		netdev_tx_reset_queue(netdev_get_tx_queue(dev, index));
 		tun_set_real_num_queues(tun);
 
 		if (tun->numqueues == 0 && !(tun->flags & TUN_PERSIST))
@@ -455,6 +456,7 @@ static void tun_detach_all(struct net_de
 						  lockdep_rtnl_is_held());
 		/* Drop read queue */
 		skb_queue_purge(&tfile->sk.sk_receive_queue);
+		netdev_tx_reset_queue(netdev_get_tx_queue(dev, i));
 		sock_put(&tfile->sk);
 	}
 }
@@ -712,6 +714,8 @@ static netdev_tx_t tun_net_xmit(struct s
 	 * for indefinite time. */
 	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
 		goto drop;
+
+	netdev_tx_sent_queue(netdev_get_tx_queue(dev, txq), skb->len);
 	skb_orphan(skb);
 
 	/* Enqueue packet */
@@ -1324,6 +1328,10 @@ static ssize_t tun_do_read(struct tun_st
 		netif_wake_subqueue(tun->dev, tfile->queue_index);
 
 		ret = tun_put_user(tun, tfile, skb, iv, len);
+
+		netdev_tx_completed_queue(netdev_get_tx_queue(tun->dev,
+							      tfile->queue_index),
+					  1, skb->len);
 		kfree_skb(skb);
 		break;
 	}

^ permalink raw reply

* Re: ath9k_htc-based adapter unfunctioning after commit ceb26a6013
From: Adrian Chadd @ 2012-11-12 21:39 UTC (permalink / raw)
  To: Corey Richardson
  Cc: linux-wireless, jouni, vthiagar, linville, ath9k-devel, netdev,
	nbd
In-Reply-To: <CA++BO6SuRgOu7A4BKobrdEeSpSc3UUYHXaECmEGE=R17w7HjQA@mail.gmail.com>

Hi!

Please create a bugzilla.kernel.org ticket with this info?

Thanks,



adrian


On 12 November 2012 05:34, Corey Richardson <corey@octayn.net> wrote:
> I was giving 3.7-rc4 a whirl when I discovered my wireless adapter [0] wasn't
> working: the LED didn't light up. Went and ran 'ip link' and it just hung there.
> Tried a few other things and discovered that sudo was broken too. With
> strace I found that they were hanging in sendto().
>
> If I then unplug the adapter, everything comes out of the hang.
>
> Bisected and found that commit ceb26a6013b962b82f644189ea29d802490fc8fc is
> to blame.
>
> Attached is my .config,. The complete dmesg is attached, as well as
> dmesg_disconnect which is what gets spat out on disconnect of the device.
> strace is an strace of 'ip link'.
>
> I feel like I haven't given enough information; anything else needed?
>
> (Please CC me in any reply to a list; I am not subscribed to them)
>
> [0] - http://support.netgear.com/product/WNA1100
>
> --
> Corey Richardson

^ permalink raw reply

* Re: ath9k_htc-based adapter unfunctioning after commit ceb26a6013
From: Corey Richardson @ 2012-11-12 21:49 UTC (permalink / raw)
  To: Adrian Chadd
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	jouni-A+ZNKFmMK5xy9aJCnZT0Uw, vthiagar-A+ZNKFmMK5xy9aJCnZT0Uw,
	linville-2XuSBdqkA4R54TAoqtyWWQ,
	ath9k-devel-xDcbHBWguxHbcTqmT+pZeQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	nbd-p3rKhJxN3npAfugRpC6u6w
In-Reply-To: <CAJ-Vmon04wLvn0DDnD9X98bY8DxHRQf10jQJCbuckbztS261vQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Mon, Nov 12, 2012 at 4:39 PM, Adrian Chadd <adrian-h+KGxgPPiopAfugRpC6u6w@public.gmane.org> wrote:
> Hi!
>
> Please create a bugzilla.kernel.org ticket with this info?
>

https://bugzilla.kernel.org/show_bug.cgi?id=50441

--
Corey Richardson
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] ipv4/ip_vti.c: VTI fix post-decryption forwarding
From: Saurabh @ 2012-11-12 22:17 UTC (permalink / raw)
  To: netdev



With the latest kernel there are two things that must be done post decryption
 so that the packet are forwarded.
 1. Remove the mark from the packet. This will cause the packet to not match
 the ipsec-policy again. However doing this causes the post-decryption check to
 fail also and the packet will get dropped. (cat /proc/net/xfrm_stat).
 2. Remove the sp association in the skbuff so that no policy check is done on
 the packet for VTI tunnels.

Due to #2 above we must now do a security-policy check in the vti rcv path
prior to resetting the mark in the skbuff.

Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com>
Reported-by: Ruben Herold <ruben@puettmann.net>

---
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 1831092..858fddf 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -338,12 +338,17 @@ static int vti_rcv(struct sk_buff *skb)
 	if (tunnel != NULL) {
 		struct pcpu_tstats *tstats;
 
+		if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
+			return -1;
+
 		tstats = this_cpu_ptr(tunnel->dev->tstats);
 		u64_stats_update_begin(&tstats->syncp);
 		tstats->rx_packets++;
 		tstats->rx_bytes += skb->len;
 		u64_stats_update_end(&tstats->syncp);
 
+		skb->mark = 0;
+		secpath_reset(skb);
 		skb->dev = tunnel->dev;
 		return 1;
 	}

^ permalink raw reply related

* Re: [3.0.y, 3.2.y, 3.4.y] Re: [PATCH v2] r8169: Fix WoL on RTL8168d/8111d.
From: Francois Romieu @ 2012-11-12 23:03 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: davem, stable, Cyril Brulebois, netdev, nic_swsd, Hayes Wang,
	linux-kernel, florent.fourcot
In-Reply-To: <20121112163858.GA4343@elie.Belkin>

Jonathan Nieder <jrnieder@gmail.com> :
[...]
> This has been applied as commit b00e69dee4cc in mainline; thanks!
> 
> Fran??ois and David, would this be a candidate for inclusion in
> 3.0- and newer stable kernels?

- 3.0.51
  b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
  aee77e4accbeb2c86b1d294cd84fec4a12dde3bd

- 3.4.18
  0481776b7a70f09acf7d9d97c288c3a8403fbfe4
  b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
  aee77e4accbeb2c86b1d294cd84fec4a12dde3bd

- 3.6.6
  0481776b7a70f09acf7d9d97c288c3a8403fbfe4
  b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
  aee77e4accbeb2c86b1d294cd84fec4a12dde3bd

I'll cook the patches.

-- 
Ueimor

^ permalink raw reply

* Re: [3.0.y, 3.2.y, 3.4.y] Re: [PATCH v2] r8169: Fix WoL on RTL8168d/8111d.
From: David Miller @ 2012-11-12 23:27 UTC (permalink / raw)
  To: romieu
  Cc: jrnieder, stable, kibi, netdev, nic_swsd, hayeswang, linux-kernel,
	florent.fourcot
In-Reply-To: <20121112230339.GA28555@electric-eye.fr.zoreil.com>

From: Francois Romieu <romieu@fr.zoreil.com>
Date: Tue, 13 Nov 2012 00:03:39 +0100

> Jonathan Nieder <jrnieder@gmail.com> :
> [...]
>> This has been applied as commit b00e69dee4cc in mainline; thanks!
>> 
>> Fran??ois and David, would this be a candidate for inclusion in
>> 3.0- and newer stable kernels?
> 
> - 3.0.51
>   b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
>   aee77e4accbeb2c86b1d294cd84fec4a12dde3bd
> 
> - 3.4.18
>   0481776b7a70f09acf7d9d97c288c3a8403fbfe4
>   b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
>   aee77e4accbeb2c86b1d294cd84fec4a12dde3bd
> 
> - 3.6.6
>   0481776b7a70f09acf7d9d97c288c3a8403fbfe4
>   b00e69dee4ccbb3a19989e3d4f1385bc2e3406cd
>   aee77e4accbeb2c86b1d294cd84fec4a12dde3bd
> 
> I'll cook the patches.

Thank you, these all look fine for -stable to me.

^ permalink raw reply

* Re: [PATCH v3 1/9] net: core: use this_cpu_ptr per-cpu helper
From: Christoph Lameter @ 2012-11-12 23:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Shan Wei, David Miller, timo.teras, steffen.klassert, NetDev,
	Kernel-Maillist
In-Reply-To: <1352492144.19779.811.camel@edumazet-glaptop>

On Fri, 9 Nov 2012, Eric Dumazet wrote:

> Its not the case.
>
> We need the _pointer_ to call :
>
> tasklet_schedule(tasklet);

Ah. Ok. In that case

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply

* Re: [PATCH v3 2/9] net: rds: use this_cpu_ptr per-cpu helper
From: Christoph Lameter @ 2012-11-12 23:31 UTC (permalink / raw)
  To: Shan Wei
  Cc: venkat.x.venkatsubra, David Miller, rds-devel, NetDev,
	Kernel-Maillist
In-Reply-To: <50A0EFE7.9010100@gmail.com>

On Mon, 12 Nov 2012, Shan Wei wrote:

> __percpu annotations in struct rds_ib_refill_cache is missing.
>
> you mean that read/write fields of struct rds_ib_cache_head
> using __this_cpu_* operation like following?
> How about it?
>
> diff --git a/net/rds/ib.h b/net/rds/ib.h
> index 8d2b3d5..7280ab8 100644
> --- a/net/rds/ib.h
> +++ b/net/rds/ib.h
> @@ -50,7 +50,7 @@ struct rds_ib_cache_head {
>  };
>
>  struct rds_ib_refill_cache {
> -	struct rds_ib_cache_head *percpu;
> +	struct rds_ib_cache_head __percpu *percpu;

That looks ok.

^ permalink raw reply

* Re: [PATCH] bonding: rlb mode of bond should not alter ARP replies originating via bridge
From: Jay Vosburgh @ 2012-11-12 23:40 UTC (permalink / raw)
  To: Zheng Li; +Cc: netdev, andy, linux-kernel, davem, joe.jin
In-Reply-To: <1352714143-28160-1-git-send-email-zheng.x.li@oracle.com>

Zheng Li <zheng.x.li@oracle.com> wrote:

>ARP traffic passing through a bridge and out via the bond (when the bond is a 
>port of the bridge) should not have its source MAC address adjusted by the 
>receive load balance code in rlb_arp_xmit.
>
>Signed-off-by: Zheng Li <zheng.x.li@oracle.com>
>Cc: Jay Vosburgh <fubar@us.ibm.com>
>Cc: Andy Gospodarek <andy@greyhouse.net>
>Cc: "David S. Miller" <davem@davemloft.net>
>
>---
> drivers/net/bonding/bond_alb.c |   21 ++++++++++++++++++++-
> 1 files changed, 20 insertions(+), 1 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index e15cc11..a99e658 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -685,6 +685,18 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
> 	return assigned_slave;
> }
>
>+struct slave *bond_slave_has_mac(struct bonding *bond, const u8 *mac)
>+{
>+	int i = 0;
>+	struct slave *tmp;
>+
>+	bond_for_each_slave(bond, tmp, i)
>+		if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
>+			return tmp;
>+
>+	return NULL;
>+}

	This should go in bonding.h as a static line, so that other
parts of bonding can use it in the future.  I'll also point out that
this function is one I suggested in a prior email, but did not test.
Presumably you've tested this.

> /* chooses (and returns) transmit channel for arp reply
>  * does not choose channel for other arp types since they are
>  * sent on the curr_active_slave
>@@ -700,7 +712,14 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
> 		*/
> 		tx_slave = rlb_choose_channel(skb, bond);
> 		if (tx_slave) {
>-			memcpy(arp->mac_src,tx_slave->dev->dev_addr, ETH_ALEN);
>+			struct slave *tmp_slave = NULL;

	This does not need to be initialized, as the next line of code
assigns a value to it.

	-J

>+			/* Only modify ARP's MAC if it originates locally;
>+			 * don't change ARPs arriving via a bridge.
>+			 */
>+			tmp_slave = bond_slave_has_mac(bond, arp->mac_src);
>+			if (tmp_slave)
>+				memcpy(arp->mac_src, tx_slave->dev->dev_addr,
>+				       ETH_ALEN);
> 		}
> 		pr_debug("Server sent ARP Reply packet\n");
> 	} else if (arp->op_code == htons(ARPOP_REQUEST)) {
>-- 
>1.7.6.5

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* [PATCH 1/3] ipvs: remove silly double assignment
From: pablo @ 2012-11-13  0:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1352765203-4160-1-git-send-email-pablo@netfilter.org>

From: Alan Cox <alan@linux.intel.com>

I don't even want to think what the C spec says for this 8)

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipvs/ip_vs_nfct.c |    2 +-
 net/netfilter/ipvs/ip_vs_xmit.c |    8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_nfct.c b/net/netfilter/ipvs/ip_vs_nfct.c
index 022e77e..c8beafd 100644
--- a/net/netfilter/ipvs/ip_vs_nfct.c
+++ b/net/netfilter/ipvs/ip_vs_nfct.c
@@ -82,7 +82,7 @@ void
 ip_vs_update_conntrack(struct sk_buff *skb, struct ip_vs_conn *cp, int outin)
 {
 	enum ip_conntrack_info ctinfo;
-	struct nf_conn *ct = ct = nf_ct_get(skb, &ctinfo);
+	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 	struct nf_conntrack_tuple new_tuple;
 
 	if (ct == NULL || nf_ct_is_confirmed(ct) || nf_ct_is_untracked(ct) ||
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 12008b4..ee6b7a9 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -594,7 +594,7 @@ ip_vs_nat_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	if (cp->flags & IP_VS_CONN_F_SYNC && local) {
 		enum ip_conntrack_info ctinfo;
-		struct nf_conn *ct = ct = nf_ct_get(skb, &ctinfo);
+		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 		if (ct && !nf_ct_is_untracked(ct)) {
 			IP_VS_DBG_RL_PKT(10, AF_INET, pp, skb, 0,
@@ -710,7 +710,7 @@ ip_vs_nat_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	if (cp->flags & IP_VS_CONN_F_SYNC && local) {
 		enum ip_conntrack_info ctinfo;
-		struct nf_conn *ct = ct = nf_ct_get(skb, &ctinfo);
+		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 		if (ct && !nf_ct_is_untracked(ct)) {
 			IP_VS_DBG_RL_PKT(10, AF_INET6, pp, skb, 0,
@@ -1235,7 +1235,7 @@ ip_vs_icmp_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	if (cp->flags & IP_VS_CONN_F_SYNC && local) {
 		enum ip_conntrack_info ctinfo;
-		struct nf_conn *ct = ct = nf_ct_get(skb, &ctinfo);
+		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 		if (ct && !nf_ct_is_untracked(ct)) {
 			IP_VS_DBG(10, "%s(): "
@@ -1356,7 +1356,7 @@ ip_vs_icmp_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	if (cp->flags & IP_VS_CONN_F_SYNC && local) {
 		enum ip_conntrack_info ctinfo;
-		struct nf_conn *ct = ct = nf_ct_get(skb, &ctinfo);
+		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 		if (ct && !nf_ct_is_untracked(ct)) {
 			IP_VS_DBG(10, "%s(): "
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 3/3] netfilter: ipv6: add getsockopt to retrieve origdst
From: pablo @ 2012-11-13  0:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1352765203-4160-1-git-send-email-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

userspace can query the original ipv4 destination address of a REDIRECTed
connection via
getsockopt(m_sock, SOL_IP, SO_ORIGINAL_DST, &m_server_addr, &addrsize)

but for ipv6 no such option existed.

This adds getsockopt(..., IPPROTO_IPV6, IP6T_SO_ORIGINAL_DST, ...).

Without this, userspace needs to parse /proc or use ctnetlink, which
appears to be overkill.

This uses option number 80 for IP6T_SO_ORIGINAL_DST, which is spare,
to use the same number we use in the IPv4 socket option SO_ORIGINAL_DST.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/in6.h                       |    1 +
 include/uapi/linux/netfilter_ipv6/ip6_tables.h |    3 ++
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   61 ++++++++++++++++++++++++
 3 files changed, 65 insertions(+)

diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index 1e31599..f79c372 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -240,6 +240,7 @@ struct in6_flowlabel_req {
  *
  * IP6T_SO_GET_REVISION_MATCH	68
  * IP6T_SO_GET_REVISION_TARGET	69
+ * IP6T_SO_ORIGINAL_DST		80
  */
 
 /* RFC5014: Source address selection */
diff --git a/include/uapi/linux/netfilter_ipv6/ip6_tables.h b/include/uapi/linux/netfilter_ipv6/ip6_tables.h
index bf1ef65..649c680 100644
--- a/include/uapi/linux/netfilter_ipv6/ip6_tables.h
+++ b/include/uapi/linux/netfilter_ipv6/ip6_tables.h
@@ -178,6 +178,9 @@ struct ip6t_error {
 #define IP6T_SO_GET_REVISION_TARGET	(IP6T_BASE_CTL + 5)
 #define IP6T_SO_GET_MAX			IP6T_SO_GET_REVISION_TARGET
 
+/* obtain original address if REDIRECT'd connection */
+#define IP6T_SO_ORIGINAL_DST            80
+
 /* ICMP matching stuff */
 struct ip6t_icmp {
 	__u8 type;				/* type to match */
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index 8860d23..02dcafd 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -21,6 +21,7 @@
 
 #include <linux/netfilter_bridge.h>
 #include <linux/netfilter_ipv6.h>
+#include <linux/netfilter_ipv6/ip6_tables.h>
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_helper.h>
 #include <net/netfilter/nf_conntrack_l4proto.h>
@@ -295,6 +296,50 @@ static struct nf_hook_ops ipv6_conntrack_ops[] __read_mostly = {
 	},
 };
 
+static int
+ipv6_getorigdst(struct sock *sk, int optval, void __user *user, int *len)
+{
+	const struct inet_sock *inet = inet_sk(sk);
+	const struct ipv6_pinfo *inet6 = inet6_sk(sk);
+	const struct nf_conntrack_tuple_hash *h;
+	struct sockaddr_in6 sin6;
+	struct nf_conntrack_tuple tuple = { .src.l3num = NFPROTO_IPV6 };
+	struct nf_conn *ct;
+
+	tuple.src.u3.in6 = inet6->rcv_saddr;
+	tuple.src.u.tcp.port = inet->inet_sport;
+	tuple.dst.u3.in6 = inet6->daddr;
+	tuple.dst.u.tcp.port = inet->inet_dport;
+	tuple.dst.protonum = sk->sk_protocol;
+
+	if (sk->sk_protocol != IPPROTO_TCP && sk->sk_protocol != IPPROTO_SCTP)
+		return -ENOPROTOOPT;
+
+	if (*len < 0 || (unsigned int) *len < sizeof(sin6))
+		return -EINVAL;
+
+	h = nf_conntrack_find_get(sock_net(sk), NF_CT_DEFAULT_ZONE, &tuple);
+	if (!h) {
+		pr_debug("IP6T_SO_ORIGINAL_DST: Can't find %pI6c/%u-%pI6c/%u.\n",
+			 &tuple.src.u3.ip6, ntohs(tuple.src.u.tcp.port),
+			 &tuple.dst.u3.ip6, ntohs(tuple.dst.u.tcp.port));
+		return -ENOENT;
+	}
+
+	ct = nf_ct_tuplehash_to_ctrack(h);
+
+	sin6.sin6_family = AF_INET6;
+	sin6.sin6_port = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.tcp.port;
+	sin6.sin6_flowinfo = inet6->flow_label & IPV6_FLOWINFO_MASK;
+	memcpy(&sin6.sin6_addr,
+		&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.in6,
+					sizeof(sin6.sin6_addr));
+	sin6.sin6_scope_id = sk->sk_bound_dev_if;
+
+	nf_ct_put(ct);
+	return copy_to_user(user, &sin6, sizeof(sin6)) ? -EFAULT : 0;
+}
+
 #if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
 
 #include <linux/netfilter/nfnetlink.h>
@@ -359,6 +404,14 @@ MODULE_ALIAS("nf_conntrack-" __stringify(AF_INET6));
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Yasuyuki KOZAKAI @USAGI <yasuyuki.kozakai@toshiba.co.jp>");
 
+static struct nf_sockopt_ops so_getorigdst6 = {
+	.pf		= NFPROTO_IPV6,
+	.get_optmin	= IP6T_SO_ORIGINAL_DST,
+	.get_optmax	= IP6T_SO_ORIGINAL_DST + 1,
+	.get		= ipv6_getorigdst,
+	.owner		= THIS_MODULE,
+};
+
 static int ipv6_net_init(struct net *net)
 {
 	int ret = 0;
@@ -425,6 +478,12 @@ static int __init nf_conntrack_l3proto_ipv6_init(void)
 	need_conntrack();
 	nf_defrag_ipv6_enable();
 
+	ret = nf_register_sockopt(&so_getorigdst6);
+	if (ret < 0) {
+		pr_err("Unable to register netfilter socket option\n");
+		return ret;
+	}
+
 	ret = register_pernet_subsys(&ipv6_net_ops);
 	if (ret < 0)
 		goto cleanup_pernet;
@@ -440,6 +499,7 @@ static int __init nf_conntrack_l3proto_ipv6_init(void)
  cleanup_ipv6:
 	unregister_pernet_subsys(&ipv6_net_ops);
  cleanup_pernet:
+	nf_unregister_sockopt(&so_getorigdst6);
 	return ret;
 }
 
@@ -448,6 +508,7 @@ static void __exit nf_conntrack_l3proto_ipv6_fini(void)
 	synchronize_net();
 	nf_unregister_hooks(ipv6_conntrack_ops, ARRAY_SIZE(ipv6_conntrack_ops));
 	unregister_pernet_subsys(&ipv6_net_ops);
+	nf_unregister_sockopt(&so_getorigdst6);
 }
 
 module_init(nf_conntrack_l3proto_ipv6_init);
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 0/3] netfilter updates for net-next
From: pablo @ 2012-11-13  0:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>

Hi David,

The following three patches contain updates for your net-next tree,
they include:

* Little cleanup for IPVS the use of a strange notation to assign the
  conntrack object, from Alan Cox.

* getsockopt support to obtain the original IPv6 address after NAT,
  similar to the one that IPv4 provides, from Florian Westphal.

* Another little cleanup for nf_nat to save a couple of lines by using
  PTR_RET, from Wu Fengguang.

You can pull these changes from:

git://1984.lsi.us.es/nf-next master

Thanks!

Alan Cox (1):
  ipvs: remove silly double assignment

Florian Westphal (1):
  netfilter: ipv6: add getsockopt to retrieve origdst

Wu Fengguang (1):
  netfilter: nf_nat: use PTR_RET

 include/uapi/linux/in6.h                       |    1 +
 include/uapi/linux/netfilter_ipv6/ip6_tables.h |    3 ++
 net/ipv4/netfilter/iptable_nat.c               |    4 +-
 net/ipv6/netfilter/ip6table_nat.c              |    4 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   61 ++++++++++++++++++++++++
 net/netfilter/ipvs/ip_vs_nfct.c                |    2 +-
 net/netfilter/ipvs/ip_vs_xmit.c                |    8 ++--
 7 files changed, 72 insertions(+), 11 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* [PATCH 2/3] netfilter: nf_nat: use PTR_RET
From: pablo @ 2012-11-13  0:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1352765203-4160-1-git-send-email-pablo@netfilter.org>

From: Wu Fengguang <fengguang.wu@intel.com>

Use PTR_RET rather than if(IS_ERR(...)) + PTR_ERR

Generated by: coccinelle/api/ptr_ret.cocci

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/iptable_nat.c  |    4 +---
 net/ipv6/netfilter/ip6table_nat.c |    4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c
index 9e0ffaf..8d65b74 100644
--- a/net/ipv4/netfilter/iptable_nat.c
+++ b/net/ipv4/netfilter/iptable_nat.c
@@ -274,9 +274,7 @@ static int __net_init iptable_nat_net_init(struct net *net)
 		return -ENOMEM;
 	net->ipv4.nat_table = ipt_register_table(net, &nf_nat_ipv4_table, repl);
 	kfree(repl);
-	if (IS_ERR(net->ipv4.nat_table))
-		return PTR_ERR(net->ipv4.nat_table);
-	return 0;
+	return PTR_RET(net->ipv4.nat_table);
 }
 
 static void __net_exit iptable_nat_net_exit(struct net *net)
diff --git a/net/ipv6/netfilter/ip6table_nat.c b/net/ipv6/netfilter/ip6table_nat.c
index e418bd6..4c8219e 100644
--- a/net/ipv6/netfilter/ip6table_nat.c
+++ b/net/ipv6/netfilter/ip6table_nat.c
@@ -275,9 +275,7 @@ static int __net_init ip6table_nat_net_init(struct net *net)
 		return -ENOMEM;
 	net->ipv6.ip6table_nat = ip6t_register_table(net, &nf_nat_ipv6_table, repl);
 	kfree(repl);
-	if (IS_ERR(net->ipv6.ip6table_nat))
-		return PTR_ERR(net->ipv6.ip6table_nat);
-	return 0;
+	return PTR_RET(net->ipv6.ip6table_nat);
 }
 
 static void __net_exit ip6table_nat_net_exit(struct net *net)
-- 
1.7.10.4

^ permalink raw reply related

* Re: How to compile tcp fast open client app with kernel tfo client support ?
From: Vincent Li @ 2012-11-13  0:33 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: Eric Dumazet, Yuchung Cheng, netdev
In-Reply-To: <CAGK4HS_SpdECk_KCu98Z1X9W02G9j9u7QE9B6qHEnTXLeZbqgA@mail.gmail.com>

On Fri, Nov 9, 2012 at 12:54 PM, Vijay Subramanian
<subramanian.vijay@gmail.com> wrote:
> On 9 November 2012 12:27, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Fri, 2012-11-09 at 12:17 -0800, Vincent Li wrote:
>>
>>> I have tried to run fresh build net-next, for some reason the build
>>> stops ssh service from starting up, tried 3.7.0-rc4 too, same issue. I
>>> don't know if it is new kernel bug or ubuntu upstart bug, but that is
>>> another issue.
>>>
>>
>> Not sure Ubuntu 10.04 binaries can run a 3.7 kernel.
>>
>> Some bug fixes in kernel need bug fixes in userland.
>>
>> You could try a bisection ?
>>
>  I saw the same issue with Ubuntu starting around 3.7.0-rc1 but 3.6
> was fine if I recall. (I did a partial bisection but was not sure it
> was a kernel issue)
>  sshd would not start up on boot and I 'fixed' it by adding the
> following in /etc/rc.local since I assumed it was a userspace issue.
>
> if [ ! -d /var/run/sshd ]; then
>    mkdir /var/run/sshd
>    chmod 0755 /var/run/sshd
> fi
> /usr/sbin/sshd
>
>
> Thanks,
> Vijay

Hi Vijay,

just FYI, I filed a bug report in ubuntu and it appears to be a kernel
commit breaks the user space code that still using deprecated kernel
knobs oom_adj.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1077248

commit 01dc52ebdf472f77cca623ca693ca24cfc0f1bbe
Author: Davidlohr Bueso <dave@gnu.org>
Date: Mon Oct 8 16:29:30 2012 -0700

    oom: remove deprecated oom_adj

    The deprecated /proc/<pid>/oom_adj is scheduled for removal this month.

^ permalink raw reply

* Re: How to compile tcp fast open client app with kernel tfo client support ?
From: Vijay Subramanian @ 2012-11-13  1:13 UTC (permalink / raw)
  To: Vincent Li; +Cc: Eric Dumazet, Yuchung Cheng, netdev
In-Reply-To: <CAK3+h2y+Kctdfk6+q5UB56qemdEVY4Fw3AbqZwycL-uDWzD=Lg@mail.gmail.com>

> Hi Vijay,
>
> just FYI, I filed a bug report in ubuntu and it appears to be a kernel
> commit breaks the user space code that still using deprecated kernel
> knobs oom_adj.
>

Vincent,
Thanks for following up and finding the exact cause!!

Vijay

^ permalink raw reply

* [PATCH 0/9 v4] use efficient this_cpu_* helper
From: Shan Wei @ 2012-11-13  1:51 UTC (permalink / raw)
  To: cl, David Miller, NetDev, Kernel-Maillist, Shan Wei

this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id()) 
and can reduce  memory accesses.
The latter helper needs to find the offset for current cpu,
and needs more assembler instructions which objdump shows in following. 

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. If you want to operate on rda(defined as per_cpu) 
then you can only use this_cpu_ptr. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

per_cpu_ptr(p, smp_processor_id()):
  1e:   65 8b 04 25 00 00 00 00         mov    %gs:0x0,%eax
  26:   48 98                           cltq
  28:   31 f6                           xor    %esi,%esi
  2a:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
  31:   48 8b 04 c5 00 00 00 00         mov    0x0(,%rax,8),%rax
  39:   c7 44 10 04 14 00 00 00         movl   $0x14,0x4(%rax,%rdx,1)

this_cpu_ptr(p)
  1e:   65 48 03 14 25 00 00 00 00      add    %gs:0x0,%rdx
  27:   31 f6                           xor    %esi,%esi
  29:   c7 42 04 14 00 00 00            movl   $0x14,0x4(%rdx)
  30:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi



Changelog V4:
1. [read|write]ing fields of struct rds_ib_cache_head using __this_cpu_* operation for rds subsystem.
   see patch2
2. fix bug in xfrm to read pointer. see patch3. 
3. avoid type cast in patch7.

Changelog V3:
1. use this_cpu_read directly read member of per-cpu variable,
   so that droping the this_cpu_ptr operation.
2. for preemption off and bottom halves off case,
   use __this_cpu_read instead of this_cpu_read. 

Changelog V2:
1. Use this_cpu_read directly instead of ref to field of per-cpu variable.
2. Patch5 about ftrace is dropped from this series.
3. Add new patch9 to replace get_cpu;per_cpu_ptr;put_cpu with this_cpu_add opt.
4. For preemption disable case, use __this_cpu_read instead.
  

$ git diff --stat d4185bbf62a5d8d777ee445db1581beb17882a07
 drivers/clocksource/arm_generic.c |    2 +-
 kernel/padata.c                   |    5 ++---
 kernel/rcutree.c                  |    2 +-
 kernel/trace/blktrace.c           |    2 +-
 kernel/trace/trace.c              |    5 +----
 net/batman-adv/main.h             |    4 +---
 net/core/flow.c                   |    4 +---
 net/openvswitch/datapath.c        |    4 ++--
 net/openvswitch/vport.c           |    5 ++---
 net/rds/ib.h                      |    2 +-
 net/rds/ib_recv.c                 |   24 +++++++++++++-----------
 net/xfrm/xfrm_ipcomp.c            |    8 +++-----
 12 files changed, 29 insertions(+), 38 deletions(-)

^ permalink raw reply

* [PATCH v4 1/9] net: core: use this_cpu_ptr per-cpu helper
From: Shan Wei @ 2012-11-13  1:51 UTC (permalink / raw)
  To: David Miller, timo.teras, steffen.klassert, NetDev,
	Kernel-Maillist, cl, Shan Wei

From: Shan Wei <davidshan@tencent.com>

flush_tasklet is a struct, not a pointer in percpu var.
so use this_cpu_ptr to get the member pointer.

Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
---
no changes vs v3.
---
 net/core/flow.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index e318c7e..b0901ee 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -327,11 +327,9 @@ static void flow_cache_flush_tasklet(unsigned long data)
 static void flow_cache_flush_per_cpu(void *data)
 {
 	struct flow_flush_info *info = data;
-	int cpu;
 	struct tasklet_struct *tasklet;
 
-	cpu = smp_processor_id();
-	tasklet = &per_cpu_ptr(info->cache->percpu, cpu)->flush_tasklet;
+	tasklet = this_cpu_ptr(&info->cache->percpu->flush_tasklet);
 	tasklet->data = (unsigned long)info;
 	tasklet_schedule(tasklet);
 }
-- 
1.7.1

^ permalink raw reply related

* [PATCH v4 2/9] net: rds: use this_cpu_* per-cpu helper
From: Shan Wei @ 2012-11-13  1:52 UTC (permalink / raw)
  To: venkat.x.venkatsubra, David Miller, rds-devel, NetDev,
	Kernel-Maillist, cl, Shan Wei

From: Shan Wei <davidshan@tencent.com>


Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
---
v4:
1. add missing __percpu annotations.
2. [read|write]ing fields of struct rds_ib_cache_head
using __this_cpu_* operation, drop per_cpu_ptr.
---
 net/rds/ib.h      |    2 +-
 net/rds/ib_recv.c |   24 +++++++++++++-----------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index 8d2b3d5..7280ab8 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -50,7 +50,7 @@ struct rds_ib_cache_head {
 };
 
 struct rds_ib_refill_cache {
-	struct rds_ib_cache_head *percpu;
+	struct rds_ib_cache_head __percpu *percpu;
 	struct list_head	 *xfer;
 	struct list_head	 *ready;
 };
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 8d19491..8c5bc85 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -418,20 +418,21 @@ static void rds_ib_recv_cache_put(struct list_head *new_item,
 				 struct rds_ib_refill_cache *cache)
 {
 	unsigned long flags;
-	struct rds_ib_cache_head *chp;
 	struct list_head *old;
+	struct list_head __percpu *chpfirst;
 
 	local_irq_save(flags);
 
-	chp = per_cpu_ptr(cache->percpu, smp_processor_id());
-	if (!chp->first)
+	chpfirst = __this_cpu_read(cache->percpu->first);
+	if (!chpfirst)
 		INIT_LIST_HEAD(new_item);
 	else /* put on front */
-		list_add_tail(new_item, chp->first);
-	chp->first = new_item;
-	chp->count++;
+		list_add_tail(new_item, chpfirst);
 
-	if (chp->count < RDS_IB_RECYCLE_BATCH_COUNT)
+	__this_cpu_write(chpfirst, new_item);
+	__this_cpu_inc(cache->percpu->count);
+
+	if (__this_cpu_read(cache->percpu->count) < RDS_IB_RECYCLE_BATCH_COUNT)
 		goto end;
 
 	/*
@@ -443,12 +444,13 @@ static void rds_ib_recv_cache_put(struct list_head *new_item,
 	do {
 		old = xchg(&cache->xfer, NULL);
 		if (old)
-			list_splice_entire_tail(old, chp->first);
-		old = cmpxchg(&cache->xfer, NULL, chp->first);
+			list_splice_entire_tail(old, chpfirst);
+		old = cmpxchg(&cache->xfer, NULL, chpfirst);
 	} while (old);
 
-	chp->first = NULL;
-	chp->count = 0;
+
+	__this_cpu_write(chpfirst, NULL);
+	__this_cpu_write(cache->percpu->count, 0);
 end:
 	local_irq_restore(flags);
 }
-- 
1.7.1

^ permalink raw reply related

* [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper
From: Shan Wei @ 2012-11-13  1:52 UTC (permalink / raw)
  To: steffen.klassert, David Miller, NetDev, Herbert Xu,
	Kernel-Maillist, cl, Shan Wei

From: Shan Wei <davidshan@tencent.com>


Signed-off-by: Shan Wei <davidshan@tencent.com>
---
v4:
  derefrence pointer before reading to avoid compile warning.
---
 net/xfrm/xfrm_ipcomp.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
 	struct crypto_comp * __percpu *tfms;
 	int cpu;
 
-	/* This can be any valid CPU ID so we don't need locking. */
-	cpu = raw_smp_processor_id();
 
 	list_for_each_entry(pos, &ipcomp_tfms_list, list) {
 		struct crypto_comp *tfm;
 
-		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		/* This can be any valid CPU ID so we don't need locking. */
+		tfm = __this_cpu_read(*pos->tfms);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
-			return tfms;
+			return pos->tfms;
 		}
 	}
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH v4 4/9] net: openvswitch: use this_cpu_ptr per-cpu helper
From: Shan Wei @ 2012-11-13  1:52 UTC (permalink / raw)
  To: jesse-l0M0P4e3n4LQT0dZR+AlfA, dev-yBygre7rU0TnMu66kgdUjQ, NetDev,
	Kernel-Maillist, David Miller,
	cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Shan Wei

From: Shan Wei <davidshan-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>

just use more faster this_cpu_ptr instead of per_cpu_ptr(p, smp_processor_id());


Signed-off-by: Shan Wei <davidshan-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
Reviewed-by: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
---
no changes vs v3,v2.
---
 net/openvswitch/datapath.c |    4 ++--
 net/openvswitch/vport.c    |    5 ++---
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 4c4b62c..77d16a5 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -208,7 +208,7 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb)
 	int error;
 	int key_len;
 
-	stats = per_cpu_ptr(dp->stats_percpu, smp_processor_id());
+	stats = this_cpu_ptr(dp->stats_percpu);
 
 	/* Extract flow from 'skb' into 'key'. */
 	error = ovs_flow_extract(skb, p->port_no, &key, &key_len);
@@ -282,7 +282,7 @@ int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb,
 	return 0;
 
 err:
-	stats = per_cpu_ptr(dp->stats_percpu, smp_processor_id());
+	stats = this_cpu_ptr(dp->stats_percpu);
 
 	u64_stats_update_begin(&stats->sync);
 	stats->n_lost++;
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 03779e8..70af0be 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -333,8 +333,7 @@ void ovs_vport_receive(struct vport *vport, struct sk_buff *skb)
 {
 	struct vport_percpu_stats *stats;
 
-	stats = per_cpu_ptr(vport->percpu_stats, smp_processor_id());
-
+	stats = this_cpu_ptr(vport->percpu_stats);
 	u64_stats_update_begin(&stats->sync);
 	stats->rx_packets++;
 	stats->rx_bytes += skb->len;
@@ -359,7 +358,7 @@ int ovs_vport_send(struct vport *vport, struct sk_buff *skb)
 	if (likely(sent)) {
 		struct vport_percpu_stats *stats;
 
-		stats = per_cpu_ptr(vport->percpu_stats, smp_processor_id());
+		stats = this_cpu_ptr(vport->percpu_stats);
 
 		u64_stats_update_begin(&stats->sync);
 		stats->tx_packets++;
-- 
1.7.1

^ permalink raw reply related

* [PATCH v4 9/9] net: batman-adv: use per_cpu_add helper
From: Shan Wei @ 2012-11-13  1:53 UTC (permalink / raw)
  To: lindner_marek-LWAfsSFWpa4, siwu-MaAgPAbsBIVS8oHt8HbXEIQuADTiUCJX,
	ordex-GaUfNO9RBHfsrOwW+9ziJQ,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r, David Miller,
	NetDev, Kernel-Maillist, Shan Wei, Christoph Lameter

From: Shan Wei <davidshan-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>

this_cpu_add is an atomic operation.
and be more faster than per_cpu_ptr operation.

Signed-off-by: Shan Wei <davidshan-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>
Reviewed-by: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
---
v4: no changes vs v3.
---
 net/batman-adv/main.h |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index 897ba6a..3aef5b2 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -263,9 +263,7 @@ static inline bool batadv_has_timed_out(unsigned long timestamp,
 static inline void batadv_add_counter(struct batadv_priv *bat_priv, size_t idx,
 				      size_t count)
 {
-	int cpu = get_cpu();
-	per_cpu_ptr(bat_priv->bat_counters, cpu)[idx] += count;
-	put_cpu();
+	this_cpu_add(bat_priv->bat_counters[idx], count);
 }
 
 #define batadv_inc_counter(b, i) batadv_add_counter(b, i, 1)
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH 3/3] netfilter: ipv6: add getsockopt to retrieve origdst
From: YOSHIFUJI Hideaki @ 2012-11-13  3:29 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, davem, netdev, YOSHIFUJI Hideaki
In-Reply-To: <1352765203-4160-4-git-send-email-pablo@netfilter.org>

pablo@netfilter.org wrote:
> From: Florian Westphal <fw@strlen.de>
> +static int
> +ipv6_getorigdst(struct sock *sk, int optval, void __user *user, int *len)
> +{
> +	const struct inet_sock *inet = inet_sk(sk);
> +	const struct ipv6_pinfo *inet6 = inet6_sk(sk);
> +	const struct nf_conntrack_tuple_hash *h;
> +	struct sockaddr_in6 sin6;
> +	struct nf_conntrack_tuple tuple = { .src.l3num = NFPROTO_IPV6 };
> +	struct nf_conn *ct;
> +
> +	tuple.src.u3.in6 = inet6->rcv_saddr;
> +	tuple.src.u.tcp.port = inet->inet_sport;
> +	tuple.dst.u3.in6 = inet6->daddr;
> +	tuple.dst.u.tcp.port = inet->inet_dport;
> +	tuple.dst.protonum = sk->sk_protocol;
> +
> +	if (sk->sk_protocol != IPPROTO_TCP && sk->sk_protocol != IPPROTO_SCTP)
> +		return -ENOPROTOOPT;
> +
> +	if (*len < 0 || (unsigned int) *len < sizeof(sin6))
> +		return -EINVAL;
> +
> +	h = nf_conntrack_find_get(sock_net(sk), NF_CT_DEFAULT_ZONE, &tuple);
> +	if (!h) {
> +		pr_debug("IP6T_SO_ORIGINAL_DST: Can't find %pI6c/%u-%pI6c/%u.\n",
> +			 &tuple.src.u3.ip6, ntohs(tuple.src.u.tcp.port),
> +			 &tuple.dst.u3.ip6, ntohs(tuple.dst.u.tcp.port));
> +		return -ENOENT;
> +	}
> +
> +	ct = nf_ct_tuplehash_to_ctrack(h);
> +
> +	sin6.sin6_family = AF_INET6;
> +	sin6.sin6_port = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.tcp.port;
> +	sin6.sin6_flowinfo = inet6->flow_label & IPV6_FLOWINFO_MASK;
> +	memcpy(&sin6.sin6_addr,
> +		&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.in6,
> +					sizeof(sin6.sin6_addr));
> +	sin6.sin6_scope_id = sk->sk_bound_dev_if;
> +
> +	nf_ct_put(ct);
> +	return copy_to_user(user, &sin6, sizeof(sin6)) ? -EFAULT : 0;
> +}
> +

I think we should set sin6_scope_id to sk->sk_bound_dev_if only if the
destination is link-local address.

--yoshfuji

^ permalink raw reply

* linux-next: manual merge of the arm-soc tree with the net-next tree
From: Stephen Rothwell @ 2012-11-13  4:15 UTC (permalink / raw)
  To: Olof Johansson, Arnd Bergmann, linux-arm-kernel
  Cc: linux-next, linux-kernel, Jean-Christophe PLAGNIOL-VILLARD,
	David Miller, netdev, Joachim Eastwood

[-- Attachment #1: Type: text/plain, Size: 26031 bytes --]

Hi all,

Today's linux-next merge of the arm-soc tree got a conflict in
drivers/net/ethernet/cadence/at91_ether.c between various commits from
the net-next tree and commit bcd2360c1ff9 ("arm: at91: move platfarm_data
to include/linux/platform_data/atmel.h") from the arm-soc tree.

I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --cc drivers/net/ethernet/cadence/at91_ether.c
index e7a476c,35fc6edb..0000000
--- a/drivers/net/ethernet/cadence/at91_ether.c
+++ b/drivers/net/ethernet/cadence/at91_ether.c
@@@ -25,53 -31,728 +25,54 @@@
  #include <linux/clk.h>
  #include <linux/gfp.h>
  #include <linux/phy.h>
 +#include <linux/io.h>
 +#include <linux/of.h>
 +#include <linux/of_device.h>
 +#include <linux/of_net.h>
 +#include <linux/pinctrl/consumer.h>
+ #include <linux/platform_data/atmel.h>
  
 -#include <asm/io.h>
 -#include <asm/uaccess.h>
 -#include <asm/mach-types.h>
 -
 -#include <mach/at91rm9200_emac.h>
 -#include <asm/gpio.h>
 -
 -#include "at91_ether.h"
 -
 -#define DRV_NAME	"at91_ether"
 -#define DRV_VERSION	"1.0"
 -
 -#define LINK_POLL_INTERVAL	(HZ)
 -
 -/* ..................................................................... */
 -
 -/*
 - * Read from a EMAC register.
 - */
 -static inline unsigned long at91_emac_read(struct at91_private *lp, unsigned int reg)
 -{
 -	return __raw_readl(lp->emac_base + reg);
 -}
 -
 -/*
 - * Write to a EMAC register.
 - */
 -static inline void at91_emac_write(struct at91_private *lp, unsigned int reg, unsigned long value)
 -{
 -	__raw_writel(value, lp->emac_base + reg);
 -}
 -
 -/* ........................... PHY INTERFACE ........................... */
 -
 -/*
 - * Enable the MDIO bit in MAC control register
 - * When not called from an interrupt-handler, access to the PHY must be
 - *  protected by a spinlock.
 - */
 -static void enable_mdi(struct at91_private *lp)
 -{
 -	unsigned long ctl;
 -
 -	ctl = at91_emac_read(lp, AT91_EMAC_CTL);
 -	at91_emac_write(lp, AT91_EMAC_CTL, ctl | AT91_EMAC_MPE);	/* enable management port */
 -}
 -
 -/*
 - * Disable the MDIO bit in the MAC control register
 - */
 -static void disable_mdi(struct at91_private *lp)
 -{
 -	unsigned long ctl;
 -
 -	ctl = at91_emac_read(lp, AT91_EMAC_CTL);
 -	at91_emac_write(lp, AT91_EMAC_CTL, ctl & ~AT91_EMAC_MPE);	/* disable management port */
 -}
 -
 -/*
 - * Wait until the PHY operation is complete.
 - */
 -static inline void at91_phy_wait(struct at91_private *lp)
 -{
 -	unsigned long timeout = jiffies + 2;
 -
 -	while (!(at91_emac_read(lp, AT91_EMAC_SR) & AT91_EMAC_SR_IDLE)) {
 -		if (time_after(jiffies, timeout)) {
 -			printk("at91_ether: MIO timeout\n");
 -			break;
 -		}
 -		cpu_relax();
 -	}
 -}
 -
 -/*
 - * Write value to the a PHY register
 - * Note: MDI interface is assumed to already have been enabled.
 - */
 -static void write_phy(struct at91_private *lp, unsigned char phy_addr, unsigned char address, unsigned int value)
 -{
 -	at91_emac_write(lp, AT91_EMAC_MAN, AT91_EMAC_MAN_802_3 | AT91_EMAC_RW_W
 -		| ((phy_addr & 0x1f) << 23) | (address << 18) | (value & AT91_EMAC_DATA));
 -
 -	/* Wait until IDLE bit in Network Status register is cleared */
 -	at91_phy_wait(lp);
 -}
 -
 -/*
 - * Read value stored in a PHY register.
 - * Note: MDI interface is assumed to already have been enabled.
 - */
 -static void read_phy(struct at91_private *lp, unsigned char phy_addr, unsigned char address, unsigned int *value)
 -{
 -	at91_emac_write(lp, AT91_EMAC_MAN, AT91_EMAC_MAN_802_3 | AT91_EMAC_RW_R
 -		| ((phy_addr & 0x1f) << 23) | (address << 18));
 -
 -	/* Wait until IDLE bit in Network Status register is cleared */
 -	at91_phy_wait(lp);
 -
 -	*value = at91_emac_read(lp, AT91_EMAC_MAN) & AT91_EMAC_DATA;
 -}
 -
 -/* ........................... PHY MANAGEMENT .......................... */
 -
 -/*
 - * Access the PHY to determine the current link speed and mode, and update the
 - * MAC accordingly.
 - * If no link or auto-negotiation is busy, then no changes are made.
 - */
 -static void update_linkspeed(struct net_device *dev, int silent)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int bmsr, bmcr, lpa, mac_cfg;
 -	unsigned int speed, duplex;
 -
 -	if (!mii_link_ok(&lp->mii)) {		/* no link */
 -		netif_carrier_off(dev);
 -		if (!silent)
 -			printk(KERN_INFO "%s: Link down.\n", dev->name);
 -		return;
 -	}
 -
 -	/* Link up, or auto-negotiation still in progress */
 -	read_phy(lp, lp->phy_address, MII_BMSR, &bmsr);
 -	read_phy(lp, lp->phy_address, MII_BMCR, &bmcr);
 -	if (bmcr & BMCR_ANENABLE) {				/* AutoNegotiation is enabled */
 -		if (!(bmsr & BMSR_ANEGCOMPLETE))
 -			return;			/* Do nothing - another interrupt generated when negotiation complete */
 -
 -		read_phy(lp, lp->phy_address, MII_LPA, &lpa);
 -		if ((lpa & LPA_100FULL) || (lpa & LPA_100HALF)) speed = SPEED_100;
 -		else speed = SPEED_10;
 -		if ((lpa & LPA_100FULL) || (lpa & LPA_10FULL)) duplex = DUPLEX_FULL;
 -		else duplex = DUPLEX_HALF;
 -	} else {
 -		speed = (bmcr & BMCR_SPEED100) ? SPEED_100 : SPEED_10;
 -		duplex = (bmcr & BMCR_FULLDPLX) ? DUPLEX_FULL : DUPLEX_HALF;
 -	}
 -
 -	/* Update the MAC */
 -	mac_cfg = at91_emac_read(lp, AT91_EMAC_CFG) & ~(AT91_EMAC_SPD | AT91_EMAC_FD);
 -	if (speed == SPEED_100) {
 -		if (duplex == DUPLEX_FULL)		/* 100 Full Duplex */
 -			mac_cfg |= AT91_EMAC_SPD | AT91_EMAC_FD;
 -		else					/* 100 Half Duplex */
 -			mac_cfg |= AT91_EMAC_SPD;
 -	} else {
 -		if (duplex == DUPLEX_FULL)		/* 10 Full Duplex */
 -			mac_cfg |= AT91_EMAC_FD;
 -		else {}					/* 10 Half Duplex */
 -	}
 -	at91_emac_write(lp, AT91_EMAC_CFG, mac_cfg);
 -
 -	if (!silent)
 -		printk(KERN_INFO "%s: Link now %i-%s\n", dev->name, speed, (duplex == DUPLEX_FULL) ? "FullDuplex" : "HalfDuplex");
 -	netif_carrier_on(dev);
 -}
 -
 -/*
 - * Handle interrupts from the PHY
 - */
 -static irqreturn_t at91ether_phy_interrupt(int irq, void *dev_id)
 -{
 -	struct net_device *dev = (struct net_device *) dev_id;
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int phy;
 -
 -	/*
 -	 * This hander is triggered on both edges, but the PHY chips expect
 -	 * level-triggering.  We therefore have to check if the PHY actually has
 -	 * an IRQ pending.
 -	 */
 -	enable_mdi(lp);
 -	if ((lp->phy_type == MII_DM9161_ID) || (lp->phy_type == MII_DM9161A_ID)) {
 -		read_phy(lp, lp->phy_address, MII_DSINTR_REG, &phy);	/* ack interrupt in Davicom PHY */
 -		if (!(phy & (1 << 0)))
 -			goto done;
 -	}
 -	else if (lp->phy_type == MII_LXT971A_ID) {
 -		read_phy(lp, lp->phy_address, MII_ISINTS_REG, &phy);	/* ack interrupt in Intel PHY */
 -		if (!(phy & (1 << 2)))
 -			goto done;
 -	}
 -	else if (lp->phy_type == MII_BCM5221_ID) {
 -		read_phy(lp, lp->phy_address, MII_BCMINTR_REG, &phy);	/* ack interrupt in Broadcom PHY */
 -		if (!(phy & (1 << 0)))
 -			goto done;
 -	}
 -	else if (lp->phy_type == MII_KS8721_ID) {
 -		read_phy(lp, lp->phy_address, MII_TPISTATUS, &phy);		/* ack interrupt in Micrel PHY */
 -		if (!(phy & ((1 << 2) | 1)))
 -			goto done;
 -	}
 -	else if (lp->phy_type == MII_T78Q21x3_ID) {					/* ack interrupt in Teridian PHY */
 -		read_phy(lp, lp->phy_address, MII_T78Q21INT_REG, &phy);
 -		if (!(phy & ((1 << 2) | 1)))
 -			goto done;
 -	}
 -	else if (lp->phy_type == MII_DP83848_ID) {
 -		read_phy(lp, lp->phy_address, MII_DPPHYSTS_REG, &phy);	/* ack interrupt in DP83848 PHY */
 -		if (!(phy & (1 << 7)))
 -			goto done;
 -	}
 -
 -	update_linkspeed(dev, 0);
 -
 -done:
 -	disable_mdi(lp);
 -
 -	return IRQ_HANDLED;
 -}
 -
 -/*
 - * Initialize and enable the PHY interrupt for link-state changes
 - */
 -static void enable_phyirq(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int dsintr, irq_number;
 -	int status;
 -
 -	if (!gpio_is_valid(lp->board_data.phy_irq_pin)) {
 -		/*
 -		 * PHY doesn't have an IRQ pin (RTL8201, DP83847, AC101L),
 -		 * or board does not have it connected.
 -		 */
 -		mod_timer(&lp->check_timer, jiffies + LINK_POLL_INTERVAL);
 -		return;
 -	}
 -
 -	irq_number = gpio_to_irq(lp->board_data.phy_irq_pin);
 -	status = request_irq(irq_number, at91ether_phy_interrupt, 0, dev->name, dev);
 -	if (status) {
 -		printk(KERN_ERR "at91_ether: PHY IRQ %d request failed - status %d!\n", irq_number, status);
 -		return;
 -	}
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	if ((lp->phy_type == MII_DM9161_ID) || (lp->phy_type == MII_DM9161A_ID)) {	/* for Davicom PHY */
 -		read_phy(lp, lp->phy_address, MII_DSINTR_REG, &dsintr);
 -		dsintr = dsintr & ~0xf00;		/* clear bits 8..11 */
 -		write_phy(lp, lp->phy_address, MII_DSINTR_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_LXT971A_ID) {	/* for Intel PHY */
 -		read_phy(lp, lp->phy_address, MII_ISINTE_REG, &dsintr);
 -		dsintr = dsintr | 0xf2;			/* set bits 1, 4..7 */
 -		write_phy(lp, lp->phy_address, MII_ISINTE_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_BCM5221_ID) {	/* for Broadcom PHY */
 -		dsintr = (1 << 15) | ( 1 << 14);
 -		write_phy(lp, lp->phy_address, MII_BCMINTR_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_KS8721_ID) {	/* for Micrel PHY */
 -		dsintr = (1 << 10) | ( 1 << 8);
 -		write_phy(lp, lp->phy_address, MII_TPISTATUS, dsintr);
 -	}
 -	else if (lp->phy_type == MII_T78Q21x3_ID) {	/* for Teridian PHY */
 -		read_phy(lp, lp->phy_address, MII_T78Q21INT_REG, &dsintr);
 -		dsintr = dsintr | 0x500;		/* set bits 8, 10 */
 -		write_phy(lp, lp->phy_address, MII_T78Q21INT_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_DP83848_ID) {	/* National Semiconductor DP83848 PHY */
 -		read_phy(lp, lp->phy_address, MII_DPMISR_REG, &dsintr);
 -		dsintr = dsintr | 0x3c;			/* set bits 2..5 */
 -		write_phy(lp, lp->phy_address, MII_DPMISR_REG, dsintr);
 -		read_phy(lp, lp->phy_address, MII_DPMICR_REG, &dsintr);
 -		dsintr = dsintr | 0x3;			/* set bits 0,1 */
 -		write_phy(lp, lp->phy_address, MII_DPMICR_REG, dsintr);
 -	}
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -}
 -
 -/*
 - * Disable the PHY interrupt
 - */
 -static void disable_phyirq(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int dsintr;
 -	unsigned int irq_number;
 -
 -	if (!gpio_is_valid(lp->board_data.phy_irq_pin)) {
 -		del_timer_sync(&lp->check_timer);
 -		return;
 -	}
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	if ((lp->phy_type == MII_DM9161_ID) || (lp->phy_type == MII_DM9161A_ID)) {	/* for Davicom PHY */
 -		read_phy(lp, lp->phy_address, MII_DSINTR_REG, &dsintr);
 -		dsintr = dsintr | 0xf00;			/* set bits 8..11 */
 -		write_phy(lp, lp->phy_address, MII_DSINTR_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_LXT971A_ID) {	/* for Intel PHY */
 -		read_phy(lp, lp->phy_address, MII_ISINTE_REG, &dsintr);
 -		dsintr = dsintr & ~0xf2;			/* clear bits 1, 4..7 */
 -		write_phy(lp, lp->phy_address, MII_ISINTE_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_BCM5221_ID) {	/* for Broadcom PHY */
 -		read_phy(lp, lp->phy_address, MII_BCMINTR_REG, &dsintr);
 -		dsintr = ~(1 << 14);
 -		write_phy(lp, lp->phy_address, MII_BCMINTR_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_KS8721_ID) {	/* for Micrel PHY */
 -		read_phy(lp, lp->phy_address, MII_TPISTATUS, &dsintr);
 -		dsintr = ~((1 << 10) | (1 << 8));
 -		write_phy(lp, lp->phy_address, MII_TPISTATUS, dsintr);
 -	}
 -	else if (lp->phy_type == MII_T78Q21x3_ID) {	/* for Teridian PHY */
 -		read_phy(lp, lp->phy_address, MII_T78Q21INT_REG, &dsintr);
 -		dsintr = dsintr & ~0x500;			/* clear bits 8, 10 */
 -		write_phy(lp, lp->phy_address, MII_T78Q21INT_REG, dsintr);
 -	}
 -	else if (lp->phy_type == MII_DP83848_ID) {	/* National Semiconductor DP83848 PHY */
 -		read_phy(lp, lp->phy_address, MII_DPMICR_REG, &dsintr);
 -		dsintr = dsintr & ~0x3;				/* clear bits 0, 1 */
 -		write_phy(lp, lp->phy_address, MII_DPMICR_REG, dsintr);
 -		read_phy(lp, lp->phy_address, MII_DPMISR_REG, &dsintr);
 -		dsintr = dsintr & ~0x3c;			/* clear bits 2..5 */
 -		write_phy(lp, lp->phy_address, MII_DPMISR_REG, dsintr);
 -	}
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -
 -	irq_number = gpio_to_irq(lp->board_data.phy_irq_pin);
 -	free_irq(irq_number, dev);			/* Free interrupt handler */
 -}
 -
 -/*
 - * Perform a software reset of the PHY.
 - */
 -#if 0
 -static void reset_phy(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int bmcr;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	/* Perform PHY reset */
 -	write_phy(lp, lp->phy_address, MII_BMCR, BMCR_RESET);
 -
 -	/* Wait until PHY reset is complete */
 -	do {
 -		read_phy(lp, lp->phy_address, MII_BMCR, &bmcr);
 -	} while (!(bmcr & BMCR_RESET));
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -}
 -#endif
 -
 -static void at91ether_check_link(unsigned long dev_id)
 -{
 -	struct net_device *dev = (struct net_device *) dev_id;
 -	struct at91_private *lp = netdev_priv(dev);
 -
 -	enable_mdi(lp);
 -	update_linkspeed(dev, 1);
 -	disable_mdi(lp);
 -
 -	mod_timer(&lp->check_timer, jiffies + LINK_POLL_INTERVAL);
 -}
 -
 -/*
 - * Perform any PHY-specific initialization.
 - */
 -static void __init initialize_phy(struct at91_private *lp)
 -{
 -	unsigned int val;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	if ((lp->phy_type == MII_DM9161_ID) || (lp->phy_type == MII_DM9161A_ID)) {
 -		read_phy(lp, lp->phy_address, MII_DSCR_REG, &val);
 -		if ((val & (1 << 10)) == 0)			/* DSCR bit 10 is 0 -- fiber mode */
 -			lp->phy_media = PORT_FIBRE;
 -	} else if (machine_is_csb337()) {
 -		/* mix link activity status into LED2 link state */
 -		write_phy(lp, lp->phy_address, MII_LEDCTRL_REG, 0x0d22);
 -	} else if (machine_is_ecbat91())
 -		write_phy(lp, lp->phy_address, MII_LEDCTRL_REG, 0x156A);
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -}
 -
 -/* ......................... ADDRESS MANAGEMENT ........................ */
 -
 -/*
 - * NOTE: Your bootloader must always set the MAC address correctly before
 - * booting into Linux.
 - *
 - * - It must always set the MAC address after reset, even if it doesn't
 - *   happen to access the Ethernet while it's booting.  Some versions of
 - *   U-Boot on the AT91RM9200-DK do not do this.
 - *
 - * - Likewise it must store the addresses in the correct byte order.
 - *   MicroMonitor (uMon) on the CSB337 does this incorrectly (and
 - *   continues to do so, for bug-compatibility).
 - */
 -
 -static short __init unpack_mac_address(struct net_device *dev, unsigned int hi, unsigned int lo)
 -{
 -	char addr[6];
 -
 -	if (machine_is_csb337()) {
 -		addr[5] = (lo & 0xff);			/* The CSB337 bootloader stores the MAC the wrong-way around */
 -		addr[4] = (lo & 0xff00) >> 8;
 -		addr[3] = (lo & 0xff0000) >> 16;
 -		addr[2] = (lo & 0xff000000) >> 24;
 -		addr[1] = (hi & 0xff);
 -		addr[0] = (hi & 0xff00) >> 8;
 -	}
 -	else {
 -		addr[0] = (lo & 0xff);
 -		addr[1] = (lo & 0xff00) >> 8;
 -		addr[2] = (lo & 0xff0000) >> 16;
 -		addr[3] = (lo & 0xff000000) >> 24;
 -		addr[4] = (hi & 0xff);
 -		addr[5] = (hi & 0xff00) >> 8;
 -	}
 -
 -	if (is_valid_ether_addr(addr)) {
 -		memcpy(dev->dev_addr, &addr, 6);
 -		return 1;
 -	}
 -	return 0;
 -}
 -
 -/*
 - * Set the ethernet MAC address in dev->dev_addr
 - */
 -static void __init get_mac_address(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -
 -	/* Check Specific-Address 1 */
 -	if (unpack_mac_address(dev, at91_emac_read(lp, AT91_EMAC_SA1H), at91_emac_read(lp, AT91_EMAC_SA1L)))
 -		return;
 -	/* Check Specific-Address 2 */
 -	if (unpack_mac_address(dev, at91_emac_read(lp, AT91_EMAC_SA2H), at91_emac_read(lp, AT91_EMAC_SA2L)))
 -		return;
 -	/* Check Specific-Address 3 */
 -	if (unpack_mac_address(dev, at91_emac_read(lp, AT91_EMAC_SA3H), at91_emac_read(lp, AT91_EMAC_SA3L)))
 -		return;
 -	/* Check Specific-Address 4 */
 -	if (unpack_mac_address(dev, at91_emac_read(lp, AT91_EMAC_SA4H), at91_emac_read(lp, AT91_EMAC_SA4L)))
 -		return;
 -
 -	printk(KERN_ERR "at91_ether: Your bootloader did not configure a MAC address.\n");
 -}
 -
 -/*
 - * Program the hardware MAC address from dev->dev_addr.
 - */
 -static void update_mac_address(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -
 -	at91_emac_write(lp, AT91_EMAC_SA1L, (dev->dev_addr[3] << 24) | (dev->dev_addr[2] << 16) | (dev->dev_addr[1] << 8) | (dev->dev_addr[0]));
 -	at91_emac_write(lp, AT91_EMAC_SA1H, (dev->dev_addr[5] << 8) | (dev->dev_addr[4]));
 -
 -	at91_emac_write(lp, AT91_EMAC_SA2L, 0);
 -	at91_emac_write(lp, AT91_EMAC_SA2H, 0);
 -}
 -
 -/*
 - * Store the new hardware address in dev->dev_addr, and update the MAC.
 - */
 -static int set_mac_address(struct net_device *dev, void* addr)
 -{
 -	struct sockaddr *address = addr;
 -
 -	if (!is_valid_ether_addr(address->sa_data))
 -		return -EADDRNOTAVAIL;
 -
 -	memcpy(dev->dev_addr, address->sa_data, dev->addr_len);
 -	update_mac_address(dev);
 +#include "macb.h"
  
 -	printk("%s: Setting MAC address to %pM\n", dev->name,
 -	       dev->dev_addr);
 +/* 1518 rounded up */
 +#define MAX_RBUFF_SZ	0x600
 +/* max number of receive buffers */
 +#define MAX_RX_DESCR	9
  
 -	return 0;
 -}
 -
 -static int inline hash_bit_value(int bitnr, __u8 *addr)
 -{
 -	if (addr[bitnr / 8] & (1 << (bitnr % 8)))
 -		return 1;
 -	return 0;
 -}
 -
 -/*
 - * The hash address register is 64 bits long and takes up two locations in the memory map.
 - * The least significant bits are stored in EMAC_HSL and the most significant
 - * bits in EMAC_HSH.
 - *
 - * The unicast hash enable and the multicast hash enable bits in the network configuration
 - *  register enable the reception of hash matched frames. The destination address is
 - *  reduced to a 6 bit index into the 64 bit hash register using the following hash function.
 - * The hash function is an exclusive or of every sixth bit of the destination address.
 - *   hash_index[5] = da[5] ^ da[11] ^ da[17] ^ da[23] ^ da[29] ^ da[35] ^ da[41] ^ da[47]
 - *   hash_index[4] = da[4] ^ da[10] ^ da[16] ^ da[22] ^ da[28] ^ da[34] ^ da[40] ^ da[46]
 - *   hash_index[3] = da[3] ^ da[09] ^ da[15] ^ da[21] ^ da[27] ^ da[33] ^ da[39] ^ da[45]
 - *   hash_index[2] = da[2] ^ da[08] ^ da[14] ^ da[20] ^ da[26] ^ da[32] ^ da[38] ^ da[44]
 - *   hash_index[1] = da[1] ^ da[07] ^ da[13] ^ da[19] ^ da[25] ^ da[31] ^ da[37] ^ da[43]
 - *   hash_index[0] = da[0] ^ da[06] ^ da[12] ^ da[18] ^ da[24] ^ da[30] ^ da[36] ^ da[42]
 - * da[0] represents the least significant bit of the first byte received, that is, the multicast/
 - *  unicast indicator, and da[47] represents the most significant bit of the last byte
 - *  received.
 - * If the hash index points to a bit that is set in the hash register then the frame will be
 - *  matched according to whether the frame is multicast or unicast.
 - * A multicast match will be signalled if the multicast hash enable bit is set, da[0] is 1 and
 - *  the hash index points to a bit set in the hash register.
 - * A unicast match will be signalled if the unicast hash enable bit is set, da[0] is 0 and the
 - *  hash index points to a bit set in the hash register.
 - * To receive all multicast frames, the hash register should be set with all ones and the
 - *  multicast hash enable bit should be set in the network configuration register.
 - */
 -
 -/*
 - * Return the hash index value for the specified address.
 - */
 -static int hash_get_index(__u8 *addr)
 -{
 -	int i, j, bitval;
 -	int hash_index = 0;
 -
 -	for (j = 0; j < 6; j++) {
 -		for (i = 0, bitval = 0; i < 8; i++)
 -			bitval ^= hash_bit_value(i*6 + j, addr);
 -
 -		hash_index |= (bitval << j);
 -	}
 -
 -	return hash_index;
 -}
 -
 -/*
 - * Add multicast addresses to the internal multicast-hash table.
 - */
 -static void at91ether_sethashtable(struct net_device *dev)
 +/* Initialize and start the Receiver and Transmit subsystems */
 +static int at91ether_start(struct net_device *dev)
  {
 -	struct at91_private *lp = netdev_priv(dev);
 -	struct netdev_hw_addr *ha;
 -	unsigned long mc_filter[2];
 -	unsigned int bitnr;
 -
 -	mc_filter[0] = mc_filter[1] = 0;
 -
 -	netdev_for_each_mc_addr(ha, dev) {
 -		bitnr = hash_get_index(ha->addr);
 -		mc_filter[bitnr >> 5] |= 1 << (bitnr & 31);
 -	}
 -
 -	at91_emac_write(lp, AT91_EMAC_HSL, mc_filter[0]);
 -	at91_emac_write(lp, AT91_EMAC_HSH, mc_filter[1]);
 -}
 +	struct macb *lp = netdev_priv(dev);
 +	dma_addr_t addr;
 +	u32 ctl;
 +	int i;
  
 -/*
 - * Enable/Disable promiscuous and multicast modes.
 - */
 -static void at91ether_set_multicast_list(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned long cfg;
 -
 -	cfg = at91_emac_read(lp, AT91_EMAC_CFG);
 -
 -	if (dev->flags & IFF_PROMISC)			/* Enable promiscuous mode */
 -		cfg |= AT91_EMAC_CAF;
 -	else if (dev->flags & (~IFF_PROMISC))		/* Disable promiscuous mode */
 -		cfg &= ~AT91_EMAC_CAF;
 -
 -	if (dev->flags & IFF_ALLMULTI) {		/* Enable all multicast mode */
 -		at91_emac_write(lp, AT91_EMAC_HSH, -1);
 -		at91_emac_write(lp, AT91_EMAC_HSL, -1);
 -		cfg |= AT91_EMAC_MTI;
 -	} else if (!netdev_mc_empty(dev)) { /* Enable specific multicasts */
 -		at91ether_sethashtable(dev);
 -		cfg |= AT91_EMAC_MTI;
 -	} else if (dev->flags & (~IFF_ALLMULTI)) {	/* Disable all multicast mode */
 -		at91_emac_write(lp, AT91_EMAC_HSH, 0);
 -		at91_emac_write(lp, AT91_EMAC_HSL, 0);
 -		cfg &= ~AT91_EMAC_MTI;
 +	lp->rx_ring = dma_alloc_coherent(&lp->pdev->dev,
 +					MAX_RX_DESCR * sizeof(struct macb_dma_desc),
 +					&lp->rx_ring_dma, GFP_KERNEL);
 +	if (!lp->rx_ring) {
 +		netdev_err(dev, "unable to alloc rx ring DMA buffer\n");
 +		return -ENOMEM;
  	}
  
 -	at91_emac_write(lp, AT91_EMAC_CFG, cfg);
 -}
 -
 -/* ......................... ETHTOOL SUPPORT ........................... */
 -
 -static int mdio_read(struct net_device *dev, int phy_id, int location)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	unsigned int value;
 -
 -	read_phy(lp, phy_id, location, &value);
 -	return value;
 -}
 -
 -static void mdio_write(struct net_device *dev, int phy_id, int location, int value)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -
 -	write_phy(lp, phy_id, location, value);
 -}
 -
 -static int at91ether_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	int ret;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	ret = mii_ethtool_gset(&lp->mii, cmd);
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 +	lp->rx_buffers = dma_alloc_coherent(&lp->pdev->dev,
 +					MAX_RX_DESCR * MAX_RBUFF_SZ,
 +					&lp->rx_buffers_dma, GFP_KERNEL);
 +	if (!lp->rx_buffers) {
 +		netdev_err(dev, "unable to alloc rx data DMA buffer\n");
  
 -	if (lp->phy_media == PORT_FIBRE) {		/* override media type since mii.c doesn't know */
 -		cmd->supported = SUPPORTED_FIBRE;
 -		cmd->port = PORT_FIBRE;
 +		dma_free_coherent(&lp->pdev->dev,
 +					MAX_RX_DESCR * sizeof(struct macb_dma_desc),
 +					lp->rx_ring, lp->rx_ring_dma);
 +		lp->rx_ring = NULL;
 +		return -ENOMEM;
  	}
  
 -	return ret;
 -}
 -
 -static int at91ether_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	int ret;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	ret = mii_ethtool_sset(&lp->mii, cmd);
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -
 -	return ret;
 -}
 -
 -static int at91ether_nwayreset(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	int ret;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -
 -	ret = mii_nway_restart(&lp->mii);
 -
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -
 -	return ret;
 -}
 -
 -static void at91ether_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
 -{
 -	strlcpy(info->driver, DRV_NAME, sizeof(info->driver));
 -	strlcpy(info->version, DRV_VERSION, sizeof(info->version));
 -	strlcpy(info->bus_info, dev_name(dev->dev.parent), sizeof(info->bus_info));
 -}
 -
 -static const struct ethtool_ops at91ether_ethtool_ops = {
 -	.get_settings	= at91ether_get_settings,
 -	.set_settings	= at91ether_set_settings,
 -	.get_drvinfo	= at91ether_get_drvinfo,
 -	.nway_reset	= at91ether_nwayreset,
 -	.get_link	= ethtool_op_get_link,
 -};
 -
 -static int at91ether_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	int res;
 -
 -	if (!netif_running(dev))
 -		return -EINVAL;
 -
 -	spin_lock_irq(&lp->lock);
 -	enable_mdi(lp);
 -	res = generic_mii_ioctl(&lp->mii, if_mii(rq), cmd, NULL);
 -	disable_mdi(lp);
 -	spin_unlock_irq(&lp->lock);
 -
 -	return res;
 -}
 -
 -/* ................................ MAC ................................ */
 -
 -/*
 - * Initialize and start the Receiver and Transmit subsystems
 - */
 -static void at91ether_start(struct net_device *dev)
 -{
 -	struct at91_private *lp = netdev_priv(dev);
 -	struct recv_desc_bufs *dlist, *dlist_phys;
 -	int i;
 -	unsigned long ctl;
 -
 -	dlist = lp->dlist;
 -	dlist_phys = lp->dlist_phys;
 -
 +	addr = lp->rx_buffers_dma;
  	for (i = 0; i < MAX_RX_DESCR; i++) {
 -		dlist->descriptors[i].addr = (unsigned int) &dlist_phys->recv_buf[i][0];
 -		dlist->descriptors[i].size = 0;
 +		lp->rx_ring[i].addr = addr;
 +		lp->rx_ring[i].ctrl = 0;
 +		addr += MAX_RBUFF_SZ;
  	}
  
  	/* Set the Wrap bit on the last descriptor */

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* соблюдать инструкцию.
From: Подтвердите сразу:Победа E-mail:Beijing,China: @ 2012-11-13  4:57 UTC (permalink / raw)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
*************************************************************************************************************
Вот ваша безопасность конфиденциальной личной электронной почты
победителем код 7655-96-22-CHINA/BJ/002. Вы должны отправить эту
безопасность конфиденциальной личной электронной почты победителем код
7655-96-22-CHINA/BJ/002 на: your_email_address_won@gmx.co.uk. После
получения этого кода в наш адрес электронной почты отдела, вы получите
$ Шестьсот тысяч долларов США мгновенно

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox