Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v9 net-next 2/4] net: filter: split filter.h and expose eBPF to user space
From: Daniel Borkmann @ 2014-10-14 20:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, Andy Lutomirski,
	Steven Rostedt, Hannes Frederic Sowa, Chema Gonzalez,
	Eric Dumazet, Peter Zijlstra, H. Peter Anvin, Andrew Morton,
	Kees Cook, Linux API, Network Development, LKML
In-Reply-To: <CAMEtUuws+AtOdwud8_YjXhs=yomt8nY+49f_UuhofcmhV58c1Q@mail.gmail.com>

On 10/14/2014 10:43 AM, Alexei Starovoitov wrote:
> On Tue, Oct 14, 2014 at 12:34 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> On 10/13/2014 11:49 PM, Alexei Starovoitov wrote:
>>>
>>> On Mon, Oct 13, 2014 at 10:21 AM, Daniel Borkmann <dborkman@redhat.com>
>>> wrote:
>>>>
>>>> On 09/03/2014 05:46 PM, Daniel Borkmann wrote:
>>>> ...
>>>>>
>>>>> Ok, given you post the remaining two RFCs later on this window as
>>>>> you indicate, I have no objections:
>>>>>
>>>>> Acked-by: Daniel Borkmann <dborkman@redhat.com>
>>>>
>>>> Ping, Alexei, are you still sending the patch for bpf_common.h or
>>>> do you want me to take care of this?
>>>
>>> It's not forgotten.
>>> I'm not sending it only because net-next is closed
>>> and it seems to be -next material.
>>
>> Well, the point was since it's UAPI you're modifying, that it needs
>> to be shipped before it first gets exposed to user land ...
>>
>> I think that should be reason enough ... there's no point in doing
>> this at a later point in time.
>
> Moving common #defines from filter.h into bpf_common.h can
> be done at any point in time. For the sake of argument if
> there is an app that includes both filter.h and bpf.h, it will
> continue to work just fine.

Correct, but the argument was that we can _avoid_ this from the
very beginning. Thus, user space applications making use of eBPF
only need to include <linux/bpf.h>, nothing more.

Doing this at any later point in time will just lead to the need
to include both headers.

^ permalink raw reply

* Re: [PATCH] dsa: mv88e6171: Fix tag_protocol check
From: David Miller @ 2014-10-14 20:23 UTC (permalink / raw)
  To: linux; +Cc: f.fainelli, netdev, linux-kernel, andrew
In-Reply-To: <1413310864-16830-1-git-send-email-linux@roeck-us.net>

From: Guenter Roeck <linux@roeck-us.net>
Date: Tue, 14 Oct 2014 11:21:04 -0700

> tag_protocol is now an enum, so drivers have to check against it.
> 
> Cc: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Applied, thanks.

^ permalink raw reply

* [PATCH net] cxgb4: Fix FW flash logic using ethtool
From: Hariprasad Shenai @ 2014-10-14 20:24 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai

Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since
t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed
enough to require a power cycle of the system.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h      |    2 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |   14 ++++++++++++--
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c      |    6 ++----
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 9b2c669..38d8234 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -986,6 +986,8 @@ static inline int t4_memory_write(struct adapter *adap, int mtype, u32 addr,
 int t4_seeprom_wp(struct adapter *adapter, bool enable);
 int get_vpd_params(struct adapter *adapter, struct vpd_params *p);
 int t4_load_fw(struct adapter *adapter, const u8 *fw_data, unsigned int size);
+int t4_fw_upgrade(struct adapter *adap, unsigned int mbox,
+		  const u8 *fw_data, unsigned int size, int force);
 unsigned int t4_flash_cfg_addr(struct adapter *adapter);
 int t4_get_fw_version(struct adapter *adapter, u32 *vers);
 int t4_get_tp_version(struct adapter *adapter, u32 *vers);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 321f3d9..54a135d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -2929,16 +2929,26 @@ static int set_flash(struct net_device *netdev, struct ethtool_flash *ef)
 	int ret;
 	const struct firmware *fw;
 	struct adapter *adap = netdev2adap(netdev);
+	unsigned int mbox = FW_PCIE_FW_MASTER_MASK + 1;
 
 	ef->data[sizeof(ef->data) - 1] = '\0';
 	ret = request_firmware(&fw, ef->data, adap->pdev_dev);
 	if (ret < 0)
 		return ret;
 
-	ret = t4_load_fw(adap, fw->data, fw->size);
+	/* If the adapter has been fully initialized then we'll go ahead and
+	 * try to get the firmware's cooperation in upgrading to the new
+	 * firmware image otherwise we'll try to do the entire job from the
+	 * host ... and we always "force" the operation in this path.
+	 */
+	if (adap->flags & FULL_INIT_DONE)
+		mbox = adap->mbox;
+
+	ret = t4_fw_upgrade(adap, mbox, fw->data, fw->size, 1);
 	release_firmware(fw);
 	if (!ret)
-		dev_info(adap->pdev_dev, "loaded firmware %s\n", ef->data);
+		dev_info(adap->pdev_dev, "loaded firmware %s,"
+			 " reload cxgb4 driver\n", ef->data);
 	return ret;
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 22d7581..b659b94 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -37,8 +37,6 @@
 #include "t4_regs.h"
 #include "t4fw_api.h"
 
-static int t4_fw_upgrade(struct adapter *adap, unsigned int mbox,
-			 const u8 *fw_data, unsigned int size, int force);
 /**
  *	t4_wait_op_done_val - wait until an operation is completed
  *	@adapter: the adapter performing the operation
@@ -3076,8 +3074,8 @@ static int t4_fw_restart(struct adapter *adap, unsigned int mbox, int reset)
  *	positive errno indicates that the adapter is ~probably~ intact, a
  *	negative errno indicates that things are looking bad ...
  */
-static int t4_fw_upgrade(struct adapter *adap, unsigned int mbox,
-			 const u8 *fw_data, unsigned int size, int force)
+int t4_fw_upgrade(struct adapter *adap, unsigned int mbox,
+		  const u8 *fw_data, unsigned int size, int force)
 {
 	const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data;
 	int reset, ret;
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH v2] ipv4: dst_entry leak in ip_append_data()
From: David Miller @ 2014-10-14 20:12 UTC (permalink / raw)
  To: vvs; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber, eric.dumazet
In-Reply-To: <543CAD2A.3070701@parallels.com>

From: Vasily Averin <vvs@parallels.com>
Date: Tue, 14 Oct 2014 08:57:14 +0400

> v2: adjust the indentation of the arguments __ip_append_data() call
> 
> Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
> 
> If sk_write_queue is empty ip_append_data() executes ip_setup_cork()
> that "steals" dst entry from rt to cork. Later it calls __ip_append_data()
> that creates skb and adds it to sk_write_queue.
> 
> If skb was added successfully following ip_push_pending_frames() call
> reassign dst entries from cork to skb, and kfree_skb frees dst_entry.
> 
> However nobody frees stolen dst_entry if skb was not added into sk_write_queue.
> 
> Signed-off-by: Vasily Averin <vvs@parallels.com>

Why doesn't ip_make_skb() need the same fix?  It seems to do the same
thing.

^ permalink raw reply

* Re: [PATCH v2 0/4] Add SGMII based 1GbE support to APM X-Gene SoC ethernet driver
From: David Miller @ 2014-10-14 20:10 UTC (permalink / raw)
  To: isubramanian
  Cc: romieu, netdev, devicetree, linux-arm-kernel, patches, kchudgar
In-Reply-To: <1413245135-2989-1-git-send-email-isubramanian@apm.com>

From: Iyappan Subramanian <isubramanian@apm.com>
Date: Mon, 13 Oct 2014 17:05:31 -0700

> Adding SGMII based 1GbE basic support to APM X-Gene SoC ethernet driver.
> 
> v2: Address comments from v1
> * Split the patchset into two, the first one being preparatory patch
> * Added link_state function pointer to the xgene_mac_ops structure
> * Added xgene_indirect_ctl structure for indirect read/write arguments
> 
> v1:
> * Initial version

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net] net: filter: move common defines into bpf_common.h
From: David Miller @ 2014-10-14 20:07 UTC (permalink / raw)
  To: ast-uqk4Ao+rVK5Wk0Htik3J/w
  Cc: mingo-DgEjT+Ai2ygdnm+yROfE0A,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	luto-kltTT9wpgjJwATOyAt5JVQ, rostedt-nx8X9YLhiw1AfugRpC6u6w,
	dborkman-H+wXaHxf7aLQT0dZR+AlfA,
	hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
	chema-hpIqsD4AKlfQT0dZR+AlfA, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
	a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw, hpa-YMNOUZJC4hwAvxtiuMwx3w,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	keescook-F7+t8E8rja9g9hUCZPvPmw, linux-api-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1413277734-13053-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>

From: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
Date: Tue, 14 Oct 2014 02:08:54 -0700

> userspace programs that use eBPF instruction macros need to include two files:
> uapi/linux/filter.h and uapi/linux/bpf.h
> Move common macro definitions that are shared between classic BPF and eBPF
> into uapi/linux/bpf_common.h, so that user app can include only one bpf.h file
> 
> Cc: Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
> ---
> 
> Daniel believes that this patch has to be done for this merge window.
> I think it can wait till next, but it won't hurt now either.

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH V2 2/2 net-next] caif_usb: use target structure member in memset
From: David Miller @ 2014-10-14 20:06 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, joe, dmitry.tarnyagin, netdev
In-Reply-To: <1413306074-8401-1-git-send-email-fabf@skynet.be>

From: Fabian Frederick <fabf@skynet.be>
Date: Tue, 14 Oct 2014 19:01:14 +0200

> parent cfusbl was used instead of first structure member 'layer'
> 
> Suggested-by: Joe Perches <joe@perches.com>
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

Applied.

^ permalink raw reply

* Re: [PATCH V2 1/2 net-next] caif_usb: remove redundant memory message
From: David Miller @ 2014-10-14 20:05 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, joe, dmitry.tarnyagin, netdev
In-Reply-To: <1413306055-8281-1-git-send-email-fabf@skynet.be>

From: Fabian Frederick <fabf@skynet.be>
Date: Tue, 14 Oct 2014 19:00:55 +0200

> Let MM subsystem display out of memory messages.
> 
> Signed-off-by: Fabian Frederick <fabf@skynet.be>
> ---
> V2: add second patch with memset fix

Applied.

^ permalink raw reply

* Re: Netlink mmap tx security?
From: David Miller @ 2014-10-14 20:00 UTC (permalink / raw)
  To: luto; +Cc: torvalds, kaber, netdev
In-Reply-To: <CALCETrUYBed_TBacxoDPMphM5y=iqxho2isrDruOQi=pmK2yoQ@mail.gmail.com>

From: Andy Lutomirski <luto@amacapital.net>
Date: Tue, 14 Oct 2014 12:33:43 -0700

> For full honesty, there is now the machinery developed for memfd
> sealing, but I can't imagine that this is ever faster than just
> copying the buffer.

I don't have much motivation to even check if it's a worthwhile
pursuit at this point.  

Someone who wants to can :-)

> I think that the NETLINK_SKB_TX declaration in include/linux/netlink.h
> should probably go, too.  And there's the last parameter to
> netlink_set_ring, too, and possibly even the nlk->tx_ring struct
> itself.

Agreed on all counts, here is the new patch:

====================
[PATCH] netlink: Remove TX mmap support.

There is no reasonable manner in which to absolutely make sure that another
thread of control cannot write to the pages in the mmap()'d area and thus
make sure that netlink messages do not change underneath us after we've
performed verifications.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/netlink.h  |   5 +-
 net/netlink/af_netlink.c | 161 +++++------------------------------------------
 net/netlink/af_netlink.h |   1 -
 3 files changed, 16 insertions(+), 151 deletions(-)

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 9e572da..57080a9 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -17,9 +17,8 @@ static inline struct nlmsghdr *nlmsg_hdr(const struct sk_buff *skb)
 
 enum netlink_skb_flags {
 	NETLINK_SKB_MMAPED	= 0x1,	/* Packet data is mmaped */
-	NETLINK_SKB_TX		= 0x2,	/* Packet was sent by userspace */
-	NETLINK_SKB_DELIVERED	= 0x4,	/* Packet was delivered */
-	NETLINK_SKB_DST		= 0x8,	/* Dst set in sendto or sendmsg */
+	NETLINK_SKB_DELIVERED	= 0x2,	/* Packet was delivered */
+	NETLINK_SKB_DST		= 0x4,	/* Dst set in sendto or sendmsg */
 };
 
 struct netlink_skb_parms {
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c416725..07ef0c9 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -289,11 +289,6 @@ static bool netlink_rx_is_mmaped(struct sock *sk)
 	return nlk_sk(sk)->rx_ring.pg_vec != NULL;
 }
 
-static bool netlink_tx_is_mmaped(struct sock *sk)
-{
-	return nlk_sk(sk)->tx_ring.pg_vec != NULL;
-}
-
 static __pure struct page *pgvec_to_page(const void *addr)
 {
 	if (is_vmalloc_addr(addr))
@@ -359,7 +354,7 @@ err1:
 }
 
 static int netlink_set_ring(struct sock *sk, struct nl_mmap_req *req,
-			    bool closing, bool tx_ring)
+			    bool closing)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 	struct netlink_ring *ring;
@@ -368,8 +363,8 @@ static int netlink_set_ring(struct sock *sk, struct nl_mmap_req *req,
 	unsigned int order = 0;
 	int err;
 
-	ring  = tx_ring ? &nlk->tx_ring : &nlk->rx_ring;
-	queue = tx_ring ? &sk->sk_write_queue : &sk->sk_receive_queue;
+	ring  = &nlk->rx_ring;
+	queue = &sk->sk_receive_queue;
 
 	if (!closing) {
 		if (atomic_read(&nlk->mapped))
@@ -476,11 +471,9 @@ static int netlink_mmap(struct file *file, struct socket *sock,
 	mutex_lock(&nlk->pg_vec_lock);
 
 	expected = 0;
-	for (ring = &nlk->rx_ring; ring <= &nlk->tx_ring; ring++) {
-		if (ring->pg_vec == NULL)
-			continue;
+	ring = &nlk->rx_ring;
+	if (ring->pg_vec)
 		expected += ring->pg_vec_len * ring->pg_vec_pages * PAGE_SIZE;
-	}
 
 	if (expected == 0)
 		goto out;
@@ -490,10 +483,8 @@ static int netlink_mmap(struct file *file, struct socket *sock,
 		goto out;
 
 	start = vma->vm_start;
-	for (ring = &nlk->rx_ring; ring <= &nlk->tx_ring; ring++) {
-		if (ring->pg_vec == NULL)
-			continue;
-
+	ring = &nlk->rx_ring;
+	if (ring->pg_vec) {
 		for (i = 0; i < ring->pg_vec_len; i++) {
 			struct page *page;
 			void *kaddr = ring->pg_vec[i];
@@ -662,13 +653,6 @@ static unsigned int netlink_poll(struct file *file, struct socket *sock,
 	}
 	spin_unlock_bh(&sk->sk_receive_queue.lock);
 
-	spin_lock_bh(&sk->sk_write_queue.lock);
-	if (nlk->tx_ring.pg_vec) {
-		if (netlink_current_frame(&nlk->tx_ring, NL_MMAP_STATUS_UNUSED))
-			mask |= POLLOUT | POLLWRNORM;
-	}
-	spin_unlock_bh(&sk->sk_write_queue.lock);
-
 	return mask;
 }
 
@@ -698,104 +682,6 @@ static void netlink_ring_setup_skb(struct sk_buff *skb, struct sock *sk,
 	NETLINK_CB(skb).sk = sk;
 }
 
-static int netlink_mmap_sendmsg(struct sock *sk, struct msghdr *msg,
-				u32 dst_portid, u32 dst_group,
-				struct sock_iocb *siocb)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-	struct netlink_ring *ring;
-	struct nl_mmap_hdr *hdr;
-	struct sk_buff *skb;
-	unsigned int maxlen;
-	bool excl = true;
-	int err = 0, len = 0;
-
-	/* Netlink messages are validated by the receiver before processing.
-	 * In order to avoid userspace changing the contents of the message
-	 * after validation, the socket and the ring may only be used by a
-	 * single process, otherwise we fall back to copying.
-	 */
-	if (atomic_long_read(&sk->sk_socket->file->f_count) > 2 ||
-	    atomic_read(&nlk->mapped) > 1)
-		excl = false;
-
-	mutex_lock(&nlk->pg_vec_lock);
-
-	ring   = &nlk->tx_ring;
-	maxlen = ring->frame_size - NL_MMAP_HDRLEN;
-
-	do {
-		hdr = netlink_current_frame(ring, NL_MMAP_STATUS_VALID);
-		if (hdr == NULL) {
-			if (!(msg->msg_flags & MSG_DONTWAIT) &&
-			    atomic_read(&nlk->tx_ring.pending))
-				schedule();
-			continue;
-		}
-		if (hdr->nm_len > maxlen) {
-			err = -EINVAL;
-			goto out;
-		}
-
-		netlink_frame_flush_dcache(hdr);
-
-		if (likely(dst_portid == 0 && dst_group == 0 && excl)) {
-			skb = alloc_skb_head(GFP_KERNEL);
-			if (skb == NULL) {
-				err = -ENOBUFS;
-				goto out;
-			}
-			sock_hold(sk);
-			netlink_ring_setup_skb(skb, sk, ring, hdr);
-			NETLINK_CB(skb).flags |= NETLINK_SKB_TX;
-			__skb_put(skb, hdr->nm_len);
-			netlink_set_status(hdr, NL_MMAP_STATUS_RESERVED);
-			atomic_inc(&ring->pending);
-		} else {
-			skb = alloc_skb(hdr->nm_len, GFP_KERNEL);
-			if (skb == NULL) {
-				err = -ENOBUFS;
-				goto out;
-			}
-			__skb_put(skb, hdr->nm_len);
-			memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len);
-			netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
-		}
-
-		netlink_increment_head(ring);
-
-		NETLINK_CB(skb).portid	  = nlk->portid;
-		NETLINK_CB(skb).dst_group = dst_group;
-		NETLINK_CB(skb).creds	  = siocb->scm->creds;
-
-		err = security_netlink_send(sk, skb);
-		if (err) {
-			kfree_skb(skb);
-			goto out;
-		}
-
-		if (unlikely(dst_group)) {
-			atomic_inc(&skb->users);
-			netlink_broadcast(sk, skb, dst_portid, dst_group,
-					  GFP_KERNEL);
-		}
-		err = netlink_unicast(sk, skb, dst_portid,
-				      msg->msg_flags & MSG_DONTWAIT);
-		if (err < 0)
-			goto out;
-		len += err;
-
-	} while (hdr != NULL ||
-		 (!(msg->msg_flags & MSG_DONTWAIT) &&
-		  atomic_read(&nlk->tx_ring.pending)));
-
-	if (len > 0)
-		err = len;
-out:
-	mutex_unlock(&nlk->pg_vec_lock);
-	return err;
-}
-
 static void netlink_queue_mmaped_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct nl_mmap_hdr *hdr;
@@ -842,10 +728,8 @@ static void netlink_ring_set_copied(struct sock *sk, struct sk_buff *skb)
 #else /* CONFIG_NETLINK_MMAP */
 #define netlink_skb_is_mmaped(skb)	false
 #define netlink_rx_is_mmaped(sk)	false
-#define netlink_tx_is_mmaped(sk)	false
 #define netlink_mmap			sock_no_mmap
 #define netlink_poll			datagram_poll
-#define netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group, siocb)	0
 #endif /* CONFIG_NETLINK_MMAP */
 
 static void netlink_skb_destructor(struct sk_buff *skb)
@@ -864,16 +748,11 @@ static void netlink_skb_destructor(struct sk_buff *skb)
 		hdr = netlink_mmap_hdr(skb);
 		sk = NETLINK_CB(skb).sk;
 
-		if (NETLINK_CB(skb).flags & NETLINK_SKB_TX) {
-			netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
-			ring = &nlk_sk(sk)->tx_ring;
-		} else {
-			if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
-				hdr->nm_len = 0;
-				netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
-			}
-			ring = &nlk_sk(sk)->rx_ring;
+		if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
+			hdr->nm_len = 0;
+			netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
 		}
+		ring = &nlk_sk(sk)->rx_ring;
 
 		WARN_ON(atomic_read(&ring->pending) == 0);
 		atomic_dec(&ring->pending);
@@ -921,10 +800,7 @@ static void netlink_sock_destruct(struct sock *sk)
 
 		memset(&req, 0, sizeof(req));
 		if (nlk->rx_ring.pg_vec)
-			netlink_set_ring(sk, &req, true, false);
-		memset(&req, 0, sizeof(req));
-		if (nlk->tx_ring.pg_vec)
-			netlink_set_ring(sk, &req, true, true);
+			netlink_set_ring(sk, &req, true);
 	}
 #endif /* CONFIG_NETLINK_MMAP */
 
@@ -2165,8 +2041,7 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname,
 		err = 0;
 		break;
 #ifdef CONFIG_NETLINK_MMAP
-	case NETLINK_RX_RING:
-	case NETLINK_TX_RING: {
+	case NETLINK_RX_RING: {
 		struct nl_mmap_req req;
 
 		/* Rings might consume more memory than queue limits, require
@@ -2178,8 +2053,7 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname,
 			return -EINVAL;
 		if (copy_from_user(&req, optval, sizeof(req)))
 			return -EFAULT;
-		err = netlink_set_ring(sk, &req, false,
-				       optname == NETLINK_TX_RING);
+		err = netlink_set_ring(sk, &req, false);
 		break;
 	}
 #endif /* CONFIG_NETLINK_MMAP */
@@ -2295,13 +2169,6 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 			goto out;
 	}
 
-	if (netlink_tx_is_mmaped(sk) &&
-	    msg->msg_iov->iov_base == NULL) {
-		err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
-					   siocb);
-		goto out;
-	}
-
 	err = -EMSGSIZE;
 	if (len > sk->sk_sndbuf - 32)
 		goto out;
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index b20a173..4741b88 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -45,7 +45,6 @@ struct netlink_sock {
 #ifdef CONFIG_NETLINK_MMAP
 	struct mutex		pg_vec_lock;
 	struct netlink_ring	rx_ring;
-	struct netlink_ring	tx_ring;
 	atomic_t		mapped;
 #endif /* CONFIG_NETLINK_MMAP */
 
-- 
1.7.11.7

^ permalink raw reply related

* Re: [PATCH] dsa: mv88e6171: Fix tag_protocol check
From: Andrew Lunn @ 2014-10-14 19:59 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: David S. Miller, Florian Fainelli, netdev, linux-kernel,
	Andrew Lunn
In-Reply-To: <1413310864-16830-1-git-send-email-linux@roeck-us.net>

On Tue, Oct 14, 2014 at 11:21:04AM -0700, Guenter Roeck wrote:
> tag_protocol is now an enum, so drivers have to check against it.
> 
> Cc: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Acked-by: Andrew Lunn <andrew@lunn.ch>

Thanks for fixing this.

       Andrew

> ---
>  drivers/net/dsa/mv88e6171.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
> index 6365e30..1020a7a 100644
> --- a/drivers/net/dsa/mv88e6171.c
> +++ b/drivers/net/dsa/mv88e6171.c
> @@ -206,7 +206,7 @@ static int mv88e6171_setup_port(struct dsa_switch *ds, int p)
>  	 */
>  	val = 0x0433;
>  	if (dsa_is_cpu_port(ds, p)) {
> -		if (ds->dst->tag_protocol == htons(ETH_P_EDSA))
> +		if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
>  			val |= 0x3300;
>  		else
>  			val |= 0x0100;
> -- 
> 1.9.1
> 

^ permalink raw reply

* Re: [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries
From: Eric W. Biederman @ 2014-10-14 19:56 UTC (permalink / raw)
  To: nicolas.dichtel
  Cc: netdev, linux-kernel, davem, akpm, adobriyan, rui.xiang, viro,
	oleg, gorcunov, kirill.shutemov, grant.likely, tytso,
	Linus Torvalds, Andrew Morton
In-Reply-To: <543BB42B.30505@6wind.com>

Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:

> Le 07/10/2014 11:02, Nicolas Dichtel a écrit :
>> The current implementation for the directories in /proc is using a single
>> linked list. This is slow when handling directories with large numbers of
>> entries (eg netdevice-related entries when lots of tunnels are opened).
>>
>> This patch replaces this linked list by a red-black tree.
>>
>> Here are some numbers:
>>
>> dummy30000.batch contains 30 000 times 'link add type dummy'.
>>
>> Before the patch:
>> $ time ip -b dummy30000.batch
>> real	2m31.950s
>> user	0m0.440s
>> sys	2m21.440s
>> $ time rmmod dummy
>> real	1m35.764s
>> user	0m0.000s
>> sys	1m24.088s
>>
>> After the patch:
>> $ time ip -b dummy30000.batch
>> real	2m0.874s
>> user	0m0.448s
>> sys	1m49.720s
>> $ time rmmod dummy
>> real	1m13.988s
>> user	0m0.000s
>> sys	1m1.008s
>>
>> The idea of improving this part was suggested by
>> Thierry Herbelot <thierry.herbelot@6wind.com>.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>
> I'm not sure who is in charge of taking this patch. Should I resend it to
> someone else or is it already included in a tree?

There are a couple of things going on here.

This patch came out at the beginning of the merge window which is a time
when everything that was ready and well tested ahead of time gets
merged.

Your numbers don't look too bad, so I expect this code is ready to go
(although I am a smidge disappointed in the small size of the
performance improvement), my quick read through earlier did not show
anything scary.   Do you have any idea why going from O(N^2) algorithm
to a O(NlogN) algorithm showed such a small performance improvement with
30,000 entries?

Normally proc is looked at by a group of folks me, Alexey Dobriyan, and
Al Viro all sort of tag team taking care of the proc infrastructure with
(except for Al) Andrew Morton typically taking the patches and merging
them.

I am currently in the middle of a move so I don't have the time to carry
this change or do much of anything until I am settled again.

What I would recommend is verifying your patch works against v3.18-rc1
at the begginning of next week and sending the code to Andrew Morton.

Eric

^ permalink raw reply

* Re: [net PATCH 0/3] bug fixes for davinci_cpdma and cpsw drivers
From: David Miller @ 2014-10-14 19:35 UTC (permalink / raw)
  To: mugunthanvnm; +Cc: netdev
In-Reply-To: <1413219067-15328-1-git-send-email-mugunthanvnm@ti.com>

From: Mugunthan V N <mugunthanvnm@ti.com>
Date: Mon, 13 Oct 2014 22:21:04 +0530

> Mugunthan V N (3):
>   drivers: net: davinci_cpdma: remove kfree on objects allocated with
>     devm_* apis
>   drivers: net: davinci_cpdma: remove spinlock as SOFTIRQ-unsafe lock
>     order detected
>   drivers: net: cpsw: remove child devices while driver detach

Series applied, thanks.

^ permalink raw reply

* [Patch net] rds: avoid calling sock_kfree_s() on allocation failure
From: Cong Wang @ 2014-10-14 19:35 UTC (permalink / raw)
  To: netdev; +Cc: davem, rds-devel, Chien Yen, Stephen Hemminger, Cong Wang,
	Cong Wang

From: Cong Wang <cwang@twopensource.com>

It is okay to free a NULL pointer but not okay to mischarge the socket optmem
accounting. Compile test only.

Reported-by: rucsoftsec@gmail.com
Cc: Chien Yen <chien.yen@oracle.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

---
diff --git a/net/rds/rdma.c b/net/rds/rdma.c
index 4e37c1c..40084d8 100644
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -564,12 +564,12 @@ int rds_cmsg_rdma_args(struct rds_sock *rs, struct rds_message *rm,
 
 	if (rs->rs_bound_addr == 0) {
 		ret = -ENOTCONN; /* XXX not a great errno */
-		goto out;
+		goto out_ret;
 	}
 
 	if (args->nr_local > UIO_MAXIOV) {
 		ret = -EMSGSIZE;
-		goto out;
+		goto out_ret;
 	}
 
 	/* Check whether to allocate the iovec area */
@@ -578,7 +578,7 @@ int rds_cmsg_rdma_args(struct rds_sock *rs, struct rds_message *rm,
 		iovs = sock_kmalloc(rds_rs_to_sk(rs), iov_size, GFP_KERNEL);
 		if (!iovs) {
 			ret = -ENOMEM;
-			goto out;
+			goto out_ret;
 		}
 	}
 
@@ -696,6 +696,7 @@ int rds_cmsg_rdma_args(struct rds_sock *rs, struct rds_message *rm,
 	if (iovs != iovstack)
 		sock_kfree_s(rds_rs_to_sk(rs), iovs, iov_size);
 	kfree(pages);
+out_ret:
 	if (ret)
 		rds_rdma_free_op(op);
 	else

^ permalink raw reply related

* Re: [PATCH net-next] tg3: Add skb->xmit_more support
From: David Miller @ 2014-10-14 19:34 UTC (permalink / raw)
  To: prashant; +Cc: netdev, dborkman, mchan
In-Reply-To: <1413217302-15396-1-git-send-email-prashant@broadcom.com>

From: Prashant Sreedharan <prashant@broadcom.com>
Date: Mon, 13 Oct 2014 09:21:42 -0700

> Ring TX doorbell only if xmit_more is not set or the queue is stopped.
> 
> Suggested-by: Daniel Borkmann <dborkman@redhat.com>
> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied, thanks.

^ permalink raw reply

* Re: Netlink mmap tx security?
From: Andy Lutomirski @ 2014-10-14 19:33 UTC (permalink / raw)
  To: David Miller; +Cc: Linus Torvalds, Patrick McHardy, Network Development
In-Reply-To: <20141014.151949.1967601568480255495.davem@davemloft.net>

On Tue, Oct 14, 2014 at 12:19 PM, David Miller <davem@davemloft.net> wrote:
> From: Andy Lutomirski <luto@amacapital.net>
> Date: Sat, 11 Oct 2014 15:29:17 -0700
>
>> On May 12, 2014 3:08 PM, "Andy Lutomirski" <luto@amacapital.net> wrote:
>>>
>>> [moving to netdev -- this is much lower impact than I thought, since
>>> you can't set up a netlink mmap ring without global CAP_NET_ADMIN]
>>
>> Did anything ever happen here?  Despite not being a privilege
>> escalation in the normal sense, it's still a bug, and it's still a
>> fairly easy bypass of module signatures.
>
> Andy, please review:
>
> ====================
> [PATCH] netlink: Remove TX mmap support.
>
> There is no reasonable manner in which to absolutely make sure that another
> thread of control cannot write to the pages in the mmap()'d area and thus
> make sure that netlink messages do not change underneath us after we've
> performed verifications.

For full honesty, there is now the machinery developed for memfd
sealing, but I can't imagine that this is ever faster than just
copying the buffer.

I think that the NETLINK_SKB_TX declaration in include/linux/netlink.h
should probably go, too.  And there's the last parameter to
netlink_set_ring, too, and possibly even the nlk->tx_ring struct
itself.

--Andy

>
> Reported-by: Andy Lutomirski <luto@amacapital.net>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  net/netlink/af_netlink.c | 135 ++---------------------------------------------
>  1 file changed, 5 insertions(+), 130 deletions(-)
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index c416725..771e6c0 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -289,11 +289,6 @@ static bool netlink_rx_is_mmaped(struct sock *sk)
>         return nlk_sk(sk)->rx_ring.pg_vec != NULL;
>  }
>
> -static bool netlink_tx_is_mmaped(struct sock *sk)
> -{
> -       return nlk_sk(sk)->tx_ring.pg_vec != NULL;
> -}
> -
>  static __pure struct page *pgvec_to_page(const void *addr)
>  {
>         if (is_vmalloc_addr(addr))
> @@ -662,13 +657,6 @@ static unsigned int netlink_poll(struct file *file, struct socket *sock,
>         }
>         spin_unlock_bh(&sk->sk_receive_queue.lock);
>
> -       spin_lock_bh(&sk->sk_write_queue.lock);
> -       if (nlk->tx_ring.pg_vec) {
> -               if (netlink_current_frame(&nlk->tx_ring, NL_MMAP_STATUS_UNUSED))
> -                       mask |= POLLOUT | POLLWRNORM;
> -       }
> -       spin_unlock_bh(&sk->sk_write_queue.lock);
> -
>         return mask;
>  }
>
> @@ -698,104 +686,6 @@ static void netlink_ring_setup_skb(struct sk_buff *skb, struct sock *sk,
>         NETLINK_CB(skb).sk = sk;
>  }
>
> -static int netlink_mmap_sendmsg(struct sock *sk, struct msghdr *msg,
> -                               u32 dst_portid, u32 dst_group,
> -                               struct sock_iocb *siocb)
> -{
> -       struct netlink_sock *nlk = nlk_sk(sk);
> -       struct netlink_ring *ring;
> -       struct nl_mmap_hdr *hdr;
> -       struct sk_buff *skb;
> -       unsigned int maxlen;
> -       bool excl = true;
> -       int err = 0, len = 0;
> -
> -       /* Netlink messages are validated by the receiver before processing.
> -        * In order to avoid userspace changing the contents of the message
> -        * after validation, the socket and the ring may only be used by a
> -        * single process, otherwise we fall back to copying.
> -        */
> -       if (atomic_long_read(&sk->sk_socket->file->f_count) > 2 ||
> -           atomic_read(&nlk->mapped) > 1)
> -               excl = false;
> -
> -       mutex_lock(&nlk->pg_vec_lock);
> -
> -       ring   = &nlk->tx_ring;
> -       maxlen = ring->frame_size - NL_MMAP_HDRLEN;
> -
> -       do {
> -               hdr = netlink_current_frame(ring, NL_MMAP_STATUS_VALID);
> -               if (hdr == NULL) {
> -                       if (!(msg->msg_flags & MSG_DONTWAIT) &&
> -                           atomic_read(&nlk->tx_ring.pending))
> -                               schedule();
> -                       continue;
> -               }
> -               if (hdr->nm_len > maxlen) {
> -                       err = -EINVAL;
> -                       goto out;
> -               }
> -
> -               netlink_frame_flush_dcache(hdr);
> -
> -               if (likely(dst_portid == 0 && dst_group == 0 && excl)) {
> -                       skb = alloc_skb_head(GFP_KERNEL);
> -                       if (skb == NULL) {
> -                               err = -ENOBUFS;
> -                               goto out;
> -                       }
> -                       sock_hold(sk);
> -                       netlink_ring_setup_skb(skb, sk, ring, hdr);
> -                       NETLINK_CB(skb).flags |= NETLINK_SKB_TX;
> -                       __skb_put(skb, hdr->nm_len);
> -                       netlink_set_status(hdr, NL_MMAP_STATUS_RESERVED);
> -                       atomic_inc(&ring->pending);
> -               } else {
> -                       skb = alloc_skb(hdr->nm_len, GFP_KERNEL);
> -                       if (skb == NULL) {
> -                               err = -ENOBUFS;
> -                               goto out;
> -                       }
> -                       __skb_put(skb, hdr->nm_len);
> -                       memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len);
> -                       netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
> -               }
> -
> -               netlink_increment_head(ring);
> -
> -               NETLINK_CB(skb).portid    = nlk->portid;
> -               NETLINK_CB(skb).dst_group = dst_group;
> -               NETLINK_CB(skb).creds     = siocb->scm->creds;
> -
> -               err = security_netlink_send(sk, skb);
> -               if (err) {
> -                       kfree_skb(skb);
> -                       goto out;
> -               }
> -
> -               if (unlikely(dst_group)) {
> -                       atomic_inc(&skb->users);
> -                       netlink_broadcast(sk, skb, dst_portid, dst_group,
> -                                         GFP_KERNEL);
> -               }
> -               err = netlink_unicast(sk, skb, dst_portid,
> -                                     msg->msg_flags & MSG_DONTWAIT);
> -               if (err < 0)
> -                       goto out;
> -               len += err;
> -
> -       } while (hdr != NULL ||
> -                (!(msg->msg_flags & MSG_DONTWAIT) &&
> -                 atomic_read(&nlk->tx_ring.pending)));
> -
> -       if (len > 0)
> -               err = len;
> -out:
> -       mutex_unlock(&nlk->pg_vec_lock);
> -       return err;
> -}
> -
>  static void netlink_queue_mmaped_skb(struct sock *sk, struct sk_buff *skb)
>  {
>         struct nl_mmap_hdr *hdr;
> @@ -842,10 +732,8 @@ static void netlink_ring_set_copied(struct sock *sk, struct sk_buff *skb)
>  #else /* CONFIG_NETLINK_MMAP */
>  #define netlink_skb_is_mmaped(skb)     false
>  #define netlink_rx_is_mmaped(sk)       false
> -#define netlink_tx_is_mmaped(sk)       false
>  #define netlink_mmap                   sock_no_mmap
>  #define netlink_poll                   datagram_poll
> -#define netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group, siocb)    0
>  #endif /* CONFIG_NETLINK_MMAP */
>
>  static void netlink_skb_destructor(struct sk_buff *skb)
> @@ -864,16 +752,11 @@ static void netlink_skb_destructor(struct sk_buff *skb)
>                 hdr = netlink_mmap_hdr(skb);
>                 sk = NETLINK_CB(skb).sk;
>
> -               if (NETLINK_CB(skb).flags & NETLINK_SKB_TX) {
> -                       netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
> -                       ring = &nlk_sk(sk)->tx_ring;
> -               } else {
> -                       if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
> -                               hdr->nm_len = 0;
> -                               netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
> -                       }
> -                       ring = &nlk_sk(sk)->rx_ring;
> +               if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
> +                       hdr->nm_len = 0;
> +                       netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
>                 }
> +               ring = &nlk_sk(sk)->rx_ring;
>
>                 WARN_ON(atomic_read(&ring->pending) == 0);
>                 atomic_dec(&ring->pending);
> @@ -2165,8 +2048,7 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname,
>                 err = 0;
>                 break;
>  #ifdef CONFIG_NETLINK_MMAP
> -       case NETLINK_RX_RING:
> -       case NETLINK_TX_RING: {
> +       case NETLINK_RX_RING: {
>                 struct nl_mmap_req req;
>
>                 /* Rings might consume more memory than queue limits, require
> @@ -2295,13 +2177,6 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
>                         goto out;
>         }
>
> -       if (netlink_tx_is_mmaped(sk) &&
> -           msg->msg_iov->iov_base == NULL) {
> -               err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
> -                                          siocb);
> -               goto out;
> -       }
> -
>         err = -EMSGSIZE;
>         if (len > sk->sk_sndbuf - 32)
>                 goto out;
> --
> 1.7.11.7
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply

* Re: [patch net repost] ipv4: fix nexthop attlen check in fib_nh_match
From: David Miller @ 2014-10-14 19:33 UTC (permalink / raw)
  To: jiri; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber, edumazet, tgraf
In-Reply-To: <1413210850-14456-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Mon, 13 Oct 2014 16:34:10 +0200

> fib_nh_match does not match nexthops correctly. Example:
> 
> ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
>                           nexthop via 192.168.122.13 dev eth0
> ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
>                           nexthop via 192.168.122.15 dev eth0
> 
> Del command is successful and route is removed. After this patch
> applied, the route is correctly matched and result is:
> RTNETLINK answers: No such process
> 
> Please consider this for stable trees as well.
> 
> Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config")
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> Acked-by: Eric Dumazet <edumazet@google.com>
> ---
> reposted with example (it was missing for some reason in the original post)

Applied and queued up for -stable, thanks Jiri!

^ permalink raw reply

* Re: Fwd: micrel: ksz8051 badly detected as ksz8031
From: Angelo Dureghello @ 2014-10-14 19:33 UTC (permalink / raw)
  To: Florian Fainelli, netdev@vger.kernel.org
In-Reply-To: <543D7507.3060008@gmail.com>

Hi Florian,

> On 10/14/2014 10:24 AM, Angelo Dureghello wrote:
>> Dear,
>>
>> have to apologize for the confusion, previous patch is not the proper fix,
>> since it is not solving completely the issue.
>>
>> And also, i mainly misunderstood the issue.
>>
>> The issue i am experiencing is :
>>
>> https://lkml.org/lkml/2013/9/18/259
>>
>> Mainly, i have Micrel chip marked KSZ8051(RNL), but the product Id in the
>> silicon is KSZ8031 and linux detects it as KSZ8031.
>> The attmept to mdio boot override register kill the Micrel functionality.
> Ok, so basically your bootloader does something that Linux does, and
> once Linux boots, it will reset the PHY to put it in a known state. If
> you can snoop the MDIO read/writes done in your bootloader environment,
> that might help narrow down the issue.
>
Bootloader is u-boot and seems it uses generic PHY setup, and so it works.

Linux at boot detects the phy and does a soft_reset. If the detection 
sets the
driver as for KSZ8031, ethernet/link will not work, becouse micrel.c uses
the incorrect config_init function, attempts to write to the bootstrap
override register, that can't be written for KSZ8051, and so puts the micrel
chip in a broken state.

The guy in this link (https://lkml.org/lkml/2013/9/18/259) seems are 
discussing a
better solution.

I patched my linux as below, but this is a fast fixup for me:

diff -rupN drivers/net/phy/phy_device.c 
../linux-3.17/drivers/net/phy/phy_device.c
--- drivers/net/phy/phy_device.c        2014-10-14 21:05:56.191117190 +0200
+++ ../linux-3.17/drivers/net/phy/phy_device.c  2014-10-05 
21:23:04.000000000 +0200
@@ -310,19 +310,6 @@ static int get_phy_id(struct mii_bus *bu

         *phy_id |= (phy_reg & 0xffff);

-       /*
-        * Angelo - Barix
-        * Micrel produced chips marked KSZ8051 but with KSZ8031 id code
-        * in the silicon. After getting crazy to understand why in 
recent kernel
-        * the ethenret was not workeing, i find it out.
-        *
-        * From the schematic, we assume to use KSZ8051
-        * I hardcode the fix here for ipam390.
-        */
-#if CONFIG_MACH_BARIX_IPAM390
-       if (*phy_id == 0x00221556) *phy_id = 0x00221550;
-#endif
-
         return 0;
  }

Regards
angelo

^ permalink raw reply

* Re: [PATCH net] tcp: TCP Small Queues and strange attractors
From: David Miller @ 2014-10-14 19:33 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1413206867.9362.100.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 13 Oct 2014 06:27:47 -0700

> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 8d4eac793700..4a7e97811d71 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -839,26 +839,38 @@ void tcp_wfree(struct sk_buff *skb)
>  {
 ...
>  		local_irq_restore(flags);
> -	} else {
> -		sock_wfree(skb);
> +		return;
>  	}
> +out:
> +	sk_free(sk);
>  }
>  

Why do we need to release the socket here?

^ permalink raw reply

* Re: [PATCH net] tcp: fix tcp_ack() performance problem
From: David Miller @ 2014-10-14 19:30 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, willemb, ncardwell, ycheng, vanj
In-Reply-To: <1413065849.9362.72.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 11 Oct 2014 15:17:29 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> We worked hard to improve tcp_ack() performance, by not accessing
> skb_shinfo() in fast path (cd7d8498c9a5 tcp: change tcp_skb_pcount()
> location)
> 
> We still have one spurious access because of ACK timestamping,
> added in commit e1c8a607b281 ("net-timestamp: ACK timestamp for
> bytestreams")
> 
> By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set,
> we can avoid two cache line misses for the common case.
> 
> While we are at it, add two prefetchw() :
> 
> One in tcp_ack() to bring skb at the head of write queue.
> 
> One in tcp_clean_rtx_queue() loop to bring following skb,
> as we will delete skb from the write queue and dirty skb->next->prev.
> 
> Add a couple of [un]likely() clauses.
> 
> After this patch, tcp_ack() is no longer the most consuming
> function in tcp stack.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries
From: David Miller @ 2014-10-14 19:30 UTC (permalink / raw)
  To: nicolas.dichtel
  Cc: netdev, linux-kernel, ebiederm, akpm, adobriyan, rui.xiang, viro,
	oleg, gorcunov, kirill.shutemov, grant.likely, tytso, torvalds
In-Reply-To: <543BB42B.30505@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Mon, 13 Oct 2014 13:14:51 +0200

> I'm not sure who is in charge of taking this patch. Should I resend
> it to someone else or is it already included in a tree?

Just want to make it clear that I don't intend to take this via
the networking tree.

^ permalink raw reply

* Re: Netlink mmap tx security?
From: David Miller @ 2014-10-14 19:19 UTC (permalink / raw)
  To: luto; +Cc: torvalds, kaber, netdev
In-Reply-To: <CALCETrWfQe5H2Ht7cjCQLfUw+XUcRvga_H93esaWpAp37=noZg@mail.gmail.com>

From: Andy Lutomirski <luto@amacapital.net>
Date: Sat, 11 Oct 2014 15:29:17 -0700

> On May 12, 2014 3:08 PM, "Andy Lutomirski" <luto@amacapital.net> wrote:
>>
>> [moving to netdev -- this is much lower impact than I thought, since
>> you can't set up a netlink mmap ring without global CAP_NET_ADMIN]
> 
> Did anything ever happen here?  Despite not being a privilege
> escalation in the normal sense, it's still a bug, and it's still a
> fairly easy bypass of module signatures.

Andy, please review:

====================
[PATCH] netlink: Remove TX mmap support.

There is no reasonable manner in which to absolutely make sure that another
thread of control cannot write to the pages in the mmap()'d area and thus
make sure that netlink messages do not change underneath us after we've
performed verifications.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/netlink/af_netlink.c | 135 ++---------------------------------------------
 1 file changed, 5 insertions(+), 130 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c416725..771e6c0 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -289,11 +289,6 @@ static bool netlink_rx_is_mmaped(struct sock *sk)
 	return nlk_sk(sk)->rx_ring.pg_vec != NULL;
 }
 
-static bool netlink_tx_is_mmaped(struct sock *sk)
-{
-	return nlk_sk(sk)->tx_ring.pg_vec != NULL;
-}
-
 static __pure struct page *pgvec_to_page(const void *addr)
 {
 	if (is_vmalloc_addr(addr))
@@ -662,13 +657,6 @@ static unsigned int netlink_poll(struct file *file, struct socket *sock,
 	}
 	spin_unlock_bh(&sk->sk_receive_queue.lock);
 
-	spin_lock_bh(&sk->sk_write_queue.lock);
-	if (nlk->tx_ring.pg_vec) {
-		if (netlink_current_frame(&nlk->tx_ring, NL_MMAP_STATUS_UNUSED))
-			mask |= POLLOUT | POLLWRNORM;
-	}
-	spin_unlock_bh(&sk->sk_write_queue.lock);
-
 	return mask;
 }
 
@@ -698,104 +686,6 @@ static void netlink_ring_setup_skb(struct sk_buff *skb, struct sock *sk,
 	NETLINK_CB(skb).sk = sk;
 }
 
-static int netlink_mmap_sendmsg(struct sock *sk, struct msghdr *msg,
-				u32 dst_portid, u32 dst_group,
-				struct sock_iocb *siocb)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-	struct netlink_ring *ring;
-	struct nl_mmap_hdr *hdr;
-	struct sk_buff *skb;
-	unsigned int maxlen;
-	bool excl = true;
-	int err = 0, len = 0;
-
-	/* Netlink messages are validated by the receiver before processing.
-	 * In order to avoid userspace changing the contents of the message
-	 * after validation, the socket and the ring may only be used by a
-	 * single process, otherwise we fall back to copying.
-	 */
-	if (atomic_long_read(&sk->sk_socket->file->f_count) > 2 ||
-	    atomic_read(&nlk->mapped) > 1)
-		excl = false;
-
-	mutex_lock(&nlk->pg_vec_lock);
-
-	ring   = &nlk->tx_ring;
-	maxlen = ring->frame_size - NL_MMAP_HDRLEN;
-
-	do {
-		hdr = netlink_current_frame(ring, NL_MMAP_STATUS_VALID);
-		if (hdr == NULL) {
-			if (!(msg->msg_flags & MSG_DONTWAIT) &&
-			    atomic_read(&nlk->tx_ring.pending))
-				schedule();
-			continue;
-		}
-		if (hdr->nm_len > maxlen) {
-			err = -EINVAL;
-			goto out;
-		}
-
-		netlink_frame_flush_dcache(hdr);
-
-		if (likely(dst_portid == 0 && dst_group == 0 && excl)) {
-			skb = alloc_skb_head(GFP_KERNEL);
-			if (skb == NULL) {
-				err = -ENOBUFS;
-				goto out;
-			}
-			sock_hold(sk);
-			netlink_ring_setup_skb(skb, sk, ring, hdr);
-			NETLINK_CB(skb).flags |= NETLINK_SKB_TX;
-			__skb_put(skb, hdr->nm_len);
-			netlink_set_status(hdr, NL_MMAP_STATUS_RESERVED);
-			atomic_inc(&ring->pending);
-		} else {
-			skb = alloc_skb(hdr->nm_len, GFP_KERNEL);
-			if (skb == NULL) {
-				err = -ENOBUFS;
-				goto out;
-			}
-			__skb_put(skb, hdr->nm_len);
-			memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len);
-			netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
-		}
-
-		netlink_increment_head(ring);
-
-		NETLINK_CB(skb).portid	  = nlk->portid;
-		NETLINK_CB(skb).dst_group = dst_group;
-		NETLINK_CB(skb).creds	  = siocb->scm->creds;
-
-		err = security_netlink_send(sk, skb);
-		if (err) {
-			kfree_skb(skb);
-			goto out;
-		}
-
-		if (unlikely(dst_group)) {
-			atomic_inc(&skb->users);
-			netlink_broadcast(sk, skb, dst_portid, dst_group,
-					  GFP_KERNEL);
-		}
-		err = netlink_unicast(sk, skb, dst_portid,
-				      msg->msg_flags & MSG_DONTWAIT);
-		if (err < 0)
-			goto out;
-		len += err;
-
-	} while (hdr != NULL ||
-		 (!(msg->msg_flags & MSG_DONTWAIT) &&
-		  atomic_read(&nlk->tx_ring.pending)));
-
-	if (len > 0)
-		err = len;
-out:
-	mutex_unlock(&nlk->pg_vec_lock);
-	return err;
-}
-
 static void netlink_queue_mmaped_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct nl_mmap_hdr *hdr;
@@ -842,10 +732,8 @@ static void netlink_ring_set_copied(struct sock *sk, struct sk_buff *skb)
 #else /* CONFIG_NETLINK_MMAP */
 #define netlink_skb_is_mmaped(skb)	false
 #define netlink_rx_is_mmaped(sk)	false
-#define netlink_tx_is_mmaped(sk)	false
 #define netlink_mmap			sock_no_mmap
 #define netlink_poll			datagram_poll
-#define netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group, siocb)	0
 #endif /* CONFIG_NETLINK_MMAP */
 
 static void netlink_skb_destructor(struct sk_buff *skb)
@@ -864,16 +752,11 @@ static void netlink_skb_destructor(struct sk_buff *skb)
 		hdr = netlink_mmap_hdr(skb);
 		sk = NETLINK_CB(skb).sk;
 
-		if (NETLINK_CB(skb).flags & NETLINK_SKB_TX) {
-			netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
-			ring = &nlk_sk(sk)->tx_ring;
-		} else {
-			if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
-				hdr->nm_len = 0;
-				netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
-			}
-			ring = &nlk_sk(sk)->rx_ring;
+		if (!(NETLINK_CB(skb).flags & NETLINK_SKB_DELIVERED)) {
+			hdr->nm_len = 0;
+			netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
 		}
+		ring = &nlk_sk(sk)->rx_ring;
 
 		WARN_ON(atomic_read(&ring->pending) == 0);
 		atomic_dec(&ring->pending);
@@ -2165,8 +2048,7 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname,
 		err = 0;
 		break;
 #ifdef CONFIG_NETLINK_MMAP
-	case NETLINK_RX_RING:
-	case NETLINK_TX_RING: {
+	case NETLINK_RX_RING: {
 		struct nl_mmap_req req;
 
 		/* Rings might consume more memory than queue limits, require
@@ -2295,13 +2177,6 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 			goto out;
 	}
 
-	if (netlink_tx_is_mmaped(sk) &&
-	    msg->msg_iov->iov_base == NULL) {
-		err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
-					   siocb);
-		goto out;
-	}
-
 	err = -EMSGSIZE;
 	if (len > sk->sk_sndbuf - 32)
 		goto out;
-- 
1.7.11.7

^ permalink raw reply related

* Re: Fwd: micrel: ksz8051 badly detected as ksz8031
From: Florian Fainelli @ 2014-10-14 19:09 UTC (permalink / raw)
  To: Angelo Dureghello, netdev@vger.kernel.org
In-Reply-To: <543D5C5B.9010703@gmail.com>

On 10/14/2014 10:24 AM, Angelo Dureghello wrote:
> Dear,
> 
> have to apologize for the confusion, previous patch is not the proper fix,
> since it is not solving completely the issue.
> 
> And also, i mainly misunderstood the issue.
> 
> The issue i am experiencing is :
> 
> https://lkml.org/lkml/2013/9/18/259
> 
> Mainly, i have Micrel chip marked KSZ8051(RNL), but the product Id in the
> silicon is KSZ8031 and linux detects it as KSZ8031.
> The attmept to mdio boot override register kill the Micrel functionality.

Ok, so basically your bootloader does something that Linux does, and
once Linux boots, it will reset the PHY to put it in a known state. If
you can snoop the MDIO read/writes done in your bootloader environment,
that might help narrow down the issue.

> 
> So i just replaced the phy_id code (hardcoded) in the code mdio detecion
> routine.

According to the link you posted above, the problem is that two
different PHY chips have the PHY ID, so why would you need to hardcode
the detection here?

> 
> Also, i am not giving to the Micrel any external 50Mhz clock, but
> as per default, the Micrel is giving the clock out to the davinci-emac.
> So no fixups are needed for my case.
> 
> But i still have a last issue now: i see link is up 100Mbit, but no
> packets are really sent, and nothing is received.
> Led links are up.

Do you have access to the Ethernet MAC counters using 'ethtool -S'? That
might tell you whether the problem is between the MAC and PHY, or at the
PHY level. If the PHY maintain its own set of counters (fairly unlikely)
can you try reading those and see if that works?

What about pinmux settings, Ethernet MAC settings etc... are they all
correct? Is there any Ethernet MAC back-pressure mechanism that can be
read to know whether the DMA engine is stuck transmitting?

Thanks

> 
> Time zone set
> Starting network...
> davinci_mdio davinci_mdio.0: resetting idled controller
> net eth0: attached PHY driver [Micrel KSZ8051]
> (mii_bus:phy_addr=davinci_mdio-0:00, id=221550)
> udhcpc (v1.20.2) started
> Sending discover...
> davinci_emac davinci_emac.1 eth0: Link is Up - 100Mbps/Full - flow
> control off
> Sending discover...
> Sending discover...
> No lease, failing
> ....
> 
> 
> [root@barix ~]# ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:08:E1:03:2A:C5
>           inet6 addr: fe80::208:e1ff:fe03:2ac5/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:0 (0.0 B)  TX bytes:2258 (2.2 KiB)
>           Interrupt:33
> 
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:65536  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> Packets seems sent but they are not sent at all (checking from WS)
> and no packets are received at the same time.
> 
> Reagrds,
> Angelo
> 
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/12] Coverity patches for drivers/isdn
From: David Miller @ 2014-10-14 19:05 UTC (permalink / raw)
  To: tilman; +Cc: netdev, davej, hjlipp, isdn, isdn4linux
In-Reply-To: <cover.1413021630.git.tilman@imap.cc>

From: Tilman Schmidt <tilman@imap.cc>
Date: Sat, 11 Oct 2014 13:46:29 +0200 (CEST)

> Here's a series of patches for the ISDN CAPI subsystem and the
> Gigaset ISDN driver.
> Patches 1 to 7 are specific fixes for Coverity warnings.
> Patches 8 to 11 fix related problems with the handling of invalid
> CAPI command codes I noticed while working on this.
> Patch 12 fixes an unrelated problem I noticed during the subsequent
> regression tests.
> It would be great if these could still be merged.

Series applied, thanks.

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: David Miller @ 2014-10-14 19:03 UTC (permalink / raw)
  To: cwang; +Cc: kkolasa, eric.dumazet, netdev, edumazet
In-Reply-To: <CAHA+R7N9rNUH4aytFG3YLt8nzpERqmf34FgaWbe5R7P0AaWuZw@mail.gmail.com>

From: Cong Wang <cwang@twopensource.com>
Date: Tue, 14 Oct 2014 11:59:25 -0700

> On Tue, Oct 14, 2014 at 6:25 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
>> W dniu 14.10.2014 o 02:09, Cong Wang pisze:
>>
>>> On Mon, Oct 13, 2014 at 4:59 PM, Cong Wang <cwang@twopensource.com> wrote:
>>>>
>>>> Probably not related with this bug, but with regarding to the
>>>> offending commit, what's the point of the memmove() in tcp_v4_rcv()
>>>> since ip_rcv() already clears IPCB()?
>>>
>>> Oh, ip options are actually saved in ip_rcv_finish()... Hmm, looks scary
>>> to play with variable-length array with memmove()....
>>>
>> On my other old laptop with 32bit kernel next and graphics card Intel 945GM
>> just after the revert commit working OK,
>> before, after login to gnome shell in some seconds decorations disappear
>>
>> 32 bit Ubuntu 12.04.5 LTS, gnome shell, kernel source next 14-10-2014
>>
>> Can anyone confirm this ?
>>
> 
> Sorry, believe it or not, for me it is hard to find a 32bit machine even VM. :)
> 
> Could the attached patch by any chance help? I noticed cookie_v4_check()
> still uses IPCB() instead of TCPCB() at TCP layer.

Eric, please review.

^ permalink raw reply

* Re: [PATCH RFC v4 net 1/3] ipv6: Remove BACKTRACK macro
From: David Miller @ 2014-10-14 19:01 UTC (permalink / raw)
  To: kafai; +Cc: netdev, hannes
In-Reply-To: <1412966888-31384-2-git-send-email-kafai@fb.com>

From: Martin KaFai Lau <kafai@fb.com>
Date: Fri, 10 Oct 2014 11:48:06 -0700

> +struct fib6_node *fib6_backtrack(struct fib6_node *fn,
> +				 struct in6_addr *saddr);
> +

I am completely mystified why you did this, could you explain the
logic?  I want to know what drove you to make this exported.

I marked it static in my example patch, and there is no caller outside
of route.c

Doing this also eliminates inlining opportunitites.

Please keep this private inside of route.c

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox