Netdev List
 help / color / mirror / Atom feed
* Re: [0/3] net: Kill skb_copy_datagram_const_iovec
From: David Miller @ 2014-11-03 20:05 UTC (permalink / raw)
  To: herbert; +Cc: viro, netdev, linux-kernel, bcrl
In-Reply-To: <20141103053751.GA27845@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 13:37:51 +0800

> On Mon, Nov 03, 2014 at 12:45:03AM +0000, Al Viro wrote:
>>
>> Note, BTW, that there's a damn good reason to convert the socket side of
>> things to iov_iter - as it is, ->splice_write() there is basically done with
>> page-by-page mapping and doing kernel_sendmsg(); being able to deal with
>> "map and copy" stuff *inside* ->sendmsg() would not only reduce the overhead,
>> it would allow to get rid of ->sendpage() completely.  Basically, let
>> ->sendmsg() instances check the iov_iter type and play zerocopy games if
>> it's an "array of kernel pages" kind.  Compare ->sendpage() and ->sendmsg()
>> instances for the protocols that have nontrivial ->sendpage(); you'll see
>> that there's a lot of duplication.  Merging them looks very feasible, with
>> divergence happening only very deep in the call chain.
> 
> Honestly I don't really care which way we end up going as long as
> we pick one solution and stick with it.  Right now we have an
> abomination in the form of skb_copy_datagram_const_iovec which is
> the worst of both worlds, plus it duplicates tons of code.
> 
> So here's a few patches to kill this crap.

To pick one direction and go with it, I totally agree with.

But a patch set like this as an interim solution, I am not so happy
with.

If the method says const, we have a contract with the caller to not
modify the iovec.  That caller can assume that we have not done so.

So this patch set violated that contract and can result in real bugs
either now or in the future.

I'll see if I can make some progress converting the networking over
to iov_iter.  It can't be that difficult... albeit perhaps a little
time consuming.

^ permalink raw reply

* [PATCH v2 3/3] drivers: net: xgene: fix: Use separate resources
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
  To: davem, netdev, devicetree
  Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>

This patch fixes the following kernel crash during SGMII based 1GbE probe.

	BUG: Bad page state in process swapper/0  pfn:40fe6ad
	page:ffffffbee37a75d8 count:-1 mapcount:0 mapping:          (null) index:0x0
	flags: 0x0()
	page dumped because: nonzero _count
	Modules linked in:
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0+ #7
	Call trace:
	[<ffffffc000087fa0>] dump_backtrace+0x0/0x12c
	[<ffffffc0000880dc>] show_stack+0x10/0x1c
	[<ffffffc0004d981c>] dump_stack+0x74/0xc4
	[<ffffffc00012fe70>] bad_page+0xd8/0x128
	[<ffffffc000133000>] get_page_from_freelist+0x4b8/0x640
	[<ffffffc000133260>] __alloc_pages_nodemask+0xd8/0x834
	[<ffffffc0004194f8>] __netdev_alloc_frag+0x124/0x1b8
	[<ffffffc00041bfdc>] __netdev_alloc_skb+0x90/0x10c
	[<ffffffc00039ff30>] xgene_enet_refill_bufpool+0x11c/0x280
	[<ffffffc0003a11a4>] xgene_enet_process_ring+0x168/0x340
	[<ffffffc0003a1498>] xgene_enet_napi+0x1c/0x50
	[<ffffffc00042b454>] net_rx_action+0xc8/0x18c
	[<ffffffc0000b0880>] __do_softirq+0x114/0x24c
	[<ffffffc0000b0c34>] irq_exit+0x94/0xc8
	[<ffffffc0000e68a0>] __handle_domain_irq+0x8c/0xf4
	[<ffffffc000081288>] gic_handle_irq+0x30/0x7c

This was due to hardware resource sharing conflict with the firmware. This
patch fixes this crash by using resources (descriptor ring, prefetch buffer)
that are not shared.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 6 +++---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h | 3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index cc3f955..1236696 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -639,9 +639,9 @@ static int xgene_enet_create_desc_rings(struct net_device *ndev)
 	struct device *dev = ndev_to_dev(ndev);
 	struct xgene_enet_desc_ring *rx_ring, *tx_ring, *cp_ring;
 	struct xgene_enet_desc_ring *buf_pool = NULL;
-	u8 cpu_bufnum = 0, eth_bufnum = 0;
-	u8 bp_bufnum = 0x20;
-	u16 ring_id, ring_num = 0;
+	u8 cpu_bufnum = 0, eth_bufnum = START_ETH_BUFNUM;
+	u8 bp_bufnum = START_BP_BUFNUM;
+	u16 ring_id, ring_num = START_RING_NUM;
 	int ret;
 
 	/* allocate rx descriptor ring */
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index dba647d..f9958fa 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -38,6 +38,9 @@
 #define SKB_BUFFER_SIZE		(XGENE_ENET_MAX_MTU - NET_IP_ALIGN)
 #define NUM_PKT_BUF	64
 #define NUM_BUFPOOL	32
+#define START_ETH_BUFNUM	2
+#define START_BP_BUFNUM		0x22
+#define START_RING_NUM		8
 
 #define PHY_POLL_LINK_ON	(10 * HZ)
 #define PHY_POLL_LINK_OFF	(PHY_POLL_LINK_ON / 5)
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 2/3] drivers: net: xgene: Backward compatibility with older firmware
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
  To: davem, netdev, devicetree
  Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>

This patch adds support when used with older firmware (<= 1.13.28).

- Added xgene_ring_mgr_init() to check whether ring manager is initialized
- Calling xgene_ring_mgr_init() from xgene_port_ops.reset()
- To handle errors, changed the return type of xgene_port_ops.reset()

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c    | 18 +++++++++++++++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h    |  4 ++++
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  |  5 ++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  2 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c |  7 ++++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c |  7 ++++++-
 6 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 63ea194..7ba83ff 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -575,10 +575,24 @@ static void xgene_gmac_tx_disable(struct xgene_enet_pdata *pdata)
 	xgene_enet_wr_mcx_mac(pdata, MAC_CONFIG_1_ADDR, data & ~TX_EN);
 }
 
-static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
+bool xgene_ring_mgr_init(struct xgene_enet_pdata *p)
+{
+	if (!ioread32(p->ring_csr_addr + CLKEN_ADDR))
+		return false;
+
+	if (ioread32(p->ring_csr_addr + SRST_ADDR))
+		return false;
+
+	return true;
+}
+
+static int xgene_enet_reset(struct xgene_enet_pdata *pdata)
 {
 	u32 val;
 
+	if (!xgene_ring_mgr_init(pdata))
+		return -ENODEV;
+
 	clk_prepare_enable(pdata->clk);
 	clk_disable_unprepare(pdata->clk);
 	clk_prepare_enable(pdata->clk);
@@ -590,6 +604,8 @@ static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
 	val |= SCAN_AUTO_INCR;
 	MGMT_CLOCK_SEL_SET(&val, 1);
 	xgene_enet_wr_mcx_mac(pdata, MII_MGMT_CONFIG_ADDR, val);
+
+	return 0;
 }
 
 static void xgene_gport_shutdown(struct xgene_enet_pdata *pdata)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index 3855858..ec45f32 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -104,6 +104,9 @@ enum xgene_enet_rm {
 #define BLOCK_ETH_MAC_OFFSET		0x0000
 #define BLOCK_ETH_MAC_CSR_OFFSET	0x2800
 
+#define CLKEN_ADDR			0xc208
+#define SRST_ADDR			0xc200
+
 #define MAC_ADDR_REG_OFFSET		0x00
 #define MAC_COMMAND_REG_OFFSET		0x04
 #define MAC_WRITE_REG_OFFSET		0x08
@@ -318,6 +321,7 @@ void xgene_enet_parse_error(struct xgene_enet_desc_ring *ring,
 
 int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata);
 void xgene_enet_mdio_remove(struct xgene_enet_pdata *pdata);
+bool xgene_ring_mgr_init(struct xgene_enet_pdata *p);
 
 extern struct xgene_mac_ops xgene_gmac_ops;
 extern struct xgene_port_ops xgene_gport_ops;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 3c208cc..cc3f955 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -852,7 +852,9 @@ static int xgene_enet_init_hw(struct xgene_enet_pdata *pdata)
 	u16 dst_ring_num;
 	int ret;
 
-	pdata->port_ops->reset(pdata);
+	ret = pdata->port_ops->reset(pdata);
+	if (ret)
+		return ret;
 
 	ret = xgene_enet_create_desc_rings(ndev);
 	if (ret) {
@@ -954,6 +956,7 @@ static int xgene_enet_probe(struct platform_device *pdev)
 
 	return ret;
 err:
+	unregister_netdev(ndev);
 	free_netdev(ndev);
 	return ret;
 }
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index 874e5a0..dba647d 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -83,7 +83,7 @@ struct xgene_mac_ops {
 };
 
 struct xgene_port_ops {
-	void (*reset)(struct xgene_enet_pdata *pdata);
+	int (*reset)(struct xgene_enet_pdata *pdata);
 	void (*cle_bypass)(struct xgene_enet_pdata *pdata,
 			   u32 dst_ring_num, u16 bufpool_id);
 	void (*shutdown)(struct xgene_enet_pdata *pdata);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
index c22f326..f5d4f68 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c
@@ -311,14 +311,19 @@ static void xgene_sgmac_tx_disable(struct xgene_enet_pdata *p)
 	xgene_sgmac_rxtx(p, TX_EN, false);
 }
 
-static void xgene_enet_reset(struct xgene_enet_pdata *p)
+static int xgene_enet_reset(struct xgene_enet_pdata *p)
 {
+	if (!xgene_ring_mgr_init(p))
+		return -ENODEV;
+
 	clk_prepare_enable(p->clk);
 	clk_disable_unprepare(p->clk);
 	clk_prepare_enable(p->clk);
 
 	xgene_enet_ecc_init(p);
 	xgene_enet_config_ring_if_assoc(p);
+
+	return 0;
 }
 
 static void xgene_enet_cle_bypass(struct xgene_enet_pdata *p,
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
index 67d0720..a18a9d1 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
@@ -252,14 +252,19 @@ static void xgene_xgmac_tx_disable(struct xgene_enet_pdata *pdata)
 	xgene_enet_wr_mac(pdata, AXGMAC_CONFIG_1, data & ~HSTTFEN);
 }
 
-static void xgene_enet_reset(struct xgene_enet_pdata *pdata)
+static int xgene_enet_reset(struct xgene_enet_pdata *pdata)
 {
+	if (!xgene_ring_mgr_init(pdata))
+		return -ENODEV;
+
 	clk_prepare_enable(pdata->clk);
 	clk_disable_unprepare(pdata->clk);
 	clk_prepare_enable(pdata->clk);
 
 	xgene_enet_ecc_init(pdata);
 	xgene_enet_config_ring_if_assoc(pdata);
+
+	return 0;
 }
 
 static void xgene_enet_xgcle_bypass(struct xgene_enet_pdata *pdata,
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 1/3] dtb: xgene: fix: Backward compatibility with older firmware
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
  To: davem, netdev, devicetree
  Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian
In-Reply-To: <1415044796-5081-1-git-send-email-isubramanian@apm.com>

The following kernel crash was reported when using older firmware (<= 1.13.28).

[    0.980000] libphy: APM X-Gene MDIO bus: probed
[    1.130000] Unhandled fault: synchronous external abort (0x96000010) at 0xffffff800009a17c
[    1.140000] Internal error: : 96000010 [#1] SMP
[    1.140000] Modules linked in:
[    1.140000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0+ #21
[    1.140000] task: ffffffc3f0110000 ti: ffffffc3f0064000 task.ti: ffffffc3f0064000
[    1.140000] PC is at ioread32+0x58/0x68
[    1.140000] LR is at xgene_enet_setup_ring+0x18c/0x1cc
[    1.140000] pc : [<ffffffc0003cec68>] lr : [<ffffffc00053dad8>] pstate: a0000045
[    1.140000] sp : ffffffc3f0067b20
[    1.140000] x29: ffffffc3f0067b20 x28: ffffffc000aa8ea0
[    1.140000] x27: ffffffc000bb2000 x26: ffffffc000a64270
[    1.140000] x25: ffffffc000b05ad8 x24: ffffffc0ff99ba58
[    1.140000] x23: 0000000000004000 x22: 0000000000004000
[    1.140000] x21: 0000000000000200 x20: 0000000000200000
[    1.140000] x19: ffffffc0ff99ba18 x18: ffffffc0007a6000
[    1.140000] x17: 0000000000000007 x16: 000000000000000e
[    1.140000] x15: 0000000000000001 x14: 0000000000000000
[    1.140000] x13: ffffffbeedb71320 x12: 00000000ffffff80
[    1.140000] x11: 0000000000000002 x10: 0000000000000000
[    1.140000] x9 : 0000000000000000 x8 : ffffffc3eb2a4000
[    1.140000] x7 : 0000000000000000 x6 : 0000000000000000
[    1.140000] x5 : 0000000001080000 x4 : 000000007d654010
[    1.140000] x3 : ffffffffffffffff x2 : 000000000003ffff
[    1.140000] x1 : ffffff800009a17c x0 : ffffff800009a17c

The issue was that the older firmware does not support 10GbE and
SGMII based 1GBE interfaces.

This patch changes the address length of the reg property of sgmii0 and xgmii
nodes and serves as preparatory patch for the fix.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Reported-by: Dann Frazier <dann.frazier@canonical.com>
---
 arch/arm64/boot/dts/apm-storm.dtsi | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/boot/dts/apm-storm.dtsi b/arch/arm64/boot/dts/apm-storm.dtsi
index 295c72d..f1ad9c2 100644
--- a/arch/arm64/boot/dts/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm-storm.dtsi
@@ -599,7 +599,7 @@
 			compatible = "apm,xgene-enet";
 			status = "disabled";
 			reg = <0x0 0x17020000 0x0 0xd100>,
-			      <0x0 0X17030000 0x0 0X400>,
+			      <0x0 0X17030000 0x0 0Xc300>,
 			      <0x0 0X10000000 0x0 0X200>;
 			reg-names = "enet_csr", "ring_csr", "ring_cmd";
 			interrupts = <0x0 0x3c 0x4>;
@@ -624,9 +624,9 @@
 		sgenet0: ethernet@1f210000 {
 			compatible = "apm,xgene-enet";
 			status = "disabled";
-			reg = <0x0 0x1f210000 0x0 0x10000>,
-			      <0x0 0x1f200000 0x0 0X10000>,
-			      <0x0 0x1B000000 0x0 0X20000>;
+			reg = <0x0 0x1f210000 0x0 0xd100>,
+			      <0x0 0x1f200000 0x0 0Xc300>,
+			      <0x0 0x1B000000 0x0 0X200>;
 			reg-names = "enet_csr", "ring_csr", "ring_cmd";
 			interrupts = <0x0 0xA0 0x4>;
 			dma-coherent;
@@ -639,7 +639,7 @@
 			compatible = "apm,xgene-enet";
 			status = "disabled";
 			reg = <0x0 0x1f610000 0x0 0xd100>,
-			      <0x0 0x1f600000 0x0 0X400>,
+			      <0x0 0x1f600000 0x0 0Xc300>,
 			      <0x0 0x18000000 0x0 0X200>;
 			reg-names = "enet_csr", "ring_csr", "ring_cmd";
 			interrupts = <0x0 0x60 0x4>;
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 0/3] drivers: net: xgene: Fix crash for backward compatibility
From: Iyappan Subramanian @ 2014-11-03 19:59 UTC (permalink / raw)
  To: davem, netdev, devicetree
  Cc: linux-arm-kernel, patches, kchudgar, Iyappan Subramanian

This patch set fixes the following issues that were reported during regression.

Patch 1,2 : Adds backward compatibility with the older firmware (<= 1.13.28).
Patch 3   : Use separate hardware resources (descriptor ring, prefetch buffer)
	   that are not shared with the firmware
---

Iyappan Subramanian (3):
  dtb: xgene: fix: Backward compatibility with older firmware
  drivers: net: xgene: Backward compatibility with older firmware
  drivers: net: xgene: fix: Use separate resources

 arch/arm64/boot/dts/apm-storm.dtsi                | 10 +++++-----
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c    | 18 +++++++++++++++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h    |  4 ++++
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  | 11 +++++++----
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  5 ++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c |  7 ++++++-
 drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c |  7 ++++++-
 7 files changed, 49 insertions(+), 13 deletions(-)

-- 
1.9.1

^ permalink raw reply

* Re: macvtap: Fix csum_start when VLAN tags are present
From: David Miller @ 2014-11-03 19:52 UTC (permalink / raw)
  To: herbert; +Cc: netdev
In-Reply-To: <20141103060125.GA28295@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 14:01:25 +0800

> When VLAN is in use in macvtap_put_user, we end up setting
> csum_start to the wrong place.  The result is that the whoever
> ends up doing the checksum setting will corrupt the packet instead
> of writing the checksum to the expected location, usually this
> means writing the checksum with an offset of -4.
> 
> This patch fixes this by adjusting csum_start when VLAN tags are
> detected.
> 
> Fixes: f09e2249c4f5 ("macvtap: restore vlan header on user read")
> Cc: stable@vger.kernel.org
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply

* Re: [PATCH] net: fec: fix suspend broken on multiple MACs sillicons
From: David Miller @ 2014-11-03 19:50 UTC (permalink / raw)
  To: b38611; +Cc: netdev, festevam
In-Reply-To: <1414992410-27886-1-git-send-email-b38611@freescale.com>

From: Fugang Duan <b38611@freescale.com>
Date: Mon, 3 Nov 2014 13:26:50 +0800

> On i.MX6SX sdb platform, there has two same enet MACs, after system up,
> just eth0 is up, and then do suspend/resume test:
 ...
> The root cause is that eth1 is not opened and clock is not enabled, and .suspend() still
> call .fec_enet_clk_enable() to disable clock.
> 
> To avoid the broken, let it check network device up status by calling .netif_running()
> before disable/enable clocks.
> 
> Signed-off-by: Fugang Duan <B38611@freescale.com>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH v1 1/2] dtb: xgene: fix: Disable 10GbE and SGMII based 1GbE by default
From: Iyappan Subramanian @ 2014-11-03 19:45 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel@lists.infradead.org, David Miller, netdev,
	devicetree@vger.kernel.org, Keyur Chudgar, patches
In-Reply-To: <1991367.DGHysZpQVP@wuerfel>

Hi Arnd,

On Thu, Oct 30, 2014 at 3:13 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 29 October 2014 17:56:19 Iyappan Subramanian wrote:
>> @@ -621,7 +621,7 @@
>>                         };
>>                 };
>>
>> -               sgenet0: ethernet@1f210000 {
>> +               sgenet0: sgenet@1f210000 {
>>                         compatible = "apm,xgene-enet";
>>                         status = "disabled";
>>                         reg = <0x0 0x1f210000 0x0 0x10000>,
>>
>
> This looks like you accidentally reverted a bug fix made earlier.
> Network devices should always have the name 'ethernet@...'.

Thanks for the review.  Since our firmware was patching the dtb, based
on the node-name, we thought by changing node-name, we can avoid the
patching and maintain backward compatibility.

Now we know that network devices should have 'ethernet@...', we will
handle the backward compatibility in a different way and will post the
patch v2 shortly.

>
>         Arnd

^ permalink raw reply

* Re: [PATCH] ipv6: do xfrm transform after nat if necessary
From: David Miller @ 2014-11-03 19:42 UTC (permalink / raw)
  To: duanj.fnst; +Cc: netdev
In-Reply-To: <54570A2F.2070206@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Date: Mon, 3 Nov 2014 12:53:03 +0800

> 
> 
> In function nf_nat_ipv6_out, after nat is done, nf_xfrm_me_harder()
> will be called to look up xfrm dst.
> 
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>

This is far from sufficient of a commit log message for a change that
is as serious and has as many implications as this one.

You haven't answered many questions, first of which in my mind is
why we are bypassing all of the fragmentation checks?

We're also bypassing ip6_finish_output2() which does multicast and
hooks up the neighbour.

IPV4 doesn't do this, why doesn't it have the same supposed problem
you are trying to solve?

It is not even clear to me what the problem is, because your commit
message is way too terse.

^ permalink raw reply

* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
From: Eric Dumazet @ 2014-11-03 19:41 UTC (permalink / raw)
  To: Tomasz Mloduchowski; +Cc: netdev
In-Reply-To: <5457D095.9070502@qdot.me>

On Mon, 2014-11-03 at 19:59 +0100, Tomasz Mloduchowski wrote:
> Hi List,
> 
> I hope this is the right place to report a networking issue with
> 3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
> rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
> quickly exhausts tcp_mem limits in a very strange sequence of events.
> 
> It might be scheduler or networking subsystem related.
> 
> It's 100% reproducible on my system, first observed under 3.18.0-rc2.
> 

Sounds a perfect case for a bisection maybe ?

^ permalink raw reply

* Re: [0/2] tun: Fix csum_start and TUN_PKT_STRIP
From: David Miller @ 2014-11-03 19:28 UTC (permalink / raw)
  To: herbert; +Cc: netdev
In-Reply-To: <20141102202929.GA24935@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 3 Nov 2014 04:29:30 +0800

> The first patch fixes a serious problem that breaks checksum offload
> in VMs while the second patch fixes a problem that probably affects
> no one.

Series applied, thanks Herbert.

^ permalink raw reply

* Re: [PATCH] net: eth: realtek: atp: checkpatch errors and warnings corrected
From: Roberto Medina @ 2014-11-03 19:09 UTC (permalink / raw)
  To: Joe Perches; +Cc: netdev, linux-kernel
In-Reply-To: <1415040686.17743.31.camel@perches.com>

On 11/03/2014 07:51 PM, Joe Perches wrote:
>
> Some ancient drivers could be regarded as neolithic
> curiosities that never need updating.  This may be one.
>
> But if you really want to change it, could you please
> make sure that objdiff shows no changes?
>

I see some changes with objdiff, maybe this is caused by the include 
file that I changed to linux/io.h instead of asm/io.h

--- 
/home/vov/Git/linux-next/.tmp_objdiff/a641d0e/drivers/net/ethernet/realtek/atp.dis 
2014-11-03 19:59:18.723954900 +0100
+++ 
/home/vov/Git/linux-next/.tmp_objdiff/5f19b70/drivers/net/ethernet/realtek/atp.dis 
2014-11-03 20:00:34.133954217 +0100
@@ -1753,9 +1753,8 @@
  	...
  :	e3 0a                	jrcxz  4c4 <__gcov_.atp_probe1+0x14>
  :	00 00                	add    %al,(%rax)
-:	e2 e2                	loop   4a0 <__gcov_.atp_init+0x20>
-:	9c                   	pushfq
-:	9a                   	(bad)
+:	4b b4 98             	rex.WXB mov $0x98,%r12b
+:	ac                   	lods   %ds:(%rsi),%al
  :	67 2f                	addr32 (bad)
  :	e4 e3                	in     $0xe3,%al
  :	00 00                	add    %al,(%rax)
@@ -1767,30 +1766,37 @@
  	...
  :	e4 0a                	in     $0xa,%al
  :	00 00                	add    %al,(%rax)
-:	a5                   	movsl  %ds:(%rsi),%es:(%rdi)
-:	21 4f ca             	and    %ecx,-0x36(%rdi)
-:	a0 82 0e 4f 00 00 00 	movabs 0x7000000004f0e82,%al
-:	00 07
+:	d6                   	(bad)
+:	ad                   	lods   %ds:(%rsi),%eax
+:	51                   	push   %rcx
+:	70 a0                	jo     491 <__gcov_.atp_init+0x11>
+:	82                   	(bad)
+:	0e                   	(bad)
+:	4f 00 00             	rex.WRXB add %r8b,(%r8)
+:	00 00                	add    %al,(%rax)
+:	07                   	(bad)
  	...

  0000000000000510 <__gcov_.eeprom_op>:
  	...
  :	e5 0a                	in     $0xa,%eax
  :	00 00                	add    %al,(%rax)
-:	cd c3                	int    $0xc3
-:	0f c4 5c 4b fc 84    	pinsrw $0x84,-0x4(%rbx,%rcx,2),%mm3
-:	00 00                	add    %al,(%rax)
+:	54                   	push   %rsp
+:	0e                   	(bad)
+:	dd ca                	(bad)
+:	5c                   	pop    %rsp
+:	4b fc                	rex.WXB cld
+:	84 00                	test   %al,(%rax)
  :	00 00                	add    %al,(%rax)
-:	07                   	(bad)
+:	00 07                	add    %al,(%rdi)
  	...

  0000000000000540 <__gcov_.net_open>:
  	...
  :	e6 0a                	out    %al,$0xa
  :	00 00                	add    %al,(%rax)
-:	52                   	push   %rdx
-:	e7 5d                	out    %eax,$0x5d
-:	8f                   	(bad)
+:	0f bf 0e             	movswl (%rsi),%ecx
+:	4f                   	rex.WRXB
  :	66 af                	scas   %es:(%rdi),%ax
  :	e2 a2                	loop   4f6 <__gcov_.get_node_ID+0x16>
  :	00 00                	add    %al,(%rax)
@@ -1802,15 +1808,16 @@
  	...
  :	e7 0a                	out    %eax,$0xa
  :	00 00                	add    %al,(%rax)
-:	a9 17 a9 64 dc       	test   $0xdc64a917,%eax
-:	20 8c 1d 00 00 00 00 	and    %cl,0x0(%rbp,%rbx,1)
+:	21 8a b7 70 dc 20    	and    %ecx,0x20dc70b7(%rdx)
+:	8c 1d 00 00 00 00    	mov    %ds,0x0(%rip)        # 588 
<__gcov_.hardware_init+0x18>
  :	0f 00 00             	sldt   (%rax)
  	...

  00000000000005a0 <__gcov_.trigger_send>:
  	...
-:	e8 0a 00 00 b9       	callq  ffffffffb90005b7 
<net_open+0xffffffffb8ffeeb8>
-:	d3 0c 12             	rorl   %cl,(%rdx,%rdx,1)
+:	e8 0a 00 00 38       	callq  380005b7 <net_open+0x37ffeeb8>
+:	2e                   	cs
+:	e7 bb                	out    %eax,$0xbb
  :	b9 43 f3 81 00       	mov    $0x81f343,%ecx
  :	00 00                	add    %al,(%rax)
  :	00 04 00             	add    %al,(%rax,%rax,1)
@@ -1818,10 +1825,10 @@

  00000000000005d0 <__gcov_.write_packet>:
  	...
-:	e9 0a 00 00 63       	jmpq   630005e7 <net_open+0x62ffeee8>
-:	e5 05                	in     $0x5,%eax
-:	96                   	xchg   %eax,%esi
-:	27                   	(bad)
+:	e9 0a 00 00 05       	jmpq   50005e7 <net_open+0x4ffeee8>
+:	54                   	push   %rsp
+:	5f                   	pop    %rdi
+:	23 27                	and    (%rdi),%esp
  :	0d e5 2c 00 00       	or     $0x2ce5,%eax
  :	00 00                	add    %al,(%rax)
  :	15 00 00 00 00       	adc    $0x0,%eax
@@ -1831,7 +1838,7 @@
  	...
  :	ea                   	(bad)
  :	0a 00                	or     (%rax),%al
-:	00 2d d0 35 4c 7a    	add    %ch,0x7a4c35d0(%rip)        # 7a4c3be1 
<net_open+0x7a4c24e2>
+:	00 8e bf 72 94 7a    	add    %cl,0x7a9472bf(%rsi)
  :	c4 61 e7 00          	(bad)
  :	00 00                	add    %al,(%rax)
  :	00 06                	add    %al,(%rsi)
@@ -1841,9 +1848,11 @@
  	...
  :	eb 0a                	jmp    644 <__gcov_.atp_send_packet+0x14>
  :	00 00                	add    %al,(%rax)
-:	78 ac                	js     5ea <__gcov_.write_packet+0x1a>
-:	9b                   	fwait
-:	4f 63 b6 a7 f3 00 00 	rex.WRXB movslq 0xf3a7(%r14),%r14
+:	4e ed                	rex.WRX in (%dx),%eax
+:	5b                   	pop    %rbx
+:	dd 63 b6             	frstor -0x4a(%rbx)
+:	a7                   	cmpsl  %es:(%rdi),%ds:(%rsi)
+:	f3 00 00             	repz add %al,(%rax)
  :	00 00                	add    %al,(%rax)
  :	0c 00                	or     $0x0,%al
  	...
@@ -1852,9 +1861,10 @@
  	...
  :	ec                   	in     (%dx),%al
  :	0a 00                	or     (%rax),%al
-:	00 66 37             	add    %ah,0x37(%rsi)
-:	2f                   	(bad)
-:	d0 3e                	sarb   (%rsi)
+:	00 f9                	add    %bh,%cl
+:	64 4d                	fs rex.WRB
+:	44                   	rex.R
+:	3e                   	ds
  :	16                   	(bad)
  :	72 d7                	jb     64b <__gcov_.atp_send_packet+0x1b>
  :	00 00                	add    %al,(%rax)
@@ -1866,8 +1876,9 @@
  	...
  :	ed                   	in     (%dx),%eax
  :	0a 00                	or     (%rax),%al
-:	00 50 2e             	add    %dl,0x2e(%rax)
-:	24 2a                	and    $0x2a,%al
+:	00 5a 73             	add    %bl,0x73(%rdx)
+:	56                   	push   %rsi
+:	cf                   	iret
  :	b0 28                	mov    $0x28,%al
  :	9e                   	sahf
  :	0c 00                	or     $0x0,%al
@@ -1879,9 +1890,9 @@
  	...
  :	ee                   	out    %al,(%dx)
  :	0a 00                	or     (%rax),%al
-:	00 60 1b             	add    %ah,0x1b(%rax)
-:	f5                   	cmc
-:	b9 ca 01 fa fa       	mov    $0xfafa01ca,%ecx
+:	00 85 af 2d b0 ca    	add    %al,-0x354fd251(%rbp)
+:	01 fa                	add    %edi,%edx
+:	fa                   	cli
  :	00 00                	add    %al,(%rax)
  :	00 00                	add    %al,(%rax)
  :	17                   	(bad)
@@ -1891,7 +1902,8 @@
  	...
  :	ef                   	out    %eax,(%dx)
  :	0a 00                	or     (%rax),%al
-:	00 b3 fb 8e ed 58    	add    %dh,0x58ed8efb(%rbx)
+:	00 e8                	add    %ch,%al
+:	09 7c 81 58          	or     %edi,0x58(%rcx,%rax,4)
  :	32 13                	xor    (%rbx),%dl
  :	9d                   	popfq
  :	00 00                	add    %al,(%rax)
@@ -1902,7 +1914,8 @@
  0000000000000720 <__gcov_.net_close>:
  	...
  :	f0 0a 00             	lock or (%rax),%al
-:	00 ae b2 85 8a 2a    	add    %ch,0x2a8a85b2(%rsi)
+:	00 54 b3 8a          	add    %dl,-0x76(%rbx,%rsi,4)
+:	3b 2a                	cmp    (%rdx),%ebp
  :	24 53                	and    $0x53,%al
  :	13 00                	adc    (%rax),%eax
  :	00 00                	add    %al,(%rax)
@@ -1913,11 +1926,7 @@
  	...
  :	f1                   	icebp
  :	0a 00                	or     (%rax),%al
-:	00 20                	add    %ah,(%rax)
-:	f5                   	cmc
-:	d4                   	(bad)
-:	c7                   	(bad)
-:	96                   	xchg   %eax,%esi
+:	00 b5 7a 16 07 96    	add    %dh,-0x69f8e986(%rbp)
  :	df 25 67 00 00 00    	fbld   0x67(%rip)        # 7ce 
<__gcov_.atp_init_module+0x4e>
  :	00 04 00             	add    %al,(%rax,%rax,1)
  	...
@@ -1925,10 +1934,7 @@
  0000000000000780 <__gcov_.atp_init_module>:
  	...
  :	f2 0a 00             	repnz or (%rax),%al
-:	00 3f                	add    %bh,(%rdi)
-:	2e                   	cs
-:	04 c9                	add    $0xc9,%al
-:	9e                   	sahf
+:	00 a6 4d 6b ed 9e    	add    %ah,-0x611294b3(%rsi)
  :	66                   	data16
  :	b2 b6                	mov    $0xb6,%dl
  :	00 00                	add    %al,(%rax)
@@ -1940,7 +1946,8 @@
  :	30 34 00             	xor    %dh,(%rax,%rax,1)
  	...
  :	00 00                	add    %al,(%rax)
-:	00 6c 48 04          	add    %ch,0x4(%rax,%rcx,2)
+:	00 2c c8             	add    %ch,(%rax,%rcx,8)
+:	06                   	(bad)
  :	77 00                	ja     7d5 <__gcov_.atp_init_module+0x55>
  	...
  :	00 00                	add    %al,(%rax)
@@ -1950,9 +1957,8 @@
  0000000000000840 <__gcov_.atp_cleanup_module>:
  	...
  :	f3 0a 00             	repz or (%rax),%al
-:	00 20                	add    %ah,(%rax)
-:	d2 18                	rcrb   %cl,(%rax)
-:	bd 5e f3 45 61       	mov    $0x6145f35e,%ebp
+:	00 bb a8 99 b6 5e    	add    %bh,0x5eb699a8(%rbx)
+:	f3 45 61             	repz rex.RB (bad)
  :	00 00                	add    %al,(%rax)
  :	00 00                	add    %al,(%rax)
  :	04 00                	add    $0x0,%al
@@ -2076,13 +2082,13 @@
  0000000000000c50 <__gcov0.atp_cleanup_module>:
  	...

-0000000000000c70 <num_tx_since_rx.44203>:
+0000000000000c70 <num_tx_since_rx.44241>:
  	...

-0000000000000c78 <__key.44146>:
+0000000000000c78 <__key.44184>:
  	...

-0000000000000c80 <__key.44121>:
+0000000000000c80 <__key.44159>:
  	...

  0000000000000c88 <root_atp_dev>:

^ permalink raw reply

* Re: [PATCH 0/1] mv643xx_eth: Disable TSO by default
From: Eric Dumazet @ 2014-11-03 19:04 UTC (permalink / raw)
  To: David Laight
  Cc: Ezequiel Garcia, netdev@vger.kernel.org, David Miller,
	Thomas Petazzoni, Gregory Clement, Tawfik Bayouk, Lior Amsalem,
	Nadav Haklai
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D1C9E44F5@AcuExch.aculab.com>

On Mon, 2014-11-03 at 14:51 +0000, David Laight wrote:
> From: Eric Dumazet
> > On Sat, 2014-11-01 at 12:30 -0300, Ezequiel Garcia wrote:
> > > Several users ([1], [2]) have been reporting data corruption with TSO on
> > > Kirkwood platforms (i.e. using the mv643xx_eth driver).
> > >
> > > Until we manage to find what's causing this, this simple patch will make
> > > the TSO path disabled by default. This patch should be queued for stable,
> > > fixing the TSO feature introduced in v3.16.
> > >
> > > The corruption itself is very easy to reproduce: checking md5sum on a mounted
> > > NFS directory gives a different result each time. Same tests using the mvneta
> > > driver (Armada 370/38x/XP SoC) pass with no issues.
> > >
> > > Frankly, I'm a bit puzzled about this, and so any ideas or debugging hints
> > > are well received.
> > 
> > lack of barriers maybe ?
> > 
> > It seems you might need to populate all TX descriptors but delay the
> > first, like doing the populate in descending order.
> > 
> > If you take a look at txq_submit_skb(), you'll see the final
> > desc->cmd_sts = cmd_sts (line 959) is done _after_ frags were cooked by
> > txq_submit_frag_skb()
> > 
> > You should kick the nick only when all TX descriptors are ready and
> > committed to memory.
> 
> Don't forget that the nick might process the first descriptor without
> being given a 'kick' - it will read it when it finishes processing the
> previous frame.
> This also means that you have to be careful about the order of the writes
> to the first descriptor.

This is what I implied and implemented in the patch ;)

^ permalink raw reply

* TCP out of memory - possible bug [3.18.0-rc3] / sched?
From: Tomasz Mloduchowski @ 2014-11-03 18:59 UTC (permalink / raw)
  To: netdev

Hi List,

I hope this is the right place to report a networking issue with
3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
quickly exhausts tcp_mem limits in a very strange sequence of events.

It might be scheduler or networking subsystem related.

It's 100% reproducible on my system, first observed under 3.18.0-rc2.

Neither terminating the offending application, nor removing the network
card seems to bring the 'mem' item in /proc/net/sockstat down from it's
extreme values.

http://static.qdot.me/tcp_mem_issue.png contains the plot of the 'mem'
and 'sockets' fields from sockstat - violet is the 'mem', quickly
exhausting the default 512k pages limit after a short period of reliable
operation.


Best Regards,
Tomasz

-- snip --

sched: RT throttling activated
kworker/dying (792) used greatest stack depth: 11984 bytes left
TCP: out of memory -- consider tuning tcp_mem
TCP: out of memory -- consider tuning tcp_mem
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/core/stream.c:201
sk_stream_kill_queues+0x12c/0x130()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Not tainted 3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
 0000000000000009 ffff8800cae2fd88 ffffffff81b6c0ff 0000000000000000
 0000000000000000 ffff8800cae2fdc8 ffffffff810c235c ffff8800cae2fda8
 ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
 [<ffffffff81b6c0ff>] dump_stack+0x46/0x58
 [<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
 [<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
 [<ffffffff81954fbc>] sk_stream_kill_queues+0x12c/0x130
 [<ffffffff819b8b45>] inet_csk_destroy_sock+0x55/0x140
 [<ffffffff819bd9ee>] tcp_close+0x22e/0x430
 [<ffffffff819e3b82>] inet_release+0x72/0x80
 [<ffffffff819455fa>] sock_release+0x1a/0x90
 [<ffffffff8194567d>] sock_close+0xd/0x20
 [<ffffffff811dfdf6>] __fput+0xc6/0x1d0
 [<ffffffff811dff49>] ____fput+0x9/0x10
 [<ffffffff810db94f>] task_work_run+0x8f/0xd0
 [<ffffffff81046b12>] do_notify_resume+0x82/0xa0
 [<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2571 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/ipv4/af_inet.c:153
inet_sock_destruct+0x1d9/0x1e0()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Tainted: G        W      3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
 0000000000000009 ffff8800cae2fd68 ffffffff81b6c0ff 0000000000000000
 0000000000000000 ffff8800cae2fda8 ffffffff810c235c ffff8800cae2fd88
 ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
 [<ffffffff81b6c0ff>] dump_stack+0x46/0x58
 [<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
 [<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
 [<ffffffff819e5179>] inet_sock_destruct+0x1d9/0x1e0
 [<ffffffff81949e7e>] __sk_free+0x1e/0x100
 [<ffffffff81949f79>] sk_free+0x19/0x20
 [<ffffffff819bd918>] tcp_close+0x158/0x430
 [<ffffffff819e3b82>] inet_release+0x72/0x80
 [<ffffffff819455fa>] sock_release+0x1a/0x90
 [<ffffffff8194567d>] sock_close+0xd/0x20
 [<ffffffff811dfdf6>] __fput+0xc6/0x1d0
 [<ffffffff811dff49>] ____fput+0x9/0x10
 [<ffffffff810db94f>] task_work_run+0x8f/0xd0
 [<ffffffff81046b12>] do_notify_resume+0x82/0xa0
 [<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2572 ]---
kworker/dying (679) used greatest stack depth: 11784 bytes left
TCP: out of memory -- consider tuning tcp_mem

-- snip --

^ permalink raw reply

* Re: DMA allocations from CMA and fatal_signal_pending check
From: Florian Fainelli @ 2014-11-03 18:51 UTC (permalink / raw)
  To: Michal Nazarewicz, Joonsoo Kim
  Cc: linux-arm-kernel, Brian Norris, Gregory Fong, linux-kernel,
	linux-mm, lauraa, gioh.kim, aneesh.kumar, m.szyprowski, akpm,
	netdev@vger.kernel.org
In-Reply-To: <xa1tlhnsw7v8.fsf@mina86.com>

On 11/03/2014 08:45 AM, Michal Nazarewicz wrote:
> On Fri, Oct 31 2014, Florian Fainelli wrote:
>> I agree that the CMA allocation should not be allowed to succeed, but
>> the dma_alloc_coherent() allocation should succeed. If we look at the
>> sysport driver, there are kmalloc() calls to initialize private
>> structures, those will succeed (except under high memory pressure), so
>> by the same token, a driver expects DMA allocations to succeed (unless
>> we are under high memory pressure)
>>
>> What are we trying to solve exactly with the fatal_signal_pending()
>> check here? Are we just optimizing for the case where a process has
>> allocated from a CMA region to allow this region to be returned to the
>> pool of free pages when it gets killed? Could there be another mechanism
>> used to reclaim those pages if we know the process is getting killed
>> anyway?
> 
> We're guarding against situations where process may hang around
> arbitrarily long time after receiving SIGKILL.  If user does “kill -9
> $pid” the usual expectation is that the $pid process will die within
> seconds and anything longer is perceived by user as a bug.
> 
> What problem are *you* trying to solve?  If user sent SIGKILL to
> a process that imitated device initialisation, what is the point of
> continuing initialising the device?  Just recover and return -EINTR.

I have two problems with the current approach:

- behavior of a dma_alloc_coherent() call is not consistent between a
CONFIG_CMA=y vs. CONFIG_CMA=n build, which is probably fine as long as
we document that properly

- there is currently no way for a caller of dma_alloc_coherent to tell
whether the allocation failed because it was interrupted by a signal, a
genuine OOM or something else, this is largely made worse by problem 1

> 
>> Well, not really. This driver is not an isolated case, there are tons of
>> other networking drivers that do exactly the same thing, and we do
>> expect these dma_alloc_* calls to succeed.
> 
> Again, why do you expect them to succeed?  The code must handle failures
> correctly anyway so why do you wish to ignore fatal signal?

I guess expecting them to succeed is probably not good, but at we should
at least be able to report an accurate error code to the caller and down
to user-space.

Thanks
--
Florian

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] net: eth: realtek: atp: checkpatch errors and warnings corrected
From: Joe Perches @ 2014-11-03 18:51 UTC (permalink / raw)
  To: Roberto Medina; +Cc: netdev, linux-kernel
In-Reply-To: <1415039073-20164-1-git-send-email-robertoxmed@gmail.com>

On Mon, 2014-11-03 at 19:24 +0100, Roberto Medina wrote:
> From: Roberto Medina <robertoxmed@gmail.com>
> 
> Several warnings and errors corrected using checkpatch.
> Some warnings like "line over 80" are still present.
> 
> Before the patch:
> 	total: 16 errors, 155 warnings, 38 checks, 883 lines checked
> 
> With the patch:
> 	total: 0 errors, 64 warnings, 24 checks, 905 lines checked
> 	
> Compile tested.

Some ancient drivers could be regarded as neolithic
curiosities that never need updating.  This may be one.

But if you really want to change it, could you please
make sure that objdiff shows no changes?

^ permalink raw reply

* Re: [PATCH net-next 3/7] gue: Add infrastructure for flags and options
From: Tom Herbert @ 2014-11-03 18:39 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Netdev List
In-Reply-To: <20141103.121856.1307429117331581815.davem@davemloft.net>

On Mon, Nov 3, 2014 at 9:18 AM, David Miller <davem@davemloft.net> wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Sat,  1 Nov 2014 15:57:59 -0700
>
>> @@ -20,7 +20,16 @@ static size_t fou_encap_hlen(struct ip_tunnel_encap *e)
>>
>>  static size_t gue_encap_hlen(struct ip_tunnel_encap *e)
>>  {
>> -     return sizeof(struct udphdr) + sizeof(struct guehdr);
>> +     size_t len;
>> +     bool need_priv = false;
>> +
>> +     len = sizeof(struct udphdr) + sizeof(struct guehdr);
>> +
>> +     /* Add in lengths flags */
>> +
>> +     len += need_priv ? GUE_LEN_PRIV : 0;
>
> Add this need_priv logic in patch #6, not here.

I would rather keep it in this patch. This is adding the common
infrastructure to support private option field, remote checksum
offload is an instance that uses that.

Tom

^ permalink raw reply

* Re: [PATCH 1/7] can: m_can: fix possible sleep in napi poll
From: Oliver Hartkopp @ 2014-11-03 18:37 UTC (permalink / raw)
  To: Marc Kleine-Budde, Dong Aisheng, linux-can
  Cc: wg, varkabhadram, netdev, linux-arm-kernel
In-Reply-To: <5457B517.8030805@pengutronix.de>

Hello Marc,

you can omit the 'stable' tag as this driver just emerged in 3.18 - which is
currently in 3.18-rc3 state.

If possible I would suggest to push all 7 patches via can/master as it is
pretty bad to say "M_CAN was introduced in 3.18 but M_CAN with CAN FD is
working since 3.19".

This will produce confusion as the M_CAN core has built-in CAN FD support.

Regards,
Oliver

On 11/03/2014 06:02 PM, Marc Kleine-Budde wrote:
> Hey Dong Aisheng,
> 
> the state of the patches are:
> 
> 1: can/master + stable
> 2: can/mster
> 3: can/mster
> 4: can/master + stable
> 5: waiting for Oliver's opinion due to the user space change
> 6: waiting for your input
> 7: waiting for your input
> 
> Marc
> 

^ permalink raw reply

* Re: [PATCH] VNIC: Adding support for Cavium ThunderX network controller
From: Robert Richter @ 2014-11-03 18:33 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Sunil Kovvuri, Robert Richter, David S. Miller, Sunil Goutham,
	Stefan Assmann, LKML, LAKML, netdev
In-Reply-To: <20141103101651.1d664cc3@urahara>

On 03.11.14 10:16:51, Stephen Hemminger wrote:
> On Fri, 31 Oct 2014 22:44:11 +0530
> Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> 
> > On Fri, Oct 31, 2014 at 8:24 AM, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > > On Thu, 30 Oct 2014 17:54:34 +0100
> > > Robert Richter <rric@kernel.org> wrote:
> > >
> > >> +#ifdef       VNIC_RSS_SUPPORT
> > >> +static int rss_config = RSS_IP_HASH_ENA | RSS_TCP_HASH_ENA | RSS_UDP_HASH_ENA;
> > >> +module_param(rss_config, int, S_IRUGO);
> > >> +MODULE_PARM_DESC(rss_config,
> > >> +              "RSS hash config [bits 8:0] (Bit0:L2 extended, 1:IP, 2:TCP, 3:TCP SYN, 4:UDP, 5:L4 extended, 6:ROCE 7:L3 bi-directional, 8:L4 bi-directional)");
> > >> +#endif
> > >
> > > This should managed  be via ethtool ETHTOOL_GRXFH rather than a module parameter.
> > Thanks, i will add setting hash options via ETHTOOL_SRXFH as well.
> > The idea here is to have a choice of hash while module load (through
> > module params) and if it needs to be changed runtime then
> > via Ethtool.
> > 
> > Sunil.
> 
> Network developers do not like vendor unique module parameters.
> Anything device specific doesn't work in a generic distro environment.

Do you accept unique module parameters in parallel to ethtool support
or should this be removed?

-Robert

^ permalink raw reply

* Re: [PATCH 5/7] can: clear ctrlmode when close candev
From: Oliver Hartkopp @ 2014-11-03 18:25 UTC (permalink / raw)
  To: Marc Kleine-Budde, Dong Aisheng, Wolfgang Grandegger
  Cc: linux-can, wg, varkabhadram, netdev, linux-arm-kernel
In-Reply-To: <5457AD40.7020606@pengutronix.de>



On 11/03/2014 05:28 PM, Marc Kleine-Budde wrote:
> On 10/29/2014 11:45 AM, Dong Aisheng wrote:
>> Currently priv->ctrlmode is not cleared when close_candev, so next time
>> the driver will still use this value to set controller even user
>> does not set any ctrl mode.
>> e.g.
>> Step 1. ip link set can0 up type can0 bitrate 1000000 loopback on
>> Controller will be in loopback mode
>> Step 2. ip link set can0 down
>> Step 3. ip link set can0 up type can0 bitrate 1000000
>> Controller will still be set to loopback mode in driver due to saved
>> priv->ctrlmode.
>>
>> This patch clears priv->ctrlmode when the CAN interface is closed,
>> and set it to correct mode according to next user setting.
> 
> Oliver, what do you think of this patch? It will introduce a subtle
> change to the userspace.

Hi Marc,

indeed the current policy is that ifdown/ifup doesn't change any setting.
When we start to clear priv->ctrlmode when setting the interface down the
question arises whether the bitrate(s) has/have to be wiped too.

Setting the interface down/up can be done with ip or ifconfig without changing
the controller settings.

So IMO we can not assume that people always use the 'all-in-one'

	ip link set can0 up type can0 bitrate 1000000

command as given in the example above.

You might also do it like this:

	ip link set can0 up type can0 bitrate 1000000 loopback on
	ip link set can0 down
	ip link set can0 type can0 loopback off
	ip link set can0 up

Finally I would vote to reject this patch to stay consistent.

When users like to fiddle with options like loopback they need to take care
about their 'expert' settings too.

Regards,
Oliver


> 
>> Signed-off-by: Dong Aisheng <b29396@freescale.com>
>> ---
>>  drivers/net/can/dev.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
>> index 02492d2..1fce485 100644
>> --- a/drivers/net/can/dev.c
>> +++ b/drivers/net/can/dev.c
>> @@ -671,6 +671,7 @@ void close_candev(struct net_device *dev)
>>  
>>  	del_timer_sync(&priv->restart_timer);
>>  	can_flush_echo_skb(dev);
>> +	priv->ctrlmode = 0;
>>  }
>>  EXPORT_SYMBOL_GPL(close_candev);
>>  
>>
> 
> 

^ permalink raw reply

* [PATCH] net: eth: realtek: atp: checkpatch errors and warnings corrected
From: Roberto Medina @ 2014-11-03 18:24 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Roberto Medina

From: Roberto Medina <robertoxmed@gmail.com>

Several warnings and errors corrected using checkpatch.
Some warnings like "line over 80" are still present.

Before the patch:
	total: 16 errors, 155 warnings, 38 checks, 883 lines checked

With the patch:
	total: 0 errors, 64 warnings, 24 checks, 905 lines checked
	
Compile tested.

Signed-off-by: Roberto Medina <robertoxmed@gmail.com>
---
 drivers/net/ethernet/realtek/atp.c | 470 +++++++++++++++++++------------------
 1 file changed, 246 insertions(+), 224 deletions(-)

diff --git a/drivers/net/ethernet/realtek/atp.c b/drivers/net/ethernet/realtek/atp.c
index d77d60e..8291260 100644
--- a/drivers/net/ethernet/realtek/atp.c
+++ b/drivers/net/ethernet/realtek/atp.c
@@ -1,43 +1,42 @@
 /* atp.c: Attached (pocket) ethernet adapter driver for linux. */
-/*
-	This is a driver for commonly OEM pocket (parallel port)
-	ethernet adapters based on the Realtek RTL8002 and RTL8012 chips.
-
-	Written 1993-2000 by Donald Becker.
-
-	This software may be used and distributed according to the terms of
-	the GNU General Public License (GPL), incorporated herein by reference.
-	Drivers based on or derived from this code fall under the GPL and must
-	retain the authorship, copyright and license notice.  This file is not
-	a complete program and may only be used when the entire operating
-	system is licensed under the GPL.
-
-	Copyright 1993 United States Government as represented by the Director,
-	National Security Agency.  Copyright 1994-2000 retained by the original
-	author, Donald Becker. The timer-based reset code was supplied in 1995
-	by Bill Carlson, wwc@super.org.
-
-	The author may be reached as becker@scyld.com, or C/O
-	Scyld Computing Corporation
-	410 Severn Ave., Suite 210
-	Annapolis MD 21403
-
-	Support information and updates available at
-	http://www.scyld.com/network/atp.html
-
-
-	Modular support/softnet added by Alan Cox.
-	_bit abuse fixed up by Alan Cox
-
-*/
+/*	This is a driver for commonly OEM pocket (parallel port)
+ *	ethernet adapters based on the Realtek RTL8002 and RTL8012 chips.
+ *
+ *	Written 1993-2000 by Donald Becker.
+ *
+ *	This software may be used and distributed according to the terms of
+ *	the GNU General Public License (GPL), incorporated herein by reference.
+ *	Drivers based on or derived from this code fall under the GPL and must
+ *	retain the authorship, copyright and license notice.  This file is not
+ *	a complete program and may only be used when the entire operating
+ *	system is licensed under the GPL.
+ *
+ *	Copyright 1993 United States Government as represented by the Director,
+ *	National Security Agency.  Copyright 1994-2000 retained by the original
+ *	author, Donald Becker. The timer-based reset code was supplied in 1995
+ *	by Bill Carlson, wwc@super.org.
+ *
+ *	The author may be reached as becker@scyld.com, or C/O
+ *	Scyld Computing Corporation
+ *	410 Severn Ave., Suite 210
+ *	Annapolis MD 21403
+ *
+ *	Support information and updates available at
+ *	http://www.scyld.com/network/atp.html
+ *
+ *	Modular support/softnet added by Alan Cox.
+ *	_bit abuse fixed up by Alan Cox
+ *
+ */
 
 static const char version[] =
 "atp.c:v1.09=ac 2002/10/01 Donald Becker <becker@scyld.com>\n";
 
 /* The user-configurable values.
-   These may be modified when a driver module is loaded.*/
+ * These may be modified when a driver module is loaded.
+ */
 
-static int debug = 1; 			/* 1 normal messages, 0 quiet .. 7 verbose. */
+static int debug = 1;	/* 1 normal messages, 0 quiet .. 7 verbose. */
 #define net_debug debug
 
 /* Maximum events (Rx packets, etc.) to handle at each interrupt. */
@@ -47,80 +46,79 @@ static int max_interrupt_work = 15;
 /* The standard set of ISA module parameters. */
 static int io[NUM_UNITS];
 static int irq[NUM_UNITS];
-static int xcvr[NUM_UNITS]; 			/* The data transfer mode. */
+static int xcvr[NUM_UNITS];		/* The data transfer mode. */
 
 /* Operational parameters that are set at compile time. */
 
 /* Time in jiffies before concluding the transmitter is hung. */
 #define TX_TIMEOUT  (400*HZ/1000)
 
-/*
-	This file is a device driver for the RealTek (aka AT-Lan-Tec) pocket
-	ethernet adapter.  This is a common low-cost OEM pocket ethernet
-	adapter, sold under many names.
-
-  Sources:
-	This driver was written from the packet driver assembly code provided by
-	Vincent Bono of AT-Lan-Tec.	 Ever try to figure out how a complicated
-	device works just from the assembly code?  It ain't pretty.  The following
-	description is written based on guesses and writing lots of special-purpose
-	code to test my theorized operation.
-
-	In 1997 Realtek made available the documentation for the second generation
-	RTL8012 chip, which has lead to several driver improvements.
-	  http://www.realtek.com.tw/
-
-					Theory of Operation
-
-	The RTL8002 adapter seems to be built around a custom spin of the SEEQ
-	controller core.  It probably has a 16K or 64K internal packet buffer, of
-	which the first 4K is devoted to transmit and the rest to receive.
-	The controller maintains the queue of received packet and the packet buffer
-	access pointer internally, with only 'reset to beginning' and 'skip to next
-	packet' commands visible.  The transmit packet queue holds two (or more?)
-	packets: both 'retransmit this packet' (due to collision) and 'transmit next
-	packet' commands must be started by hand.
-
-	The station address is stored in a standard bit-serial EEPROM which must be
-	read (ughh) by the device driver.  (Provisions have been made for
-	substituting a 74S288 PROM, but I haven't gotten reports of any models
-	using it.)  Unlike built-in devices, a pocket adapter can temporarily lose
-	power without indication to the device driver.  The major effect is that
-	the station address, receive filter (promiscuous, etc.) and transceiver
-	must be reset.
-
-	The controller itself has 16 registers, some of which use only the lower
-	bits.  The registers are read and written 4 bits at a time.  The four bit
-	register address is presented on the data lines along with a few additional
-	timing and control bits.  The data is then read from status port or written
-	to the data port.
-
-	Correction: the controller has two banks of 16 registers.  The second
-	bank contains only the multicast filter table (now used) and the EEPROM
-	access registers.
-
-	Since the bulk data transfer of the actual packets through the slow
-	parallel port dominates the driver's running time, four distinct data
-	(non-register) transfer modes are provided by the adapter, two in each
-	direction.  In the first mode timing for the nibble transfers is
-	provided through the data port.  In the second mode the same timing is
-	provided through the control port.  In either case the data is read from
-	the status port and written to the data port, just as it is accessing
-	registers.
-
-	In addition to the basic data transfer methods, several more are modes are
-	created by adding some delay by doing multiple reads of the data to allow
-	it to stabilize.  This delay seems to be needed on most machines.
-
-	The data transfer mode is stored in the 'dev->if_port' field.  Its default
-	value is '4'.  It may be overridden at boot-time using the third parameter
-	to the "ether=..." initialization.
-
-	The header file <atp.h> provides inline functions that encapsulate the
-	register and data access methods.  These functions are hand-tuned to
-	generate reasonable object code.  This header file also documents my
-	interpretations of the device registers.
-*/
+/*	This file is a device driver for the RealTek (aka AT-Lan-Tec) pocket
+ *	ethernet adapter.  This is a common low-cost OEM pocket ethernet
+ *	adapter, sold under many names.
+ *
+ * Sources:
+ *	This driver was written from the packet driver assembly code provided by
+ *	Vincent Bono of AT-Lan-Tec.	 Ever try to figure out how a complicated
+ *	device works just from the assembly code?  It ain't pretty.  The following
+ *	description is written based on guesses and writing lots of special-purpose
+ *	code to test my theorized operation.
+ *
+ *	In 1997 Realtek made available the documentation for the second generation
+ *	RTL8012 chip, which has lead to several driver improvements.
+ *	  http://www.realtek.com.tw/
+ *
+ *				Theory of Operation
+ *
+ *	The RTL8002 adapter seems to be built around a custom spin of the SEEQ
+ *	controller core.  It probably has a 16K or 64K internal packet buffer, of
+ *	which the first 4K is devoted to transmit and the rest to receive.
+ *	The controller maintains the queue of received packet and the packet buffer
+ *	access pointer internally, with only 'reset to beginning' and 'skip to next
+ *	packet' commands visible.  The transmit packet queue holds two (or more?)
+ *	packets: both 'retransmit this packet' (due to collision) and 'transmit next
+ *	packet' commands must be started by hand.
+ *
+ *	The station address is stored in a standard bit-serial EEPROM which must be
+ *	read (ughh) by the device driver.  (Provisions have been made for
+ *	substituting a 74S288 PROM, but I haven't gotten reports of any models
+ *	using it.)  Unlike built-in devices, a pocket adapter can temporarily lose
+ *	power without indication to the device driver.  The major effect is that
+ *	the station address, receive filter (promiscuous, etc.) and transceiver
+ *	must be reset.
+ *
+ *	The controller itself has 16 registers, some of which use only the lower
+ *	bits.  The registers are read and written 4 bits at a time.  The four bit
+ *	register address is presented on the data lines along with a few additional
+ *	timing and control bits.  The data is then read from status port or written
+ *	to the data port.
+ *
+ *	Correction: the controller has two banks of 16 registers.  The second
+ *	bank contains only the multicast filter table (now used) and the EEPROM
+ *	access registers.
+ *
+ *	Since the bulk data transfer of the actual packets through the slow
+ *	parallel port dominates the driver's running time, four distinct data
+ *	(non-register) transfer modes are provided by the adapter, two in each
+ *	direction.  In the first mode timing for the nibble transfers is
+ *	provided through the data port.  In the second mode the same timing is
+ *	provided through the control port.  In either case the data is read from
+ *	the status port and written to the data port, just as it is accessing
+ *	registers.
+ *
+ *	In addition to the basic data transfer methods, several more are modes are
+ *	created by adding some delay by doing multiple reads of the data to allow
+ *	it to stabilize.  This delay seems to be needed on most machines.
+ *
+ *	The data transfer mode is stored in the 'dev->if_port' field.  Its default
+ *	value is '4'.  It may be overridden at boot-time using the third parameter
+ *	to the "ether=..." initialization.
+ *
+ *	The header file <atp.h> provides inline functions that encapsulate the
+ *	register and data access methods.  These functions are hand-tuned to
+ *	generate reasonable object code.  This header file also documents my
+ *	interpretations of the device registers.
+ */
 
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -140,7 +138,7 @@ static int xcvr[NUM_UNITS]; 			/* The data transfer mode. */
 #include <linux/delay.h>
 #include <linux/bitops.h>
 
-#include <asm/io.h>
+#include <linux/io.h>
 #include <asm/dma.h>
 
 #include "atp.h"
@@ -167,20 +165,21 @@ MODULE_PARM_DESC(xcvr, "ATP transceiver(s) (0=internal, 1=external)");
 static char mux_8012[] = { 0xff, 0xf7, 0xff, 0xfb, 0xf3, 0xfb, 0xff, 0xf7,};
 
 struct net_local {
-    spinlock_t lock;
-    struct net_device *next_module;
-    struct timer_list timer;	/* Media selection timer. */
-    long last_rx_time;		/* Last Rx, in jiffies, to handle Rx hang. */
-    int saved_tx_size;
-    unsigned int tx_unit_busy:1;
-    unsigned char re_tx,	/* Number of packet retransmissions. */
-		addr_mode,		/* Current Rx filter e.g. promiscuous, etc. */
+	spinlock_t lock;
+	struct net_device *next_module;
+	struct timer_list timer;	/* Media selection timer. */
+	long last_rx_time;	/* Last Rx, in jiffies, to handle Rx hang. */
+	int saved_tx_size;
+	unsigned int tx_unit_busy:1;
+	unsigned char re_tx,	/* Number of packet retransmissions. */
+		addr_mode,	/* Current Rx filter e.g. promiscuous, etc. */
 		pac_cnt_in_tx_buf;
 };
 
 /* This code, written by wwc@super.org, resets the adapter every
-   TIMED_CHECKER ticks.  This recovers from an unknown error which
-   hangs the device. */
+ * TIMED_CHECKER ticks.  This recovers from an unknown error which
+ * hangs the device.
+ */
 #define TIMED_CHECKER (HZ/4)
 #ifdef TIMED_CHECKER
 #include <linux/timer.h>
@@ -194,41 +193,43 @@ static void get_node_ID(struct net_device *dev);
 static unsigned short eeprom_op(long ioaddr, unsigned int cmd);
 static int net_open(struct net_device *dev);
 static void hardware_init(struct net_device *dev);
-static void write_packet(long ioaddr, int length, unsigned char *packet, int pad, int mode);
+static void write_packet(long ioaddr, int length, unsigned char *packet,
+			 int pad, int mode);
 static void trigger_send(long ioaddr, int length);
 static netdev_tx_t atp_send_packet(struct sk_buff *skb,
 				   struct net_device *dev);
 static irqreturn_t atp_interrupt(int irq, void *dev_id);
 static void net_rx(struct net_device *dev);
-static void read_block(long ioaddr, int length, unsigned char *buffer, int data_mode);
+static void read_block(long ioaddr, int length, unsigned char *buffer,
+		       int data_mode);
 static int net_close(struct net_device *dev);
 static void set_rx_mode(struct net_device *dev);
 static void tx_timeout(struct net_device *dev);
 
-
 /* A list of all installed ATP devices, for removing the driver module. */
 static struct net_device *root_atp_dev;
 
 /* Check for a network adapter of this type, and return '0' iff one exists.
-   If dev->base_addr == 0, probe all likely locations.
-   If dev->base_addr == 1, always return failure.
-   If dev->base_addr == 2, allocate space for the device and return success
-   (detachable devices only).
-
-   FIXME: we should use the parport layer for this
-   */
+ * If dev->base_addr == 0, probe all likely locations.
+ * If dev->base_addr == 1, always return failure.
+ * If dev->base_addr == 2, allocate space for the device and return success
+ * (detachable devices only).
+ *
+ *  FIXME: we should use the parport layer for this
+ */
 static int __init atp_init(void)
 {
 	int *port, ports[] = {0x378, 0x278, 0x3bc, 0};
 	int base_addr = io[0];
 
-	if (base_addr > 0x1ff)		/* Check a single specified location. */
+	if (base_addr > 0x1ff)	/* Check a single specified location. */
 		return atp_probe1(base_addr);
 	else if (base_addr == 1)	/* Don't probe at all. */
 		return -ENXIO;
 
 	for (port = ports; *port; port++) {
 		long ioaddr = *port;
+
 		outb(0x57, ioaddr + PAR_DATA);
 		if (inb(ioaddr + PAR_DATA) != 0x57)
 			continue;
@@ -246,7 +247,7 @@ static const struct net_device_ops atp_netdev_ops = {
 	.ndo_set_rx_mode	= set_rx_mode,
 	.ndo_tx_timeout		= tx_timeout,
 	.ndo_change_mtu		= eth_change_mtu,
-	.ndo_set_mac_address 	= eth_mac_addr,
+	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_validate_addr	= eth_validate_addr,
 };
 
@@ -259,7 +260,8 @@ static int __init atp_probe1(long ioaddr)
 
 	outb(0xff, ioaddr + PAR_DATA);
 	/* Save the original value of the Control register, in case we guessed
-	   wrong. */
+	 *  wrong.
+	 */
 	saved_ctrl_reg = inb(ioaddr + PAR_CONTROL);
 	if (net_debug > 3)
 		printk("atp: Control register was %#2.2x.\n", saved_ctrl_reg);
@@ -330,8 +332,7 @@ static int __init atp_probe1(long ioaddr)
 		printk(KERN_INFO "%s", version);
 #endif
 
-	printk(KERN_NOTICE "%s: Pocket adapter found at %#3lx, IRQ %d, "
-	       "SAPROM %pM.\n",
+	printk(KERN_NOTICE "%s: Pocket adapter found at %#3lx, IRQ %d, SAPROM %pM.\n",
 	       dev->name, dev->base_addr, dev->irq, dev->dev_addr);
 
 	/* Reset the ethernet hardware and activate the printer pass-through. */
@@ -349,7 +350,7 @@ static int __init atp_probe1(long ioaddr)
 	if (dev->mem_end & 0xf)
 		net_debug = dev->mem_end & 7;
 
-	dev->netdev_ops 	= &atp_netdev_ops;
+	dev->netdev_ops		= &atp_netdev_ops;
 	dev->watchdog_timeo	= TX_TIMEOUT;
 
 	res = register_netdev(dev);
@@ -374,7 +375,8 @@ static void __init get_node_ID(struct net_device *dev)
 	write_reg(ioaddr, CMR2, CMR2_EEPROM);	  /* Point to the EEPROM control registers. */
 
 	/* Some adapters have the station address at offset 15 instead of offset
-	   zero.  Check for it, and fix it if needed. */
+	 * zero.  Check for it, and fix it if needed.
+	 */
 	if (eeprom_op(ioaddr, EE_READ(0)) == 0xffff)
 		sa_offset = 15;
 
@@ -385,9 +387,8 @@ static void __init get_node_ID(struct net_device *dev)
 	write_reg(ioaddr, CMR2, CMR2_NULL);
 }
 
-/*
-  An EEPROM read command starts by shifting out 0x60+address, and then
-  shifting in the serial data. See the NatSemi databook for details.
+/* An EEPROM read command starts by shifting out 0x60+address, and then
+ * shifting in the serial data. See the NatSemi databook for details.
  *		   ________________
  * CS : __|
  *			   ___	   ___
@@ -404,6 +405,7 @@ static unsigned short __init eeprom_op(long ioaddr, u32 cmd)
 
 	while (--num_bits >= 0) {
 		char outval = (cmd & (1<<num_bits)) ? EE_DATA_WRITE : 0;
+
 		write_reg_high(ioaddr, PROM_CMD, outval | EE_CLK_LOW);
 		write_reg_high(ioaddr, PROM_CMD, outval | EE_CLK_HIGH);
 		eedata_out <<= 1;
@@ -414,25 +416,25 @@ static unsigned short __init eeprom_op(long ioaddr, u32 cmd)
 	return eedata_out;
 }
 
-
 /* Open/initialize the board.  This is called (in the current kernel)
-   sometime after booting when the 'ifconfig' program is run.
-
-   This routine sets everything up anew at each open, even
-   registers that "should" only need to be set once at boot, so that
-   there is non-reboot way to recover if something goes wrong.
-
-   This is an attachable device: if there is no private entry then it wasn't
-   probed for at boot-time, and we need to probe for it again.
-   */
+ * sometime after booting when the 'ifconfig' program is run.
+ *
+ * This routine sets everything up anew at each open, even
+ * registers that "should" only need to be set once at boot, so that
+ * there is non-reboot way to recover if something goes wrong.
+ *
+ * This is an attachable device: if there is no private entry then it wasn't
+ * probed for at boot-time, and we need to probe for it again.
+ */
 static int net_open(struct net_device *dev)
 {
 	struct net_local *lp = netdev_priv(dev);
 	int ret;
 
 	/* The interrupt line is turned off (tri-stated) when the device isn't in
-	   use.  That's especially important for "attached" interfaces where the
-	   port or interrupt may be shared. */
+	 * use.  That's especially important for "attached" interfaces where the
+	 *  port or interrupt may be shared.
+	 */
 	ret = request_irq(dev->irq, atp_interrupt, 0, dev->name, dev);
 	if (ret)
 		return ret;
@@ -450,40 +452,41 @@ static int net_open(struct net_device *dev)
 }
 
 /* This routine resets the hardware.  We initialize everything, assuming that
-   the hardware may have been temporarily detached. */
+ *   the hardware may have been temporarily detached.
+ */
 static void hardware_init(struct net_device *dev)
 {
 	struct net_local *lp = netdev_priv(dev);
 	long ioaddr = dev->base_addr;
-    int i;
+	int i;
 
 	/* Turn off the printer multiplexer on the 8012. */
 	for (i = 0; i < 8; i++)
 		outb(mux_8012[i], ioaddr + PAR_DATA);
 	write_reg_high(ioaddr, CMR1, CMR1h_RESET);
 
-    for (i = 0; i < 6; i++)
+	for (i = 0; i < 6; i++)
 		write_reg_byte(ioaddr, PAR0 + i, dev->dev_addr[i]);
 
 	write_reg_high(ioaddr, CMR2, lp->addr_mode);
 
 	if (net_debug > 2) {
 		printk(KERN_DEBUG "%s: Reset: current Rx mode %d.\n", dev->name,
-			   (read_nibble(ioaddr, CMR2_h) >> 3) & 0x0f);
+		       (read_nibble(ioaddr, CMR2_h) >> 3) & 0x0f);
 	}
 
-    write_reg(ioaddr, CMR2, CMR2_IRQOUT);
-    write_reg_high(ioaddr, CMR1, CMR1h_RxENABLE | CMR1h_TxENABLE);
+	write_reg(ioaddr, CMR2, CMR2_IRQOUT);
+	write_reg_high(ioaddr, CMR1, CMR1h_RxENABLE | CMR1h_TxENABLE);
 
 	/* Enable the interrupt line from the serial port. */
 	outb(Ctrl_SelData + Ctrl_IRQEN, ioaddr + PAR_CONTROL);
 
 	/* Unmask the interesting interrupts. */
-    write_reg(ioaddr, IMR, ISR_RxOK | ISR_TxErr | ISR_TxOK);
-    write_reg_high(ioaddr, IMR, ISRh_RxErr);
+	write_reg(ioaddr, IMR, ISR_RxOK | ISR_TxErr | ISR_TxOK);
+	write_reg_high(ioaddr, IMR, ISRh_RxErr);
 
 	lp->tx_unit_busy = 0;
-    lp->pac_cnt_in_tx_buf = 0;
+	lp->pac_cnt_in_tx_buf = 0;
 	lp->saved_tx_size = 0;
 }
 
@@ -496,23 +499,22 @@ static void trigger_send(long ioaddr, int length)
 
 static void write_packet(long ioaddr, int length, unsigned char *packet, int pad_len, int data_mode)
 {
-    if (length & 1)
-    {
-    	length++;
-    	pad_len++;
-    }
-
-    outb(EOC+MAR, ioaddr + PAR_DATA);
-    if ((data_mode & 1) == 0) {
+	if (length & 1) {
+		length++;
+		pad_len++;
+	}
+
+	outb(EOC+MAR, ioaddr + PAR_DATA);
+	if ((data_mode & 1) == 0) {
 		/* Write the packet out, starting with the write addr. */
 		outb(WrAddr+MAR, ioaddr + PAR_DATA);
 		do {
 			write_byte_mode0(ioaddr, *packet++);
-		} while (--length > pad_len) ;
+		} while (--length > pad_len);
 		do {
 			write_byte_mode0(ioaddr, 0);
-		} while (--length > 0) ;
-    } else {
+		} while (--length > 0);
+	} else {
 		/* Write the packet out in slow mode. */
 		unsigned char outbyte = *packet++;
 
@@ -528,10 +530,10 @@ static void write_packet(long ioaddr, int length, unsigned char *packet, int pad
 			write_byte_mode1(ioaddr, *packet++);
 		while (--length > 0)
 			write_byte_mode1(ioaddr, 0);
-    }
-    /* Terminate the Tx frame.  End of write: ECB. */
-    outb(0xff, ioaddr + PAR_DATA);
-    outb(Ctrl_HNibWrite | Ctrl_SelData | Ctrl_IRQEN, ioaddr + PAR_CONTROL);
+	}
+	/* Terminate the Tx frame.  End of write: ECB. */
+	outb(0xff, ioaddr + PAR_DATA);
+	outb(Ctrl_HNibWrite | Ctrl_SelData | Ctrl_IRQEN, ioaddr + PAR_CONTROL);
 }
 
 static void tx_timeout(struct net_device *dev)
@@ -539,8 +541,8 @@ static void tx_timeout(struct net_device *dev)
 	long ioaddr = dev->base_addr;
 
 	printk(KERN_WARNING "%s: Transmit timed out, %s?\n", dev->name,
-		   inb(ioaddr + PAR_CONTROL) & 0x10 ? "network cable problem"
-		   :  "IRQ conflict");
+	       inb(ioaddr + PAR_CONTROL) & 0x10 ? "network cable problem"
+	       :  "IRQ conflict");
 	dev->stats.tx_errors++;
 	/* Try to restart the adapter. */
 	hardware_init(dev);
@@ -562,7 +564,8 @@ static netdev_tx_t atp_send_packet(struct sk_buff *skb,
 	netif_stop_queue(dev);
 
 	/* Disable interrupts by writing 0x00 to the Interrupt Mask Register.
-	   This sequence must not be interrupted by an incoming packet. */
+	 * This sequence must not be interrupted by an incoming packet.
+	 */
 
 	spin_lock_irqsave(&lp->lock, flags);
 	write_reg(ioaddr, IMR, 0);
@@ -574,22 +577,23 @@ static netdev_tx_t atp_send_packet(struct sk_buff *skb,
 	lp->pac_cnt_in_tx_buf++;
 	if (lp->tx_unit_busy == 0) {
 		trigger_send(ioaddr, length);
-		lp->saved_tx_size = 0; 				/* Redundant */
+		lp->saved_tx_size = 0;		/* Redundant */
 		lp->re_tx = 0;
 		lp->tx_unit_busy = 1;
-	} else
+	} else {
 		lp->saved_tx_size = length;
+	}
 	/* Re-enable the LPT interrupts. */
 	write_reg(ioaddr, IMR, ISR_RxOK | ISR_TxErr | ISR_TxOK);
 	write_reg_high(ioaddr, IMR, ISRh_RxErr);
 
-	dev_kfree_skb (skb);
+	dev_kfree_skb(skb);
 	return NETDEV_TX_OK;
 }
 
-
 /* The typical workload of the driver:
-   Handle the network interface interrupts. */
+ * Handle the network interface interrupts.
+ */
 static irqreturn_t atp_interrupt(int irq, void *dev_instance)
 {
 	struct net_device *dev = dev_instance;
@@ -611,20 +615,25 @@ static irqreturn_t atp_interrupt(int irq, void *dev_instance)
 	write_reg(ioaddr, CMR2, CMR2_NULL);
 	write_reg(ioaddr, IMR, 0);
 
-	if (net_debug > 5) printk(KERN_DEBUG "%s: In interrupt ", dev->name);
-    while (--boguscount > 0) {
+	if (net_debug > 5)
+		printk(KERN_DEBUG "%s: In interrupt ", dev->name);
+	while (--boguscount > 0) {
 		int status = read_nibble(ioaddr, ISR);
-		if (net_debug > 5) printk("loop status %02x..", status);
+
+		if (net_debug > 5)
+			printk("loop status %02x..", status);
 
 		if (status & (ISR_RxOK<<3)) {
 			handled = 1;
 			write_reg(ioaddr, ISR, ISR_RxOK); /* Clear the Rx interrupt. */
 			do {
 				int read_status = read_nibble(ioaddr, CMR1);
+
 				if (net_debug > 6)
 					printk("handling Rx packet %02x..", read_status);
 				/* We acknowledged the normal Rx interrupt, so if the interrupt
-				   is still outstanding we must have a Rx error. */
+				 * is still outstanding we must have a Rx error.
+				 */
 				if (read_status & (CMR1_IRQ << 3)) { /* Overrun. */
 					dev->stats.rx_over_errors++;
 					/* Set to no-accept mode long enough to remove a packet. */
@@ -636,14 +645,17 @@ static irqreturn_t atp_interrupt(int irq, void *dev_instance)
 				} else if ((read_status & (CMR1_BufEnb << 3)) == 0) {
 					net_rx(dev);
 					num_tx_since_rx = 0;
-				} else
+				} else {
 					break;
+				}
 			} while (--boguscount > 0);
 		} else if (status & ((ISR_TxErr + ISR_TxOK)<<3)) {
 			handled = 1;
-			if (net_debug > 6)  printk("handling Tx done..");
-			/* Clear the Tx interrupt.  We should check for too many failures
-			   and reinitialize the adapter. */
+			if (net_debug > 6)
+				printk("handling Tx done..");
+			/* Clear the Tx interrupt.  We should check for too many
+			 * failures and reinitialize the adapter.
+			 */
 			write_reg(ioaddr, ISR, ISR_TxErr + ISR_TxOK);
 			if (status & (ISR_TxErr<<3)) {
 				dev->stats.collisions++;
@@ -653,38 +665,41 @@ static irqreturn_t atp_interrupt(int irq, void *dev_instance)
 					break;
 				}
 				/* Attempt to retransmit. */
-				if (net_debug > 6)  printk("attempting to ReTx");
+				if (net_debug > 6)
+					printk("attempting to ReTx");
 				write_reg(ioaddr, CMR1, CMR1_ReXmit + CMR1_Xmit);
 			} else {
 				/* Finish up the transmit. */
 				dev->stats.tx_packets++;
 				lp->pac_cnt_in_tx_buf--;
-				if ( lp->saved_tx_size) {
+				if (lp->saved_tx_size) {
 					trigger_send(ioaddr, lp->saved_tx_size);
 					lp->saved_tx_size = 0;
 					lp->re_tx = 0;
-				} else
+				} else {
 					lp->tx_unit_busy = 0;
+				}
 				netif_wake_queue(dev);	/* Inform upper layers. */
 			}
 			num_tx_since_rx++;
 		} else if (num_tx_since_rx > 8 &&
 			   time_after(jiffies, dev->last_rx + HZ)) {
 			if (net_debug > 2)
-				printk(KERN_DEBUG "%s: Missed packet? No Rx after %d Tx and "
-					   "%ld jiffies status %02x  CMR1 %02x.\n", dev->name,
-					   num_tx_since_rx, jiffies - dev->last_rx, status,
-					   (read_nibble(ioaddr, CMR1) >> 3) & 15);
+				printk(KERN_DEBUG "%s: Missed packet? No Rx after %d Tx and %ld jiffies status %02x  CMR1 %02x.\n",
+				       dev->name, num_tx_since_rx, jiffies - dev->last_rx, status,
+				       (read_nibble(ioaddr, CMR1) >> 3) & 15);
 			dev->stats.rx_missed_errors++;
 			hardware_init(dev);
 			num_tx_since_rx = 0;
 			break;
-		} else
+		} else {
 			break;
-    }
+		}
+	}
 
 	/* This following code fixes a rare (and very difficult to track down)
-	   problem where the adapter forgets its ethernet address. */
+	 *  problem where the adapter forgets its ethernet address.
+	 */
 	{
 		int i;
 		for (i = 0; i < 6; i++)
@@ -695,22 +710,24 @@ static irqreturn_t atp_interrupt(int irq, void *dev_instance)
 	}
 
 	/* Tell the adapter that it can go back to using the output line as IRQ. */
-    write_reg(ioaddr, CMR2, CMR2_IRQOUT);
+	write_reg(ioaddr, CMR2, CMR2_IRQOUT);
 	/* Enable the physical interrupt line, which is sure to be low until.. */
 	outb(Ctrl_SelData + Ctrl_IRQEN, ioaddr + PAR_CONTROL);
 	/* .. we enable the interrupt sources. */
 	write_reg(ioaddr, IMR, ISR_RxOK | ISR_TxErr | ISR_TxOK);
-	write_reg_high(ioaddr, IMR, ISRh_RxErr); 			/* Hmmm, really needed? */
+	write_reg_high(ioaddr, IMR, ISRh_RxErr); /* Hmmm, really needed? */
 
 	spin_unlock(&lp->lock);
 
-	if (net_debug > 5) printk("exiting interrupt.\n");
+	if (net_debug > 5)
+		printk("exiting interrupt.\n");
 	return IRQ_RETVAL(handled);
 }
 
 #ifdef TIMED_CHECKER
 /* This following code fixes a rare (and very difficult to track down)
-   problem where the adapter forgets its ethernet address. */
+ * problem where the adapter forgets its ethernet address.
+ */
 static void atp_timed_checker(unsigned long data)
 {
 	struct net_device *dev = (struct net_device *)data;
@@ -727,19 +744,19 @@ static void atp_timed_checker(unsigned long data)
 		lp->last_rx_time = jiffies;
 #else
 		for (i = 0; i < 6; i++)
-			if (read_cmd_byte(ioaddr, PAR0 + i) != atp_timed_dev->dev_addr[i])
-				{
-			struct net_local *lp = netdev_priv(atp_timed_dev);
-			write_reg_byte(ioaddr, PAR0 + i, atp_timed_dev->dev_addr[i]);
-			if (i == 2)
-			  dev->stats.tx_errors++;
-			else if (i == 3)
-			  dev->stats.tx_dropped++;
-			else if (i == 4)
-			  dev->stats.collisions++;
-			else
-			  dev->stats.rx_errors++;
-		  }
+			if (read_cmd_byte(ioaddr, PAR0 + i) != atp_timed_dev->dev_addr[i]) {
+				struct net_local *lp = netdev_priv(atp_timed_dev);
+
+				write_reg_byte(ioaddr, PAR0 + i, atp_timed_dev->dev_addr[i]);
+				if (i == 2)
+					dev->stats.tx_errors++;
+				else if (i == 3)
+					dev->stats.tx_dropped++;
+				else if (i == 4)
+					dev->stats.collisions++;
+				else
+					dev->stats.rx_errors++;
+			}
 #endif
 	}
 	spin_unlock(&lp->lock);
@@ -757,23 +774,26 @@ static void net_rx(struct net_device *dev)
 
 	/* Process the received packet. */
 	outb(EOC+MAR, ioaddr + PAR_DATA);
-	read_block(ioaddr, 8, (unsigned char*)&rx_head, dev->if_port);
+	read_block(ioaddr, 8, (unsigned char *)&rx_head, dev->if_port);
 	if (net_debug > 5)
 		printk(KERN_DEBUG " rx_count %04x %04x %04x %04x..", rx_head.pad,
-			   rx_head.rx_count, rx_head.rx_status, rx_head.cur_addr);
+		       rx_head.rx_count, rx_head.rx_status, rx_head.cur_addr);
 	if ((rx_head.rx_status & 0x77) != 0x01) {
 		dev->stats.rx_errors++;
-		if (rx_head.rx_status & 0x0004) dev->stats.rx_frame_errors++;
-		else if (rx_head.rx_status & 0x0002) dev->stats.rx_crc_errors++;
+		if (rx_head.rx_status & 0x0004)
+			dev->stats.rx_frame_errors++;
+		else if (rx_head.rx_status & 0x0002)
+			dev->stats.rx_crc_errors++;
 		if (net_debug > 3)
 			printk(KERN_DEBUG "%s: Unknown ATP Rx error %04x.\n",
-				   dev->name, rx_head.rx_status);
+			       dev->name, rx_head.rx_status);
 		if  (rx_head.rx_status & 0x0020) {
 			dev->stats.rx_fifo_errors++;
 			write_reg_high(ioaddr, CMR1, CMR1h_TxENABLE);
 			write_reg_high(ioaddr, CMR1, CMR1h_RxENABLE | CMR1h_TxENABLE);
-		} else if (rx_head.rx_status & 0x0050)
+		} else if (rx_head.rx_status & 0x0050) {
 			hardware_init(dev);
+		}
 		return;
 	} else {
 		/* Malloc up new buffer. The "-4" omits the FCS (CRC). */
@@ -787,7 +807,7 @@ static void net_rx(struct net_device *dev)
 		}
 
 		skb_reserve(skb, 2);	/* Align IP on 16 byte boundaries */
-		read_block(ioaddr, pkt_len, skb_put(skb,pkt_len), dev->if_port);
+		read_block(ioaddr, pkt_len, skb_put(skb, pkt_len), dev->if_port);
 		skb->protocol = eth_type_trans(skb, dev);
 		netif_rx(skb);
 		dev->last_rx = jiffies;
@@ -804,7 +824,7 @@ static void read_block(long ioaddr, int length, unsigned char *p, int data_mode)
 	if (data_mode <= 3) { /* Mode 0 or 1 */
 		outb(Ctrl_LNibRead, ioaddr + PAR_CONTROL);
 		outb(length == 8  ?  RdAddr | HNib | MAR  :  RdAddr | MAR,
-			 ioaddr + PAR_DATA);
+		     ioaddr + PAR_DATA);
 		if (data_mode <= 1) { /* Mode 0 or 1 */
 			do { *p++ = read_byte_mode0(ioaddr); } while (--length > 0);
 		} else { /* Mode 2 or 3 */
@@ -844,8 +864,7 @@ net_close(struct net_device *dev)
 	return 0;
 }
 
-/*
- *	Set or clear the multicast filter for this adapter.
+/* Set or clear the multicast filter for this adapter.
  */
 
 static void set_rx_mode(struct net_device *dev)
@@ -860,17 +879,20 @@ static void set_rx_mode(struct net_device *dev)
 	write_reg_high(ioaddr, CMR2, lp->addr_mode);
 }
 
-static int __init atp_init_module(void) {
-	if (debug)					/* Emit version even if no cards detected. */
+static int __init atp_init_module(void)
+{
+	if (debug)	/* Emit version even if no cards detected. */
 		printk(KERN_INFO "%s", version);
 	return atp_init();
 }
 
-static void __exit atp_cleanup_module(void) {
+static void __exit atp_cleanup_module(void)
+{
 	struct net_device *next_dev;
 
 	while (root_atp_dev) {
 		struct net_local *atp_local = netdev_priv(root_atp_dev);
+
 		next_dev = atp_local->next_module;
 		unregister_netdev(root_atp_dev);
 		/* No need to release_region(), since we never snarf it. */
-- 
2.1.3

^ permalink raw reply related

* Re: [Xen-devel] [PATCHv1 net-next] xen-netback: remove unconditional pull_skb_tail in guest Tx path
From: Zoltan Kiss @ 2014-11-03 18:23 UTC (permalink / raw)
  To: David Vrabel, Ian Campbell; +Cc: netdev, Malcolm Crossley, Wei Liu, xen-devel
In-Reply-To: <5457BF80.2000205@citrix.com>



On 03/11/14 17:46, David Vrabel wrote:
> On 03/11/14 17:39, Ian Campbell wrote:
>> On Mon, 2014-11-03 at 17:23 +0000, David Vrabel wrote:
>>> From: Malcolm Crossley <malcolm.crossley@citrix.com>
>>>
>>> Unconditionally pulling 128 bytes into the linear buffer is not
>>> required. Netback has already grant copied up-to 128 bytes from the
>>> first slot of a packet into the linear buffer. The first slot normally
>>> contain all the IPv4/IPv6 and TCP/UDP headers.
>>
>> What about when it doesn't? It sounds as if we now won't pull up, which
>> would be bad.
>
> The network stack will always pull any headers it needs to inspect (the
> frag may be a userspace page which has the same security issues as a
> frag with a foreign page).
I wouldn't bet my life on this, but indeed it should always happen.

>
> e.g., see skb_checksum_setup() called slightly later on in netback.
Fortunately the call to checksum_setup a bit later makes sure that at 
least the TCP/UDP headers are in the linear area, which is quite the 
same as blindly pulling 128 bytes. With any other protocol we rely on 
the network stack that it will enforce packet header in the linear buffer.

Regards,

Zoli

^ permalink raw reply

* Re: [PATCH] VNIC: Adding support for Cavium ThunderX network controller
From: Stephen Hemminger @ 2014-11-03 18:16 UTC (permalink / raw)
  To: Sunil Kovvuri
  Cc: Robert Richter, David S. Miller, Sunil Goutham, Robert Richter,
	Stefan Assmann, LKML, LAKML, netdev
In-Reply-To: <CA+sq2CfV-90CRvo-FCizkzoxHq-AsfOEDW2Z5pXtnNLNXCfTUg@mail.gmail.com>

On Fri, 31 Oct 2014 22:44:11 +0530
Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:

> On Fri, Oct 31, 2014 at 8:24 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > On Thu, 30 Oct 2014 17:54:34 +0100
> > Robert Richter <rric@kernel.org> wrote:
> >
> >> +#ifdef       VNIC_RSS_SUPPORT
> >> +static int rss_config = RSS_IP_HASH_ENA | RSS_TCP_HASH_ENA | RSS_UDP_HASH_ENA;
> >> +module_param(rss_config, int, S_IRUGO);
> >> +MODULE_PARM_DESC(rss_config,
> >> +              "RSS hash config [bits 8:0] (Bit0:L2 extended, 1:IP, 2:TCP, 3:TCP SYN, 4:UDP, 5:L4 extended, 6:ROCE 7:L3 bi-directional, 8:L4 bi-directional)");
> >> +#endif
> >
> > This should managed  be via ethtool ETHTOOL_GRXFH rather than a module parameter.
> Thanks, i will add setting hash options via ETHTOOL_SRXFH as well.
> The idea here is to have a choice of hash while module load (through
> module params) and if it needs to be changed runtime then
> via Ethtool.
> 
> Sunil.

Network developers do not like vendor unique module parameters.
Anything device specific doesn't work in a generic distro environment.

^ permalink raw reply

* [Patch net-next v2] neigh: remove dynamic neigh table registration support
From: Cong Wang @ 2014-11-03 18:14 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Cong Wang

Currently there are only three neigh tables in the whole kernel:
arp table, ndisc table and decnet neigh table. What's more,
we don't support registering multiple tables per family.
Therefore we can just make these tables statically built-in.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
v2: remove useless #ifdef's
    move the assignment to the end of neigh_table_init()

 include/net/neighbour.h |   9 +-
 net/core/neighbour.c    | 244 ++++++++++++++++++++++--------------------------
 net/decnet/dn_neigh.c   |   2 +-
 net/ipv4/arp.c          |   2 +-
 net/ipv6/ndisc.c        |   2 +-
 5 files changed, 121 insertions(+), 138 deletions(-)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index dedfb18..1e65c68 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -220,6 +220,13 @@ struct neigh_table {
 	struct pneigh_entry	**phash_buckets;
 };
 
+enum {
+	NEIGH_ARP_TABLE = 0,
+	NEIGH_ND_TABLE = 1,
+	NEIGH_DN_TABLE = 2,
+	NEIGH_NR_TABLES,
+};
+
 static inline int neigh_parms_family(struct neigh_parms *p)
 {
 	return p->tbl->family;
@@ -240,7 +247,7 @@ static inline void *neighbour_priv(const struct neighbour *n)
 #define NEIGH_UPDATE_F_ISROUTER			0x40000000
 #define NEIGH_UPDATE_F_ADMIN			0x80000000
 
-void neigh_table_init(struct neigh_table *tbl);
+void neigh_table_init(int index, struct neigh_table *tbl);
 int neigh_table_clear(struct neigh_table *tbl);
 struct neighbour *neigh_lookup(struct neigh_table *tbl, const void *pkey,
 			       struct net_device *dev);
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index edd0411..55b7e0b 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -56,7 +56,6 @@ static void __neigh_notify(struct neighbour *n, int type, int flags);
 static void neigh_update_notify(struct neighbour *neigh);
 static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev);
 
-static struct neigh_table *neigh_tables;
 #ifdef CONFIG_PROC_FS
 static const struct file_operations neigh_stat_seq_fops;
 #endif
@@ -87,13 +86,8 @@ static const struct file_operations neigh_stat_seq_fops;
    the most complicated procedure, which we allow is dev->hard_header.
    It is supposed, that dev->hard_header is simplistic and does
    not make callbacks to neighbour tables.
-
-   The last lock is neigh_tbl_lock. It is pure SMP lock, protecting
-   list of neighbour tables. This list is used only in process context,
  */
 
-static DEFINE_RWLOCK(neigh_tbl_lock);
-
 static int neigh_blackhole(struct neighbour *neigh, struct sk_buff *skb)
 {
 	kfree_skb(skb);
@@ -1520,7 +1514,9 @@ static void neigh_parms_destroy(struct neigh_parms *parms)
 
 static struct lock_class_key neigh_table_proxy_queue_class;
 
-static void neigh_table_init_no_netlink(struct neigh_table *tbl)
+static struct neigh_table *neigh_tables[NEIGH_NR_TABLES] __read_mostly;
+
+void neigh_table_init(int index, struct neigh_table *tbl)
 {
 	unsigned long now = jiffies;
 	unsigned long phsize;
@@ -1566,34 +1562,13 @@ static void neigh_table_init_no_netlink(struct neigh_table *tbl)
 
 	tbl->last_flush = now;
 	tbl->last_rand	= now + tbl->parms.reachable_time * 20;
-}
-
-void neigh_table_init(struct neigh_table *tbl)
-{
-	struct neigh_table *tmp;
 
-	neigh_table_init_no_netlink(tbl);
-	write_lock(&neigh_tbl_lock);
-	for (tmp = neigh_tables; tmp; tmp = tmp->next) {
-		if (tmp->family == tbl->family)
-			break;
-	}
-	tbl->next	= neigh_tables;
-	neigh_tables	= tbl;
-	write_unlock(&neigh_tbl_lock);
-
-	if (unlikely(tmp)) {
-		pr_err("Registering multiple tables for family %d\n",
-		       tbl->family);
-		dump_stack();
-	}
+	neigh_tables[index] = tbl;
 }
 EXPORT_SYMBOL(neigh_table_init);
 
 int neigh_table_clear(struct neigh_table *tbl)
 {
-	struct neigh_table **tp;
-
 	/* It is not clean... Fix it to unload IPv6 module safely */
 	cancel_delayed_work_sync(&tbl->gc_work);
 	del_timer_sync(&tbl->proxy_timer);
@@ -1601,14 +1576,6 @@ int neigh_table_clear(struct neigh_table *tbl)
 	neigh_ifdown(tbl, NULL);
 	if (atomic_read(&tbl->entries))
 		pr_crit("neighbour leakage\n");
-	write_lock(&neigh_tbl_lock);
-	for (tp = &neigh_tables; *tp; tp = &(*tp)->next) {
-		if (*tp == tbl) {
-			*tp = tbl->next;
-			break;
-		}
-	}
-	write_unlock(&neigh_tbl_lock);
 
 	call_rcu(&rcu_dereference_protected(tbl->nht, 1)->rcu,
 		 neigh_hash_free_rcu);
@@ -1626,12 +1593,32 @@ int neigh_table_clear(struct neigh_table *tbl)
 }
 EXPORT_SYMBOL(neigh_table_clear);
 
+static struct neigh_table *neigh_find_table(int family)
+{
+	struct neigh_table *tbl = NULL;
+
+	switch (family) {
+	case AF_INET:
+		tbl = neigh_tables[NEIGH_ARP_TABLE];
+		break;
+	case AF_INET6:
+		tbl = neigh_tables[NEIGH_ND_TABLE];
+		break;
+	case AF_DECnet:
+		tbl = neigh_tables[NEIGH_DN_TABLE];
+		break;
+	}
+
+	return tbl;
+}
+
 static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
 	struct net *net = sock_net(skb->sk);
 	struct ndmsg *ndm;
 	struct nlattr *dst_attr;
 	struct neigh_table *tbl;
+	struct neighbour *neigh;
 	struct net_device *dev = NULL;
 	int err = -EINVAL;
 
@@ -1652,39 +1639,31 @@ static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh)
 		}
 	}
 
-	read_lock(&neigh_tbl_lock);
-	for (tbl = neigh_tables; tbl; tbl = tbl->next) {
-		struct neighbour *neigh;
-
-		if (tbl->family != ndm->ndm_family)
-			continue;
-		read_unlock(&neigh_tbl_lock);
-
-		if (nla_len(dst_attr) < tbl->key_len)
-			goto out;
+	tbl = neigh_find_table(ndm->ndm_family);
+	if (tbl == NULL)
+		return -EAFNOSUPPORT;
 
-		if (ndm->ndm_flags & NTF_PROXY) {
-			err = pneigh_delete(tbl, net, nla_data(dst_attr), dev);
-			goto out;
-		}
+	if (nla_len(dst_attr) < tbl->key_len)
+		goto out;
 
-		if (dev == NULL)
-			goto out;
+	if (ndm->ndm_flags & NTF_PROXY) {
+		err = pneigh_delete(tbl, net, nla_data(dst_attr), dev);
+		goto out;
+	}
 
-		neigh = neigh_lookup(tbl, nla_data(dst_attr), dev);
-		if (neigh == NULL) {
-			err = -ENOENT;
-			goto out;
-		}
+	if (dev == NULL)
+		goto out;
 
-		err = neigh_update(neigh, NULL, NUD_FAILED,
-				   NEIGH_UPDATE_F_OVERRIDE |
-				   NEIGH_UPDATE_F_ADMIN);
-		neigh_release(neigh);
+	neigh = neigh_lookup(tbl, nla_data(dst_attr), dev);
+	if (neigh == NULL) {
+		err = -ENOENT;
 		goto out;
 	}
-	read_unlock(&neigh_tbl_lock);
-	err = -EAFNOSUPPORT;
+
+	err = neigh_update(neigh, NULL, NUD_FAILED,
+			   NEIGH_UPDATE_F_OVERRIDE |
+			   NEIGH_UPDATE_F_ADMIN);
+	neigh_release(neigh);
 
 out:
 	return err;
@@ -1692,11 +1671,14 @@ static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 static int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
+	int flags = NEIGH_UPDATE_F_ADMIN | NEIGH_UPDATE_F_OVERRIDE;
 	struct net *net = sock_net(skb->sk);
 	struct ndmsg *ndm;
 	struct nlattr *tb[NDA_MAX+1];
 	struct neigh_table *tbl;
 	struct net_device *dev = NULL;
+	struct neighbour *neigh;
+	void *dst, *lladdr;
 	int err;
 
 	ASSERT_RTNL();
@@ -1720,70 +1702,60 @@ static int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 			goto out;
 	}
 
-	read_lock(&neigh_tbl_lock);
-	for (tbl = neigh_tables; tbl; tbl = tbl->next) {
-		int flags = NEIGH_UPDATE_F_ADMIN | NEIGH_UPDATE_F_OVERRIDE;
-		struct neighbour *neigh;
-		void *dst, *lladdr;
+	tbl = neigh_find_table(ndm->ndm_family);
+	if (tbl == NULL)
+		return -EAFNOSUPPORT;
 
-		if (tbl->family != ndm->ndm_family)
-			continue;
-		read_unlock(&neigh_tbl_lock);
+	if (nla_len(tb[NDA_DST]) < tbl->key_len)
+		goto out;
+	dst = nla_data(tb[NDA_DST]);
+	lladdr = tb[NDA_LLADDR] ? nla_data(tb[NDA_LLADDR]) : NULL;
 
-		if (nla_len(tb[NDA_DST]) < tbl->key_len)
-			goto out;
-		dst = nla_data(tb[NDA_DST]);
-		lladdr = tb[NDA_LLADDR] ? nla_data(tb[NDA_LLADDR]) : NULL;
+	if (ndm->ndm_flags & NTF_PROXY) {
+		struct pneigh_entry *pn;
+
+		err = -ENOBUFS;
+		pn = pneigh_lookup(tbl, net, dst, dev, 1);
+		if (pn) {
+			pn->flags = ndm->ndm_flags;
+			err = 0;
+		}
+		goto out;
+	}
 
-		if (ndm->ndm_flags & NTF_PROXY) {
-			struct pneigh_entry *pn;
+	if (dev == NULL)
+		goto out;
 
-			err = -ENOBUFS;
-			pn = pneigh_lookup(tbl, net, dst, dev, 1);
-			if (pn) {
-				pn->flags = ndm->ndm_flags;
-				err = 0;
-			}
+	neigh = neigh_lookup(tbl, dst, dev);
+	if (neigh == NULL) {
+		if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
+			err = -ENOENT;
 			goto out;
 		}
 
-		if (dev == NULL)
+		neigh = __neigh_lookup_errno(tbl, dst, dev);
+		if (IS_ERR(neigh)) {
+			err = PTR_ERR(neigh);
+			goto out;
+		}
+	} else {
+		if (nlh->nlmsg_flags & NLM_F_EXCL) {
+			err = -EEXIST;
+			neigh_release(neigh);
 			goto out;
-
-		neigh = neigh_lookup(tbl, dst, dev);
-		if (neigh == NULL) {
-			if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
-				err = -ENOENT;
-				goto out;
-			}
-
-			neigh = __neigh_lookup_errno(tbl, dst, dev);
-			if (IS_ERR(neigh)) {
-				err = PTR_ERR(neigh);
-				goto out;
-			}
-		} else {
-			if (nlh->nlmsg_flags & NLM_F_EXCL) {
-				err = -EEXIST;
-				neigh_release(neigh);
-				goto out;
-			}
-
-			if (!(nlh->nlmsg_flags & NLM_F_REPLACE))
-				flags &= ~NEIGH_UPDATE_F_OVERRIDE;
 		}
 
-		if (ndm->ndm_flags & NTF_USE) {
-			neigh_event_send(neigh, NULL);
-			err = 0;
-		} else
-			err = neigh_update(neigh, lladdr, ndm->ndm_state, flags);
-		neigh_release(neigh);
-		goto out;
+		if (!(nlh->nlmsg_flags & NLM_F_REPLACE))
+			flags &= ~NEIGH_UPDATE_F_OVERRIDE;
 	}
 
-	read_unlock(&neigh_tbl_lock);
-	err = -EAFNOSUPPORT;
+	if (ndm->ndm_flags & NTF_USE) {
+		neigh_event_send(neigh, NULL);
+		err = 0;
+	} else
+		err = neigh_update(neigh, lladdr, ndm->ndm_state, flags);
+	neigh_release(neigh);
+
 out:
 	return err;
 }
@@ -1982,7 +1954,8 @@ static int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh)
 	struct neigh_table *tbl;
 	struct ndtmsg *ndtmsg;
 	struct nlattr *tb[NDTA_MAX+1];
-	int err;
+	bool found = false;
+	int err, tidx;
 
 	err = nlmsg_parse(nlh, sizeof(*ndtmsg), tb, NDTA_MAX,
 			  nl_neightbl_policy);
@@ -1995,19 +1968,21 @@ static int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh)
 	}
 
 	ndtmsg = nlmsg_data(nlh);
-	read_lock(&neigh_tbl_lock);
-	for (tbl = neigh_tables; tbl; tbl = tbl->next) {
+
+	for (tidx = 0; tidx < NEIGH_NR_TABLES; tidx++) {
+		tbl = neigh_tables[tidx];
+		if (!tbl)
+			continue;
 		if (ndtmsg->ndtm_family && tbl->family != ndtmsg->ndtm_family)
 			continue;
-
-		if (nla_strcmp(tb[NDTA_NAME], tbl->id) == 0)
+		if (nla_strcmp(tb[NDTA_NAME], tbl->id) == 0) {
+			found = true;
 			break;
+		}
 	}
 
-	if (tbl == NULL) {
-		err = -ENOENT;
-		goto errout_locked;
-	}
+	if (!found)
+		return -ENOENT;
 
 	/*
 	 * We acquire tbl->lock to be nice to the periodic timers and
@@ -2118,8 +2093,6 @@ static int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 errout_tbl_lock:
 	write_unlock_bh(&tbl->lock);
-errout_locked:
-	read_unlock(&neigh_tbl_lock);
 errout:
 	return err;
 }
@@ -2134,10 +2107,13 @@ static int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 
 	family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family;
 
-	read_lock(&neigh_tbl_lock);
-	for (tbl = neigh_tables, tidx = 0; tbl; tbl = tbl->next, tidx++) {
+	for (tidx = 0; tidx < NEIGH_NR_TABLES; tidx++) {
 		struct neigh_parms *p;
 
+		tbl = neigh_tables[tidx];
+		if (!tbl)
+			continue;
+
 		if (tidx < tbl_skip || (family && tbl->family != family))
 			continue;
 
@@ -2168,7 +2144,6 @@ static int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 		neigh_skip = 0;
 	}
 out:
-	read_unlock(&neigh_tbl_lock);
 	cb->args[0] = tidx;
 	cb->args[1] = nidx;
 
@@ -2351,7 +2326,6 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 	int proxy = 0;
 	int err;
 
-	read_lock(&neigh_tbl_lock);
 	family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family;
 
 	/* check for full ndmsg structure presence, family member is
@@ -2363,8 +2337,11 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 
 	s_t = cb->args[0];
 
-	for (tbl = neigh_tables, t = 0; tbl;
-	     tbl = tbl->next, t++) {
+	for (t = 0; t < NEIGH_NR_TABLES; t++) {
+		tbl = neigh_tables[t];
+
+		if (!tbl)
+			continue;
 		if (t < s_t || (family && tbl->family != family))
 			continue;
 		if (t > s_t)
@@ -2377,7 +2354,6 @@ static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 		if (err < 0)
 			break;
 	}
-	read_unlock(&neigh_tbl_lock);
 
 	cb->args[0] = t;
 	return skb->len;
diff --git a/net/decnet/dn_neigh.c b/net/decnet/dn_neigh.c
index c8121ce..845957a 100644
--- a/net/decnet/dn_neigh.c
+++ b/net/decnet/dn_neigh.c
@@ -591,7 +591,7 @@ static const struct file_operations dn_neigh_seq_fops = {
 
 void __init dn_neigh_init(void)
 {
-	neigh_table_init(&dn_neigh_table);
+	neigh_table_init(NEIGH_DN_TABLE, &dn_neigh_table);
 	proc_create("decnet_neigh", S_IRUGO, init_net.proc_net,
 		    &dn_neigh_seq_fops);
 }
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 16acb59..205e147 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -1292,7 +1292,7 @@ static int arp_proc_init(void);
 
 void __init arp_init(void)
 {
-	neigh_table_init(&arp_tbl);
+	neigh_table_init(NEIGH_ARP_TABLE, &arp_tbl);
 
 	dev_add_pack(&arp_packet_type);
 	arp_proc_init();
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 4cb45c1..c67f339 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1763,7 +1763,7 @@ int __init ndisc_init(void)
 	/*
 	 * Initialize the neighbour table
 	 */
-	neigh_table_init(&nd_tbl);
+	neigh_table_init(NEIGH_ND_TABLE, &nd_tbl);
 
 #ifdef CONFIG_SYSCTL
 	err = neigh_sysctl_register(NULL, &nd_tbl.parms,

^ permalink raw reply related

* Re: [PATCH] stmmac: remove custom implementation of print_hex_dump()
From: Joe Perches @ 2014-11-03 18:11 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Giuseppe Cavallaro, netdev, Kweh Hock Leong, David S . Miller,
	Vince Bridgers
In-Reply-To: <1415037237.472.1.camel@linux.intel.com>

On Mon, 2014-11-03 at 19:53 +0200, Andy Shevchenko wrote:
> On Mon, 2014-11-03 at 09:39 -0800, Joe Perches wrote:
> > On Mon, 2014-11-03 at 19:28 +0200, Andy Shevchenko wrote:
> > > There is a kernel helper to dump buffers in a hexdecimal format. This patch
> > > substitutes the open coded function by calling that helper.
> > []
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > []
> > > @@ -191,14 +191,8 @@ static void stmmac_clk_csr_set(struct stmmac_priv *priv)
> > >  
> > >  static void print_pkt(unsigned char *buf, int len)
> > >  {
> > > -	int j;
> > > -	pr_debug("len = %d byte, buf addr: 0x%p", len, buf);
> > > -	for (j = 0; j < len; j++) {
> > > -		if ((j % 16) == 0)
> > > -			pr_debug("\n %03x:", j);
> > > -		pr_debug(" %02x", buf[j]);
> > > -	}
> > > -	pr_debug("\n");
> > > +	pr_debug("len = %d byte, buf addr: 0x%p\n", len, buf);
> > > +	print_hex_dump(KERN_DEBUG, " ", DUMP_PREFIX_OFFSET, 16, 1, buf, len, 0);
> > 
> > Prefix should be ""
> 
> Oh, initially there is an indentation in one space.
> Maybe Giuseppe would comment on this.

I think that's not particularly important.

This changes the output anyway as the the printed offset
is now 8 chars wide not 3.

> > Another (better?) option would be to use:
> > 
> > 	print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, buf, len);
> 
> > so that it can be dynamically controlled like
> > the pr_debug statements.
> 
> Ah, yes, but unfortunately it will print ASCII part always. Giuseppe,
> what do you think?

I'm not bothered by logging message output changes.
It's not a guaranteed stable output.

> P.S. [off topic] Joe, just would like to know if you have you seen my
> last version of the patch series against hexdump.c which adds
> seq_hex_dump() call [1]. If so, could you comment on it?
> [1] https://lkml.org/lkml/2014/9/4/309

seq_<foo> output however is guaranteed.

So any conversion to the seq_hex_dump function
would need to be exactly the same.

New code could use seq_hex_dump.

Other than that, it seems sensible enough.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox